Announcement

#1 2019-08-09 02:15:46

dd-b
Member
Minneapolis, MN USA
2018-04-16
69

Period in keywords splits keywords

I see this was reported in 2012, but it's still in the current version.

It's bad enough that commas in keywords get treated as separators (so far as I can tell, the problem was that early keyword IPTC implementations used embedded characters to support multiple keywords, and had no escape mechanism; this badly screwed up my first couple of years of keywording when I imported some stuff using "last, first" for people's names into the wrong software and they all got broken apart.  Ouch!

This seems to be an import/sync issue; I've seen no problem establishing keywords with periods in them using the website tools, nothing strange happens then.

But now -- can't use period either?  Who keeps picking these random characters and taking them away?  What character will go next?  Here, or somehwere else?  Who can tell me what characters I can safely use? 

Best we can do locally is not take away anything *we* don't absolutely have to, which might be just "," and ";"?   Does anybody know a reason we *should* be splitting imported keywords on dots? 

The previous thread points out where to fix the code.  I'll give it a try, but of course that's a problem every time I update, I'm looking for a more stable long-term solution.  It'll be like a 10-line patch file.


    Piwigo 2.9.5
    Operating system: Linux
    PHP: 7.0.33 (Show info) [2019-08-08 17:15:10]
    MySQL: 5.7.25-log [2019-08-08 17:15:10]
    Graphics Library: ImageMagick 6.9.7-4

Offline

 

#2 2019-08-09 03:16:40

dd-b
Member
Minneapolis, MN USA
2018-04-16
69

Re: Period in keywords splits keywords

Vastly easier than I expected!

The regex used to split keywords, which includes the dot or period character, is already configurable.

So, to fix this, all I had to do was add the following line to my local/config/config.inc.php:

Code:

$conf['metadata_keyword_separator_regex'] = '/[,;]/';

(This line is identical with the default version over in config_default.inc.php, except that I have removed the dot from inside the square braces.)

That change fixed my problem.  I'm attaching a small jpeg that has a number of keywords with periods in them that I used as a test, in case anybody else needs to test anything related to this.

Offline

 

#3 2019-08-09 07:23:06

JohnB
Member
2019-08-02
16

Re: Period in keywords splits keywords

Very useful, otherwise any photos who's keywords were people's names with initials would be broken. This should be the default

Offline

 

#4 2019-08-09 07:38:00

dd-b
Member
Minneapolis, MN USA
2018-04-16
69

Re: Period in keywords splits keywords

I do think the default should be to NOT split keywords with periods in them, yes.  It's a nasty surprise to anybody who uses periods in their keywords -- which is a whole bunch of people (I've looked at a LOT of IPTC fields over the decades).  We don't want people trying out or starting to use Piwigo having nasty surprises more often than necessary. 

(Subject to somebody coming along explaining why it's actually important; but since nothing else I've found so far does it, I very strongly suspect it's a vestige of somebody with a strong opinion and/or a weird situation where it made sense, and we really would be better off changing the default.)

Offline

 

#5 2019-08-13 15:35:45

plg
Piwigo Team
Nantes, France, Europe
2002-04-05
13786

Re: Period in keywords splits keywords

I don't remember why period [.] is used by default. This was added on [Github] Piwigo commit cf5f9f4e back in 2006 (yes, that's 13 years ago).

I agree I don't think it's a good idea, we should remove it by default.

I also think that forcing comma to be a separator is not great neither. Sometimes it's relevant to have a comma in a tag. Maybe we should implement a clean "CSV" (comma separated values) algorithm, with PHP function str_getcsv for example. Such a string:

Code:

first tag, "a, tag, with, commas", last.tag

would produce 3 tags only. It could be configuration setting: either you use a regex to separate values, or a CSV parser.

Offline

 

#6 2019-08-13 16:39:52

dd-b
Member
Minneapolis, MN USA
2018-04-16
69

Re: Period in keywords splits keywords

Lots of software does have trouble with keywords with commas -- but as far as I can tell, the XMP format doesn't, and exiftool doesn't, so getting ourselves on the right side of that line would be good. This bad behavior is probably left over from pre-XMP days?

A quick test shows that exiftool's usage of commas in output formatting is just the default list separator, and that it can extract a keyword with a comma in it (it also inserted that keyword, not shown)

Code:

$ exiftool -n -sep "*" -keywords t2-latest.jpg
Keywords                        : Keyword_BBBBB*new01*new02, with embedded comma

Offline

 

Board footer

Powered by FluxBB

github twitter newsletter Donate Piwigo.org © 2002-2024 · Contact