MantisBT - Piwigo
View Issue Details
0003218Piwigosynchronizationpublic2015.04.09 16:512015.04.09 16:51
mindhaq 
 
normalmajoralways
newopen 
LinuxUbuntu12.04
2.7.4 
 
any
5.3.10-1ubuntu3.17
Apache 2.2.x
0003218: Normalize UTF-8 filenames before synchronization
Filenames scanned during synchronization should be UTF-8 normalized to the precomposed format using Normalizer. At least when default encoding is utf-8 and PHP 5.3 is available.

http://us2.php.net/manual/en/class.normalizer.php [^]
UTF-8 allows two different ways to encode umlaut special characters like the german ä, ü, ä. For technical background on that see e.g. the following:

http://stackoverflow.com/questions/1089966/utf8-filenames-in-php-and-different-unicode-encodings [^]

I upload images from my Mac to my Linux server via SSH, and some of them contain Umlaut characters. For those files to be imported, I already changed the regex to the following:

$conf['sync_chars_regex'] = '/^[\s,.\'\pL0-9-_.]+$/u';

Which works fine for files with Umlaut I create directly on the server.

On the Mac, the Umlauts ä are encoded in the decomposed format 0x61cc88. When I upload them, they keep that filename. This is no problem for Linux, the names are displayed just fine. However, PHPs regex parser does not match those decomposed characters with the \pL character group (which it should). So it would be great if those filenames would be normalized before matching.
No tags attached.
Issue History
2015.04.09 16:51mindhaqNew Issue

There are no notes attached to this issue.