Announcement

  •  » Requests
  •  » support unicode characters in pathname and filename

#1 2012-01-15 06:09:53

hwang00
Member
2011-02-18
4

support unicode characters in pathname and filename

I try to keep remote files(on server)  synchronize with local files(on pc). But i manage local photos by directory name(as album name) and filename(photo's description) in non-western, local-language-based way.
So you know that, some improvement of core features is devoutly to be wished.

* Support unicode(non west characters) in pathname and filename of data in "galleries" folder.
* and keep the original filename uploaded (contain nonascii characters)

This will bring great benefit to all non-English users,and, it also makes ftp/remote batch photo files management more logical and easier.

Last edited by hwang00 (2012-01-15 06:13:42)

Offline

 

#2 2012-05-11 15:14:57

miblo69
Guest

Re: support unicode characters in pathname and filename

I totally agree with this! The gallery seems to have most features I have been looking for, but by not supporting UTF-8 or international characters - it's a show stopper.
Thanks,
~Mike

 

#3 2012-06-05 14:04:48

miblo69
Member
Stockholm, Sweden
2012-06-05
27

Re: support unicode characters in pathname and filename

I have gone so far as to try and manually update the code (using 2.4 RC3) and identify where the problems are, to allow UTF-8 characters in filenames and directories when doing a 'Local Sync'.

But it's driving me nuts! There must be some character conversion going on 'behind the scenes', that doesn't allow for UTF-8. I roughly know where the problem lies, but I just can't find a solution. If someone is willing to give a hand, I'll gladly find time to revise the php code. Once finished, I'll supply it to the original developers.

I am working on the Local Sync stuff, and am editing i.php. 

- First I added "$conf['sync_chars_regex'] = '/^[a-zA-Z0-9-_.åäöÅÄÖ ]+$/';" to the config.inc.php in /local/config/  (Yes, Swedish is my native language...)

- Secondly, I have added a urldecode in i.php:
Line 199: changed from '$req = ltrim($req, '/');' to '$req = urldecode(ltrim($req, '/'));

BUT - the query starting on line 449, does NEVER match if $page['src_location'] contains a UTF-8 encoded string. All ASCII characters including space works, but as soon as öäåÖÄÅ are in it - it fails and retruns nothing.

The really strange part is that when I print the value of $page['src_location']  as a HEX string (using error_log and printing to a file), and compare it with the actual field contents (running a manual mysql query) - they are identical!

When the query is run through php/piwigo - no match. Manually with mysql - it matches.

If someone could tell me what the difference is - I'd be able to proceed with the 'Internationalization' of Piwigo.

Running on a Linux host,  3.3.7-1.fc16.i686, PHP 5.3.13, UTF-8 all over, Apache/2.2.22.

Many Thanks for any pointers!
~Mike

Offline

 

#4 2012-06-05 15:27:41

miblo69
Member
Stockholm, Sweden
2012-06-05
27

Re: support unicode characters in pathname and filename

Jumping Jee-bees! I made it work! :-)

Dont ask me why, but I added the php function:
pwg_db_check_charset();

before this line:
if ( ($row=pwg_db_fetch_assoc(pwg_query($query))) )
in the file i.php.

And then it just worked! In addition to the previously posted urldecode.
Plus I modified DB_COLLATE to contain 'utf8_general_ci' (default it is blank).

~Mike

Offline

 

#5 2012-06-05 16:33:45

flop25
Piwigo Team
2006-07-06
7037

Re: support unicode characters in pathname and filename

thx a lot I think it will be in 2.4.1
Could you open a entry in our bugtracker ? thx again http://piwigo.org/bugs/main_page.php


To get a better help : Politeness like Hello-A link-Your past actions precisely described
Check my extensions : more than 30 available
who I am and what I do : http://fr.gravatar.com/flop25
My gallery : an illustration of how to integrate Piwigo in your website

Offline

 

#6 2017-03-05 11:07:38

miblo69
Member
Stockholm, Sweden
2012-06-05
27

Re: support unicode characters in pathname and filename

Long time sine last post.... But I am running latest 2.8.6, and have always (for years) had problems with filenames containing International (Swedish in my case) characters.

Especially if the directory name starts with a Swedish Character such as '/galleries/2017/Åsas Bilder', it will be displayed as 'sas Bilder'. And Quick Local Synchronization will Add it every time.

However, I found a fix and solved it for my purposes. It was a problem with basename() that won't handle MutliByte conversions correctly. The solution was to add a mb_basename() function, and modify the site_update.php in two places.

In functions.inc.php I added:

/**
* Attemot to fix MultiByte filenames
* MB 2017-03-04
* Originally found here:
* http://stackoverflow.com/questions/4451 … e-is-utf-8
*/
if (!function_exists("mb_basename"))
{
    function mb_basename($path)
    {
        $separator = " qq ";
        $path = preg_replace("/[^ ]/u", $separator."\$0".$separator, $path);
        $base = basename($path);
        $base = str_replace($separator, "", $base);
        return $base;
    }
}


Then edit site_update.php on line 216 from $dir = basename($fulldir); to $dir = mb_basename($fulldir);
And on line 502 change $filename = basename($path); to $filename = mb_basename($path);

It works for me. Not excessiveley tested, and don't know if it breaks other things. But I am very happy that I could get Quick Local Synchronization to work correctly with Swedish Characters.


Of course, this requires that you also have edited your $conf['sync_chars_regex']  to allow International/Swedish characters.

Offline

 

#7 2017-03-15 16:38:19

miblo69
Member
Stockholm, Sweden
2012-06-05
27

Re: support unicode characters in pathname and filename

After some more testing, I found an even simpler solution.

In .../local/config/config.inc.php i added:
setlocale(LC_ALL,'sv_SE.utf8');

Which matches my system, and which is also installed. You can see which locales are installed by running  'locale -a' from CLI.

EDIT: Spelling.
EDIT2: setlocale was wrong. Was missing a parameter.

Last edited by miblo69 (2017-04-18 15:31:21)

Offline

 

#8 2017-04-18 12:38:52

rob777
Member
2017-02-11
9

Re: support unicode characters in pathname and filename

Hi,
Your posts are interesting.
I'm a bit in the same situation where I have file names containing special characters like minus, plus, space... and even asian characters.

Piwigo refuses to import them : wrong filename


I tried to add :
setlocale(LC_ALL, 'en_US.utf8');


Without success.

But I'm not sure how this locale works.
I understood UTF8 contains all chars worldwide, so there shouldn't be any country in the name...?

In my case, locale -a returns :

C
C.UTF-8
en_AG
en_AG.utf8
en_AU.utf8
en_BW.utf8
en_CA.utf8
en_DK.utf8
en_GB.utf8
en_HK.utf8
en_IE.utf8
en_IN
en_IN.utf8
en_NG
en_NG.utf8
en_NZ.utf8
en_PH.utf8
en_SG.utf8
en_US.utf8
en_ZA.utf8
en_ZM
en_ZM.utf8
en_ZW.utf8
POSIX


How can I set this so piwigo accepts any characters (specials, occidental, asian, russian...) ?

Thank you.

Last edited by rob777 (2017-04-19 09:25:41)

Offline

 

#9 2017-04-18 14:43:38

rob777
Member
2017-02-11
9

Re: support unicode characters in pathname and filename

actually I manage to solve most of these files import by adding

$conf['sync_chars_regex'] = '/^[a-zA-Z0-9-_.+$%(){}[\],~#*;\\/|<>"\'!?^:;=@àéèçëû&ùêôãúâîü]/';

to local/config/config.inc.php (localFiles editor plugin).

It doesn't help for russian/asian characters though.

Offline

 

#10 2017-04-18 15:33:09

miblo69
Member
Stockholm, Sweden
2012-06-05
27

Re: support unicode characters in pathname and filename

My bad. The correct setting that worked for me in config.inc.php was:

setlocale(LC_ALL,'sv_SE.utf8');

(Also edited my earlier post with the correction)

Offline

 

#11 2017-04-19 09:25:10

rob777
Member
2017-02-11
9

Re: support unicode characters in pathname and filename

yes you are right, this is what I did too but still not sure how it can be used to manage multiple languages at the same time (let's say swedish, russian, chinese for instance)

Offline

 
  •  » Requests
  •  » support unicode characters in pathname and filename

Board footer

Powered by FluxBB

github twitter newsletter Donate Piwigo.org © 2002-2024 · Contact