Announcement

#1 2021-08-08 20:35:45

intbonus
Member
2021-08-07
1

Bash script to remove duplicate galleries images & hello!

Hi team!

First time Piwigo user and forum poster, set up a self-hosted version and ported all my Google Photos images to it. Amazing project and great fun to set up and figure out : )

One of the problems I had with my photo export was a lot of duplicate images, different sizes and qualities. I'm a newbie at writing shell scripts, so I tried my hand at writing one to solve this problem. I couldn't find a solution for this in the forums, so I hope you don't mind me sharing my attempt

Notes
This works for  images files in folders in the ./galleries/ folder of Piwigo! Or indeed any folder of images you might have on linux
This script assumes your files have no space names in, because Piwigo requires that! I'll enhance it to solve for file-names with spaces in at some point. Make sure all of your files and folders have no spaces in before running. You can use rename utility to sort that out
I'm going to add a "move" mode for duplicates too that will move them to a directory you can specify soon too. If you're savvy with bash scripts you can easily sort this out yourself on line 77.

Step 1
Install the very use findimagedupes package.

Code:

sudo apt-get install findimagedupes

This command-line tool can run through all the images in a given folder, analyses each image, and outputs line items showing files and all matching duplicates--if there's a 90% match between two images by default. More on how it works.

Step 2
Here's the bash script I wrote called image-dupes.sh. As I said, pretty new at this so it isn't perfect!

The script won't delete any files until you change deletefiles to true in the script.
It defaults to current directory if you don't specify one

Code:

chmod +x image-dupes.sh
./image-dupes.sh /path/to/images

The script will use findimagedupes to run through all images in the given folder, and print out all the dupes it found, the file size of each dupe.

The script selects the largest file to keep, this was just my preference to keep the highest quality one :)

Step 3
Backup your images, and change deletefiles=false to deletefiles=true and re-run the script. This time it'll delete all duplicates, and keep the largest original file intact!

Once your duplicates are deleted, run Synchronisation again in the Tools section of Piwigo admin and your dupes will be gone.

Thanks again for Piwigo, it's pretty cool and helps me move away from relying on Google for my stuff! My gallery is at https://pics.seanmaddison.uk/ with not many public pics yet while I sort them out.

Last edited by intbonus (2021-08-08 21:08:18)

Offline

 

Board footer

Powered by FluxBB

github twitter newsletter Donate Piwigo.org © 2002-2024 · Contact