Digital Trash Heap

punkcoder@lemmy.world to Selfhosted@lemmy.world – 17 points –

So here’s the problem that I have, I have several generations of back ups, which are currently taking over huge amounts of space on my NAS server. I want to be able to go through and process all of the files that are on it while the duplicating, and possibly going through and tagging any files that I find that are helpful. Is anyone aware of a good tool to help accomplish this task. Again because of the nature of the backups, I don’t want to utilize any software I’m not running locally.

Thanks in advance.

7

How are your backups currently stored, simple copies of the files like you would make with rsync? I assume your on a Linux NAS, in which case fdupes would likely fit the bill. meld would be another option, and it also has a GUI if your NAS isn't headless.

For future backups restic might be a nice option as it deduplicates itself each time you run the backup. You can set retention policies (i.e. 7 daily, 4 weekly, 2 monthly, etc...) that only keep regulated intervals of backups.

Borg Backup would also fit the bill for backups going forward, especially if OP is still backing up to a local server (as opposed to cloud object storage).

I haven't tried Borg, but have noticed it mentioned pretty often in data hoarder forums. What do you like about it?

It deduplicates aggressively at the block level. So if your files don't change much, each additional backup takes very little space. And if a file changes a little, Borg only backs up what's changed instead of the whole file again.

Borg also has a rich ecosystem of wrappers and tools (borgmatic, Vorta, etc.) that extend its functionality and make it easier to use.

Interesting, sounds like it's worth checking out. Plus as a star trek fan, I approve of the name 😄

The only thing I can think of is to do a restore of all the backups in sequence, assuming they're all of the same thing. That would give you one consolidated image. Then you could run some deduplication and take a new single backup, if desired.

But really it's so subjective that I don't think there's really any way to automate it. I would mount all the backups, go through everything, pick out what you want to keep, and delete the rest.

Look at it this way. If you've had the backup for years, and never needed to restore any of those files, how likely are you ever need them in the future? Even if you did delete something you later wanted, how life-threatening would it be to not have it?

Or you could take the easy way out and just add more storage.