How do you all go about backing up your data, on Linux?

Linux@lemmy.ml – 192 points – 1 years ago

I'm trying to find a good method of making periodic, incremental backups. I assume that the most minimal approach would be to have a Cronjob run rsync periodically, but I'm curious what other solutions may exist.

I'm interested in both command-line, and GUI solutions.

I don't. I lose my data like all the cool (read: fool) kids.

I too rawdog linux like a chad

Too real

Timeshift is a great tool for creating incremental backups. Basically it's a frontend for rsync and it works great. If needed you can also use it in CLI

I use Borg backup with Vorta for a GUI. Hasn't let me down yet.

Borgmatic is also a great option, cli only.

I use PikaBackup which I think uses Borg. Super good looking Gnome app that has worked for me.

This is the correct answer.

Is it just me or the backup topic is recurring each few days on !linux@lemmy.ml and !selfhosted@lemmy.world?

To be on topic as well - I use restic+autorestic combo. Pretty simple, I made repo with small script to generate config for different machines and that's it. Storing between machines and b2.

It is a critical one. Maybe needs to be part of an FAQ with link to discussion.

It hasn't succeeded in nagging me to properly back up my data yet, so I think it needs to be discussed even more.

1 more...

I have a bash script that backs all my stuff up to my Homeserver with Borg. My servers have cronjobs that run similar scripts.

I use Back In Time to backup my important data on an external drive. And for snapshots I use timeshift.

Back In times

Isn't timeshift have same purpose, or it's just matter of preference?

Yes, it is the same purpose, kinda. But timeshift runs as a cron and allows for an easy rollback, while I use BIT for manual backups.

Pika Backup (GUI for borgbackup) is a great app for backups. It has all the features you might expect from backup software and "just works".

I use restic (https://restic.net/) which can use rclone to connect to a variety of backends (eg. onedrive, mega, dropbox etc.). Also, resticprofile (https://restic.net/) makes it easier to run (hides flags in the config file). I use it manually but a cron job would be easy to implement (a tutorial is here: https://forum.yunohost.org/t/daily-automated-backups-using-restic/16812).

Restic does not need rclone and can use many remote storage services directly. I do restic backups directly to Backblaze.

I like rsnapshot, run from a cron job at various useful intervals. backups are hardlinked and rotated so that eventually the disk usage reaches a very slowly growing steady state.

I also use it. Big benefit is also that you don‘t need a special software to access your backup.

Been using rsnapshot for years, has saved me more than once

Exactly like you think. Cronjob runs a periodic rsync of a handful of directories under /home. My OS is on a different drive that doesn't get backed up. My configs are in an ansible repository hosted on my home server and backed up the same way.

rsync + backblaze B2. Bafkblaze is stupid cheap.

Cost is about $10 per year.

Duplicity (cli) with deja-dup (gui) has saved my sorry ass many times.

I do periodic backups of my system from live usb via Borg Backup to a samba share.

Used to use Duplicati but it was buggy and would often need manual intervention to repair corruption. I gave up on it.

Now use Restic to Backblaze B2. I've been very happy.

I’ve used restic in the past; it’s good but requires a great deal of setup if memory serves me correctly. I’m currently using Duplicati on both Ubuntu and Windows and I’ve never had any issues. Thanks for sharing your experience though; I’ll be vigilant.

Restic to B2 is made of win.

The quick, change-only backups in a digit executable intrigued me; the ability to mount snapshots to get at, e.g., a single file hooked me. The wide, effortless support for services like BackBlaze made me an advocate.

I back up nightly to a local disk, and twice a week to B2. Everywhere. I have some 6 machines I do this on; one holds the family photos and our music library, and is near a TB by itself. I still pay only a few dollars per month to B2; it's a great service.

Check out Pika backup. It's a beautiful frontend for Borg. And Borg is the shit.

Kopia or Restic. Both do incremental, deduplicated backups and support many storage services.

Kopia provides UI for end user and has integrated scheduling. Restic is a powerfull cli tool thatlyou build your backup system on, but usually one does not need more than a cron job for that. I use a set of custom systems jobs and generators for my restic backups.

Keep in mind, than backups on local, constantly connected storage is hardly a backup. When the machine fails hard, backups are lost ,together with the original backup. So timeshift alone is not really a solution. Also: test your backups.

I really like kopia

I rotate between a few computers. Everything is synced between them with syncthing and they all have automatic btrfs snapshots. So I have several physical points to roll back from.

For a worst case scenario everything is also synced offsite weekly to a pCloud share. I have a little script that mounts it with pcloudfs, encfs and then rsyncs any updates.

I don't, really. I don't have much data that is irreplaceable.

The ones that are get backed up manually to Proton Drive and my NAS (manually via SMB).

BTRFS filesystem, Snapper for taking periodic snapshots and snap-sync for saving one to an external drive every now and then.

BTRFS is what makes everything incremental.

Seconded for this

by the way, syncthing is great if you need bi-directional sync.
not exactly what you're looking for (sth like Duplicacy?) but you should probably know about it as it's a great tool.

Git projects and system configs are on GitHub (see etckeeper), the reset is synced to my self-hosted Nextcloud instance using their desktop client. There I have periodic backup using Borg for both the files and Nextcloud database.

All my devices use Syncthing via Tailscale to get my data to my server.

From there, my server backs up nightly to rsync.net via BorgBackup.

I then have Zabbix monitoring my backups to make sure a daily is always uploaded.

I use rsync+btrfs snapshot solution.

Use rsync to incrementally collect all data into a btrfs subvolume
Deduplicate using duperemove
Create a read-only snapshot of the subvolume

I don't have a backup server, just an external drive that I only connect during backup.

Deduplication is mediocre, I am still looking for snapshot aware duperemove replacement.

I'm not trying to start a flame war, but I'm genuinely curious. Why do people like btrfs over zfs? Btrfs seems very much so "not ready for prime time".

btrfs is included in the linux kernel, zfs is not on most distros
the tiny chance that an externel kernel module borking with a kernel upgrade happens sometimes and is probably scary enough for a lot of people

Fair enough

Features necessary for most btrfs use cases are all stable, plus btrfs is readily available in Linux kernel whereas for zfs you need additional kernel module. The availability advantage of btrfs is a big plus in case of a disaster. i.e. no additional work is required to recover your files.

(All the above only applies if your primary OS is Linux, if you use Solaris then zfs might be better.)

I've only ever run ZFS on a proxmox/server system but doesn't it have a not insignificant amount of resources required to run it? BTRFS is not flawless, but it does have a pretty good feature set.

Restic since 2018, both to locally hosted storage and to remote over ssh. I've "stuff I care about" and "stuff that can be relatively easily replaced" fairly well separated so my filtering rules are not too complicated. I used duplicity for many years before that and afbackup to DLT IV tapes prior to that.

I do a periodic backup with Vorta towards my server. The server does a daily backup to an S3 service with Restic

Use synching on several devices to replicate data I want to keep backups of. Family photos, journals, important docs, etc. Works perfect and I run a relay node to give back to the community given I am on a unlimited data connection.

I use syncthing for my documents as well. My source code is in GitHub if it's important, and I can reinstall everything else if I need.

DejaDup on one computer. Another is using Syncthing, another I do a manual Grsync. i really should have a better plan. lol

dont keep anything u would be upset to lose /s

timeshift with system files and manually my home folder

I use Rclone which has both an WEBUI and CLI.

I just run my own nextcloud instance. Everything important is synced to that with the nextcloud desktop client, and the server keeps a month's worth of backups on my NAS via rsync.

I use btrbk to send btrfs snapshots to a local NAS. Consistent backups with no downtime. The only annoyance (for me at least) is that both send and receive ends must use the same SELinux policy or labels won't match.

At the core it has always been rsync and Cron. Sure I add a NAS and things like rclone+cryptomator to have extra copies of synchronized data (mostly documents and media files) spread around, but it's always rsync+Cron at the core.

I use bupstash to backup to a server I built a few years ago

I run ZFS on my servers and then replicate to other ZFS servers with Syncoid.

Just keep in mind that a replica is not a backup.

If you lose or corrupt a file and you don't find out for a few months, it's gone on the replicas too.

Correct! I have Sanoid take daily and monthly snapshots on the source server, which replicate to the destination. Every now and then, I run a diff between the last known-good monthly snapshot and the most recent one which has been replicated to the destination. If I am happy with what files have changed, I delete the previous known-good snapshot and the one I diff’d becomes the new known-good. That helps keep me safe from ransomware on the source. The destination pulls from the source to prevent the source from tampering with the backup. Also helps when you’re running low on storage.

I use Restic, called from cron, with a password file containing a long randomly generated key.

I back up with Restic to a repository on a different local hard drive (not part of my main RAID array), with --exclude-caches as well as excluding lots of files that can easily be re-generated / re-installed/ re-downloaded (so my backups are focused on important data). I make sure to include all important data including /etc (and also backup the output of dpkg --get-selections as part of my backup). I auto-prune my repository to apply a policy on how far back I keep (de-duplicated) Restic snapshots.

Once the backup completes, my script runs du -s on the backup and emails me if it is unexpectedly too big (e.g. I forgot to exclude some new massive file), otherwise it uses rclone sync to sync the archive from the local disk to Backblaze B2.

I backup my password for B2 (in an encrypted password database) separately, along with the Restic decryption key. Restore procedure is: if the local hard drive is intact, restore with Restic from the last good snapshot on the local repository. If it is also destroyed, rclone sync the archive from Backblaze B2 to local, and then restore from that with Restic.

Postgres databases I do something different (they aren't included in my Restic backups, except for config files): I back them up with pgbackrest to Backblaze B2, with archive_mode on and an archive_command to archive WALs to Backblaze. This allows me to do PITR recovery (back to a point in accordance with my pgbackrest retention policy).

For Docker containers, I create them with docker-compose, and keep the docker-compose.yml so I can easily re-create them. I avoid keeping state in volumes, and instead use volume mounts to a location on the host, and back up the contents for important state (or use PostgreSQL for state instead where the service supports it).

Do most of my work on nfs, with zfs backing on raidz2, send snapshots for offline backup.

Don't have a serious offsite setup yet, but it's coming.

I use rsync to an external drive, but before that I toyed a bit with pika backup.

I don't automate my backup because i physically connect my drive to perform the task.

Setup

Machine A:

RAIDz1 takes care of single-disk failure
ZFS doing regular snapshots
Syncthing replicates the data off-site to Machine B

Machine B:

RAIDz1 takes care of single-disk failure
ZFS doing regular snapshots
Syncthing receiving data from Machine A

Implications

Any single-disk hardware failure on machine A or B results in no data loss
Physical destruction of A won't affect B and the other way around
Any accidentally deleted or changed file can be recovered from a previous snapshot
Any ZFS corruption at A doesn't affect B because send/recv isn't used. The two filesystems are completely independent
Any malicious data destruction on A can be recovered from B even if it's replicated via snapshot at B. The reverse is also true. A malicious actor would have to have root access on both A and B in order to destroy the data and the snapshots on both machines to prevent recovery
Any data destruction caused by Syncthing can be recovered from snapshot at A or B

Github for projects, Syncthing to my NAS for some config files and that's pretty much it, don't care for the rest.

I use lucky backup to mirror to external drive. And I also use Duplicacy to back up 2 other separate drives at the same time. Have a read on the data hoarder wiki on backups.

I use timeshift. It really is the best. For servers I go with restic.

I use timeshift because it was pre-installed. But I can vouch for it; it works really well, and let's you choose and tweak every single thing in a legible user interface!

When I do something really dumb I typically just use dd to create an iso. I should probably find something better.

I run Openmediavault and I backup using BorgBackup. Super easy to setup, use, and modify

I use Pika backup, which uses borg backup under the hood. It's pretty good, with amazing documentation. Main issue I have with it is its really finicky and is kind of a pain to setup, even if it "just works" after that.

Can you restore from it? That’s the part I’ve always struggled with?

The way pika backup handles it, it loads the backup as a folder you can browse. I've used it a few times when hopping distros to copy and paste stuff from my home folder. Not very elegant, but it works and is very intuitive, even if I wish I could just hit a button and reset everything to the snapshot.

A separate NAS on an atom cpu with btrfs of raid 10 exposed over NFS.

Restic with deja dupe gui

Periodic backup to external drive via Deja Dup. Plus, I keep all important docs in Google Drive. All photos are in Google Photos. So it's only my music really which isn't in the cloud. But I might try upload it to Drive as well one day.

Most of my data is backed up to (or just stored on) a VPS in the first instance, and then I backup the VPS to a local NAS daily using rsnapshot (the NAS is just a few old hard drives attached to a Raspberry Pi until I can get something more robust). Very occasionally I'll back the NAS up to a separate drive. I also occasionally backup my laptop directly to a separate hard drive.

Not a particularly robust solution but it gives me some piece of mind. I would like to build a better NAS that can support RAID as I was never able to get it working with the Pi.

I use Duplicacy to encrypt and backup my data to OneDrive on a schedule. If Proton ever creates a Linux client for Drive, then I'll switch to that, but I'm not holding my breath.

zfs snap and zfs send to an external or another server.

Either an external hard drive or a pendrive. Just put one of those in a keychain and voila, a perfect backup solution that does not need of internet access.

...it's not dumb if it (still) works. :^)

Vorta + borgbase

The yearly subscription is cheap and fits my storage needs by quite some margin. Gives me peace of mind to have an off-site back up.

I also store my documents on Google Drive.

Restic to Synology nas, Synology software for cloud backup.

Anything important I keep in my Dropbox folder, so then I have a copy on my desktop, laptop, and in the cloud.

When I turn off my desktop, I use restic to backup my Dropbox folder to a local external hard drive, and then restic runs again to back up to Wasabi which is a storage service like amazon's S3.

Same exact process for when I turn off my laptop.. except sometimes I don't have my laptop external hd plugged in so that gets skipped.

So that's three local copies, two local backups, and two remote backup storage locations. Not bad.

Changes I might make:

add another remote location
rotate local physical backup device somewhere (that seems like a lot of work)
move to next cloud or seafile instead of Dropbox

I used seafile for a long time but I couldn't keep it up so I switched to Dropbox.

Advice, thoughts welcome.

I actually move my Documents, Pictures and other important folders inside my Dropbox folder and symlink them back to their original locations

This gives me the same Docs, Pics, etc. folders synced on every computer.

I use duplicity to a drive mounted off a Pi for local, tarsnap for remote. Both are command-line tools; tarsnap charges for their servers based on exact usage. (And thanks for the reminder; I'm due for another review of exactly what parts of which drives I'm backing up.)

ZFS send / recieve and snapshots.

me too. ZFS is amazing

Does this method allow to pick what you need to backup or it's the entire filesystem?

It allows me to copy select datasets inside the pool.

So I can choose rpool/USERDATA/so-n-so_123xu4 for user so-n-so. I can also choose copy copy some or all of the rpool/ROOT/ubuntu_abcdef, and it's nested datasets.

I settle for backing up users and rpool/ROOT/ubuntu_abcdef, ignoring the stuff in var datasets. This gets me my users home, roots home, /opt. Tis all I need. I have snapshots and mirrored m2 ssd's for handling most other problems (which I've not yet had).

The only bugger is /boot (on bpool). Kernel updates grown in there and fill it up, even if you remove them via apt... because snapshots. So I have to be careful to clean it's snapshots.

Good ol' fashioned rsync once a day to a remote server with zfs with daily zfs snapshot (rsync.net). Very fast because it only need to send changed/new files, and saved my hide several times when I need to access deleted files or old version of some files from the zfs snapshots.

Get a Mac, use Time Machine. Go all in on the eco system. phone, watch, iPad, tv. I resisted for years but it's so good man and the apple silicon is just leaps beyond everything else.

Someone asking for Linux backup solution may prefer to avoid Apple 'ecosystem'.

Time Machine is such a neglected product. Time Shift is worlds beyond it.

Time Machine is not a backup, it is unreliable. I've had corrupted time machine backups and its backups are non-portable: You can only read the backups using an Apple machine. Apple Silicon is also not leaps beyond everything else, a 7000-series AMD chip will trade blows on performance per watt given the same power target. (source: I measured it, 60 watt power limit on a 7950X will closely match a M1 ultra given the same 60 watts of power)

Sure their laptops are tuned better out of the box and have great battery life, but that's not because of the Apple Silicon. Apple had good battery life before, even when their laptops had the same Intel chip as any other laptop. Why? Because of software.

Like before, their new M-chips are nothing special. Apple Silicon chips are great, but so are other modern chips. Apple Silicon is not "leaps beyond everything else".

If you look past their shiny fanboy-bait chips, you realize you pay **huge ** markups on RAM and storage. Apple's RAM and storage isn't anything special, but they're a lot more expensive than any other high-end RAM and storage modules, and it's not like their RAM or storage is better because, again, an AMD chip can just use regular RAM modules and an NVME SSD and it will match the M-chip performance given the same power target. Except you can replace the RAM modules and the SSD on the AMD chipset for reasonable prices.

In the end, a macbook is a great product and there's no other laptop that really gets close to its performance given its size. But that's it, that's where Apple's advantage ends. Past their ultra-light macbooks, you get overpriced hardware, crazy expensive upgrades, with an OS that isn't better, more reliable or more stable than Windows 11 (source: I use macOS and Windows 11 daily). You can buy a slightly thicker laptop (and it will still be thin and light) with replacable RAM and SSD and it will easily match the performance of the magic M1 chip with only a slight reduction in potential battery life. But guess what: If you actually USE your laptop for anything, the battery life of any laptop will quickly drop to 2-3 hours at best.

And that's just laptops. If you want actual work done, you get a desktop, and for the price of any Apple desktop you can easily get any PC to outperform it. In some cases, you can buy a PC to outperform the Apple desktop AND buy a macbook for on the go, and still have money left over. Except for power consumption ofcourse, but who cares about power consumption on a work machine? Only Apple fanboys care about that, because that's the only thing they got going for them. My time is more expensive than my power bill.