How do you backup your data?

onlinepersona@programming.dev to Selfhosted@lemmy.world – 107 points –

I've been considering paying for a European provider, mounting their service with rclone, and thus being transparent to most anything I host.

How do y'all backup your data?

101

Raid is backup right?

It protects against drive failure. That is the threat I am most worried about, so it's fine for me.

drive failure

Perhaps unintended but very much relevant singular. Unless you're doing RAID 6 or the like, a simultaneous failure of two drives still means data loss. It's also worth noting that drives of the same model and batch tend to fail after similar amounts of time.

same model and batch

This is why when you buy hard drives, you should split the order across several stores rather than buying all of them from one store. You're much more likely to get drives from different batches.

Oh, don't worry they're a random mix of old drives I had lying around, they're most certainly not the same model, let alone batch!

(But yes, fair call if you have a big Nas. I have 2TB in my desktop)

Two hard drives of the same size, one on site and one off site.

Where do you keep your off-site one? Like a friend or family member's house?

I keep one in a bank deposit box. It costs like $10/year, fireproof, climate controlled, and exactly the right size for a 3.5" disk. Rotate every couple of months, because it is like 10-15 minute process to get into the vault.

So your backed up data can be as old as a couple of months and requires manual interaction? I guess that's better than nothing, but I'm looking for something more automated. I'm not sure what my options are for cloud storage or if they are safe from deletion. Or if having it in a closet in a friends house is really the best option.

I have a live local backup to guard against hardware/system failure. I figure the only reason I'd have to go to the off-site backup is destruction of my home, and if that ever happens then recreating a couple of months worth of critical data will not be an undue burden.

If I had work or consulting product on my home systems, I'd probably keep a cloud backup by daily rsync, but I'm not going to spend the bandwidth to remote backup the whole system off site. It's bad enough bringing down a few tens of gigabytes - sending up several terabytes, even in the background, just isn't practical for me.

At home and at the shop where I work. At work the drives are actually stored in a Faraday cage.

Either works, if you dont trust them encryption is always an option

Tape is the best medium for archiving data.

I really want to use tape for backups, but holy expensive. Those tape drives are thousands of dollars.

Damn, the last time I thought about this (20 years ago) I was able to buy a tape drive for a PC for like ........ I wanna say $250-300?? I forget the format, it was very very common though and tapes were dirt cheap, maybe $10-12 a pop. Worked great, if you were willing to sit around and swap tapes out as needed.

I think the problem is that normal consumers wouldn't ever buy a tape drive, so the only options still being produced are enterprise grade. The tapes are still pretty cheap, but the drives are absurd.

So tape doesn't make sense for the typical person, unless you don't have to buy the equipment and store i.

But, if you're even a small company it becomes cheaper to use tape.

Companies don't like deleting data. Ever. In fact some industries have laws that say they can't delete data.

For example, the company I work in is small, but old. Our accounting department alone requires complex automated processes to do things each day that require data to be backed up.

From the beginning of time. I shit you not. There is no compression even.

And at the drop of a hat, the IT dept needs to be able to implement a backup from any time in the past. Although this almost never happens outside of the current pay cycle, they need to have the option available.

The best way they have to facilitate this (I hate it - like I said they're old) is to simply write everything multiple times a night. And it's everything since we started using digital storage. Yes, it's overkill and makes no sense, but that's the way it is for us. And that's the way it is for a lot of companies.

So, when we're talking about that amount of data, and tape having a storage cost advantage of 4:1 over disk, it more than pays for all the overhead for enterprise level backups.

I bought an incredibly overkill tape system a few years ago and then the power supply exploded in it and I never bothered to replace it. Still, definitely worth it

Yes, tape has very steep entry costs and requires maintenance and storage.

Most of the time it doesn't make sense for a person to use it, but rather a corporate entity that needs to backup petabytes of data multiple times a day.

1 more...

Local to synology. Synology to AWS with synology's backup app. It costs me pennies per day.

Same, although aws is my plan b. For plan a I have an older Synology that is a full backup target.

On site? I put enterprise drives in my nas. Always have and have never had a drive fail. If one does, raid is good until the replacement arrives.

Raid is no backup. Raid helps you against drive failure.

Backup helps you if you or some script screwed up your data, or you need to go back to last months version of a file for whatever other reason.

Aws helps if your house burns down and you need to set up again from scratch.

Versioning is a feature completely separate from raid or dual nas or whatever else you do. Your example of the house burning down is exactly why I questioned the dual nas... Both nas will be toast.

So please, tell me again why you need 2 nas for versioning? Maybe you're doing some goofy hack, then ok. That's still silly. Just do proper versioning. If you're coding, just use git. Don't reinvent the wheel.

I’m stunned that you are unfamiliar with the versioning feature of backups. In my bubble this has been best practice since Apple came along with the Time Machine, but really we tried that even before with rsync, albeit only with limited success.

This is different from git because this takes care about all files and configurations, and it does so automatically. Furthermore it also includes rules when to thin out and discard old versions, because space remains an issue.

Synologys backup tool is quite similar to Time Machine, and that’s what I am using the second NAS for. I used to have a USB hard drive for that task, but it crashed and my old Synology and a few old disks were available. That’s better because it also protects against a number of attacks that make all mounted paths unusable.

Git is not a backup tool. It’s a versioning tool, best used for text files.

Your condescension is matched only by your reading comprehension. I do not know what your requirements are. You said coding and alluded versioning, so I tossed out git. Enjoy your tech debt. I hope it serves you well and supports your ego for many years.

Your condescension is matched only by your reading comprehension.

Bruh. Look into a mirror.

I keep important files on my NAS, and use Borgbackup with Borgmagic for backups. I've got a storage VPS with HostHatch that's $10/month for 10TB space (was a special Black Friday deal a few years ago).

Make sure you don't just have one backup copy. If you discover that a file was corrupted three weeks ago, you should be able to restore the file from a three week old backup. rsync and rclone will only give you a single backup. Borg dedupes files across backups so storing months of daily backups often isn't a problem, especially if the files rarely change.

Also make sure that ransomware or an attacker can't mess up your backup. This means it should NOT be mounted as a file system on the client, and ideally the backup system has some way of allowing new backups while disallowing deleting old ones from the client side. Borg's "append only" mode is perfect for this. Even if an attacker were to get onto your client system and try to delete the backups, Borg's append-only mode just marks them as deleted until you run a compact on the server side, so you can easily recover.

Manually plug in a few disks every once in a while and copy the important stuff. Disks are offline for the most part.

I do an automated nightly backup via restic to Backblaze B2. Every month, I manually run a script to copy the latest backup from B2 to two local HDDs that I keep offline. Every half a year I recover the latest backup on my PC to make sure everything works in case I need it. For peace of mind, my automated backup includes a health check through healthchecks.io, so if anything goes wrong, I get a notification.

It's pretty low-maintenance and gives a high degree of resilience:

  • A ransomware attack won't affect my local HDDs, so at most I'll lose a month's worth of data.

  • A house fire or server failure won't affect B2, so at most I'll lose a day's worth of data.

 

restic has been very solid, includes encryption out of the box, and I like the simplicity of it. Easily automated with cron etc. Backblaze B2 is one of the cheapest cloud storage providers I could find, an alternative might be Wasabi if you have >1TB of data.

How much are you backing up? Admittedly backblaze looks cheap but at $6 Tb leaves me with $84 pcm or just over $1000 per year.

I'm seriously considering a rpi3 with a couple of external disk in an outbuilding instead of cloud

Oh, I think we're talking different orders of magnitude here. I'm in the <1TB range, probably around 100GB. At that size, the cost is negligible.

Isn't backblaze is like $6 per TB 🤔🤔🤔

So $216 a year?

$6 x 14Tb = $84 month x 12 months = $1008 per year, or did I miss read the prices?

Sorry, I thought you or somebody said they store 3TB. Probably I'm mistaken, sorry 🥲

Also you know it's also possible to setup backups on the drive connect, also a good thing to turn off the networking beforehead 😶‍🌫️ (Also it's possible to do "timer usb hub", it's not very off-site, but a switch can turn on every X days and the machine will mount it and do the backup, then the usb hub turns off (imagine putting it in a fireproof safe with a small hole for a usb cable))

Also, i'm using ntfy.sh for notifications And if you're using raid, you can setup it with on a drive failure

rclone to dropbox and opendrive for things I care about like photo backups and RAW backups, and an encrypted rclone volume to both for things that need to be backed up, but also kept secure, such as scans of my tax returns, mortgage paperwork, etc. I maintain this script for the actual rclone automation via cron

The only type of data I care about is photos and video I’ve taken. Everything else is replaceable.

My phone —> immich —> backblaze b2, and some Google drive.

Linux isos I can always redownload.

I sync all my files across 4 different computers in my house (rsync and Nextcloud) and then backups on OneDrive and Google Drive.

4 different computers? Wow...

The 4 different computers are my vr desktop, my laptop, my home server, and my wife’s computer 🤪

Acronyms, initialisms, abbreviations, contractions, and other phrases which expand to something larger, that I've seen in this thread:

Fewer Letters More Letters
ESXi VMWare virtual machine hypervisor
Git Popular version control system, primarily for code
NAS Network-Attached Storage
RAID Redundant Array of Independent Disks for mass storage
SSD Solid State Drive mass storage
VPS Virtual Private Server (opposed to shared hosting)

[Thread #188 for this sub, first seen 5th Oct 2023, 00:05] [FAQ] [Full list] [Contact] [Source code]

Synology NAS where all computers get backed up to locally. Restic for Linux, Time Machine for Mac, active backup for Windows.

NAS backs most of its data (that I trust enough to put on the cloud) encrypted to Google drive every night, occasionally I back the NAS up to an external 8tb hard-drive.

I have a cheap 2 bay synology NAS that acts solely as a backup server for my main NAS in an offsite location as well as a USB drive locally.

Backups run every night with duplicacy

I exclude media files (movies, TV shows,...) from my backup routine due to the sheer amounts of data accumulated over time and the fact that most of it can be re-aquired using public sources in case disaster recovery is needed

Device sync to nextcloud -> rsync data & db onto NAS -> nightly backup to rsync.net and quarterly offsite/offline HDD swaps.

I also copy Zoneminder recordings, configs, some server logs, and my main machine’s ~/ onto the NAS.

The offsite HDD is just a bog standard USB 4TB drive with one big LUKS2 volume on it.

It’s all relatively simple. It’s easy to complicate your backups to the point where you rely on Veeam checkpointing your ESXI disks and replicating incrementals to another device that puts them all back together… but it’s much better to have a system that’s simple and just works.

I have a Synology NAS that holds all my important data. Then it does nightly backups to Synology C2.

I miss back in the day. Used to be able to store all my stuff on CD-R's, hell before that it was floppy's. File sizes have grown exponentially, programs/apps all have huge sizes. Pictures and videos is my biggest issue, but I'd also like to backup games that I've downloaded so I don't have to download again. I can backup old games no problem, but modern games? Many are 100+ GB now, and in time they all will be and 200GB will be the standard, then a terabyte and more.

Anyway, until I can afford and find a 20 tb sad I'm just using DVDs for everything but games and large programs. Quick to write, solid, tangeable etc. If I could afford a bunch of flash drives I'd probably do that instead.

If you can afford it and it's important data I'd ofc recommend backing up to a large SSD, THEN to a cloud (or more) as a failsafe.. then also using flash drives/DVD's etc. For an additional failsafe for the super important stuff.

I mean, if it's important backup all you can.

I've got priceless memories in my Google photos library but ofc Google removed being able to view them on my native photos app and download easily.. so instead I either have to backup and save ALL of it in Google drive or download specific albums.. idk so I wouldn't personally recommend google as a true backup as you never know, personally I'd just use DVDs and flash drives for that stuff

I've finally settled on Duplicacy. I've tried several CLI tools, messed with and love rclone, tried the other GUI backup tools, circled back to Duplicacy.

I run a weekly app data backup of my unRAID docker containers, which is stored on an external SSD attacked via USB to the server. The day after that runs duplicacy does a backup of that folder to Backblaze B2. My Immich library is backed up nightly and also sent to B2 via Duplicacy. Currently, those are the only bits of critical data on the server. I will add more as I finalize a client backup for the Win10, Linux, and MacOS devices at home, but it will follow this trend.

Docker cp piped into restic, uploading to wasabi. Works well, I recently recovered from a hard drive failure and everything just worked.

Right now just a spare hard drive on a pi that I rsync too, but I'm looking for better options as well.

I backup my ESXi VMs and NAS file shares to local server storage using an encrypted Veeam job and have a copy job to a local NAS with iSCSI storage presented.

From there I have another host VM accessing that same iSCSI share uploading the encrypted backup to Backblaze. Unlimited "local" storage for $70\y? Yes please! (iSCSI appears local to Backblaze. They know and have already started they don't care.)

I'm backing up about 4TB to them currently using this method.

Mine is kind of similar. Hyper-V backed up with Veeam to a separate logical disk (same RAID array, different HDD's). Veeam backups are replicated to iDrive with rsync.

I need to readjust my replication schedule to prioritize the critical backups because my upload speed isn't fast enough to do a full replication that often.

Everything to Crashplan.

Critical data also goes to Tarsnap.

I perform a backup once a week from my main desktop to a HDD, then once a month I copy important data/files from all nodes (proxmox, rpi's and main desktop) to 2 "cold" unplugged HDD that's the only time I connect them. I do all of that using rsync with backup.sh and coldbackup.sh

I use syncthing for notes across mobile/desktop/notebook, for that and other important files the backup goes to Google Drive or MEGA (besides the offline backup).

I want to try S3 Glacier since is cheaper for cloud backup... has anyone tried?

I want to try S3 Glacier since is cheaper for cloud backup... has anyone tried

tl;dr it's too expensive for what it is (cold storage), retrieval fees are painful, and you can often find hot storage for a similar price or cheaper.

The fees to restore data make it cost prohibive to have disaster recovery runs (where you pretend that a disaster has happened and you have to restore from backup) and we all know that if you don't test your backups, you don't actually have backups.

Restores are also slow - it takes several hours from when you request a download until you're actually able to download it, unless you pay more for an expedited restore or instant retrieval storage. This is because the data is physically stored on tapes in cold storage, and AWS staff have to physically locate the right tapes and load them.

Glacier also isn't even that cheap? It looks like it's around $4/TB/month, whereas $5/TB/month is a very common price for hot storage and plenty of providers have plans around that price point. I wouldn't pay more than that. If you need to store a lot of data, a storage VPS, Hetzner storage box, Hetzner auction server, etc would be cheaper. You can easily get hot storage for less than $3/TB/month if you look around :)

I am a simple man, and like simple setups that's easy to maintain.

When it comes to my pictures and private data, I have them on one portable disk, that I rsync over to another portable disk on a monthly basis.

When it comes to my application logs and data, I back them up to a S3-compatible bucket with s3-cmd, through the frequency of my choosing as a cron-job. The S3 bucket is configured for "write once, read many" mechanism to avoid alternation of the data. And if the cron-job fails, I get a notification through ntfy.

Quite simple, and robust.

I do an s3 sync every five minutes of my important files to a versioned bucket in AWS, with S3-IA and glacier instant retrieval policies, depending on directory. This also doubles as my Dropbox replacement, and I use S3 explorer to view/sync from my phone.

I use a combination of technologies.

I keep most of my documents in sync between all my computers with SyncThing. It’s not a true backup solution, but it protects me from a drive failing in my desktop or someone stealing my laptop.

My entire drive gets backed up locally to a external hard drive using Borg. That provides me with the ability to go back in time and backs up all of my large files such as family photos and home videos.

Important documents get cloud backup with Restic to BackBlaze B2. Unfortunately, I don’t want to pay for the storage capacity to save all of my photos and videos, so those are a little less protected than they should be, but B2 gives me the peace of mind that my documents will survive a regional disaster like flooding or fire.

I use both Borg and Restic because I started with Borg many years ago and didn’t want to lose all of my backup history, but can’t use it with B2. I used to use one of the unlimited cloud single-computer solutions like Mozy or Carbonite but have multiple computers and their software was buggy, then they increased the price significantly. When I switched to B2, I found Restic worked well with it. I think they’re both solid solutions, but the way Restic works and the commands make more sense to me.

I have a lot of photos that I take. Amazon Photos gives me unlimited storage to back them all up, but it’s terrible. When Amazon Drive existed, I could grab a folder and drop it in the Photos area of Drive. My folder structure was maintained and it was easy to see what I’d already backed up or what else needed to be sent. Then Drive was discontinued and the only way to manage my photos is through the terrible web interface. There is no folder structure, putting photos in albums is unwieldy, and I have no confidence in the systems ability to give me back my photos if I needed to recover from data loss. Uploading a bunch of photos through the web page is slow and fails more often than not, leaving me to painstakingly figure out what went and what failed or just upload the whole thing again, creating duplicates. Most of the time, I can’t even find a photo or album I’m searching for. I hate that it exists and would fill a specific need if it wouldn’t have such a terrible interface.

I wish I’d have a friend who would share a few TB of storage with me but I’m pretty happy with my system, even though it has some gaps.

Syncthing's file versioning has got me out of many a jam

Oh yeah, good point! I have file versioning turned on, too, so if I do need to roll back a file, SyncThing probably has a good copy.

Various HDD full data backups maintained with FreeFileSync, important files backup on ProtonDrive. Multi-device autosync with Syncthing (phones, tablet, pcs)

I have two machines that back up to a local server using Borg. That whole server in turn backs up to Jottacloud using restic with encryption enabled.

By the way, I wouldn't use rclone for backups. Use restic or something similar that does incremental backups. Because if you do rclone and then later discover that some files were corrupted locally, then your files are gone. With incremental backups you would still be able to retrieve them.

Oh, or do you mean backing up the stuff that is on the cloud?

I do exactly the same. I do not have a lot of data I feel a need to backup. I have a nightly job that zips and then encrypts my data, then rclones it to off site storage.

Backblaze. Easy and cheap. It’s fire and forget for the most part.

Cheap second NAS that I power up every now and again, then I run a dsynchronize profile which replicates the important stuff (video), and all the stuff I could never replace I put on a usb and keep it elsewhere

I do a Clonezilla image on an old 3.5'' drive from time to time, most of my documents are stored on the cloud so I'm pretty safe in terms of 'uptodateness'

Backend storage is all ZFS. I have a big external drive plugged in via USB on my ZFS box and that backs up my daily backups.

I have a two old PCs that I run ZFS on as well. One auto turns on every week and ZFS backs up to that. The other PC is completely manual and I just randomly turn that on and backup. Every so often. Usually every 2-4 weeks.

For off-site backups. I use Syncthing and it is running on a server at a families house. Few miles away.

I picked Syncthing over ZFS because I actually a little more than an off-site I wanted a two way sync between our two locations so both locations could have a local copy they can edit and change.

For a long time I did 1 hot copy (e.g. on my laptop), 1 LAN/homelab copy (e.g. Syncthing on a VM), and 1 cloud copy ... less a backup scheme than a redundancy scheme, albeit with file versioning turned on on the homelab copy so I could be protected from oopsies.

I'm finally teaching myself duplicity in order to set up a backup system for a webdev business I'm working on ... it ain't bad.

rsync over ssh (my server is in the next room) which puts the backup on an internal drive. I also have an inotify watch to zap a copy from there to an external USB drive.

I have my data backed up locally on an HDD, though I'm planning on building a server machine to hold more data with parity (not just for backups). Important data I have backed up in Google drive and Proton drive, both encrypted before upload. It isn't that big, I don't back up media or anything in the cloud. Oh and I have some stuff in mega, but I stopped adding to that years ago. I should probably delete that account, thanks for the reminder!

I still need to get it set up, but I'll have 3: One on my NAS, one on a local USB drive, and one offline backup. I'll use rsync for the job.

The main storage is a Nas that is mounted in read only most of the time and has two drives in raid mirror. Plus rclone to push a remote and client side encrypted backup to backblaze.

Nightly backups to an on-prem NAS. Then an rsync to a second off-site NAS at my folks house.

Encrypted files sent to Google Cloud Storage (bucket) for long-term archival. Comes out pretty cheap like that.

I have a compressed copy of the config files on my server on a separate drive, and every night restic makes a snapshot and stores it in a separate drive attached to a raspberry pi 3.

Local versioning with btrfs rsync copy to other machine in home network rsync to NAS at my parents home

My work is using Google drive for Sync/back up so that is covered by them.

Personal data is automatically synched (syncthing) between three computers in different rooms in my home + some of the files is copied to my phone and tablet. I consider adding also an online server for further redundancy