From reddit selfhosted: What do you wish you knew from the start
reddit.com
I saw this post today on Reddit and was curious to see if views are similar here as they are there.
- What are the best benefits of self-hosting?
- What do you wish you would have known as a beginner starting out?
- What resources do you know of to help a non-computer-scientist/engineer get started in self-hosting?
The big thing for #2 would be to seperate out what you actually need vs what people keep recommending.
General guidance is useful, but there's a lot of 'You need ZFS!' and 'You should use K8s!' and 'Use X software!'
My life got immensely easier when I figured out I did not need any features ZFS brought to the table, and I did not need any of the features K8s brought to the table, and that less is absolutely more. I ended up doing MergerFS with a proper offsite backup method because, well, it's shockingly low-complexity.
And I ended up doing Docker with a bunch of compose files and bind mounts, because it's shockingly low-complexity. And it's just running on Debian, instead of some OS that has a couple of layers of additional software to make things "easier" because, again, it's low-complexity.
I can re-deploy the entire stack on new hardware in about ~10 minutes (I've tested this a few times just to make sure my backup scripts work), and there's basically zero vendor tie-in or dependencies that you'd have to get working first since it's just a pile of tarballs and packages from the distro's package manager on, well, ANY distro.
I have made that migration myself going from a Raspberry PI 4 to a n100 based NAS. It was 10 minutes for the software stack as you said This not taking into account media migration which was done on the background over a few hours on WiFi (I had everything on an external hard drive at the time).
That last part is the only thing I would change about my self hosting solution. Yes, the NAS has a nice form factor, is power efficient and has so far been very optimal for my needs (no lag like rpi4), however I have seen they don’t really sell motherboard or parts to repair them. They want you to replace it with another one. Reason 2 on the same is vendor lock in. Depending on the options you select when creating the storage groups/pools (whatever they are called), you could be stuck needing to get something from the same vendor to read your data if the device stops working but the disks are salvageable. Reason 3 is they’ve had security incidents so a lot of the “features” I would not recommend using ever to avoid exposing your data to ransomware over the internet. I don’t trust their competitors either. I know how commercial software is made with the smallest amount of care for security best practices.
Yeah, I just use plain boring desktop hardware. (Oh no! I'm experiencing data corruption due to the lack of ECC!) It's cheap, it's available, it's trivial to upgrade and expand, and there's very few little gotchas in there: you get pretty much exactly what it looks like you get.
Also nice is that that you can have a Ship of Theseus NAS by upgrading what needs upgrading as you go along and aren't tied into entire platform swaps unless it makes sense - my last big rebuild was 3 years ago, but this is basically a 10 year old NAS at this point.
So did you buy ecc ram?
btrfs with its send/receive (incremental fs-level backups) is already stable enough for mostly everything (just has some issues with raid 5/6), and is much more performant than zfs. And it is also in the linux kernel tree (quite hugely useful). Of course, if more zfs-like functionality is what you look for.
"Already stable enough"
My only experience with btrfs was when trying out Opensuse Tumbleweed. Within a couple days my home partition was busted, next time it was another partition. No idea if the problems could be fixed as these were fairly new installations to give Opensuse a try and I couldn't be bothered to fix a system that's troubling me from the very beginning.
Between all the options that just work (TM), btrfs is the one I've learned to stay away from.
EDIT: that was four or five years ago
And I’ve been using it for
eightsix of those 15 in RAID 5/6 with zero issues, so YMMW I guess. Sorry you experienced problems.Honestly it's not; BTRFS has been in my 'that's neat, but it's still got a non-zero chance of deciding to light everything on fire because it's bored' list for, uh, a decade now?
The NAS build is old enough to more or less predate BTRFS being usable (closing in on a decade since I did the initial OS install, jeez) and none of the features matter for what I'm storing: if every drive in my NAS died today, I'd be very annoyed for a couple of hours during the rebuild, and would lose terrabytes of linux ISOs that I can just download again, if I wanted to use Jellyfin to install them a 2nd time. (Any data I care about is pulled offsite at least once a day, so I've got pretty comprehensive backups minus the ISOs.)
I know EXT4 and mergerfs and snapraid are not cool, or have shiny features, but I've also had zero problems with them over the last decade, even between Ubuntu upgrades (16.04, 18.04, 20.04, 22.04) and hardware platform upgrades (6600k, 8700k, 10950k) and the entire replacement of all the system drives (hdd -> ssd -> nvme) and the expansion of and replacement of dead HDDs, of varying sizes (4tb drives to 8tb drives to 16tb drives to some 20tb drives).
It all just... worked, and at no point was I concerned about the filesystem not working if I replaced or upgraded or changed something, which is not something ZFS or BTRFS would have guaranteed during that same time window.
IMHO 99% of the time btrfs features are used as a band-aid for things that would be much better done otherwise. Generally by using a stable distro and a decent backup solution (like Debian + Borg). And you get to use a truly stable, proven, boring fs ike ext4 or xfs.
Stable yes, but no protection from bitrot, and the journal of ext4 is the band aid, instead of a cow fs like zfs or btrfs.
You can protect important data with backups, which you should do anyway, and in practice I feel like the added complexity of BTRFS and ZFS is not worth the COW.
BTRFS is cool but they tried to cram way too much too fast into it and it added a ton of complexity and it's still not 100% done after all these years. A COW mode for ext4 would have been adopted much faster.
Can you elaborate on how your backup script re-deploys on new hardware? Sounds very nice to have.
It's a really simple script.
Everything is deployed with a docker compose, and all the docker volume data are bind mounts and, for example, a Jellyfin install would have everything in /stacks/jellyfin.
The backup script makes a tarball of each service individually (and stops the stack if there's anything in there doing database things or anything else that might end up being inconsistent by just archiving the filesystem), and uploads them to a S3 storage provider AND burns them to a BluRay.
The recovery script does the opposite: it downloads and unarchives the data.
As long as you're on Linux and have Docker, it should just magically work.
I see! Thanks, will try to back up my docker compose services this way.
If you write the script yourself, just make sure you test it a couple of times, and preferably with different datasets from different runs.
I found some edgecase stuff that would have prevented a restore even after I had tested it successfully (some permission issues due to changes in containers and whatnot were resulting in less than the expected data being archived and restored) a couple of times.
To piggy back on your “You don’t need k8s or high availability”,
If you want to optimize your setup in a way that’s actually beneficial on the small, self hosted scale, then what you should aim for is reproducibility. Docker compose, Ansible, NixOS, whatever your pleasure. The ability to quickly take your entire environment from one box and move it to another, either because you’re switching cloud providers or got a nicer hardware box from a garage sale.
When Linode was acquired by Akamai and subsequently renamed, I moved all my cloud containers to Vultr by rsyncing the folder structure to the new VM over SSH, then running the compose file on the new server. The entire migration short of changing DNS records took like 5 minutes of hands-on time.
Ansible is so simple yet so elegant.
I've been in love with the concept of ansible since I discovered it almost a decade ago, but I still hate how verbose it is, and how cumbersome the yaml based DSL is. You can have a role that basically does the job of 3 lines of bash and it'll need 3 yaml files in 4 directories.
About 3 years ago I wrote a big ansible playbook that would fully configure my home server, desktop and laptop from a minimal arch install. Then I used said playbook for my laptop and server.
I just got a new laptop and went to look at the playbook but realised it probably needs to be updated in a few places. I got feelings of dread thinking about reading all that yaml and updating it.
So instead I'm just gonna rewrite everything in simple python with a few helper functions. The few roles I rewrote are already so much cleaner and shorter. Should be way faster and more user friendly and maintainable.
I'll keep ansible for actual deployments.
I have a k3s cluster for fun and I can admit that k8s is way too complicated.
I don't want to dig hours through documentation to find what I'm looking for. The docs sometimes feel like they were written for software devs and you should figure part of the solution yourself.
I have a ExternalName service that keeps fucking up my cluster everytime it restarts, bringing down my ingresses, because for some reason it doesn't work and I have no idea where to look at to figure out why it doesn't work - I just end up killing the service and reapplying the yaml file and it works.
I had to diagnose why my SSL certificates would get stuck in "issuing" in cert-manager, had to dig through 4 or 5 different resources until I got to an actual, descriptive error message telling me that I configured my ClusterIssuer wrongly.
I wanted a k3s cluster to learn but every time I have issues with it I realize it's a terrible idea.
I wish I had podman + compose but it does seem like a docker-compose is more complicated. Also, I wish I could do ansible but I have no idea where to start (nor how it works).
EDIT: oh yeah I also lost IPv6 support because k3s by default doesn't enable v6 and I was planning on using Hetzner CCM to have a 2 node cluster until I realized Hetzner Networks don't support v6.
I just moved everything from vultr to self host because of their latest changes.
EDIT: As I suspected, the changes that u/mesamunefire is referencing are the ones that taken out of context awhile back and incorrectly assumed to apply to user VPS’ and the data on them, which is not the case. Those terms only apply to information posted publicly to their website, like the community forums.
What changes would those be
Can't speak for OP, but I bailed on Vultr because of how they handled the arbitration agreement change. Basically, I couldn't access my containers without accepting the new TOS, so I "hacked" the website with Inspect Element so I could access support to close my account. For me, the arbitration change wasn't the issue (my current host has similar policies), but being forced to accept a new TOS to use my account. I had no option do disagree or "remind me later," I literally only had an "accept" button. I refuse to use any service that treats me like that.
I'm now with Hetzner, so we'll see if they pull that nonsense. I only use the VPS to get around my ISP's CGNAT (WireGuard VPN w/ HAProxy at the edge to route domains), so if they pull the same nonsense, I'll copy my config to another VPS.
How is Hetzner?
Seems to work fine and meets my needs. I'm in the US though, and options are pretty limited. So far the only snag I've had was having to upload my ID, I guess their signup process is a little strict.
Nice. I'm all about new providers
https://old.reddit.com/r/webdev/comments/1boz5ne/vultr_new_tos_claims_all_commercial_rights_to/ " You hereby grant to Vultr a non-exclusive, perpetual, irrevocable, royalty-free, fully paid-up, worldwide license (including the right to sublicense through multiple tiers) to use, reproduce, process, adapt, publicly perform, publicly display, modify, prepare derivative works, publish, transmit and distribute each of your User Content, or any portion thereof, in any form, medium or distribution method now known or hereafter existing, known or developed, and otherwise use and commercialize the User Content in any way that Vultr deems appropriate, without any further consent, notice and/or compensation to you or to any third parties, for purposes of providing the Services to you."
And you could not opt out. You had to click agree in order to login. That's the biggest one.
It was later removed after the fact but there were other changes that sucked.
I had customer data as well as some personal stuff on a couple of servers. It was low hanging fruit so I just started self hosting. It's silly how much rights they suddenly wanted. Not worth the hassle, they just provide basic boxes to begin with.
They also would not let you login without accepting those new rights now were you able to opt out. So I just threw my infa on some local systems, deleted everything and then had to say yes to their TOS. Again silly and great way to lose business.
That only applies to posts on their forums. Not the content on your VPS
Nope. It's the content.
Incorrect. It applies only to the forums. It does not apply in any way, shape, or form to your VPS or the content on it. It’s one thing to be mistaken, but let’s not spread misinformation on purpose.
It appears after the controversy they removed the parts https://arstechnica.com/tech-policy/2024/03/after-overreaching-tos-angers-users-cloud-provider-vultr-backs-off/
But when I read the tos, it was pretty clear it was not limited.
You also had to agree without an opt out which was scammy. There are better providers out there.
I don’t think you read the TOS. I think you read the out of context snippet and assumed that it applied to your VPS. They removed that bit because it was confusing, not because it was not limited.
Being forced to agree to a TOS change without an opt out is scummy, but that’s not limited to Vultr. Companies are not out there with multiple versions of TOS based on what people agree to or not. At that point you’re better off not using a VPS.
Welp I'm an anonymous person on the Internet so you can believe what you want. I could say that my job is literally mass deploying servers (devops) but if you don't believe me that I said that I read it then I'm not sure what we can agree on.
Let's just stop while we are both ahead. It's a Thursday, good day for coffee yeah? Hope you have a good day.
I had a similar experience with NixOS-anywhere and a VPS issue. Reset the OS, setup SSH key access and ran NixOS-anywhere and within like 15 minutes was back up and running.
Not needing Kubernetes is a broad statement. It allows for better management of storage and literally gives you a configurable reverse-proxy configured with YAML if you know what you're doing.
Yes, but you don't need Kubernetes from the start.
Well I guess podman works fine for the first few months. Interestingly I still use build-ah heavily for building my custom images
I find a lot of stuff is using docker compose, which works with Podman, but using straight docker is easier, especially if it's nothing web-facing
Funnily enough Docker compose has never worked for me on Podman. There always seems to be something that is incompatible (also due to me running on Debian). However, I feel like it should become a standard amongst homelabbers and professionals to use Kubernetes manifests going forward, since it is the most portable.
Heavy disagree on the storage statement from what I've used and seen but it works for lots of people so not going to detract. NFS is always a pain but longhorn seems to have advantages
NFS is a pain, no question about it. I used to use longhorn but these days since I'm doing a single node k3s I'm just doing hostpath. It's that PVCs make intuitive sense to me, but I guess podman will likely work just fine for such cases other than canary deployments and OOTB service-meshes
I wish I knew not to trust closed source self-hosted applications, such as Plex. Would have saved a lot of time and money.
Can you elaborate?
Plex is a great example here. I've been Hetzner customer for many many years, and bought a lifetime license to Plex. Only to receive few months later a notification from Plex that I am no longer allowed to self-host Plex for myself(and only myself) at Hetzner and that they will block all access to my self-hosted Plex instance. I tried to ask for leniency or a refund, but that was wasted effort as well.
In short, I was caught on a crossfire when for-profit company tried to please hollywood by attempting to reduce piracy, so they could get new VC funding.
...
I am now a happy Jellyfin user and warmly recommend all Plex users to try it, the Jellyfin community is awesome!
(Use your favourite search engine to look up "Hetzner Plex ban" for more details)
@zutto @warlaan Searching about, this was Plex banning the use of Plex on Hetzner's IP block, right? Not a decision made by Hetzner?
Yes, correct.
I apologize if someone misunderstood my reply, Plex was the bad actor here.
Are you still on Hetzner? How's their customer support in general?
Still with Hetzner yeah. Haven't had to deal with Hetzner customer support in the recent years at all, but they have been great in the past.
It is much easier to buy one "hefty" physical machine and run ProxMox with virtual machines for servers than it is to run multiple Raspberry Pis. After living that life for years, I'm a ProxMox shill now. Backups are important (read the other comments), and ProxMox makes backup/restore easy. Because eventually you will fuck a server up beyond repair, you will lose data, and you will feel terrible about it. Learn from my mistakes.
My reason for self hosting is being in control of my shit, and not the cloud provider.
I run jellyfin, soulseek, freshRSS, audiobookshelf and nextcloud. All of that on a pi 4 with an SSD attached and then accessible via wireguard. Also that sad is accessible as nfs share.
As I had already known Linux very well before I've started my own cloud, I didn't really had to learn much.
The biggest resource I could recommend is that GitHub repository where a huge amount of awesomely selfhosted solutions are linked.
Awesome Self-Hosted
Yes that one, thanks.
I'll parrot the top reply from Reddit on that one: to me, self hosting starts as a learning journey. There's no right or wrong way, if anything I intentionally do whacky weird things to test the limits of my knowledge. The mistakes and troubles are when you learn. You don't really understand the significance of good backups until you had to restore from them.
Even in production, it differs wildly. I have customers whom I set up a bare metal Ubuntu in some datacenter for cheap, they've been running on that setup for 10 years. Small mom and pop shop, they will never need a whole cluster of machines. Then at my day job we're looking at things like Kubernetes and very heavyweight stacks because we handle a lot of traffic.
Some people self-host a PiHole on a Raspberry Pi and that's all they need. Some people have entire NAS setups with smart TVs accessing their Plex/Jellyfin servers for the whole extended family. I host my own emails, which is a pain in the ass to get working reliably and clean your IP reputation.
I guess the only thing you should know is, you need some time to commit to maintaining your stuff if you don't want it to break or get breached (if exposed to the Internet), and a willingness to learn because self hosting isn't a turnkey experience. It can be a turnkey installation but when your SD card/drives fails you're still on your own to troubleshoot and fix it. You don't set a NextCloud server to replace Google Drive with the expectation that you shove the server in a closet forever. Owning your infrastructure and data comes at a small but very important upkeep time investment.
Benefits:
Cheap storage that I can use both locally and as a private cloud. Very convenient for
piracystoring all my legally obtained files.Network wide adblocking. Massive for mobile games/apps.
Pivate VPN. Really useful for using public networks and bypassing network restrictions.
Gives me an excuse to buy really cool, old server and networking hardware.
As for things I wish I knew... Don't use windows for servers. Just don't.
SMB sucks, try NFS.
Use docker, managing 5 or 10 different apps without containers is a nightmare.
Bold of you to assume I'm a computer scientist or engineer or that I have a degree lmao. I just hate ads, subscriptions and network restrictions, so I learned how to avoid those things. As for resources to get started... Look up TrueNAS scale. It basically does all of the work for you.
How's the network wide ad blocking work, that would solve a big issue with my kids.
You either set the DNS settings per device to the system running PiHole / AdGuard Home, or if your router allows, set the DNS there. It's ideal to set it on the router.
Any time a device makes a DNS request to a domain, it's checked against the list. If found, it's stopped. If not found, it gets sent upstream to your choice of a public DNS configured during setup. I use Cloudflare (1.1.1.1, 1.0.0.1).
For #1 I would say not to focus on learning the same kind of thing that you started at some point recently. It took me a few months to get my local setup going since I would do it after work (also similar skills) and get tired of poking around.
At some point I gave up and started doing other things that brought me joy (video games, paint night with YouTube tutorials, movies/TV). When I finally decided to get back to it, it was enjoyable again. If I have to re-do it from scratch it could be done in probably a few hours or at most some nights after work and would be enjoyable since the annoying “got ya” lessons are somewhere on memory or some searches away that could be filtered much quicker.
I've learned a number of tools I'd never used before, and refreshed my skills from when I used to be a sysadmin back in college. I can also do things other people don't loudly recommend, but fit my style (Proxmox + Puppet for VMs), which is nice. If you have the right skills, it's arbitrarily flexible.
What electricity costs in my area. $0.32/KWh at the wrong time of day. Pricier hardware could have saved me money in the long run. Bigger drives could also mean fewer, and thus less power consumption.
Google, selfhosting communities like this one, and tutorial-oriented YouTubers like NetworkChuck. Get ideas from people, learn enough to make it happen, then tweak it so you understand it. Repeat, and you'll eventually know a lot.
I assume you have this on a UPS. What about using a smart plug to switch to UPS during the expensive part of the day, then back to mains to charge when it's cheaper? I imagine that needs a bigger UPS than one would ordinarily spec, and that cost would probably outweigh the electric bill, but never know.
That’s not really what a UPS is designed for, they’re meant to last minutes. Long enough for a clean shutdown or to start a generator.
You’d want something like a whole house battery backup instead.
Is there a story attached to no. 2?
Well, turns out that when you host a private service that allows others to share files, they might share files that they are not allowed to share. But in return your door gets kicked in in the morning and suddenly no one wants to take credit for the actual upload anymore.
Yeah.... Becoming a public-facing file host for others to use seem rather irresponsible.
If/when a user's given a means of uploading files to my server, there's no method/permissions for them to share those files with others; it's really just for them to send files to me. (Filebrowser is pretty good for that)
That and almost nothing is public access; auth or gtfo.
Podman quadlets have been a blessing. They basically let you manage containers as if they were simple services. You just plop a container unit file in
/etc/containers/systemd/
, daemon-reload and presto, you've got a service that other containers or services can depend on.Is containers here used in the same context as docker? I'm not familiar with podman.
Just about but it's more experimental.
I would've wished
My experience varies wildly from yours, so please don't take this bit as gospel.
Have yet to find a container that doesn't work perfectly well in podman. The options may not be the same. Most issues I've found with running containers boil down to things that would be equally a problem in docker. A sample:
And that's it. I generally run things once from the podman command line, then use podlet to create a quadlet out of that configuration, something you can't do with docker. If you are having any trouble with running containers under podman, try the --privileged shortcut, see that it works, and then double back if you think you really need rootless.
data stays local for the most part. Every file you send to the cloud becomes property of the cloud. Yeah, you get access, but so does the hosting provider, their 3rd party resources, and typical government compliances. Hard drives are cheap and fast enough.
not quite answering this right, but I very much enjoy learning and evolving. But technology changes and sometimes implementing new software like caddy/traefik on existing setups is a PITA! I suppose if I went back in time, I would tell myself to do it the hard way and save a headache later. I wouldn't have listened to me though.
Portainer is so nice, but has quirks. It's no replacement for the command line, but wow, does it save time. The console is nerdy, but when time is on the line, find a good GUI.
For item #1, self hosted solutions like home assistant also allow using “smart” devices without the cloud in some instances. You are not at the mercy of a vendor going out of business or dropping support and your devices becoming bricks.
Not all devices are compatible, but from what I’ve learned, I would never buy another device with so called “smart” features if it is not compatible with home assistant.
For #2 and #3, it’s probably exceedingly obvious, but wish I would have truly understood ssh, remote VS Code, and enough git to put my configs on a git server.
So much easier to manage things now that I’m not trying to edit docker compose files with nano and hoping and praying I find the issue when I mess something up.
I know this is coming up on my radar, but I am not quite sure where to start. Might you have any resources on hand to point me in the right direction?
Especially once I have everything dialed in the way I want, I'd love to be able to pull from my own repo to get stuff running again/spin up a new instance
Honestly, I learned a ton from these guys: https://www.smarthomebeginner.com/
I've diverged a good bit since then of the services I've added and the specifics of how I configure things (I still use Traefik whereas I think they've shifted to Nginx), but they have a great example of a GitHub repo and what it looks like to manage a self-hosted server.
For 2.: use dns-01 challenge to generate wildcard SSL certs. Saves so much time and nerves.
My open-source, zero dependency JS library for requesting and generating certs with dns01: https://github.com/clshortfuse/acmejs
I only coded for name.com but it is compatible with anything really. Also can run in the browser, which could be useful in a pinch.
For me #2 would be "you have ADHD and won't be able to be medicated so just don't"
I've mentioned elsewhere my server upgrade project took longer than expected.
Just last night I threw it all into the trash because I just can't anymore
Acronyms, initialisms, abbreviations, contractions, and other phrases which expand to something larger, that I've seen in this thread:
20 acronyms in this thread; the most compressed thread commented on today has 4 acronyms.
[Thread #899 for this sub, first seen 30th Jul 2024, 23:35] [FAQ] [Full list] [Contact] [Source code]
Things like changes to TOS or services can be seriously mitigated by hosting it yourself. WHat happens if Spotify changes the music they host or inserts ads into everything. Well for me, nothing. On the flip side, if some of my stuff goes down, kids and wife will bark. But honestly its mostly set it and forget it.
KISS is a thing that applies to many things in life. Anything "smart" in your home should ideally function without your "smart" features working. Ie: light switches should be dumb light switches if something breaks etc etc. Also dont get caught in using rack or enterprise gear. You can learn just as much using smaller, fatter desktops with bigger fans and air cooling over a power hungry rack servers with 80mm fans that blow your eardrums out. My entire lab runs on old dell workstations and raspberry pis'
https://www.servethehome.com/ -
That sex isn't love.
And love isn't sex
NixOS is awesome!
although maybe not for beginners. for beginners use docker compose and do backups however you like
Can you clarify on how NixOS is great for selfhosting? I was going to do mint.
As far as operating systems goes, i would recommend Debian or Ubuntu. These are very wiedly used and there are many resources. And if you are brave, you can start without a Desktop.
you configure your whole server in one file (including docker/podman services), installation and configurations is taken care of by the package manager, you pretty much only need to know one file to admin your system
and no extra stuff is installed only what you specify so you have a minimal resource usage.
i think this is awesome
Not as good as Ansible although they are different tools