Best approach for Docker resilience with two hosts

Sim@lemmy.nz to Selfhosted@lemmy.world – 33 points –

I'm running Docker on Ubuntu server; around 50 containers running, most admin via Portainer. Configuration files and small databases for container applications are stored on the local SSD, media and larger files are stored on a NAS.

NAS data and the container folders are backed up.

I have a second identical machine doing nothing. What would you recommend researching to add resilience to this setup? Top priority is quick and easy restoration should the SSD fail - everything else is relatively easy to replace.

I'll create an SSD RAID but I like the idea of a second host.

16

You can use docker swarm (or a better container orchestrator) to have the containers automatically fail over to the second host

Swarm will also spread the load out over both hosts, but all your data would need to be accessible by both hosts

Thanks. That means I need to move all data off the hosts on to, say, a NAS - then the NAS becomes the single point of failure. Can I operate a swarm without doing that but still duplicate everything from host 1 to host 2, so host 2 could take over relatively seamlessly (apart from local DNS and moving port forwarding to nginx on the remaining host)?

I think you can run a ceph or glusterfs cluster for sharing files in a cluster

I think 3 nodes are required for that

Thanks. Can I use my existing, single Docker to start a new swarm, or do I have to start from scratch?

Container orchestration is what you're looking for. Kubernetes is the most popular, but it might be overkill it's hard to say based on your setup. However it's definitely useful experience to know how to run it.

Thanks. Could I achieve a simple 2-host solution with Kubernetes though?

Nothing about k8s is simple. But yes you can achieve that.

Take a look at Rancher for actually running a cluster.

I put my dockers on mirrored zfs pool and have enough spare parts in case of breakdowns.

So you have Docker itself on a single host (with parts) and all the containers in fault tolerant storage, and the most work you'd have to do in the event of host drive failure is to re-install the OS and Docker itself?

I have the OS (with docker) mirrored too. So no reinstalling, just disk or other parts swapping in case of a failure. I hope. A mothboard swap is the worst downtime. I have done this and needed to fiddle with network settings due to changed net interface name to get the server up again.

Learning K8s is a lot to take on, but it will pay off as your needs expand in the long term — and if you decide to go into infra/ops at work.

It might be enough to just rsync stuff to the secondary regularly and the inactive machine monitor the active machine and just start all services as the active machine stops responding.

Acronyms, initialisms, abbreviations, contractions, and other phrases which expand to something larger, that I've seen in this thread:

Fewer Letters More Letters
DNS Domain Name Service/System
HTTP Hypertext Transfer Protocol, the Web
NAS Network-Attached Storage
k8s Kubernetes container management package
nginx Popular HTTP server

4 acronyms in this thread; the most compressed thread commented on today has 8 acronyms.

[Thread #162 for this sub, first seen 24th Sep 2023, 17:15] [FAQ] [Full list] [Contact] [Source code]