How to stagger automated upgrade?

remram@lemmy.ml to Linux@lemmy.ml – 40 points –

I am using unattended-upgrades across multiple servers. I would like package updates to be rolled out gradually, either randomly or to a subset of test/staging machines first. Is there a way to do that for APT on Ubuntu?

An obvious option is to set some machines to update on Monday and the others to update on Wednesday, but that only gives me only weekly updates...

The goal of course is to avoid a Crowdstrike-like situation on my Ubuntu machines.

edit: For example. An updated openssh-server comes out. One fifth of the machines updates that day, another fifth updates the next day, and the rest updates 3 days later.

64

You are viewing a single comment

I invite you to re-read the second paragraph of my post.

You're just throwing things I already listed back at me. I mentioned a staging environment, I mentioned a schedule was a (bad) option.

An obvious option is to set some machines to update on Monday and the others to update on Wednesday, but that only gives me only weekly updates…

You can literally schedule them by the minute, but okay buddy.

I'll never not be stumped by people who are looking for answers shitting all over those answers.

Maybe I'm not being clear.

I want to stagger updates, giving time to make sure they work before they hit the whole fleet.

If a new SSH version comes out on Tuesday, I want it installed to 1/3 of the machines on Tuesday, another third on Wednesday, and the rest in Friday. Or similar.

Having machines update on a schedule means I have much less frequent updates and doesn't even guarantee that they hit the staging environment first (what if they're released just before the prod update time?)

You could set your staging environment PCs to be checking for updates hourly and installing them daily.

You could set your other PCs to just be downloading the updates daily but only install them on certain days of the week.

That means your staging servers could be constantly updated, but your other servers only download the updates, but wait until a certain day to install them.

I'm not sure you can set the timer based on a specific package being updated without some bash scripting alongside checking for which things are getting updated in your staging servers, and then using that script to update the unattendedupgrades control files on your second and third tier PCs in the fleet to adjust when they're supposed to install those updates.

I can't currently find anything on prohibiting specific packages or only installing selected updates from the downloaded updates. Perhaps you could use a mix of systemd downloading the updates and a cronjob for installing them?


Further, Ubuntu/Debian is technically already doing this as well. They already have staggered rollouts in APT.

If you've ever updated via command line and seen the phrase "These packages have been kept back" or "these following upgrades have been deferred due to phasing" it's because they're purposefully withholding those updates from you, to make sure they roll out safely to everyone. That way, if a handful of users who get a phased rollout have issues, the rollout can be undone before it goes out to everyone.

I found the page about "phased upgrades" (somehow missed it searching for "staggered", "incremental", "delayed", etc). Thanks for the pointer!

Unfortunately it doesn't seem configurable on my end, and it rolls out in about 54 hours so it can take out most of my machines before I have time to react (my first machine might update ~20h into the phased rollout, the rest will break within 24h). Bummer!

That doesn't even have anything to do with this. Phased upgrades are about CHANNELS. As in a select number of systems get the upgrades before anyone else. This is similar to a staging environment in that it minimizes risk. You clearly do not understand what you are asking for here, and are unable to articulate it well enough for us to understand either. I suggest you ask in a different way with more information.

Minimizing risk is LITERALLY what I asked for. You clearly don't understand what I asked for.

And I in return am asking to you STFU, do some reading, and come back when you're better informed to properly ask for your FREE HELP and get answers.

These are people wasting their time on you right now. You're being a demanding little prick. WE DONT NEED TO GIVE YOU SHIIIIIIIT, BRRAAAAAHHHHH

You should be more courteous to the guy who has been responding to you, because he's giving you exactly what you're asking for, you just don't know how to ask for it properly. Just a piece of advice 🤌

That being said, since you don't know what you're afraid of exactly, I can tell you in my long history of running thousands of Linux machines, containers and VMs at scale, I've never ever once since an unattended upgrade do anything that couldn't immediately be rolled back or fixed. The worst I've seen is services impacted that do not start. So why don't you just chill out a tiny a bit about your Jellyfin server or whatever you're being rude about.

I find it hard to stay courteous in the presence of people like you, who reply without reading my post, call me "duder" and say I "don't understand what I am asking for".

Thankfully, I did get a great answer from someone else.

You have no idea what you're asking for, that's why everyone pointed you in one direction, only to have you bitch and complain we "didn't read the post" and whine about it.

To actually answer your question, you need some kind of job scheduling service that manages the whole operation. Whether that’s SSM or Ansible or something else. With Ansible, you can set a parallel parameter that will say that you only update 3 or so at a time until they are all done. If one of those upgrades fails, then it will abort the process. There’s a parameter to make it die if any host fails, but I don’t recall it right now.

I think I would want a bigger delay, an faulty upgrade might only break something within hours.