Ran into an issue with the latest arch Linux update, how to prevent in the futur

finestnothing@lemmy.world to Linux@lemmy.ml – 44 points –

I've been using Linux for the better part of 4 years so I'm not new to it, but I've always learned stuff on an as-needed basis. Today I ran into an issue that I want to prevent in the future since I had a mini heart attack thinking about how my last backup on this system was... Never since I'm an idiot who forgot to set it up like I have on my laptop. Here are my steps:

  • Ran sudo pacman -Syu; sudo pacman -Syy like I do every few days
  • packages updated
  • restarted computer
  • can only boot into emergency mode

The journal was really long so I moved past it and went to the pacman logs, linux had updated from 6.4.3.1-1 to 6.4.3.1-2. Nothing else was important enough to cause the system to only boot into emergency (gcc, vbox, some libs) so I did a quick pacman -U to the cached 6.4.3.1-1 version for both Linux and Linux headers and rebooted - hurrah it was fixed! But I have no idea why it happened, or how to prevent it.

Has anyone else ran into this issue when updating? Any advice for preventing future crashes or issues like this so I don't fear updating?

Edit: Thanks to everyone for your advice! I ended up following multiple bits of advice. I reinstalled arch to get btrfs as the filesystem (didn't have anything important other than some docked-compose files and books yet) and grabbed the linux-lts kernal as a backup as well. I haven't configured snapper yet, but it's on my list of things to do.

33

You could install the linux-lts kernel alongside the one you have already installed to have the option to just boot into that one when a kernel update seems to be the problem.

Another thing would be to look into backup solutions that execute automatically when updating your system. Personally I have my system on BTRFS subvolumes and a package called snapper to manage the snapshots (backups). Alternatively the package timeshift gets mentioned a lot when discussing backup solutions.

Otherwise you did exactly what I have done to fix almost every issue I ever had. Downgrading the likely culprit and updating again a bit later.

I second btrfs with snapper. With snapper, you can set it up so that it automatically makes snapshots at a timed interval and/or when you run your package manager. You can restore any of your saved snapshots from the snapper app or even from GRUB.

It's a bit hard to set up, but some distros come with it set up by default. You could install one if you don't want to figure btrfs setup out and are open to OS hopping. OP, you mentioned you're using arch, Garuda OS is an Arch based distro that comes with btrfs and the grub snapper configurations set up by default.

Thanks for the info! I've tried garuda and didn't like it, but I'll try snapper!

Ran sudo pacman -Syu; sudo pacman -Syy like I do every few days

Syy forces the package database to be updated even if no updates are available.

In my opinion, this makes no sense, especially after you have already run pacman -Syu before. Basically, you only generate additional, unnecessary traffic on the mirror you are using. Pacman -Syu is normally always sufficient.

The journal was really long so I moved past it

The display of the systemd journal can be easily filtered. For example, with journalctl -p err -b -1, all entries of the last boot process that are marked as error, critical, alarm or emergency are displayed.

Has anyone else ran into this issue when updating?

Not me. But other users do. Some of them also use a distribution other than Arch (or a distribution based on it). When I look at the problems, the current kernel is probably quite a minefield as far as problems are concerned.

Any advice for preventing future crashes or issues like this so I don’t fear updating?

As other users have already recommended, you could additionally install the LTS kernel. And if you use BTRFS as a file system, create snapshots before an update (https://wiki.archlinux.org/title/snapper#Wrapping_pacman_transactions_in_snapshots).

And it should be obvious that important data should be backed up on a regular basis.

What if i run

pacman -Syyu

?

I guess that's a bit better than the original command in question. But from what I understand it's still unnecessary and there is simply no need to force the refresh. A regular pacman -Syu is all you need and will refresh all databases that need it.

Sadly, I can’t help you there, but I must applaud to your attitude of figuring out what happend and why.

When reading forum posts about Windows, Android or iOS stuff, it’s infuriatingly common to find a list of potential fixes without any explanations. Many people don’t know what went wrong, or why, but they do have some ideas what might fix it. Unfortunately, they just can’t tell you why a particular action is supposed to fix anything, because they don’t understand the root causes.

The amount of times ive seen "well if x and y don't work, you might have to reinstall Windows" keeps me up at night

If turning it on and off again didn’t help, it’s time to reinstall. 🙈

Maybe rebuilding the ramdisk failed during the original upgrade?
One of the post-install stages after upgrading the kernel is rebuilding the initramfs - a tiny environment for bootstrapping the main OS.
If you trigger it manually with mkinitcpio --allpresets you'll notice it has fancy colorful output, with clearly visible warnings/errors.
However when invoked as part of an upgrade this coloring is removed, making errors difficult to spot.
I had this stage randomly fail a few times, resulting in an unbootable system like you described - solution was to just trigger a manual rebuild or reinstall the kernel with pacman -U.
It's possible that this is what actually fixed things when you downgraded the kernel.

The only difference between those two versions of linux is that the new one was built with a newer version of gcc. That doesn't really narrow the problem down, though. As far as I'm aware, emergency mode is caused by either a kernel panic or a failure to mount a needed filesystem. I'm leaning towards a corrupted kernel, since it doesn't sound like you changed your fstab or had any problem mounting /. I would run fsck -f on your boot partition, then try to re-download and reinstall the new package.

If that doesn't work, then you can add IgnorePkg = linux linux-headers to pacman.conf so you can update without installing the broken package, until you resolve the underlying issue. Or your can install a different kernel altogether.

As for preventing problems in the future, there's only so much you can do. Check archlinux.org before updating to see if anything requires manual intervention, and pay close attention while running pacman in case something goes wrong. You already seem to know the most important part, which is to keep a set of packages that are certain to work, so you can easily downgrade if a crash does happen.

Use timeshift, easy peasy to set up. It's saved my bacon a couple of times now, only last week, when the kernel 6.4 came out, but my old nvidia driver wouldn't work with it. You just jump into tty, run timeshift --restore and chose a previous backup, takes all of ten minutes. I have it doing a back up every day to a second HD, keeping the last five, doesn't take up much space. I don't think there's an easier option than that.

Advice? Sure, setup timeshift backups.

Or if possible, switch to btrfs and install snapper + the grub integration. Will make it possible to go back to a previous state even from grub.

Not if you mess up grub tho, then you're screwed.

You don't need timeshift because arch never breaks.

Source: arch users

Arch breaks once in a while... Like anything else in my experience

But the arch users told me it never breaks, could they have lied to me?

In about two years, the update process broke twice... I had to manually add or remove packages so that the update process would complete successfully