Krait

@Krait@discuss.tchncs.de
1 Post – 15 Comments
Joined 1 years ago

I did, yes. One of the first ideas I got.

Both drives were encrypted (Samsung as root drive, encrypted except for the efi partition, and kioxia fully encrypted and mounted via crypttab and a key file residing on the encrypted Samsung partition for automatic unlock), although now as I have been reinstalling quite often, and couldn't be bothered to set up the encryption for the second drive so it stays unused atm. Trim is enabled via a kernel parameter, but not in the fstab directly anymore (as I'm running BTRFS now, and from what I've gathered passing the ssd option to BTRFS is enough to enable trim, verified with lsblk --discard)

Me neither lol

Bingo, that's what I do

Which fee are you referring to? Never heard of that

1 more...

Thanks for helping! Unfortunately, journalctl doesn't show anything really. Trying to run journalctl -b -1 shows Specifying boot ID or boot offset has no effect, no persistent journal was found., which is a bit strange. It used to just show the previous journal (I couldn't find anything suspicious though), but no error related stuff, I assume due to the filesystem being mounted ro right after the crash, it wasn't able to write anything to the journal unfortunately. EDIT: The only errors I have seen in dmesg were related to a Broadcom PCIe wireless card (which I have removed now for further testing): brcmfmac: brcmf_c_process_clm_blob: no clm_blob available (err=-2), device may have limited channels available. Although I have read that this is a common message for this type of card (broadcom 43602) and it's nothing to worry about.

The fstab looks like this (redacted UUIDs for clearer formatting):

/ btrfs rw,relatime,ssd,space_cache=v2,subvolid=256,subvol=/@ 0 0 /home btrfs rw,relatime,ssd,space_cache=v2,subvolid=257,subvol=/@home 0 0 /.snapshots btrfs rw,relatime,ssd,space_cache=v2,subvolid=258,subvol=/@.snapshots 0 0 /opt btrfs rw,relatime,ssd,space_cache=v2,subvolid=259,subvol=/@opt 0 0 /root btrfs rw,relatime,ssd,space_cache=v2,subvolid=260,subvol=/@root 0 0 /srv btrfs rw,relatime,ssd,space_cache=v2,subvolid=261,subvol=/@srv 0 0 /var btrfs rw,relatime,ssd,space_cache=v2,subvolid=262,subvol=/@var 0 0 /var/lib/portables btrfs rw,relatime,ssd,space_cache=v2,subvolid=263,subvol=/@var/lib/portables 0 0 /var/lib/machines btrfs rw,relatime,ssd,space_cache=v2,subvolid=264,subvol=/@var/lib/machines 0 0 /var/lib/libvirt/images btrfs rw,relatime,ssd,space_cache=v2,subvolid=265,subvol=/@var/lib/libvirt/images 0 0 /var/spool btrfs rw,relatime,ssd,space_cache=v2,subvolid=266,subvol=/@var/spool 0 0 /var/cache/pacman/pkg btrfs rw,relatime,ssd,space_cache=v2,subvolid=267,subvol=/@var/cache/pacman/pkg 0 0 /var/log btrfs rw,relatime,ssd,space_cache=v2,subvolid=268,subvol=/@var/log 0 0 /var/tmp btrfs rw,relatime,ssd,space_cache=v2,subvolid=269,subvol=/@var/tmp 0 0 /boot vfat rw,relatime,fmask=0022,dmask=0022,codepage=437,iocharset=ascii,shortname=mixed,utf8,errors=remount-ro 0 2

Regarding firmware updates, I have tried running fwupd, but no updates are available. Tried both samsung's and kioxia's update tool on Windows too, both drives are running the latest firmware.

Thanks for the detailed reply, I will check the reddit post out. Although my PSU should be powerful enough, and it is relatively recent (3-4 years, so I assume the deterioration should not be that bad)

Thanks, will try. Although I don't think this'll be an issue, as I have more than enough memory, swap should only be used for hibernation

Danke!

Happens with both drives, I have tried each possible permutation (Samsung in slot 1 and 2, kioxia in slot 1 and 2, and even only installing one drive at a time)

8 more...

Unfortunately I don't have a spare PSU, but I might try to measure the 3.3 volt rail with a multimeter (don't own an oscilloscope unfortunately) while under load and see what happens

2 more...

Thanks, I'll try that. I loaded the drive using dd a couple of times, and that did bring the system down a couple of times. I was writing to the filesystem though, while the system was booted

2 more...

Unfortunately I have no other system at hand at the moment that's able to accept nvme drives :( I could try using windows for a couple of days see whether the issue is really linux-related, but I am trying to avoid that lol

1 more...

I did, yes, but no avail. The dmesg output I posted is after the drive was mounted as ro, and is the best i could get. After some time, the system stops responding completely

Boot a live ISO with the flags recommended in the kernel message and do some tests on the bare drives. That way you won’t have the filesystem and subsequently the rest of the system giving out on you while you’re debugging.

Which tests are you referring to exactly? I have read about badblocks for example, and it not being much use for ssds in general, due to their automatic bad-block-remapping, so they remain invisible to the OS as all remapping happens in the drive's controller. Smart values look great for both drives, about 20TBW on the Samsung drive, and a lot less on the Kioxia drive.

4 more...

This