Noob Question Thread: Ask Any Questions About Linux!

Cyclohexane@lemmy.mlmod to Linux@lemmy.ml – 314 points –

I thought I'll make this thread for all of you out there who have questions but are afraid to ask them. This is your chance!

I'll try my best to answer any questions here, but I hope others in the community will contribute too!

423

You are viewing a single comment

How do programs that measure available space like 'lsblk', 'df', 'zfs list' etc see hardlinks and estimate disk space.

If I am trying to manage disk space, does the file system correctly display disk space (for example a zfs list)? Or does it think that I have duplicate files/directories because it can't tell what is a hardlink?

Also, during move operations, zfs dataset migrations, etc... does the hardlinked file continue tracking where the original is? I know it is almost impossible at a system level to discern which is the original.

I'm not super familiar with ZFS so I can't elaborate much on those bits, but hardlinks are just pointers to the same inode number (which is a filesystem's internal identifier for every file). The concept of a hardlink is a file-level concept basically. Commands like lsblk, df etc work on a filesystem level - they don't know or care about the individual files/links etc, instead, they work based off the metadata reported directly by the filesystem. So hardlinks or not, it makes no difference to them.

Now this is contrary to how tools like du, ncdu etc work - they work by traversing thru the directories and adding up the actual sizes of the files. du in particular is clever about it - if one or more hardlinks to a file exists in the same folder, then it's smart enough to count it only once. Other file-level programs may or may not take this into account, so you'll have to verify their behavior.

As for move operations, it depends largely on whether the move is within the same filesystem or across filesystems, and the tools or commands used to perform the move.

When a file or directory is moved within the same filesystem, it generally doesn't affect hardlinks in a significant way. The inode remains the same, as do the data blocks on the disk. Only the directory entries pointing to the inode are updated. This means if you move a file that has hardlinks pointing to it within the same filesystem, all the links still point to the same inode, and hence, to the same content. The move operation does not affect the integrity or the accessibility of the hardlinks.

Moving files or directories across different filesystems (including external storage) behaves differently, because each filesystem has its own set of inodes.

  • The move operation in this scenario is effectively a copy followed by a delete. The file is copied to the target filesystem, which assigns it a new inode, and then the original file is deleted from the source filesystem.

  • If the file had hardlinks within the original filesystem, these links are not copied to the new filesystem. Instead, they remain as separate entities pointing to the now-deleted file's original content (until it's actually deleted). This means that after the move, the hardlinks in the original filesystem still point to the content that was there before the move, but there's no link between these and the newly copied file in the new filesystem.

I believe hardlinks shouldn't affect zfs migrations as well, since it should preserve the inode and object ID information, as per my understanding.

This really clears things up for me, thanks! I guess I am not so "new" (been using linux for 8 years now), but every article I read on hardlinks just confused me. This is much of a more "layman's" explanation for me!