Data rot and how to prevent it

cool_penguin_smallData rot or data degradation is a problem related to storage media. Over a period of time storage media gradually decays and some of the data is lost or altered partially even if the data container is not lost.

To understand the problem more clearly in simplified terms, let’s say a file means an area of the disk where the file information is stored. In case of data rot all the information is not lost immediately (which happens in case of a severe disk corruption) but bits randomly get flipped at a very slow rate (even over years). In case of hard disks, the reason is that data is stored as electrical charges, which can leak very slowly due to insulation issues. The chip itself may not be damaged and may work fine if re-programmed but the data is not intact.

Some examples of data rot can be the image that lost some pixels or the audio file which sounds choppy after you saved it on the disk for a while. Or may be the video file that played perfectly before but is showing some colored rectangles in some frames off late. It can happen as quickly as within a few months to a decade depending on the quality of the storage hardware.

Data rot cannot be avoided through incremental backups because once it happens you can only backup the rotten copy. RAID can protect you. But regular users rarely use RAID.

Few of the most advanced filesystems do take care of this using integrity checking and self-repairing algorithms – btrfs and ReFS. Btrfs is from Linux and can be used on desktops as a general purpose filesystem. ReFS, on the other hand, is applicable to Windows Server platforms. A older filesystem (still in use) with such capabilities is ZFS.

Btrfs: resize, defragment, compress

cool_penguin_smallThe latest most advertised and advanced filesystem from Linuxland is btrfs. It is an extremely flexible filesystem and included as default in many major distros. There are valid reasons behind that. You can not only do many exciting things with btrfs offline (like convert an ext3/4 volume to btrfs) but even when it is online (mounted). We will discuss 3 such features in this article.

Resize

Let’s say you have mounted a btrfs filesystem to /mnt/volume. Here are three common ways to resize it.

//Resize volume to 25GB
# btrfs filesystem resize 25G /mnt/volume
//Shrink volume by 5GB
# btrfs filesystem resize -5G /mnt/volume
//Expand volume by 5GB
# btrfs filesystem resize +5G /mnt/volume
Defragment

You can defragment a single file on a btrfs filesystem! Let’s see.

# btrfs filesystem defragment /

Guess what? The above command just defragments the metadata of the / directory of the volume and not the whole volume. To defragment the volume, use

//defragment all file data recursively, verbose
# btrfs filesystem defragment -r -v /
//defragment all file metadata
# find / -xdev -type d -print -exec btrfs filesystem defragment '{}'

Use the autodefrag option in /etc/fstab to defragment the volume automatically.

Compress

Use the following options with mount or add in /etc/fstab:

//use LZO compression (better compression)
compress=lzo
//use zlib compression (faster)
compress=zlib

Only the files created or modified post mount will be compressed the above way. To re-compress everything on the volume:

# btrfs filesystem defragment -r -v -clzo /
OR
# btrfs filesystem defragment -r -v -czlib /

or you can compress a single file without the -r option as well.

Linux filesystem hierarchy: time for a overhaul?

tux_compAn important aspect that often decides the success of a software is simplicity – how easily can end users understand and use it? Does it have a steep learning curve? While there are numerous Linux software which are a delight to use, the Linux filesystem probably doesn’t fall under this category because of its hierarchy and nomenclature. Continue reading Linux filesystem hierarchy: time for a overhaul?

Access ext4 filesystem from Windows

diskWhile there were many tools available to access ext2 and ext3 filesystems from Windows, things changed with ext4. At one point of time, there was no tool that could read or write ext4 filesystems from Windows as long as extents were enabled. However, if you have a dual-boot system you may need the functionality more often that you may think. Here are two Windows utilities which can access ext4 filesystems from Windows:

  • Ext for Windows: From Paragon and free for personal use. It can both read and write an ext4 filesystem. Supports all Windows flavours up to the most recent Windows 8.1.
  • Ext2read: As the name indicates, this software can only read the ext4 filesystem and copy files from it. If you are paranoid about corrupting your Linux filesystem due to any unusual external access this is the one for you.

Defragment ext4

diskDoes the ext4 filesystem need defragmentation? The answer, in short, is yes. The ext4 filesystem is robust and gets fragmented in a few scenarios (like pidgin chat logs or browser cache) or on prolonged usage (but much less than FAT or NTFS). ext4 comes with its own defrag tool – e4defrag which is present in some distros including Ubuntu.

  • To check for fragmentation:
    sudo e4defrag -c /
  • To defragment:
    sudo e4defrag /

exFAT kernel module arrives on Linux

tux_compTill date the Microsoft exFAT filesystem support on Linux was done through the userland FUSE based filesystem. Now there’s a kernel module for exFAT support. exFAT can be used in situations where NTFS can’t be used due to its inherent structural overheads and FAT32 is rendered unfit due to its limitations – particularly on flash drives. Arch Linux user can download and use it today! Git page for the project exfat-nofuse.

FUSE: Linux userland filesystems

tux_compHave you ever come across a similar entry as below in /proc/mounts and wondered which filesystem resides on the block device?

/dev/sda1 on /mnt/sda1 type fuseblk
(rw,relatime,user_id=0,group_id=0,allow_other,blksize=4096)

It’s impossible to say by just the above entry. Because the actual filesystem on the device is written on top of the FUSE library. A well-known example of a filesystem using FUSE is ntfs-3g which comes with nearly all Linux distros nowadays to support NTFS filesystem. FUSE can be considered as an additional driver layer communicating with the kernel VFS and translating the userspace filesystem operations (written using the FUSE library) to kernel space. Here’s a simple example of a filesystem written using FUSE. As you can see, the procedure is similar to writing a regular kernel module but the key structure containing the fuse function pointers is fuse_operations.

Tune ext4 performance

ext4

Power tips to get the best performance on ext4 volumes.

Some of the options available for tuning the ext4 filesystem are very powerful and can even corrupt your filesystem if used incorrectly. Handle with care!!! For example, do not use data=writeback for root partitions with ext4. My system didn’t boot on Ubuntu 12.04. Keep a USB installed distro like SliTaz handy for these kind of issues.

To turn off journaling completely:

$ sudo tune2fs -O ^has_journal /dev/sdXn

To enable faster large directories:

$ sudo tune2fs -O dir_index /dev/sdXn

To reduce percentage of reserved blocks:

$ sudo tune2fs -m .1 /dev/sdXn //to reserve .1% blocks

For the record, my fstab entry for the NTFS partition I have is:

UUID=658822A60A30C96C /mnt/NTFS ntfs noauto,noatime,
nodiratime,noacl,noinherit,uid=1000,nouser_xattr 0 2

References:

Corrupt superblock? Recover ext2/ext3/ext4 filesystem data

diskMy build VM got corrupt today and I was on the verge of losing my temporary changes scattered across 4 different code branches. I googled a while for a solution and and I found the following links very useful:

Additionally, using TestDisk I was able to copy all my modified code even before I tried fixing the disk. TestDisk came to my rescue once before as well when my 1 TB NTFS formatted externally powered disk was not getting detected anywhere.

Another interesting article on how to handle bad blocks on hard disk.