The leader of the team at Sun Microsystems that developed ZFS, called it " ... the last word in file systems:' It is indeed worthy of the praise considering its advanced yet easily maintainable features. ZFS, a pseudoacronym for what was earlier called Zettabyte Filesystem, is a 128-bit filesystem, as opposed to the presently available 64-bits filesystems like ext4 and others.
Some of its excellent features include:
•Simplified administration: ZFS has a well-planned hierarchical structure with the uberblock (parent of all blocks) and disk label at the top, followed by pool- wide metadata, the file system's metadata, directories and files. The uberblock checksum is used as the digi,tal signature for the entire filesystem. Besides property inheritance (utilising the hierarchical structure), ZFS provides auto management of mounting, sharing, compressions, ACLs, quotas and reservations, etc, making administration easier and more effective.
The filesystems in ZFS can be compared to directories in ordinary file systems like ext3, and most administration tasks are done usingjust two commands-2ifS and zpool. Pooled storage: ZFS has revolutionised the filesystem implementation and its management with the introduction of storage pools. Concepts like datasets (a generic term for volumes, filesystems, snapshots and clones) and pools (a large storage area available for the datasets) make filesystem handling easier for the administrator. Like the virtual memory model for a process, the filesystem can grow its usage space as required without any pre-determined space limits unless provided as 'quotas' within the pool model. 'Quotas' can be set, changed or removed at will. Also, a minimum 'reservation' space for each filesystem can be specified. One important aspect of the storage pool is the removal of volume management architecture, thus reducing a lot of complexity for the administrator.
Transactional paradigm: ZFS being a transactional filesystem is guaranteed to be consistent according to its developers. Data management in ZFS uses copy on write semantics, which ensure that data is never overwritten, always maintaining an old reference to the data. A sequence offilesystem operations is either committed or ignored as a whole, thereby preventing any corruption to the file system due to power shortage or some other outage. This, in effect, removes the need for the fsck tool, the traditional filesystem check and repair tool.
Scrubbing and self-healing:
Since data and even metadata is checksummed, data scrubbing (an operation that checks data integrity within a filesystem or, in other words, data validation) is performed easily within ZFS. Checksum algorithms can be any user-selected algorithm from SHA-256 to fletcher2, producing 256-bit long checksums. Besides checking for data integrity and preventing silent corruption, ZFS also provides mechanisms for self healing, mainly through RAID-Z and mirroring.
Two RAID-Z variations, single and double-parity, are in fact slight variations of RAID-5 and RAID- 6, respectively. The variations mainly aim to eliminate the write hole, solidifying data integrity. Besides, techniques like resilvering or resyncing help in replacing a corrupted or faulty device with a new one.
Scalability: The team behind ZFS made the decision to go for a 128-bit filesystem, even though 64-bit filesystems like ext4 have come up only recently. Its data limit is an enormous 256 quadrillion zettabytes of storage which, is almost an impossible limit to reach in the near future since fully populating a 128-bit storage pool would, literally, require more energy than boiling the oceans, as Bonwick pointed out Directories can have up to 248 (256 trillion) entries. No limit exists on the number of filesystems or number of files that can be contained within a file system.
Snapshots and clones: Snapshot is a read-only copy of a filesystem or volume at any particular point of time. Its design is such that space is consumed only when data is changed, preventing any freeing of data from the file system unless explicitly asked, giving further options for maintaining data integrity. Clone is a writable file system generated from a snapshot The creation of snapshots and clones in ZFS is very simple and is always pointed out as one of its big advantages.
ZFS and Linux
ZFS is the standard filesystem for Solaris/OpenSolaris OS whose source code is published under CDDL (Common Development and Distribution License). However, from the beginning (and hopefully forever) the Linux kernel has remained licensed under the GPLv2, which prevents any other code to be linked with the GPL'd Linux kernel unless that code's licence is GPL v2 compatible. So the open sourced code ofZFS cannot be added/linked to the kernel code like any other filesystem, either as a part of the kernel or as kernel modules. As a workaround, some solutions pointed out by the open source community are:
1. A 'court ruling' (either in the US or ED, where ZFS is mainly used) stating that GPL and CDDL are compatible.
2. Either of the parties (Linux and Solaris) need to change the licence of their code to a mutually compatible one.
3. A GPill ZFS reimplementation from scratch, which should be free from all the 56 patents that Sun has taken on ZFS code.
4. A method by which we would be able to implement ZFS to be usable for Linux, which is only possible through dynamic linking between the codes-this is allowed.