FullFAT is almost out of the door now, and there’s one thing I learnt from implementing the FAT File-system, its not a very clean design or particularly good at dealing with large numbers of files within a directory.
Talking about file-systems, Microsoft have recently started touting their EXFat filesystem which is essentially Fat64. There’s just one problem with it, the specifications are not open, and Microsoft holds patents and right to that too. We need a completely public domain, simple to implement, unambiguous and stable file-system for all the media devices that currently exist.
Thats why, as of today I am announcing a new file-system. Its designed to be computationally simple, (ideal for embedded devices) and well structured allowing fast traversal through large directories. The design is not based on FAT in any way, and is built from the ground up using ideas I learnt from writing FullFAT.
The actual file-system itself is a specification, however I am also providing a reference implementation in the public domain. I shall also try to write Linux and Windows integration drivers, and lobby the file-system around as much as possible to get wide support for it.
The specification also recommends a special partitioning scheme including a small FAT12 partition to hold drivers for the most common platforms, until the FS is adopted more widely.
The following are just a few of the key-points my file-system aims to achieve:
- High Performance Read and Write.
- Computationally Simple to handle.
- Large File Support (64-bit filesize field – Theoretical maximum size is 8192 Petabyte file).
- Large Volume Support (Supports all block-sizes of 512 multiples) (1TB with 512 Block size, 2TB with 1024, 4TB with 2048, 8TB with 4096, 16TB with 8192 block sizes).
- Really Fast Directory Traversal – Special Directory structure organises objects more efficiently for reading and writing.
- Really Fast Freespace allocation – A very small and tight free-space bitmap is used to designate free space in terms of clusters.
- Varying Cluster sizes. (From blocksize to 2MB cluster size is specifiable – 512byte multiples of course).
- BlockChains – These are linked lists of blocks within allocated clusters. They are primarily used for chains of directory data and file meta-data as well as file data maps.
- File-data maps, these are like the FAT tables in FAT except the pointers are compressed into extents. They are variably sized cluster addresses describing the chain of clusters making up a single file. Standard size is 32-bits, per-cluster address. However a 64-bit address length can be specified (allowing compatibility with future storage mediums). 32-bits is enough to address 2TB with a cluster size of atleast 1024 bytes.
- Fragmentation is reduced through algorithms acting on the free-space bitmap.
- All data structures are 4-byte aligned, sizes are multiples of 4, and can never cross sector boundaries. (Thus helping reduce computational complexities).
- UTF-8 Filenames – 1024 Bytes used for name storage, that is upto 1024 char names, or 256 32-bit unicode charachters.
It would be nice to get the EFS specification out there, and really get it adopted, however I am also aware that such a task is not easy, one only has to look at the OGG Vorbis project to know that. Perhaps this design will only serve as a good educational tool, but it would be also great to get it widely adopted. I shall keep this blog updated as I go through the design process. I would also like to encourage others to suggest or help in the design phase. There are many ideas I have, but maybe there are much nicer ways of doing things.
For more information you may email me. (EFS doesn’t exist yet I’m still working on my reference and design blueprints).
James