struct xfs_agf *agf = (struct xfs_agf *)(ag->map + XFS_AGF_OFFSET); if (be32_to_cpu(agf->agf_magicnum) != XFS_AGF_MAGIC) return -EINVAL; // or crash, which is more fun No buffer cache. No I/O scheduling. Just the filesystem’s raw data laid out in virtual memory. XFS’s extent B+tree is elegant: internal nodes point to other blocks, leaves point to extents. In kernel space, traversing it is cheap. In fuse-xfs , every bmap lookup might require reading several blocks—each of which is a pread() or a memory access, depending on your cache.
The solution? . When fuse-xfs opens a file, it walks the entire B+tree and caches the extent list in a flat array. Memory-heavy? Yes. But it turns a 10ms seek into a 50µs array walk. 4. Writing: The Journaling Shim XFS’s journal (the “log”) is complex. It supports rolling transactions, buffer pinning, and tail pushing. fuse-xfs implements a naïve log : each write transaction is appended to a journal.bin file. On mount, we replay by applying every logged operation in order.
There’s a moment in every systems programmer’s life where they stare at a kernel panic, a corrupted superblock, or an unreachable inode, and think: “I wish I could just put a breakpoint inside the filesystem.”
static void xfs_lookup(fuse_req_t req, fuse_ino_t parent, const char *name) { struct xfs_inode *ip = xfs_iget(parent); xfs_dirent_t *de = xfs_dir_lookup(ip, name); fuse_reply_entry(req, &(struct fuse_entry_param){ .ino = de->inumber, .generation = ip->i_generation, .attr_timeout = 1.0, .entry_timeout = 1.0 }); } XFS divides the disk into equal-sized Allocation Groups. In fuse-xfs , each AG is a mmap() of a region in a backing file ( /var/lib/fuse-xfs/ag0.bin ). Reads and writes become pointer dereferences. fuse-xfs
Want to understand delayed allocation? Step through xfs_iomap_write_delay() in userspace with printfs . Curious about AG btree splits? Corrupt an AG by writing random bytes and watch fuse-xfs segfault at the exact line of code where validation fails.
But fuse-xfs isn’t a port. It’s a reconstruction .
So go ahead. Write your own fuse-ext4 . Or fuse-zfs . Or fuse-ntfs . Mount your system’s root partition read-only and watch every lookup and read call pass through your printf . You’ll never look at df -h the same way again. struct xfs_agf *agf = (struct xfs_agf *)(ag->map +
fuse-xfs is available at github.com/yourname/fuse-xfs . Use it on loopback files only. I am not responsible for lost data, but I am responsible for your sudden, deep understanding of B+trees.
You can’t. Not easily. The kernel is a fortress, and filesystems are its moat. Enter (Filesystem in USErspace). It’s the drawbridge. But FUSE has a reputation: it’s slow, it’s “toy” grade, and it lacks the low-level power of ext4 or xfs .
This is where the kernel-to-userspace shift gets interesting. In the kernel, XFS uses xfs_buf_t with b_ops for verification. In fuse-xfs , we just cast: XFS’s extent B+tree is elegant: internal nodes point
And when someone asks, “Why would you run a filesystem in userspace?” — you’ll know the answer.
So when I decided to write fuse-xfs —a userspace implementation of the —I wasn’t trying to build a production storage engine. I was trying to answer a single question: Can we take the soul of XFS (its allocation groups, B+tree extents, and delayed allocation) and lift it into userspace without losing its identity? Here’s what I learned. The Heresy: Userspace XFS XFS, designed by SGI in the ’90s, is a kernel beast . It assumes it owns the hardware. It assumes it can reorder writes, bypass the page cache when needed, and manipulate memory directly via kmem_cache . Porting that to userspace is not just difficult—it’s borderline heretical.
Why? Because XFS inodes have a generation number (to handle inode reuse), and the low-level API lets us pass that back to the kernel’s dcache.