ZFS on Linux project was started at 2008, did received a boost at 2013 and right now developers says that the project is stable, however still can’t recommend it at production.
I had a discussion one time, where my friend asked my ‘why can’t we look to the ZFS as hybrid storage? It has ZIL and L2ARC, so we can talk about 2 tier storage, don’t we?”
It was the very interesting discussion, so I decided to sum all our arguments here for you.
I need to start with a short overview of ZFS features we need to talk about.
ZFS have advanced cache algorithm allowing very efficient data management, but we didn’t look for it in details because it will require another post to observe.
ZFS cache includes ZIL (ZFS Intense Log) where all of the data to be the written is stored, then later flushed as a transactional write. What we need to know is what ZFS allow you to add an SSD device (or few SSD devices in a vdev group) as a SLOG which transcribes of Separate Logging device. All synchronous writes will be cached to the SLOG first and will be written to the disks next. But all asynchronous writes will go to disks beside of SLOG.
SSD SLOG device is not a write-cache, and it can’t increase total throughput of your ZFS appliance! What it can is to decrease write latency, keep it stable and let other spinning disks works at the comfortable level. Not all applications can use advantages of SSD SLOG device! Local read and write will not use SLOG, but NFS and iSCSI services will use it, so as database services.
It also can’t be bigger than ¼ of ARC size, and you will not have more raw write performance from your system than without SLOG, it’s very important to understand.
So I can resume that ZFS SSD cache can’t be treated like real first-tier write cache at hybrid storage systems. Hybrid storage using SSD drives for storing all ‘hot’ data along with caching read and write requests, and then most rare using blocks will go to the second tier drives.
It can be similar with ZFS SLOG, but really haven’t much in common.
What about read cache then?
With ZFS, you can add L2ARC device, which can be used for read cache functionality on ZFS.
How it’s working: when some block went unused and need to be pushed from ZFS ARC, it will go to L2ARC device if existed. L2ARC device is typically a fast SSD drive.
And if requested block will be stored at L2ARC device it can be retrieved from fast SSD storage instead of relatively slow HDD’s.
So can ZFS L2ARC be treated like fully functional read SSD cache?
No, I afraid.
ZFS SLOG and L2ARC are working independently from each other, so if some blocks from the same file are stored at ZIL, other blocks can be pushed out from L2ARC due inactivity.
ZFS doesn’t know much about your data, how it’s related and where you need to store it.
If you have write-intensive application then you can found that data you’re writing to the storage will push out of L2ARC data you need to read, and so on.
Hybrid storage systems can store hot data at first tier SSD storage way more efficiently using SSD both for read and write cache, while ZFS limited by half-functional write caching and more functional read cache only.
Let’s resume: ZFS is a great filesystem and ZFS on Linux is a very ambitious project, already proved yourself as stable enough to use in non-critical environments.
But it can’t be treated as hybrid storage in the full meaning of the word.
David Kovacs is passionate about entrepreneurship and triathlon. Presently, he is spending most of his time with his friends trying to kick off an awesome project.
https://www.linkedin.com/in/kovacsd
You must be logged in to post a comment.