Matthew Dillon has a very detailed commit message with changes to make sure Hammer will run overnight cleanups in situations as low as 256M of RAM. I think you can find that much RAM in breakfast cereal boxes these days.
Category: Hammer
How much RAM is too little?
If you’re running DragonFly on a very low-end system, you may be wondering about memory requirements for Hammer. Hammer is much less RAM-hungry than ZFS, so it looks like you can get away with 128M, as long as you don’t mind the occasional error message. You can manually tweak settings for it if you like. 256M is plenty.
It still strikes me as odd to consider systems with less than 1G of RAM as “low-memory”. What rich times we live in!
Libhammer added
Antonio Huete Jimenez’s ‘libhammer‘, a library to make various Hammer functions available to userland programs, has been added. It implements ‘hammer info’ only at this point, if I understand correctly.
A new Hammer presentation
Francois Tigeot recently presented a set of slides about Hammer at a recent Irill conference. PDFs of the slides are available at his site, in English and French.
Quotas almost sort of working
Francois Tigeot took an old Summer of Code proposal, VFS Quotas, and started running with it. He’s made some progress, as he detailed in a recent post to kernel@ (with code!) , but the nullfs-mount nature of PFSs in Hammer are making it difficult.
Deduplication now eats less RAM
Well, if you tell it to do so. Matthew Dillon has added a user-settable limit to the amount of memory used during deduplication, so if your Hammer-using system is low on RAM, you can conserve. This is probably most useful if you are running DragonFly in an extremely small VM, or if your name is Venkatesh.
(inside joke; Venkatesh has a crazy old desktop for DragonFly.)
Pulse-width modulated time-domain multiplexer!
I really just like that phrase and the action movie feeling of using it, like “Watch out! The pulse-width modulated time-domain multiplexer is targeting us!“ Sorta like a PU-36 space modulator. It’s actually a recently-committed mechanism to improve write performance in Hammer, but my idea sounds more exciting.
Deduplication real world results
I’ve posted about my own results with Hammer deduplication here before, but Siju George put together results from his workplace using actual files in production. He recovered 138G from a 1T disk, and recovered 20% of space from another disk. Not bad for something that’s nearly automatic, and completely free.
Trying out deduplication
I moved to DragonFly 2.10 over the past few days, and I tried out deduplication, to see what kind of results I would get. The procedure is outlined below. I’m using /home here as an example, just to reduce the amount of text pasted in.
/pfs/@@-1:00004 966000640 566434576 399566064 59% /home
Move my various Hammer pseudo-file systems to version 5, which supports deduplication.
# hammer version-upgrade /home 5
Issue a deduplication simulate command, to see what it guesses will be the savings:
# hammer dedup-simulate /home
Dedup-simulate /home: objspace 8000000000000000:0000 7fffffffffffffff:ffff pfs_id 4
Dedup-simulate /home succeeded
Simulated dedup ratio = 1.22
That ratio turned out to be pretty accurate for the actual deduplication. I didn’t time it, unfortunately. I don’t know if the time taken is proportional to the amount of deduplication or the total volume of data, though I suspect the latter.
# hammer dedup /home
Dedup /home: objspace 8000000000000000:0000 7fffffffffffffff:ffff pfs_id 4
Dedup /home succeeded
Dedup ratio = 1.22
462 GB referenced
378 GB allocated
14 MB skipped
6869 CRC collisions
0 SHA collisions
0 bigblock underflows
The end result?
/pfs/@@-1:00004 966000640 505887504 460113136 52% /home
That data space is shared across all file systems, and it’s a 1TB disk, so it’s 7%, or 70GB. I was hoping for more, but I don’t have any obviously duplicated data (no local mail store, no on-disk backups), so perhaps this is normal. 70GB that I didn’t have before is no bad thing, though.
Incidentally, I was able to upgrade my installed software from pkgsrc-2009Q4 to pkgsrc-2011Q1 entirely using pkg_radd -u <pkgname>. Remarkably quick and painless, though pkgin may have been able to do it even faster since it would pull from the same place.
Hot-swap and a bad disk
If you follow this thread, it has some discussion on how to handle a multi-disk setup and Hammer. If a disk is going bad, you can try mirroring, though you have to be careful how your pseudo-file systems are set up.
More on Hammer design
I mentioned it before, but Matthew Dillon’s updated his Hammer document, and posted about it. Read on, especially if you like extremely complex plans.
Edit: first link fixed, plus there’s a followup.
Remember to enable deduplication
I didn’t think of this, but I needed it: if you have an older Hammer system that now can perform deduplication because you upgraded to DragonFly 2.10, make sure to add it to the configuration for that file system, or else it won’t run.
Hammer and the future
Matthew Dillon’s been thinking about Hammer, and how to implement clustering well enough to work as a sort of RAID replacement. He’s written up a document describing his plans. Some highlights:
- writable history snapshots
- quotas and accounting
- live rebuilds of data from mirrors
- and the same history, mirroring, and snapshots as before.
It’s going to be a while before this “Hammer 2″ becomes a finished product, though, so don’t count on it for the next release.
RAM vs. deduplication
Tomas Bodzar asked about RAM usage with Hammer and deduplication, pointing at this example that shows ZFS requiring… I’m not sure. Lots? Anyway, Matthew Dillon noted that offline deduplication in Hammer would use available RAM/swap for CRCs on all files, but only a limited subset for ‘live’ dedup. For a real-world example, Venkatesh Srinivas described deduplicating about 600G down to 400G, with a machine having only 256M of RAM. Yes, only 256M.
More Hammer documentation
Thomas Nikolajsen has put together more information on Hammer, including formatting and the new deduplication features, conveniently located in the man pages and some other spots.
Double buffering in Hammer usually useful
Enabling the vfs.hammer.double_buffer=1 sysctl will greatly improve Hammer performance when you’ve exceeded your memory cache (at a possible slight penalty when you have not) and also speed things up when using live deduplication.
Update: Venkatesh Srinivas says:
“double_buffer makes sense when: 1) you want all CRCs to be checked on reads. 2) you’re running live dedup and care about dedup performance rather than say read-heavy performance; 3) you have swapcache but are often running into the vnode limit in what you can cache.”
So, not always useful.
Hammer speed improvements
Yeah, so those Phoronix benchmarks are crap, but Matthew Dillon went and implemented some things that would speed up Hammer write performance in any case. Read his summary for details.
New Hammer version
The default Hammer version in DragonFly is now version 5, which is the one that includes deduplication. Enjoy, bleeding-edge users! Otherwise, wait for the next release.
Version 6 is there, but don’t upgrade to it yet; there aren’t significant user-visible changes, and the usual disclaimers for new versions apply.
Phoronix benchmarks for Hammer
A Phoronix test of DragonFly’s Hammer filesystem turned up, via Siju George. It’s not really a benchmark as much as it is a speed test, and it’s not a realistic comparison, but it’s interesting to see numbers.
They need a graph that shows how much historical data can be recovered by each file system, or how long fsck takes after a crash.
Update: Matthew Dillon points out the many ways these tests are wrong.
Deduplication arrives
Ilya Dryomov’s work on deduplication for Hammer has been committed to the tree in an early test form. I guess I need to pay up as part of the code bounty. If you’re wondering how much space it will save, but don’t want to try non-production code yet, there’s a ‘hammer dedup-simulate’ command that will estimate the saving ratio.
This is great news – deduplication is so valuable it adds an extra zero onto the price of any storage device that can do it.
Encrypted HAMMER volumes possible
I haven’t covered this enough: thanks to Alex Hornung, it’s possible to create a HAMMER volume and have it be encrypted. Matthias Schmidt has done just this, and has provided an rconfig(8) script to automate the process. (Or to crib from if you prefer to do it by hand.)
Lazy reading: the return of ACID, SSI, weirdness
A smaller set of links, but still the same volume of reading material.
- Samuel Greear linked to this lengthy writeup on how to have both the consistency of ACID and the scaling of NoSQL. Astute observers may notice the similarities between the plan described and the way HAMMER works.
- Joerg Sonnenberger pointed out to me, after my works on The BSD Show! that MOSIX is an open source single-system-image implementation, though it appears to be designed for specialized high-speed networks rather than the more general case of DragonFly.
- This seems bizarre. (via)
Updates and improvements for HAMMER, crypto
Matthew Dillon posted a summary of recent bugfixes in HAMMER and kqueue, which means if you are running a version of bleeding edge DragonFly build in the last few weeks, you should update.
He also mentions a “significant improvement in performance” in disk encryption. How significant? Over three times as fast.
New HAMMER catastrophic recovery tool
Matthew Dillon reports that DragonFly now has a catastrophic recovery tool for HAMMER filesystems, with pertinent details.
Summary of recent kernel work
Matthew Dillon has provided some details about recent kernel work, along with a release forecast.
What of OpenSolaris?
You have probably seen reports declaring the demise of OpenSolaris by now, many taking a less than conservative approach in reporting the news one way or the other. So what do you make of the news? By all accounts, the source code (including future changes) for things such as ZFS will continue to be published under the CDDL. Will Oracle closing up development make it impossible for operating systems like FreeBSD to maintain ZFS without forking it? What do you think the ramifications will be for DragonFly’s HAMMER and DragonFly in general?
Dedup, please
Matthew Dillon made a minor change to HAMMER that would help any future deduplication work. There’s also a deduplication code bounty out on the recently-updated Code Bounties page…
I’ve been NAS-shopping, and I’ve found that deduplication ability seems to add an extra zero on the end of a device’s price tag. It would be very nice for HAMMER.
Messylaneous: reading, catchup
I apologize; I’ve been missing. Here’s some misc links while I get back in gear:
- A very good reason to be interested in Hammer over ZFS: nobody will threaten lawsuits over Hammer.
- 10 tricks for admins. I’m posting it cause I can never remember that thing with tunneling ssh out. (via)
- This Gaming Life, as a free download. An excellent book that is in physical form on my shelf right now. Yes, unrelated.
Rolling everything back
If you have a Hammer filesystem, and you want to roll the entire thing back to a previous snapshot – all files, everywhere – it can be accomplished with one command.
Keeping a mirror-stream going
A note, in part for my own benefit: the @reboot crontab entry is all you need to get a HAMMER mirror-stream going again after a reboot/shutdown.
More on Hammer and Samba
Matthew Dillon went into detail on just how Hammer snapshots could be shared out via Samba.
Hammer via Windows
Siju George is making a Hammer volume’s snapshots available through Samba, with the results that some Windows-using developers get historical snapshots for free.
2.6.2, 2.7.2 created; please update
Matthew Dillon identified a possible data corruption bug in Hammer with a nearly-full filesystem. It’s dramatic enough he’s tagged 2.6.2 and 2.7.2 so that people can update; his message about it describes how to check for corruption.
Last-ditch ways to check your disk
If you’re worried that your Hammer disk may be going bad – and I mean bad like physically bad – you can check it with dd, or see what the hammer tool lists as bad.
Even more PostgreSQL benchmarking
Jan Lentfer’s done some new benchmarking of PostgreSQL on Hammer. There’s further suggestions and a more complete benchmark is planned, taking advantage of the Hammer improvements in 2.6. In the meantime, you can look at previous benchmarks.
Hammer and OS X
Daniel Lorch has ported Hammer to Mac OS X, of all things. It’s not complete, but he’s moving right along.
New location for Hammer on Linux
Daniel Lorch’s work on porting Hammer to Linux (read-only, currently) has been moved to a new location.
REDOs done, and 64-bit vkernels too
Matthew Dillon has implemented what he calls “REDO” records in Hammer, which reduce the amount of time taken flushing data to disk. It’ll be in the 2.6 release, but it isn’t on by default.
Jordan Gordeev’s work on 64-bit vkernels has also been brought in, so virtual systems are now available for x86_64 users.
32 to 64-bit Hammer mirroring fixed
Michael Neumann has fixed the ability to stream Hammer data between 32 and 64 bit systems. However, this is a change to 64-bit systems that requires them to match; make sure that you are not mixing 64-bit systems built before and after this commit on the 21st.
I can’t find the commit message in the mail archive, so I’ll quote it here:
HAMMER config details
Pulled from a larger conversation: a description of the settings for a HAMMER filesystem, and what they mean. I can tell from experience that extremely active disks will need extra cleanup time…
A week’s worth of posts for you
I can’t keep up with all the things to post. I desperately want to clear my inbox, so here’s a week’s worth of posts all smushed together. Enjoy!
- Naoya Sugioka’s tmpfs work is almost ready to go.
- Francois Tigeot is looking to find supported RAID hardware for DragonFly; the LSI1068e isn’t useable. Freddie Cash listed a number of different and fully supported cards, and Francois listed some other potential choices.
- While talking about hardware, Steve O’Hara-Smith reported excellent results with a particular Atom 330-based board and DragonFly.
- Stathis Kamperis has added to ‘hammer snapls’ output; an example is in his submit@message.
- The 2.6 release of DragonFly, scheduled for March, will have version 4 of HAMMER. 2.4 has version 2. Upgrading from version 2 to 4 can happen in place, live, and only needs to happen once per volume, not per PFS. That’s about as easy as it gets. More details are available.
- The default sshd config has been updated; this shouldn’t affect your normal operations unless you’re using one of the mentioned options.
- Oliver Fromme linked to more discussion of SSD durability.
- Also, Matthew Dillon posted more notes and benchmark numbers for his swapcache work. There’s been some side benefits too. A man page for swapcache is now available.
- Aggelos Economopoulos’s libevtr has been added, for event tracing. He’s posted some additional notes on this work-in-progress.
- We now have /var/log/daemon, too.
- Notes on prepping for Google Summer of Code 2010 from the GSOC Discussion list; I don’t know if that link is readable for nonsubscribers.
- The Definitive Guide to PC-BSD is out at the end of this short month. Dru writes good books.
- Did you know FreeCiv (a Civilization clone, of sorts) is playable in a web browser? Goodbye free time! Details are available at my favoritest game site.
Phew.
New HAMMER presentation
Michael Neumann presented a talk on HAMMER at the Karlsruher Institut
für Technologie on January 27th. His slides (in English) are now available in PDF or ODP formats, and are listed on the dragonflybsd.org Presentations page.
Watch out on the bleeding edge with UFS
If you’re running DragonFly 2.5 and updated in the past week or so, and have UFS disks, there’s some instability introduced by Matthew Dillon’s recent work. It ought to be better by next week.
Users of Hammer, or of UFS only as /boot, don’t have anything to worry about.
More REDO
That didn’t take long: Matthew Dillon has an update on his REDO work; he’s about halfway there. His summary includes instructions on how to test this new work, including ways to change how Hammer syncs to disk.
Hammer downgrade and upgrade bug
Thomas Nikolajsen experienced firsthand a bug where downgrading a Hammer PFS master to a slave and then later making it a master again lost all data. Lucky him… The problem’s now fixed.
Hammer disk removal
Thanks to Michael Neumann, it’s now possible to remove a drive from a Hammer volume. It’s experimental, so all the standard warnings apply.
This can’t be done on a root volume, for hopefully obvious reasons.
Hammer expanding
Did you know you a Hammer volume can span multiple disks? And that you can add extra disks later on? There’s no RAID-like features – it’s just a straight multiple-disk volume, but it works. The Hammer command to do it is now “hammer volume-add“
Hammer saves my bacon
Some of the ikiwiki configuration files on dragonflybsd.org were accidentally overwritten during a software upgrade. Normally this would mean some work to locate and replace them from backups, but since it was a Hammer volume, a quick look in /var/hammer/usr/… found them for me.
I want to point out what Hammer does, here. Restoring from backup isn’t new – it is in fact probably one of the most basic and necessary of system administration duties. However, Hammer makes it so easy that the incremental work of using it falls to almost nothing. There’s no extra preparation or syntax to learn for retrieval, which is wonderful. Hammer’s easy fix has helped me out several times now, saving me time that, while probably still successful with any other backup system, would have been taken up just restoring things back to normal.
Hammer version 4 now available
Matthew Dillon has made version 4 of Hammer the default; the upgrade is a relatively painless ‘hammer upgrade’ command. This new version cuts out a chunk of the disk syncs needed, speeding up Hammer disk operations.
More links again
I like linkblogging, especially because there’s been a lot of good stuff floating about:
- Matthew Dillon detailed some of the problems he had using hardlinks to create backups – problems Hammer solves.
- The History of the Internet in a Nutshell: pretty good, though it says Unix “influenced” Linux and FreeBSD. Influenced is right for Linux, but there’s parts of the different BSDs that are from UNIX directly.
- From O’Reilly: The War for the Web. The walled garden that failed in the long run for Compuserve and AOL and so on is being resurrected. (via)
- Along the same lines: The Death of the URL.
Accessing Hammer config via NFS
Thomas Nikolajsen came up with some ideas for making the configuration files for a given Hammer volume accessible, even when that volume is being presented over NFS. He’s looking for more ideas.
Small undo fix
If anyone wants a project, there’s apparently a small undo bug that I’ve encountered. It is a small fix in terms of changes, for any takers.
Hammer version 4 status
There’s a status report from Matthew Dillon about his work on version 4 of Hammer, including the always enjoyable stories of tests that involve yanking the SATA cable from the drive.
Easier Hammer manipulation
‘mike’ made this interesting csh script that allows autocompletion of Hammer sub-commands. e.g. type ‘hammer’ and then cycle through the available hammer commands as you would through file names.
More Postgres benchmarking
Jan Lentfer repeated his Postgres tests on DragonFly with some system changes suggested by Matthew Dillon, and noticed a speed increase. (See previous report.)
Bug report reading
This description of a Hammer bug makes for interesting reading, since it delves into the sequence of events where data is actually laid down on disk. Interesting reading for a geek, admittedly…
Hammer version 3 in testing
Version 3 of Hammer is now available in bleeding-edge DragonFly, though it’s still experimental. The biggest reason for this version bump is to move the /snapshots folder to /var for all Hammer filesystems. This means an accidental <tt>rm -rf</tt> won’t destroy snapshots, as I’ve done. The saved data is still on the original partition, as just the metadata is saved to /var. More explication is available.
Postgres benchmarks
Jan Lentfer performed some Postgres benchmarks on DragonFly. It’s elaborate enough that it’s in the form of a PDF attached to the message I’ve linked. There’s some additional variations that haven’t been tried yet.
Vigorous file system activity seemed to lower performance in the long term on Hammer, which is certainly something to investigate. More testing please!
Saved by Hammer
A script I was running on avalon.dragonflybsd.org yesterday afternoon removed the packages, iso images, and snapshots stored there. (Sorry!) Hammer saved my bacon, with a snapshot of the 182G of missing files immediately available.
Hammer mirror header output, just in case
If you back up the pseudo-file-systems (PFS) on your Hammer volume to a non-Hammer disk, and then need to restore them to a new Hammer volume, and then realize you forgot to write down the shared-uuid, well, then, YONETANI Tomokazu has a patch for you. I haven’t seen this committed yet, but it appears valuable.
Hammer speedups
Matthew Dillon’s made some changes to Hammer that make performance during mixed operations (reading and writing requests at the same time) much faster. This should work for everyone, though AHCI/SILI/SCSI users will notice it more. The new writing system is called ‘BIOQ‘.
Hammer improvements, tests
Matthew Dillon’s made some improvements to Hammer’s read and write processes. To quantize this, he’s tested Hammer and UFS with blogbench and written up the results. The tl;dr summary: UFS performs well until the system cache runs out, and then it halts. Hammer has some overhead from saving all history, but doesn’t stop working under a much heavier load.
Messylaneous: books, lawsuits, git, more
Dear universe, including DragonFly people: stop doing so much stuff. It’s hard to keep up.
- Git in One Hour, an O’Reilly webcast. You need to register (free) and so on, but what the heck. O’Reilly doesn’t show crap.
- Poul Henning-Kamp is suing to recover the cost of Vista on his Lenovo laptop. (He’s installing FreeBSD.) I hope it comes out in his favor, though it will have little legal effect here in the U.S. (via)
- I didn’t realize this until I chimed in on the mailing lists, but one of the best books about file systems is freely available as a PDF.
- Another benefit of Hammer: you can’t run out of inodes, nor is it possible to have too many hardlinks.
- Some notes on pf usage in DragonFly. I know some parts have been mentioned before, but it’s good to sum up.
Spreading Hammer
Want to bring Hammer to Slackware Linux? People want it, and there’s some work already in progress.
PFS and NFS now play nicely together
It is now possible to mount a Hammer PFS via NFS, though you’ll want to use NFSv3.
Hammer gets bigger
Hey, look at what Michael Neumann’s doing: making Hammer expandable! It will be possible to expand your Hammer volumes while online, even.
(note: it’s experimental; don’t be surprised if it destroys data.)
New Hammer option
The hammer command now has an ‘info’ option, which gives a great deal of information on your Hammer drives, thanks to Antonio Huete Jimenez. (Committed)
New Hammer version
Matthew Dillon has a new version of Hammer, which speeds up listings from programs like ‘ls -la’ and ‘find’. This is only in 2.3.1.x code right now, so don’t force an upgrade via hammer version-upgrade if you’re still on DragonFly 2.2. His post includes some benchmarks.
On a side note: sili(4) tests look good.
Other ways to back up Hammer
As Jim Chapman found out, dump only works for UFS, and not for Hammer. Matthew Dillon outlined the different mirroring and snapshot methods that Hammer makes available.
Mirroring with Hammer
Siju George described his efforts to set up a continuous, automatic backup system using Hammer, with some interesting results. Matthew Dillon chimed in with some suggestions.
Minor Hammer changes
Matthew Dillon’s made some small changes to Hammer; it should result in a small speedup when copying data.
Better speed for cleanup
A number of people have noticed that Hammer’s pruning (which by default runs once at midnight) makes systems temporarily unresponsive. Matthew Dillon’s committed a fix for this, with warnings of more improvements to come.
Bulk build speed stats
I recently did a bulk build of pkgsrc on two similar machines; the only significant difference being extra CPU work being done on one system, and Hammer snapshots on the other. However, they’re diverging in speed over time, which is interesting but not yet conclusive. Read my post about it for more details.
A good benchmarking project would be testing Hammer with snapshots on and with snapshots off.
Hammer presentation at BSDDay
Sdävtaker is giving a “Hammer administration” presentation at BSDDay, May 29th and 30th in Argentina. (His presentation is the second day.)
Hammer porting notes
Daniel Lorch, the student working on a port of Hammer to Linux, has a blog, with some notes on progress. I found this April item entertaining.
lseek(2) extensions for Hammer?
Pedro F. Giffuni suggested that the SEEK_HOLE and SEEK_DATA lseek extensions would be good additions to Hammer, and linked to a Sun paper that went into more detail.
Encryption ideas for backups
Sdävtaker came up with a potential idea for encrypting backup files, and Matthew Dillon followed up with another way that uses Hammer.
Hammer for Linux, and others
Daniel Lorch is working on a port of Hammer to Linux’s VFS, though since he’s using FUSE, it will be able to reach other systems, like NetBSD. The code is accessible.