A conversation about compilers in the DragonFly base system led peeter (must) to describe his group’s use of OpenMPI on DragonFly for physics calculations. Apparently he’s had a significant performance improvement on DragonFly.
Category: Committed Code
Remember the new scheduler work? Well, it continued, and now Francois Tigeot has posted pgbench benchmarks of the progress and benchmarks of DragonFly vs. other operating systems. The links are to PDFs; scroll down as each have multiple pages.
The summary result: If you’re running Postgres, you probably want to do it on DragonFly. The numbers are the best results for any BSD, even better to some extent than Linux, which has had its own issues with schedulers and Postgres. DragonFly 3.2 will include these improvements.
John Marino has accomplished the difficult task of putting gcc 4.7 into DragonFly. Version 4.4 is still the default, and the older 4.1 version has been disabled. If you want to try this newer version, setting WORLD_CCVER=gcc47 will build kernel and world that way too. If you’re curious about what’s different in this version of gcc, there’s a 4.7 changelog.
Are we the only BSD with this new a version in base? I think so.
P.S.: You’ll want to do a full buildworld if you’re running DragonFly 3.1
P.P.S.: you may need to put ‘NO_GCC47=true’ in make.conf, going from IRC comments.
P.P.P.S.: Nope, now it’s fine.
The combination of Mihai Carabas’s successful Summer of Code work on the scheduler and the recent Postgres benchmarking got Matthew Dillon to start thinking about making UNIX domain sockets work better, a shortcut around the buffer cache, scheduler improvements and then a new default scheduler, along with a change in idle CPU behavior. The best place to understand all the changes is in his long post to users@.
We should have benchmarks soon to show the performance improvements from all this.
If you do, they don’t get cleaned up during the normal ‘hammer cleanup’ nightly routine. Chris Turner has added a way to manually specify them as a cleanup target.
I’m pretty sure in this case ‘offline’ means ‘nothing streaming to it from a master disk’. I think.
Matthew Dillon has created an experiment: shared page table mappings. It’s controlled by a sysctl, since it’s still experimental. The real-world effect is reducing the number of memory faults as a process uses up memory, and decreasing the overall memory usage. The obvious benchmark is Postgres speed; this makes the initial expansion of memory usage much less of an drag on speed due to a high memory fault rate.
If all this mention of faulting sounds like a problem, remember memory faults on BSD are normal; that’s how programs indicate they need more memory space by causing a fault. This is in contrast to Linux, where memory is allocated a different way. Or at least, that’s my understanding. (If you know better, please comment.)
These are small, but they make life easier: Hammer now has a scoreboard file, for viewing of mirror-streams running in the background. There’s also a ssh-remote directive, so you can use ssh without enabling an interactive shell, and a HAMMER_RSH environment variable so different remote shells can be used. These are all for Hammer 1.
John Marino is working on updating tcl in pkgsrc. It’s apparently quite messy to update, which may be why it has sat out of date for some time. Never one to rest, he’s also been making FUSE filesystems work on DragonFly. (Here’s a FUSE explanation, if you need it.)
Also this. Someday I’m going to write a “games on DragonFly” feature, or series.
Matthew Dillon recent posted a status report for Hammer 2. Of interest is the spanning tree protocol being built to handle messages between Hammer volumes. As he says in the message:
For example, we want to be able to have millions of diskless or cache-only clients be able to connect into a cluster and have it actually work…
(No, it doesn’t do this, yet.)
Pierre Abbat noticed that bc(1)‘s usage of
GNU readline something that wasn’t GNU readline made it harder to use; Sascha Wildner changed it to use libedit. Pierre’s other complaint, that BSD man page output stays on-screen when completed, is a positive feature. Linux systems that clear man page output enrage me, because I expect to be able to take advantage of my scroll buffer.
John Marino has added a ‘gcc47′ compiler ccvar, so you can build world and kernel with it. ‘It’ is actually gcc-aux, since it seems to work better than the basic (“vanilla”?) gcc47. You also get Ada support, though that wasn’t the driving reason to pick it. This is brand new so don’t try it unless you’re ready to discover issues.
Is there any other BSD able to use gcc 4.7 for world/kernel? Even 4.6? Most of the attention has been on clang.
Nuno Antunes is still working on that netgraph upgrade. Among other changes, ng_tty has been added. What’s it do? Something with ppp, I think.
Sascha Wildner has made it easier to use alternative syntax checking systems as a “lint” make target in DragonFly. His usage of coccinelle, as one of these alternatives, has already found many bugs – just today, for instance.
Is “alternative syntax checking systems” the right phrase for this? I don’t know. “Correctness checker”? My phrases all sound like something you’d read on a government form.
Reading this HAMMER2 commit carefully shows some future plans: remote cluster control, and the ability to mount nonlocal HAMMER2 volumes. A reminder: those are future plans, not what you can do now.
It’s possible to accidentally truncate your password when using DES encryption and 0×80 in UTF-8 encoding. It’s fixed.
If you are running bleeding-edge DragonFly, libpthread was broken for a short period. If you built anything in the last … 12 hours? You may want to rebuild it. If that doesn’t describe you, it’s a nonevent.
It’s funny that I’m reporting a short-term break in bleeding-edge operating system code as any sort of surprise. It shows something about how stable DragonFly-master is most of the time.
A few recent updates imported to DragonFly from FreeBSD: Francois Tigeot updated amdsbwd(4), an AMD south bridge watchdog. Sascha Wildner updated arcmsr(4), the Areca RAID controller driver, and Peter Avalos updated pw(8).
In the other direction, FreeBSD now has GNU hash support for rtld, based on John Marino’s work in DragonFly.
Sepherosa Ziehau added “Rescue Retransmission for SACK-based Loss Recovery Algorithm” in a commit, where he details just where this would be handy. It’s on by default and the sysctl net.inet.tcp.rescuesack can be used to turn it off.
Francois Tigeot has followed up with a description of how to enable and disable quotas on DragonFly, which will work for most any local file system, unless rebooted. There’s also the vquota(8) man page.
DragonFly now has a optimized scoreboard for SACK, thanks to Sepherosa Ziehau. What’s that mean? SACK is a way to make sure only the needed parts of a TCP transmission get retransmitted, when multiple packets are lost. The scoreboard is where the packets needing retransmission are tracked. So, the result of these improvements is better performance in packet-lossy situations.
(Please correct me if your understanding is better than mine; my explanation is based on stumbling around the Internet for a few minutes of reading.)
Sepherosa Ziehau has made changes to the initial TCP congestion window, based on a number of papers he links to in his post. The immediate effect is if you’re on DragonFly-current, you will need to do a full buildworld on your next upgrade. The long term effect could be improvements in latency by improving reactions to bufferbloat. Or not; this is pretty technical.
If you’re trying DragonFly 3 in a virtual machine, you may have noticed some issues in booting in (for instance) Qemu. Sepherosa Ziehau committed a change that sets the sysctl hw.ioapic_enable to 0 in virtual environments. It can always be turned back on, but the recent MSI/MSI-X improvements seem to cause trouble in some virtual environment. You can also set that tunable at boot to get an initial install going.
(I haven’t had trouble in Virtualbox or VMWare, so you may or may not need this.)
Here’s an interesting side effect that came up in Hammer 2 development: deleting files can potentially require modification of only one parent element. If I’m reading it right, that means deletion always takes about the same time, independent of the amount of data being deleted. Your ‘rm -rf /largedrive’ could complete, removing multiple terabytes of data before you realize it. I suppose it’s silly to complain about speedy results. Of course, being Hammer, it would still be available in history.
Thanks to John Marino’s work, it’s now possible to build the DragonFly kernel and world using gold, and have it work. You just have to set WORLD_LDVER to make it work. I don’t think there’s any user-visible change from this, other than a tiny speedup in building. I don’t know if any other BSD is using gold yet.
For the curious and technically oriented, Hammer 2 development can be watched directly by looking for any commits marked ‘hammer2′. There’s been a lot, and if you want to see the code as it flows in, here’s your chance.
John Marino has made it possible to build world and kernel on DragonFly using GCC 4.6 in the form of gnat-aux. (We’re currently on GCC version 4.4) Note that version 4.6 isn’t included with DragonFly, so you would need to download and compile
GCC 4.6 a very recent version of lang/gnat-aux, and set CCVER=gcc46 before building world and kernel to try this out.
Update: John Marino points out in comments that you need to set WORLD_CCVER, not CCVER as his original message said.
ISDN support has been removed from DragonFly. It was not useful at this point, because it’s rarely used any more. It does make me feel a little sad; this was the technology everyone said was the future before cable modems and DSL were figured out.
A bit of symmetry in that title, there. Old ATA, which was replaced years ago, is finally gone. This should affect nobody…
Matthias Schmidt found a discussion about DragonFly’s password encryption. The result, if I am reading it correctly, is that brute-forcing the password from available hashes is quicker than it should be. Matthias also found a contributed fix. Samuel Greear updated to match the reference SHA implementation also in Linux, with this very pertinent warning.
Matthew Dillon has a very detailed commit message with changes to make sure Hammer will run overnight cleanups in situations as low as 256M of RAM. I think you can find that much RAM in breakfast cereal boxes these days.
What happens when you break enough things in DragonFly that you become a source of test cases? As Antonio Huete Jimenez (AKA “tuxillo” on IRC) found out, you get a stress test named after you.
There’s been a rare segfault present in DragonFly for quite some time. It’s been difficult to reproduce, and the 2.12 release due some months ago was held up specifically to fix it. Matthew Dillon was, after many days (months?) of work, able to replicate it reliably and eventually find a way around what appears to be a new AMD-specific bug. Read his very detailed explanation of what he did to get to this point.
Francois Tigeot benchmarked his accounting work with blogbench, and posted a PDF with the results. Dmitrij D. Czarkoff made a simpler graph, which can be used to draw the conclusion: blogbench didn’t work well for estimating the impact of VFS accounting. If you want to try accounting yourself, put
vfs.accounting_enabled="1" in your /boot/loader.conf.
(The normal DragonFly mailarchive isn’t updating because it feeds from DragonFly NNTP, and that’s not updating, so I’m using Gmane for post links.)
There is now a NO_BINUTILS221 option, added by Sascha Wildner, that will keep your system from building binutils 2.21 during a buildworld. The system will still build binutils 2.22, so there will still be a functioning ld on the system. Use this along with NO_GCC41 (so only gcc 4.4 gets built) to speed up your buildworlds, if you like.
Francois Tigeot has been working for quite a while on a VFS accounting system. It doesn’t restrict to a quota (yet), but it will give you byte totals for each mounted filesystem. It has been committed, so it looks like a good way to tell which PFS is eating your disk.
Update: Francois pointed out he’s still adding parts for this. So it’s not quite done yet, but soon.
Buildworlds are now much faster, because they can run themselves in parallel. Invoke it using the -j option to make. Matthew Dillon saw a 25% reduction in time when using ‘make -j 12 buildworld’ on a 4-core system. You may need to manually update xinstall and mkdir:
cd /usr/src/usr.bin/xinstall make clean; make obj; make all install cd /usr/src/bin/mkdir make clean; make obj; make all install
It’ll also use more memory than a non-parallel build, but heck, that’s cheap these days.
Venkatesh Srinivas made a minor change to a ddb backtrace – it now prints the raw instruction pointers. On x86_64, a backtrace would not print the correct objects out, so this is better. It’s a minor change, but I’m pointing it out because it totally helped solve a problem for me on a package-building machine.
The general rule of thumb is that if you have a function written in an interpreted language (Perl, Python, etc.), it’ll be faster in C. If you need it faster than that, you go to assembly. Prepare to have your world rocked: Venkatesh Srinivas found that strlen() in libc was actually slower written in assembly than in C. His commit message has numbers to back that up.
It’s another throughput tweak from Sepherosa Ziehau: soaccept is run differently when pulling in network data from a socket. The commit message once again shows the results of the change using httperf.
Binutils in DragonFly is now up to version 2.22 – the commit linked is one of several.
Some time ago, Matthew Dillon worked on a bulk build system that built as much of pkgsrc in parallel as possible. It’s in the tree now as ‘fastbulk‘, for anyone wanting to try it out. I used it a bit; I didn’t measure the degree of speed increase, but was able to get about 70% of the packages built.
Sepherosa Ziehau has implemented another networking speedup. Read the commit message for details on what he changed, since it’s rather in-depth. He shows an 18% improvement in netperf results.