Archive for 'apache'
Not a Rio!
Ok, so yet more generosity from Sun today. I’ve been told we can keep the T2000! Incredibly, the T1000 is still on its way, and that will be with us some time next week – but I can now recycle the cardboard from the T2000. Brilliant!
We would actually be more than comfortable deploying a T2000 as a host for ftp.heanet.ie, but things are not that simple. We’d have to migrate the 12TB of data from XFS to UFS, for a start. We’d also prefer to buy a server, with a support contract and so on (update: the donations actually come with SunService contracts), anyway. But the T2000 would give us better performance, and a much better NFS stack, which greatly expands our options for the future. When we next purchase an ftp.heanet.ie, Niagara is at the top of our list.
So, the T2000 will go to RedBrick, and we’re working on something else for the T1000. In the immediate future, it will serve as an excellent development and testing machine for Apache, and we’ll use it for the dtrace work (now well underway, by the way). It will also be incredibly useful for us to have a machine for ongoing comparison purposes, and maybe even some OpenSolaris hacking.
Stay tuned for more benchmarking work though, results will be coming soon.
Update: SUN have also very kindly offered to cover Nóirín’s travel costs to ApacheCon Europe in Dublin (read how this will help her, here). Nóirín has been doing a lot of work behind the scenes editing my blog-posts into readable English (and also improving the spelling to that above the level of a 10 year old!) and with Sun’s help, the universe has been particularly efficient at getting the kharmic reward in order.
Erie in Eire
Good news, SUN are donating a Niagara T1000 to us (and hence RedBrick), excellent! Damien, from the Sun performance team here in Ireland, called to tell us it’s on its way. We should have it in the next week or two. This isn’t part of the original contest, it’s from the Sun engineering team, and we’re very grateful. I’m sure RedBrick will put it to excellent use, and we’ll get them thinking about a server name right away!
After doing a lot of analysis today, we’ve figured out what was up with the benchmarks – we were inserting a systematic delay of about 30ms into every connection. How? By not using epoll() on the client side. Turns out that the blocking select() time in the clients was actually a very significant factor in how quickly the benchmarks were running.
What this is means is that although the results from the previous benchmarks are almost certainly still in the correct order, and that the comparison of relative performances is valid, the absolute requests per second numbers are invalid, and there’s a lot of room for improvement, probably beyond 25,000 requests per second. I’m now planning to give everything a serious try with the latest Nevada build, instead of update 1, which is what we’re currently running.
We also have a list of other cool things to try out, and from the sounds of things, the guys in EastPoint have extremely extensive knowledge about tunings and just what levels of performance can be achieved. As a lot of their work is being opened up as part of the OpenSolaris effort, there may well be some great ways Sun and Apache can help each other out. Hopefully, we might even see some of them at ApacheCon.
We’ve still got over a month with the T2000 left though, and plenty more to try out on it, so we’re not letting that resting just yet. It’s a very, very strong candidate for the next ftp.heanet.ie iteration, and it looks like there may be a solution within UFS to our single-threaded I/O numbers. ZFS would definitely help out there too.
It’s hard for me to look neutral anymore, since we’ve been donated a box, or maybe it’s easier – since we don’t really have anything to gain – but I have to say that I am very, very impressed, not only by Sun’s latest technologies and their approach to engineering (which has always been good), but also by the genuinely open and receptive nature they’ve shown. I’ve had contact with a few Sun employees during this trial, and they’ve all been extremely helpful. There was never any hint of pressure, or even corporate schmooze, just simple and honest advice, which is good to see. I still think Solaris has a bit to go before we could mass-deploy it in production again, mainly to do with the lack of a decent packaging system. That said, Sun now looks like an organisation which is genuinely receptive to feedback, so before the 60 days are up, I’ll try and write a blog post comparing Solaris to other platforms, for common administrative and automation tasks.
Anyway, thanks Sun, especially all of the people who’ve been speaking to me over the past few days – we’ll put the server to good use.
Niagara Benchmarks: Update
Since I’ve been posting material up on Niagara, a few other httpd and solaris folk have been chiming in with some more expert opinions. I’ve also received more dder output from various platforms, which I plan to post up when I get a chance.
What I couldn’t wait for though, is to communicate the effect some single changes in our benchmarking setup have achieved. A few days ago I raved about the 5700 requests per second I was getting out of the Niagara box. Turns out that was a load of crap, here’s what I’m getting now;
Requests per second: 15298.68 [#/sec] (mean)
And here’s what an active ftp.heanet.ie is pushing;
Requests per second: 4445.26 [#/sec] (mean)
Abandoning siege, and using the latest version of ab is revealing some fundamental limitations in our previous benchmarks and some errors in our assumptions. O.k., so my assumptions. My bad, and I wanted to try and rectify it as quickly as possible. As I scramble together some time over the next few days, we’ll recheck our other results, and see what else is lurking under the hood in terms of performance. We suspect there’s more room for growth.
Thanks to Brian Akins for nailing the problem. Oh, and we’re hearing a lot of very very good things about, and seeing some very nice numbers from, Opteron systems too.
Niagara vs ftp.heanet.ie Showdown
So, after a week with the Niagara T2000, I’ve managed to find some time to do some more detailed benchmarks, and the results are very impressive. The T2000 is definitely an impressive piece of equipment, it seems very, very capable, and we may very well end up going with the platform for our mirror server. Bottom line, the T2000 was able to handle over 3 times the number of transactions per-second and about 60% more concurrent downloads than the current ftp.heanet.ie machine can (a dual Itanium with 32Gb of memory) running identical software. Its advantages were even bigger than that again, when compared to a well-specced x86 machine. Not bad!
The Introduction
ftp.heanet.ie is one of the single busiest webservers in the world. We handle many millions of downloads per day, but unusually for a high-demand site, we do it all from one machine. This is usually a bad idea, but as a mirror server has built-in resilience (in the form of a world-wide network of mirrors), and as we can’t afford 20 terabytes of ultra-scalable, network-available storage, we use a single machine with directly attached storage, and rely on our ability to tune the machine to within an inch of its life. We regularly serve up to 1.2 Gigabit/sec, and have handled over 27,000 concurrent downloads. There’s some more detail on our previous set-up (which is mostly identical to the current one) in my paper on Apache Scalability.
Over four years ago, when I started in HEAnet, Solaris and Sparc hardware represented about 50% of our Unix systems. Now it represents less than 2%, so I’ve had less and less opportunity to tinker on Solaris in the last few years, but have kept up with it enough to know how to use dtrace, and to still understand the Solaris fundamentals. At ApacheCon US 2005, Covalent had a T2000 along as a demonstration machine. I got to play with it a little and was very impressed. Unlike prior experiences, this machine felt very responsive. There was no waiting for the output of commands, no listening to the whirring of hard disks, and the benchmarking numbers it was producing weren’t bad either.
When Jonathan Schwartz announced the “Free Niagara box for 60 Days” deal, we jumped at the opportunity to test one of the these boxes – which may be ideal for our needs. It took a while for Sun to iron out some administrative problems, but they certainly held up their end of the deal, and a nice shiny T2000 arrived a little over a week ago, for us to try out.
The Machines
To get a better sense of the machine’s performance in comparison to our other options, we rustled together a Dell 2850 Dual 3.2Ghz Xeon with 12GB of RAM, running Debian, and our current Dell 7250 Itanium (which is a dual 1.5Ghz with 32GB of RAM).
![]() |
![]() |
![]() |
Throughout the benchmarking, the machine used for firing off the benchmarks (using ab, httperf and siege) was another Dell 2850, this time a dual 2.8Ghz xeon with 4GB of memory. For performing the concurrency and latency tests, we used more, similarly-configured (and identical each time), 2850’s and 2650’s to run yet more parallel benchmarks.
As ftp.heanet.ie is a live system which we can’t simply take off-air because we want to complete some benchmarks, we ran the tests during its quietest periods of use. To be fair, we also made sure that the other two systems – when benchmarked – were loaded with a baseline of 40 requests per second, with an average concurrency of around 300. After initially determining which machines were “winning” the benchmarks we tried to structure the load to favour the “loser” of the benchmarks, if any decision was needed. This means that where one machine comes out on top, the margin by which it wins is actually a conservative estimate.
Ordinarily, we try to drastically reduce the number of services on a machine, to free up memory and scheduler time on the system. However, as the T2000 came with a large number of services running, and it’s not entirely easy to determine what is and isn’t actually a critical service, we shut down obvious candidates – such as the various network filesystem daemons – but left some others alone. Again, if anything, this means that our results are actually conservative for the Sun, although they probably do reflect a real-world set-up, which will have these services running.
The Preparation
As no system comes configured perfectly for such extreme tests, we did a number of things to each machine we tested, to achieve as much performance as we could manage. Since my Solaris skills are rustier than my Linux skills by a fair margin, it’s more than possible that our benchmarks under-represent the performance of the T2000.
- T2000
- The first thing we did after receiving the system was to get smpatch configured, and to run “smpatch update”. Getting the system completely up to date took a good 6 hours, and that still only covered critical and security updates, as we don’t have a subscription for everything else. Being a Debian and Ubuntu user, this is annoying. “apt-get update && apt-get dist-upgrade” would have done the same thing, and upgraded everything in about 15 minutes, at the very, very longest. Hopefully though, that will be improved upon.
Next, we installed the SUNWspro suite, in order to have a compiler, linker and so on – which is mighty useful for compiling Apache from source! Some reasonably trivial invokations of apachebench seem to show that this compiler produces faster binaries than gcc. Over the years, there have been claims that 64-bit binaries are actually slower than 32-bit binaries. Our testing didn’t show much of a difference, but just in case there is one, we used 32-bit builds of Apache, though with the correct largefile-magic, so that we could still transfer very large files.
We didn’t apply many Solaris kernel tunings, mainly because the Solaris team seem to be working hard to get rid of them, and putting a lot of effort into making the default behaviour ultra-scalable. Nevertheless, we upped max_nprocs various times to cope with the insane number of processess we were creating. Keeping an eye on tcp:tcp_conn_hash_size with ndd seemed to show little problem with the default values, and this is the main Solaris tunable we’ve had to tune in the past.
Apart from mounting the filesystems with the “noatime” mount-option, we did no filesystem tuning, which is something I’m keen to improve on, particularly if we can try out ZFS. Again, if anything, this means that the performance of the T2000 may be under-represented. However, as our benchmarking was restricted to just 3 files, with no directory traverals, probably not by much. If anyone has any pointers on intensive filesystem tuning on Solaris, please send them my way!
- Itanium 7250
- The Itanium box runs version 2.6.15.2 of the Linux kernel and our list of related sysctl’s looks like this;
net/core/wmem_default=5000000 net/core/wmem_max=5000000 net/core/rmem_default=5000000 net/core/rmem_max=5000000 net/ipv4/tcp_rmem="8192 87380 1747600" net/ipv4/tcp_wmem="8192 87380 1747600" net/ipv4/tcp_wmem="8192 10000000 10000000" net/core/netdev_max_backlog=25
We also up the txqueulen on our interfaces to 50000, for achieving super-high throughput to our Geant users. The XFS filesystem was mounted with the “noatime” and “ihashsize=65535″ mount options.
- 2850 Xeon
- For the sake of consistency, the 2.6.15.2 kernel was also installed on the Xeon box, with the same system and interface settings as the Itanium box. The ext3 filesystem used was mounted with the “noatime” mount-option.
- Apache
- Common to each box were the usual Apache tunings we apply. For each machine, we tried to determine the quickest MPM to use. In the case of the two Dell boxes, this was the event MPM, which was ahead of the worker MPM by about 2%. We couldn’t get the event MPM working on Solaris (more about that later), so we used the worker MPM – which was over twice as fast as prefork on the platform.
As Solaris seemed to respond better to more LWP’s than PID’s, we ran with 64 threads per child – which is not at all an unreasonable number. Increasing beyond this did give us slightly better results, but the potential for 64 downloads to die at once, when there’s a problem, is just about enough real-world risk to deal with, for me. The relevant configuration stanza looks like:
<IfModule mpm_worker_module> ServerLimit 1563 ThreadLimit 64 StartServers 10 MaxClients 100032 MinSpareThreads 25 MaxSpareThreads 75 ThreadsPerChild 64 MaxRequestsPerChild 0 </IfModule>Note: these are stupid values for a real-world server, and will waste a lot of memory for the scoreboard. They are really only useful if you are doing some insane benchmarking and testing.
We naturally set “AllowOverride None”. Interestingly, although sendfile() functions flawlessly on Solaris (unlike on Linux), using it seemed to have an impact on performance. Using it did reduce the amount of memory used by Apache on the box, but it gave slower performance than just read() and write() – so perhaps it’s blocking characteristics are slightly different. Thus, we set “EnableSendfile off” and used MMap instead (via “EnableMmap”) which seemed to be the fastest way to ship bytes.
Another hack we applied to speed up Apache was to change the default buffer size, which is buried in the bowels of APR and can only be changed at build-time. In each case, the buffer size was changed as per the most efficient value (as determined by our previous benchmarks on single-threaded I/O). Don’t try this at home kids, unless you really know what you’re doing.
So, with our tunings applied, we set about performing our benchmarks, and for the sake of sticking with the showdown theme, I’ve divided the results into good, bad and ugly. (No, there weren’t really any ugly results – it’s just a fun theme for a post).
The Good
- Power Usage
- As I’d previously blogged, one of the first things we were able to measure was the power usage of the machine. Much to my amazement, it remained at the original level (+/- 20%) of current draw for the duration of our tests, peaking at a mere 1.2 Amps, or about 290 Watts. This compares pretty favourably with our Dells, though I should add that the Dells both have more disks in their chassis than the T2000.
Machine Average draw Peak Yearly cost Sun T2000 1 Amperes 1.2 Amperes €210 Dell 2850 1.6 Amperes 2 Amperes €350 Dell 7250 1.8 Amperes 2.2 Amperes €395 Costs are calculated on the average draw, at the Irish commercial ESB rate, and do not include cooling costs (roughly triple the number to get the overall yearly cost). Electricity supply was 240V, so multiply the Amperes by 240 to get the raw numbers of watts. These results were calculated using an APC metred PDU. This is not a scientific instrument, and it’s entirely possible that results are inaccurate. Some rough calibration did show that the unit produced consistent results, so personally I’m confident enough about the order in which the machines are ranked, but I wouldn’t go so far as to be certain of the raw numbers produced. We really need a good power meter to produce that kind of reliability.
- Requests per second
- How many requests the machine can handle in a second is probably the most valuable statistic when talking about webserver performance. It’s a direct measure of how many user requests you can handle. Fellow ASF committer, Dan Diephouse, has been producing some interesting stats for requests-per-second for webservices (and they are impressive), however we were more interested in how many plain-old static files the machine could really ship in a hurry. And without further ado, those numbers are;

Sun’s own benchmarks have quoted up to 2500 requests per second, which we didn’t find particularly impressive. Our current box – merely a dual Itanium – can do 2700 requests per-second without much trouble. I’m happy to confirm though, that the tricks we do to reduce Apache’s memory usage on Linux have as much of an effect on Solaris. Our results are averaged over 5 runs of the testing, during which the T2000 managed a very, very impressive 5718 requests per second. Not bad!
Despite the new kernel, the x86 box still struggled to push out a disappointing 982 requests per second, while our Itanium churned through a reliable 2712 requests per second.
- Concurrency
- Unfortunately, neither the siege nor apachebench utilities can cope with the levels of concurrency we test with these days, as there are simply far too many sockets involved. Tuning the client machine itself becomes a serious task in order to be able to cope with the sheer volume of outbound requests. We currently have some commercial traffic generation and scaling testers in our test-lab, but we decided not to use those either. Instead, multiple servers were thrown at the problem and we used 11 machines all-in, all running instances of siege at the same time. The instances were fired off by hand, but within a few seconds of each other, and more than enough requests (100,000) were used, to ensure that the processes were given enough time to ramp up to the level of parallelism required. Each machine was on the same LAN as the server we were benchmarking.
With those limitations in mind, the test certainly allowed us to find out the rough breaking point of each machine. On any system, sustaining over 10,000 concurrent requests would involve denying some requests outright, but the cut-off or breaking point was defined as the point when the server got to 50% availability. We used some other tricks, like assigning the server multiple IP addresses and targetting each client at a different address, to a) give the tuple-tracking code in the IP stacks an easier time and b) allow us to easily track how many clients each server was sustaining.
Also, in each case, the system was pretty much unusable by the time we were done! After killing all of the connections, the Linux boxes would take about 5 minutes before becoming responsive enough that we could get to a shell prompt. The T2000 would take about 20 minutes, although I think that if we reserved more processes for the root uid, that might change – sshd seemed responsive enough, but would block on fork() when trying to create a shell process.

As you can see, the T2000 was able to sustain about 83,000 concurrent downloads, and my limited dtrace skills tell me that thread-creation at that point seemed to be the main limiting factor, which is hardly surprising. For us, that number represents an upper limit on what the machine could handle when faced with a barrage of clients. Of course, no server should ever be allowed to get into that kind of insane territory, but it’s always good to know that there is plenty of headroom. More to the point, it means that availability at the lower levels of concurrency is much higher. Compared to the 57,000 concurrent connections our Itanium box, and the 27,000 our Xeon box can handle, it looks like the T2000 would be a very, very good choice of server for our load.
- Latency vs concurrency
- I would have liked to have been able to measure availability vs concurrency, but unfortunately our method of testing doesn’t really allow for this. Although we can sum the availabilities as seen by each client participating in the benchmark, this doesn’t always time-average correctly. In other words, if we used two client systems, and client A reported 90% availability and client B reported 80% availability, does that mean 85% uptime overall, or 80%? Unfortunately, it doesn’t mean either. Averaging only works if the two figures are perfectly overlapped in time, so it’s an average – but weighted in proportion to the lack of an overlap. The real availability is somewhere between 80% and 85%, and it’s very hard to figure out where. If the client systems were identical in hardware terms, we could come close to solving the problem by firing off the benchmarks with the at command, but our systems aren’t all that close in terms of spec.
Instead, what we can do, is to measure the latency as it increases with concurrency, in each case taking the worst value from our benchmarking clients. Benchmarking from a single system shows that there is a very high degree of correlation between an increase in latency and a decrease in availability, so this measurable gives us a good idea of both.

Overall, the T2000 performs very impressively. At very low numbers of concurrency, it actually has a higher latency than either of the Dell machines we tested, but these latencies are of the order of tens of milliseconds. In other words, the network latency makes a bigger difference in the overall scheme of things.
With no concurrency at all, the T2000 would exhibit latency of 9 milliseconds, compared to the Itanium’s 1 millisecond (and in fact, ab actually outputs 0, so it’s less than 1 millisecond) and at 1000 concurrent requests the T2000 would have 48 milliseconds, compared to 12 milliseconds for the Dual Itanium box. However, as we scaled up the concurrency, the latency numbers change fairly rapidly, in favour of the T2000. Due to the huge changes in scale, we’ve had to use a logarithimic graph, but at 50,000 concurrent downloads, our Itanium would take up 38 seconds to respond to a client, compared to the T2000’s 26 seconds. At 83,000 downloads, which only the T2000 could manage, the latency had gone up to 57 seconds, but it still responded.
Overall, I think it’s fair to say that while the T2000 doesn’t seem to have ultra-low latency performance, it has much better scalability and provides much better availability as more and more connections are added. So again, overall, the T2000 is still the better webserver.
The Bad
I’m a bit reticent to label these results “bad”, because they really are in areas in which Sun have never claimed the machine will perform. The Niagara platform is architected for parallelism, it’s not supposed to give great performance for any single-threaded task. If you have a load which requires great performance to a single client, Sun have an array of other hardware they’d prefer to sell you instead. However, since some aspects of single-threaded performance do have a direct impact on webserver performance, I’ve included some relevant ones here.
- Single-threaded I/O
- As I’ve previously blogged, one of the first benchmarks we run on any machine is to determine how much I/O a single-threaded task can drive, and what the most efficient buffer size to achieve that is. There’s much more detail in the linked blog post, but the summary information can be easily graphed:

These results may be attributable in part to the relatively slow system disks that the T2000 ships with, and much better performance can probably be derived by using a faster disk setup. On the other hand, the performance Linux achieves is mainly due to the very aggresive vfs caching it performs. Unlike the Linux box, the T2000 produces the same throughput numbers whether it is the first time or the tenth time it has read a file. Linux, on the other hand, takes much longer to serve a file the first time, but after that, it’s served from RAM.
It’s also useful to put these results in context; what they mean is that a single-threaded task, doing as pure and simple an I/O task as possible, can push 3.5Gigabytes per second. The Niagara box comes with 4 Gigabit/sec interfaces, so even a single-threaded task could fill that, 7 times over. Still, if I were deploying a load with a large and very active database component, I would do some more extensive testing to ensure that any single-threaded I/O constraints had no overall effect.
- Single-download throughput
- After gathering the numbers on single-threaded I/O, and confirming that the T2000 could easily saturate its 4 Gigabit interfaces – at any level of concurrency high enough to generate that level of traffic – we decided to see if the I/O numbers exhibited themselves for a single download. To perform this benchmark we went back to basics, and used curl and wget to grab a 1 Gigabyte file repeatedly. To help the systems out, we increased the MTU to 9000 bytes and made sure the TCP window size was big enough to take the entire file straight away. We also monitored for any packet loss during the tests (there was none).
Due to the way we handle the load-balancing of our network interfaces on the Linux boxes, which is per-flow, any single download is limited to 1Gigabit/second. Sure enough, wget reported a neat 123 MB/sec fairly reliably. Since the balancing was per-flow, it’s entirely possible the machine can actually ship faster downloads, and neither system seemed under any strain while doing this. With the T2000 on the other hand, we could push no more than 48 MB/sec, which is still a very respectable 384Mbit/sec.

Apart from increasing the MTU and Window size, we didn’t apply any Solaris-specific tunings for improving these numbers, so again, it’s possible that these numbers are under-representing true possible performance. And once again, we really have to put these numbers into context. As a whole, the T2000 has no problems saturating it’s 4 Gigabit/sec of connectivity, and that’s what it’s designed for – parallelism. All our numbers mean, is that if you wanted truly incredible performance for any single download, this probably isn’t the right architecture. Outside of where I work, and other high-speed research networks, I’m not aware of any place where high-speed, single-flow statistics really matter a whole lot, especially for HTTP. The network is usually a limiting factor anyway. I mean, how many people have jumboframe capable multi-gig WANs?
The Ugly
Ok, so ugly is a bad choice of word. But like I said, this is a “showdown”. While testing the event MPM, we did manage to upset the Solaris kernel to the extent that it actually crashed;
panic[cpu21]/thread=300024a7020: BAD TRAP: type=31 rp=2a102c87720 addr=0 mmu_fsr=0 occurred in module "genunix" due to a NULL pointer dereference httpd: trap type = 0x31 pid=652, pc=0x10fb4dc, sp=0x2a102c86fc1, tstate=0x4400001607, context=0x514 g1-g7: 0, 0, 12, 38, 0, 0, 300024a7020
Nice! I havn’t looked into this in detail yet, but it’s likely due to the unusual synchronisation semantics the event MPM features right now. The event MPM is marked as experimental, and if you’re not an Apache developer, you probably shouldn’t be running it. Still, the thread-handling code within the MPM all runs as a non-root user, so it really shouldn’t be able to cause the kernel to crash. Then again, it was handling about 30,000 requests at the time, with no accept mutex. This isn’t exactly within the normal range of expected behavior for a userland application. Since switching to the worker MPM, we’ve had flawless performance and not a single crash.
The Conclusion
The T2000 is one very impressive piece of kit, and at a list price of around €15,000 ($16,995), costs less than half of the price of the dual Itanium we’ve been benchmarking it against (it’s also less than I can price up a comparable X86 box for – seems to be the memory that does it). We may very well go with the platform for our next iteration of ftp.heanet.ie.
The benchmarks we’ve run were all run with our own load in mind, but hopefully they’re still of some use to others. If you’re thinking about giving the platform a try, do run your own benchmarks though, don’t take our word for it. It’s always better to have these things validated and improved upon.
The Future
We’re not finished benchmarking just yet, we still have more planned! The Niagara box has some impressive SSL-offload features, and if we get a chance, we’d like to test those capabilities. We just needs to get the hacked-up engine3-supporting versions of openssl and flood onto the box, which will involve a bit of research. Some of the Apache SpamAssassin guys may try running some SpamAssassin benchmarks on the machine too, which should be impressive, as they lend themselves to parallelisation very well. We’re also going to try and improve on our above tests, and I’ll keep blogging about the results as we manage to do that.
Rather tantalisingly, there’s a comment on Dan Kegel’s C10K page saying that “Doug Royer noted that he’d gotten 100,000 connections on Solaris 2.6 while he was working on the Sun calendar server”. but doesn’t give any details of the hardware involved. But still, 100,000 connections, on 2.6! It gives me hope that with more tuning, the T2000 might be capable of scaling beyond the 83,000 we had.
If I develop some more free time, I also hope to use the machine to instrument Apache httpd (and maybe apr) for dtrace. Do check out Matty’s mod_dtrace though, for a cool module which instruments all of the handlers.
In the meantime, you can check out all of my blog posts about the Niagara box through my new Niagara category. Mads is also keeping tabs on other benchmarks taking place within the ASF community.
The Cheeky Part!
I don’t know what the status of Jonathan’s offer to be allowed to keep a server, at the discretion of the Niagara team, is – but we might as well give it a try.
Although we’re seriously considering the platform for the future, HEAnet doesn’t have a use for a Niagara box right now, but the other participants in our benchmarking efforts (and hopefully we’ll be blogging their results soon enough too) do – DCU’s Networking Society, RedBrick. RedBrick just celebrated 10 years as a networking society, and 5 years ago, Sun donated a massive E450 to the society, on which we ran our 2000 user shell server for 2 years.
We even pulled out all of the stops at the time, and had the Taoiseach (the Irish Prime Minister) turn out to launch the machine. I’m hoping we can convince SUN to donate the Niagara box to RedBrick, where they can use it for even more testing and benchmarking, as it really is an ideal machine for a shell environment. Lots and lots of low-memory parallel tasks.
So if you thought this round-up was of any use, digg it, link to it, or mail it to your local Sun Niagara team member, and we’ll see if we can be useful enough to merit a donation!
Sponsor Noirin!
Noirin is an Apache httpd-docs committer, and among other things has authored the Irish translation of the Apache error documents, improved the mod_rewrite documentation, did a massive overhaul on the mod_ssl documentation (now it no longer references things which havn’t existed in 5 years!), added event MPM documentation and fixed a load of grammar and spelling mistakes. Although she normally lives in Dublin, she’s currently spending a year in Munich on a European Exchange (erasmus) study programme (she’s doing a Degree in Computational Linguistics). She’s one of httpd’s youngest committers and one of a very few women committers (which we’re trying to encourage!).
Annoyingly, although ApacheCon is in Dublin this year, she isn’t, and as a poor student – even getting to the hackathon isn’t very easy for her – with the World Cup in Germany adding a lot to the flight costs around the time. It would cost her about 400 euro just to be in Dublin for the hackathon, and ApacheCon itself costs around 900 euro. So she’s looking for sponsors, to see if anyone can sponsor her. Corporate or otherwise.
Busy, Busy, Busy
I didn’t blog at all during January, and I didn’t get to code as much on Apache stuff as I would have liked either, and it looks like it’s going to be like this for a while now, and I’m now finally at a stage why I can explain why I’m so busy!
We’re building a new data-centre, over a pretty short period of time (has to be occupyable – by servers – on May 1st) and believe me, this is no small amount of work. We’ve been running tender evaluations, designing cabinet layouts, working out budgets, negotiating SLA’s and contracts and lots more besides. As the build progresses I’ll try to blog about it, including photos and so on. We’re doing some things in a little bit of an unusual way and I’ll try and explain our reasoning along the way. Hopefully this will prove of use to others too.
But before that, I should cover a little of what I’ve been up to in the last 6 weeks.
3 weeks ago, I went over to visit Nóirín, and we went up Zugspitze where we had a great, if somewhat cold, time spending the night in an Igloo! You can read Nóirín’s write up on it here and there’s a bunch of photos too.

While in Munich, we also caught a Jacques Loussier gig, which I thought was a bit odd to be honest, but was good to have been at nonetheless.
On the Digital Rights Ireland front, we’ve been working hard to be in a position to accept donations as well as handling some behind-the-scenes legal work. This week, along with the TCD Dublin Legal Workshop, we’ll be hosting a talk from Suw Charman on Friday. This should be great, and if you can make it all, please do. The DRI invite and write up is here.
On the Apache front, one of the really annoying aspects of being so busy is that I havn’t been able to find the time to do much coding, I had to back out of the execd work I had started, but hopefully in a few months I’ll get a chance to get back to it. To mitigate my own sense of guilt over this, I volunteered to RM the latest httpd 2.0.x release, and I’m glad I did. It’s a lot of work, it took me at least 60 hours to get 2.0.x into a releasable state (we’re now waiting on some licensing issues to be clarified before a candidate is rolled) – but unlike coding, this work can be easily broken up. It’s possible to do 15 minutes or an hour here and there and have it all add up productively.
When I code, I need to do it in large uninterruped blocks or I lose my concentration and start being unproductive. If you’re involved in any Open Source projects, I’d say volunteering to RM is a great way of contributing when you don’t have the space to get a load of coding done, but want to help nonetheless. Though get very very familiar with how to manage code merges!
I’m considering proposing two talks for ApacheCon Europe, but before I do, it’d be useful to hear any feedback on what people would like to hear, my current ideas are:
- Scaling Apache httpd to 50,000 concurrent users
This talk would be an update of the talk I gave last year, only now with even bigger numbers. It would include the standard tuning/benchmarking basics but also new things like the pluggable schedulers in Linux, the siege utility, the event MPM in much more detail (and how it improves performance over worker), the new graceful-stop feature and how that helps, our experiences on the Itanium platform and Itanium-specific tunings and a bit on mod_ftp thrown in for good measure.
- IPv6 at the ASF
This talk would be a few things in one. A brief introduction to IPv6 from the point of view of a typical user of ASF software (mostly server software), the common platform bugs and how to avoid them, a survey and report of IPv6 support in all ASF software (I pretty much have this part done), and then some details on IPv6 from an ASF developer point of view, what’s needed and so on, using APR as an example (we have a load of bug-workarounds in the APR IPv6 code – it’s one of the best sources of platform bug documention).
If these interest you, or turn you off, or if you can think of anything else better, do tell!
Update:
A reader got in contact with me to ask how the trial of the latest Sun kit went. Like I blogged last December, Sun announced a free trial of their Niagara boxes for people to determine how good they are and to consider buying some. As far as I can tell, this trial is vapourware. We never heard back, despite filling in the form again and mailing just to be sure. A few other people we’ve talked too attested to a similar experience. I guess Sun still suck.
Update 2:
A look at Sun’s revised form for the trial, shows that Ireland isn’t on the list of selectable countries, which might explain why we never heard back. What a load of crap. Sun definitely suck.
Apachecon Europe 2006 in Dublin
After very nearly happening some time ago, some backroom lobbying, lots of hard work and slogging from the planners and who knows what else, Apachecon Europe 2006 will be in Dublin, Ireland!
From: Rich BowenDate: Fri, 17 Feb 2006 15:30:11 -0500 To: announce@apachecon.com Subject: ApacheCon EU 2006 The ApacheCon Planners are pleased to announce that ApacheCon Europe 2006 will be held in Dublin, Ireland, at the Burlington Hotel (http://www.jurysdoyle.com/ireland/doyle_burlington.htm), June 26-30. Further details to follow as they are available. CFP to follow shortly. Please feel free to spread this information far and wide. -- Rich Bowen
Apachecon’s are absolutely brilliant events, and not just for us committer folk, they are a great place to meet the people who make software tick, discuss new ideas, learn new skills and have a great time.
(large version). Pictured (Clockwise from bottom left): Dirk-Willem van Gulik, Me, Henri Yandell, Greg Stein, Wilfredo Sánchez, Sander Striker, Justin Erenkrantz at the ApacheCon US 2005 Hackathon. More here from Julian Cash.
It will be in swish Burlington Hotel, a mere 10 minutes talk from the City Centre, a stop on the Aircoach route and well-connected by major roads. See you there!
Justin Mason is a God
… at least according to Tim Bray.

He’s also got a photo of a seemingly disinterested me;

… getting a demo of the dtrace stuff. In fairness, dtrace is really impressive, and a few of us httpd committers hope to add dtrace inspection points to Apache over the coming months, it looks really useful. Matty has already done some really useful work in this regard over here, where you’ll find mod_dtrace and some other interesting examples. They should be taken with a pinch of salt though, Matty isn’t quite getting how pools work and some of his examples misinterpret what’s going on, but still it’s excellent work all the same.
I thought Tim’s keynote was actually very good, but I found one part of it really dissapointing. Their raw numbers on the Niagara system were totally bogus, he quoted a figure of 25,000 requests/second from the box, both in the presentation and the article, but it’s really not. It can do way way better. We had a close look at the system and found amongst other things that the httpd was a 32-bit build, which was a bad start, and as Tim points out the benchmarking was being done with ab from a laptop. Since our dual Itanium box can push 25,000 reqs/second in its sleep, I’m guessing their Niagara box can easily push at least 4 times that number, especially with the event MPM. It maxed at 290 Mbit/second too, which is quite low (despite what Tim might think). Just last week we shipped 1.2Gbit/sec in production without actually noticing much at all, so again I’d say the Niagra box can at least quadruple that kind of number. Of course it’s great that the numbers were wrong in the conservative direction, the opposite of the usual corporate PR.
The guys mentioned that they have some detailed Specweb stats, so it’ll be neat to see those. The platform did really “feel” fast (by which I mean responsive), not the usual sparc sense of treacle-like slowness, and dtrace really is an amazing utility. I know I’ll be looking very very seriously at the platform, and I already like their low-end X64 boxes as an alternative to Dell. So I have to agree with God, Sun really have made a gigantic sea-change, and it is kind of mind-blowing. Good stuff!
Update: I’ve applied for a free box from Sun for 60 days, to benchmark it myself, thouroughly.
@ Apachecon US
Apachecon US is starting in earnest now, and so far it’s great. The hackathons on Saturday and Sunday went well, with something like 80 committers tapping away. We made some progress on mod_ftp, some httpd stuff and I have some apr examples build system stuff sitting around to commit now. It was surprisingly productive. Yesterday myself and Justin had lunch with Cory Doctorow, which was really productive and he’s helped us out with a great amount of contacts and advice for Digital Rights Ireland. We also got a brief mention in this keynote, which is very cool. There are already over 450 attendees registered too.

On Friday I went into downtown San Diego and spent a few hours at LISA (interupted by the World Cup draw) and the Gaslight district before heading out to a great music store way out in the suburbs. Cool place. The weather here is amazing, and it’s as hot as any Irish summer every lunch time. Considering this is December, I have to wonder what this place gets like in the middle of summer. The Navy docks are huge, and it can be a bit surprising to a suddenly see 5 stories of submarine cruising past.
Apache httpd 2.2.0 released
After 3 years of cumulative work and 10 years to the day since the release of Apache 1.0.0, we’ve released Apache httpd 2.2.0, here’s the announcement, and the list of major new features.
This version of Apache httpd has been burned in on pretty major sites, so here’s to hoping it gets some good reviews. Happy Birthday Apache!


