Archive for March, 2006
Ok, so yet more generosity from Sun today. I’ve been told we can keep the T2000! Incredibly, the T1000 is still on its way, and that will be with us some time next week – but I can now recycle the cardboard from the T2000. Brilliant!
We would actually be more than comfortable deploying a T2000 as a host for ftp.heanet.ie, but things are not that simple. We’d have to migrate the 12TB of data from XFS to UFS, for a start. We’d also prefer to buy a server, with a support contract and so on (update: the donations actually come with SunService contracts), anyway. But the T2000 would give us better performance, and a much better NFS stack, which greatly expands our options for the future. When we next purchase an ftp.heanet.ie, Niagara is at the top of our list.
So, the T2000 will go to RedBrick, and we’re working on something else for the T1000. In the immediate future, it will serve as an excellent development and testing machine for Apache, and we’ll use it for the dtrace work (now well underway, by the way). It will also be incredibly useful for us to have a machine for ongoing comparison purposes, and maybe even some OpenSolaris hacking.
Stay tuned for more benchmarking work though, results will be coming soon.
Update: SUN have also very kindly offered to cover Nóirín’s travel costs to ApacheCon Europe in Dublin (read how this will help her, here). Nóirín has been doing a lot of work behind the scenes editing my blog-posts into readable English (and also improving the spelling to that above the level of a 10 year old!) and with Sun’s help, the universe has been particularly efficient at getting the kharmic reward in order.
Good news, SUN are donating a Niagara T1000 to us (and hence RedBrick), excellent! Damien, from the Sun performance team here in Ireland, called to tell us it’s on its way. We should have it in the next week or two. This isn’t part of the original contest, it’s from the Sun engineering team, and we’re very grateful. I’m sure RedBrick will put it to excellent use, and we’ll get them thinking about a server name right away!
After doing a lot of analysis today, we’ve figured out what was up with the benchmarks – we were inserting a systematic delay of about 30ms into every connection. How? By not using epoll() on the client side. Turns out that the blocking select() time in the clients was actually a very significant factor in how quickly the benchmarks were running.
What this is means is that although the results from the previous benchmarks are almost certainly still in the correct order, and that the comparison of relative performances is valid, the absolute requests per second numbers are invalid, and there’s a lot of room for improvement, probably beyond 25,000 requests per second. I’m now planning to give everything a serious try with the latest Nevada build, instead of update 1, which is what we’re currently running.
We also have a list of other cool things to try out, and from the sounds of things, the guys in EastPoint have extremely extensive knowledge about tunings and just what levels of performance can be achieved. As a lot of their work is being opened up as part of the OpenSolaris effort, there may well be some great ways Sun and Apache can help each other out. Hopefully, we might even see some of them at ApacheCon.
We’ve still got over a month with the T2000 left though, and plenty more to try out on it, so we’re not letting that resting just yet. It’s a very, very strong candidate for the next ftp.heanet.ie iteration, and it looks like there may be a solution within UFS to our single-threaded I/O numbers. ZFS would definitely help out there too.
It’s hard for me to look neutral anymore, since we’ve been donated a box, or maybe it’s easier – since we don’t really have anything to gain – but I have to say that I am very, very impressed, not only by Sun’s latest technologies and their approach to engineering (which has always been good), but also by the genuinely open and receptive nature they’ve shown. I’ve had contact with a few Sun employees during this trial, and they’ve all been extremely helpful. There was never any hint of pressure, or even corporate schmooze, just simple and honest advice, which is good to see. I still think Solaris has a bit to go before we could mass-deploy it in production again, mainly to do with the lack of a decent packaging system. That said, Sun now looks like an organisation which is genuinely receptive to feedback, so before the 60 days are up, I’ll try and write a blog post comparing Solaris to other platforms, for common administrative and automation tasks.
Anyway, thanks Sun, especially all of the people who’ve been speaking to me over the past few days – we’ll put the server to good use.