Archive for 'niagara'

More Ubuntu on T2000

Posted on April 13, 2006, under apache, general, niagara.

Fabbione, the Ubuntu-sparc maintainer, got in touch and helped me out with our Ubuntu on T2000 issues. Turns out that the installing dapper can be a bit sensitive to changes in the archive while you’re installing it. A re-install fixed the problems, and this issue will totally dissappear once dapper is marked stable – as the archive will settle down.

He also pointed me at Dave Miller‘s latest kernel with T1 fixes, which I’ve built from git, and showed me the way to the libc6-sparcv9v and libc6-sparc64v packages, which contain runtime optimisations for the platform. And the result is stunning. Ubuntu is now outperforming even Solaris express, and we’re sustaining 22,183.43 requests per second – using out of the box Apache 2.2.0. Not a single kernel tuning, Apache tuning, or anything beyond “CFLAGS=-Os” applied.

Frankly, I’m amazed. Amazed enough that I’ve rebooted into Nevada twice now just to confirm it’s not a change in the test environment. This machine just gets better and better, and Linux/Ubuntu really helps it get there. Of course things like hardware SSL acceleration don’t quite work in Linux yet, but I’m sure it’ll get there.

Now I know we’re going to be buying a few of T1 boxes this year, and although I’ll be using Solaris for debugging and development work (where it is a superb environment), it’s looking less and less attractive for production deployments.

Ubuntu on a T2000

Posted on April 12, 2006, under apache, general, niagara.

After going through the pain of setting up a rarp daemon and debugging a very odd tftp problem (it turns out Sun’s OBP tftp client and tftpd-hpa do not interoperate), we finally got Ubuntu installed on the T2000. I absolutely love old-school boot managers. I just really like the simplicity of being able to connect to a serial line (especially from our Cyclades), and do anything. With the T2000 I can interrupt to the OBP, but also to the ALOM (advanced lights out manager) and the system controller. It’s great, but it could do with a “boot pxe” option :-)

The debian-install system isn’t as optimised for 9600 baud serial as the Nevada or Solaris 10 installers were, so it is occasionally annoying to sit through the screen refresh, but overall the install took much less time (about 30 minutes, compared to about 1 hour and 30 minutes) and worked without any big problems.

The ubuntu sparc distribution is a bit hairy right now, and I wouldn’t recommend it for anyone who isn’t very experienced with linux. For example, out of the box, /dev/null, /dev/zero, /dev/random and more have the wrong permissions – which breaks a lot of things – and the dummy start-stop-daemon binary has been left installed, which means that almost no init scripts actually work.

However, for all that, apt does work – and it kicks the living daylights out of anything Solaris has to offer in the way of patch or package management. It took me about 3 minutes to get a fully working compile environment up and running, and it took about 20 seconds to get the box fully up to date. Sun badly need this kind of ease-of-deployment, or there’s simply no way people like me could consider deploying Solaris on hundreds of machines. We keep over 100 Debian boxes up-to-date in about an average of 15 minutes work per week, using a combination of apt, sudo and our own apticron. Solaris simply has nothing to compete.

Ok, problem number 1 is that Solaris isn’t a distro, so it can’t really provide updates for most packages, but there isn’t even a credible way of upgrading the native Solaris-bundled stuff. Solaris may be great at providing the tick-box-compatible certification neccessary for running many commercial applications, but it really is dire when compared to a good Linux distro in terms of ease of administration. It’s literally a waste of time. Everyone in Sun who works on Solaris should be forced to abandon BFU and start using what their customers have to deal with. Allow to bake for about 10 weeks, and out will come apt.

Anyway, back to Ubuntu;

colmmacc@murphy:~$ cat /proc/cpuinfo
cpu             : UltraSparc T1 (Niagara)
fpu             : UltraSparc T1 integrated FPU
prom            : OBP 4.19.0 2005/10/27 17:24
type            : sun4v
ncpus probed    : 32
ncpus active    : 32
D$ parity tl1   : 0
I$ parity tl1   : 0
Cpu0Bogo        : 2001.08
Cpu0ClkTck      : 000000003b9aca00
Cpu1Bogo        : 2000.02
Cpu1ClkTck      : 000000003b9aca00
Cpu2Bogo        : 2000.02
Cpu2ClkTck      : 000000003b9aca00
Cpu3Bogo        : 2000.01
Cpu3ClkTck      : 000000003b9aca00
Cpu4Bogo        : 2000.03
Cpu4ClkTck      : 000000003b9aca00
Cpu5Bogo        : 2000.03
Cpu5ClkTck      : 000000003b9aca00
Cpu6Bogo        : 2000.01
Cpu6ClkTck      : 000000003b9aca00
Cpu7Bogo        : 2000.02
Cpu7ClkTck      : 000000003b9aca00
Cpu8Bogo        : 2000.03
Cpu8ClkTck      : 000000003b9aca00
Cpu9Bogo        : 2000.03
Cpu9ClkTck      : 000000003b9aca00
Cpu10Bogo       : 2000.02
Cpu10ClkTck     : 000000003b9aca00
Cpu11Bogo       : 2000.02
Cpu11ClkTck     : 000000003b9aca00
Cpu12Bogo       : 2000.03
Cpu12ClkTck     : 000000003b9aca00
Cpu13Bogo       : 2000.02
Cpu13ClkTck     : 000000003b9aca00
Cpu14Bogo       : 2000.02
Cpu14ClkTck     : 000000003b9aca00
Cpu15Bogo       : 2000.02
Cpu15ClkTck     : 000000003b9aca00
Cpu16Bogo       : 2000.03
Cpu16ClkTck     : 000000003b9aca00
Cpu17Bogo       : 2000.02
Cpu17ClkTck     : 000000003b9aca00
Cpu18Bogo       : 2000.02
Cpu18ClkTck     : 000000003b9aca00
Cpu19Bogo       : 2000.02
Cpu19ClkTck     : 000000003b9aca00
Cpu20Bogo       : 2000.03
Cpu20ClkTck     : 000000003b9aca00
Cpu21Bogo       : 2000.02
Cpu21ClkTck     : 000000003b9aca00
Cpu22Bogo       : 2000.02
Cpu22ClkTck     : 000000003b9aca00
Cpu23Bogo       : 2000.02
Cpu23ClkTck     : 000000003b9aca00
Cpu24Bogo       : 2000.03
Cpu24ClkTck     : 000000003b9aca00
Cpu25Bogo       : 2000.02
Cpu25ClkTck     : 000000003b9aca00
Cpu26Bogo       : 2000.03
Cpu26ClkTck     : 000000003b9aca00
Cpu27Bogo       : 2000.02
Cpu27ClkTck     : 000000003b9aca00
Cpu28Bogo       : 2000.03
Cpu28ClkTck     : 000000003b9aca00
Cpu29Bogo       : 2000.02
Cpu29ClkTck     : 000000003b9aca00
Cpu30Bogo       : 2000.02
Cpu30ClkTck     : 000000003b9aca00
Cpu31Bogo       : 2000.02
Cpu31ClkTck     : 000000003b9aca00
MMU Type        : Hypervisor (sun4v)
CPU0:           online
CPU1:           online
CPU2:           online
CPU3:           online
CPU4:           online
CPU5:           online
CPU6:           online
CPU7:           online
CPU8:           online
CPU9:           online
CPU10:          online
CPU11:          online
CPU12:          online
CPU13:          online
CPU14:          online
CPU15:          online
CPU16:          online
CPU17:          online
CPU18:          online
CPU19:          online
CPU20:          online
CPU21:          online
CPU22:          online
CPU23:          online
CPU24:          online
CPU25:          online
CPU26:          online
CPU27:          online
CPU28:          online
CPU29:          online
CPU30:          online
CPU31:          online

Nice! But how does it perform? Surprisingly well actually. Linux performs much better on the single-threaded I/O test. Again, I’ll hold off on the graphs, but the dd tests under Linux were roughly 10 times faster than Solaris. In rather an odd way, this didn’t translate into better single-download performance though – instead, under Linux we could only push about 60MB/sec.

And for the number that really matters; requests per second. Linux managed a neat 18,210.57 requests per second, which is within 10% of Nevada and more or less identical to Solaris 10. I should point out that the ridiculous Linux OOM killer did rear its ugly head during our more insane scalability testing though (try starting 500,000 threads due to a typo and you’ll find out all about it!). Solaris handles OOM much more gracefully IMO.

Another note is that “make -j 64″ under Linux built Apache (and apr, apr-util) in about 2 minutes, compared to 5 minutes for “dmake -j 64″ under Solaris, but that could be due to almost anything, it probably isn’t indicative of better FLOP performance. All in all, I’d say that Linux – which only started working on T1 a few short weeks ago – compares reasonably favourably with Solaris on the T2000. I’d also say that the T2000 is even better value for money with this information in hand, because it presents a greater range of options.

I wouldn’t rush out and run Linux on the box in production quite yet though. Ubuntu-sparc still needs some work, and there are doubtless many T1 kernel bugs yet to be found and ironed out. dtrace still represents a huge win on Solaris, and when the T1000 arrives, I can see us running dtrace dozens of times a day – it’s already helped me determine a huge amount of information useful for tuning Apache, and for reworking some of the code. But that said, for a production environment, once the Linux kernel and the Ubuntu distro get a bit more stable, they would be my personal choices for the T2000. The working day is just too short to sacrifice the wins of apt.

This is probably my last really large post on the T2000, as in a week, I’m going to drive it over to RedBrick, but it’s been great fun benchmarking the box, and I’m very, very grateful to Sun – for both their kind donations, and the opportunity to test the platform. It is very, very impressive kit. My Scaling Apache talk is going to be on at ApacheCon EU 2006 (check out the new logo btw), and I’ll be including some more detail there too, including some of the more useful information we’ve gleaned from using dtrace (I’m currently working on a per-nanosecond break down of an Apache request – how cool is that!), so do come along to that.