I wish I had saved some MRTG graphs, but the R510 has now replaced a decrepit 2-CPU, SATA based generic shitbox as the sole MTA and MUA for close to 10,000 users. Disk IO is 10X faster (the 2x12-core CPU probably 100X). The old box used to spike to load averages well above 100 whenever the Monday morning newsletter got sent out to all 10,000 recipients, or when some hapless user forwarded their entire inbox to Hotmail. No more. I have yet to see the load average spike above 3. Flat-line. Everyone gets their email a few seconds after it's sent. Best Box Ever.
By the way we upgraded the MTA/MUA software, CommuniGate Pro, at the same time. If you have the bucks, buy it. It's isn't a nightmare to install and configure, like Sendmail or Postfix; support is excellent; it has a web browser interface for users too inept to install Thunderbird; and, unlike Exchange, is standards compliant and doesn't need a $5000 war chest of tools for backup and administration.
[ view entry ] ( 21 views ) | permalink
Still doomed to have all-local storage on my hosts, I desperately needed something new to host email services for 5000 people. It's a worst-case scenario - 6 million tiny files. We try to spread it out over as many filesystems as possible. The old box has an oldish OS, three XFS filesystems, and is at 100% iowait a lot of the time.
I selected a Dell R510 since Sun is basically out of business (all the sales people seem to have been sacked, and Oracle doesn't seem to have realized yet that Sun made computers.) I selected Xeon 5650 processors to take advantage of a 1.3Ghz bus, and got the box fully loaded with 14 disks. The disks have been configured as 7 RAID1 devices.
The proof is in the numbers: Here is some iostat output while testing the (ext3) filesystems by rsyncing one filesystem with 2 million million files to 2 other filesystems:
avg-cpu: %user %nice %sys %iowait %idle
3.42 0.00 2.64 6.63 87.32
Device: rrqm/s wrqm/s r/s w/s rsec/s wsec/s rkB/s wkB/s avgrq-sz avgqu-sz await svctm %util
sdb1 153.07 137.53 1915.09 7.70 180466.57 1161.82 90233.28 580.91 94.46 1.76 0.92 0.36 69.68
sdc1 0.00 11401.15 3.05 253.77 23.99 93244.58 11.99 46622.29 363.16 60.19 234.41 1.69 43.51
sdd1 0.00 10175.01 1.10 299.80 8.80 83828.49 4.40 41914.24 278.62 46.66 153.69 1.57 47.26
The formatting is a mess, but basically I'm getting 1800+ read iops and 500+ write iops per second through the R510 H700, and it's still loafing. In addition, in each generation of PowerEdges the out of band (iDRAC) management and server monitoring tools have gotten a little better, until they are finally easy to set up. Not too bad.
[ view entry ] ( 94 views ) | permalink
Is anyone else having this problem with Nagios 3.2.1 (the current version)? It seems to go insane from time to time, and when I look, ownerships are messed up on nagios.cmd and config files, and nagios.cmd is occasionally transmogrified from a named pipe to a plain file (with the wrong ownership as well.) This causes Nagios to basically go insane, plugins don't report back, and my active checks all break, and everyone gets paged for no reason.
I've seen some comments that imply SELinux might somehow be responsible, and only on Red Hat / Centos. I'm running SELinux in "permissive" mode but I might as well get rid of it altogether. I'll report back if anything (doesn't) happen.
[ view entry ] ( 129 views ) | permalink
Until I started this job, I didn't know about Whiptail. No link to project page here - it doesn't seem to exist as an Open Source project anywhere, but it comes with most Linux distros. This app uses curses to pop up dialogs, forms, and lists in a terminal. How come I didn't know about it until recently? I wouldn't have had to learn all that fancy web stuff.
[ view entry ] ( 133 views ) | permalink
I needed a new server, and management found me an X4100 at a garage sale. Not a bad server, but it's been EOLed by Sun, and Sun either never shipped mpt tools with the box, dropped support, or they got tossed in the dumpster when Oracle moved in.
Anyway, you will probably want to monitor your LSI MPT raid if you find one, so here's how to do it if your distro does not come with the "mpt-status" command:
- Obtain mpt-status from http://freshmeat.net/projects/mptstatus/
- Obtain the X4100 resource CD from Sun. You may have to pay for this. Hopefully you got one with your box. I have an ISO file called X4100_X4200_ResourceCD_4.
- Install the mpt driver from the RPMs on the CD: mptlinux-4.00.05.00-1-rhel5.x86_64.rpm
- Activate the mptctl driver (your distro should have come with mptbase and mpt sas): "/etc/rc3.d/S99fusion.mptctl start". Set up an rc3.d link to start this driver on boot!
- You should see mptctl, mptsas, mptscsih (maybe), and mptbase in the output of lsmod at this point. If not, keep hunting for drivers.
- Also on the Sun CDROM is mptlinux-4.00.05.00-src.tar.gz. Create the directory and extract this source into /tmp/mptlinux-4.00.05.00-src.
- Extract the mpt-status source into /tmp/mpt-status-1.2.0.
- Edit the Makefile with:
KERNEL_PATH := /usr/src/kernels/2.6.18-164.15.1.el5-x86_64/include
CFLAGS := -Iincl -Wall -W -O2 \
-I${KERNEL_PATH} \
-I/tmp/mptlinux-4.00.05.00-src/message/fusion
- Make and - it works!
# ./mpt-status -i 2
ioc0 vol_id 2 type IM, 2 phy, 67 GB, state OPTIMAL, flags ENABLED
ioc0 phy 1 scsi_id 4 SEAGATE ST973401LSUN72G 0556, 68 GB, state ONLINE, flags NONE
ioc0 phy 0 scsi_id 3 SEAGATE ST973401LSUN72G 0556, 68 GB, state ONLINE, flags NONE
[ view entry ] ( 194 views ) | permalink

Search



