Here's the hardware setup:
- Storage array: MSA P2000 G3 10GbE iSCSI with dual controllers, each with a pair of 10GbE NICs (total 4 x 10GbE). The system includes two D2700 expansion units.
- Switch: 2 x HP BladeSystem 6120XG.
- Servers: HP BL460c Gen8 - one of the models with the HP FlexFabric 10Gb 2-port 554FLB FlexibleLOM, which provides hardware accelerated iSCSI.
- Hypervisor: VMware ESXi 5.1.
- OS: SUSE Linux Enterprise Server 11 SP2.
- Enterprise drives: 900GB 6G SAS 10K rpm SFF (2.5"); tested with a group of 10 in RAID-10
- Midline drives: 1TB 6G SAS 7.2K rpm SFF (2.5"); tested with a group of 14 in RAID-10
The diagram below shows how the pieces are connected. Each array controller has one port connected to each of the two switches. The ports going to one switch are in one iSCSI VLAN and the ports going to the second switch are in a second VLAN; this allows multipath clients to choose the most direct path and ensures paths aren't crossing the uplinks between the switches.
Inside the blade enclosure each server has two connections, one to each switch. On the first switch the untagged VLAN for the server ports is the first iSCSI VLAN, and on the second switch it's the second VLAN. The hardware iSCSI initiator uses the untagged VLAN; there are also tagged VLANs on each server port, used for VMware management and virtual machine networks.
Wiring diagram |
There are also interlinks between the switches, and uplinks to top-of-rack switches, none of which carry iSCSI traffic.
The hardware initiators show up in VMware as storage HBAs and each is assigned an IP in the corresponding iSCSI VLAN. The NICs also show up as a pair of regular ethernet NICs and can be teamed as you'd expect for load balancing management and virtual machine networks.
I used fio to benchmark the enterprise and midline drives separately. Because I was looking to simulate an Oracle ASM system with enterprise drives doing on-disk backup to midline drives, the filesystem setup is a bit inconsistent: the enterprise drive tests use a plain disk (VMDK) while the midline tests use an ext3 filesystem on top of a disk. The effect should be fairly minimal.
Sequential I/O, 1 MB blocks:
- Enterprise (10K RPM) disk group: read 366429 KB/sec (split across 16 parallel jobs)
- Midline (7.2K RPM) disk group: read 380451 KB/sec (1 job only)
- Separately, I measured a write speed of 348642 KB/sec
The enterprise sequential numbers should have been higher but I was using a ridiculous number of parallel jobs (16) to simulate a particular workload. The midline disk group also has more spindles (14 midline drives compared to 10 enterprise disks) which may have helped.
Random I/O, 4 parallel jobs, 8 KB blocks, 50% read, 50% write:
- Enterprise (10K RPM) disk group: read 5481 KB/sec, write 5482 KB/sec, total 10963 KB/sec
- 1370 IOPS, read latency 5.3 +/- 3.2 ms, write latency 0.51 +/- 0.46 ms
- Midline (7.2K RPM) disk group: read 3611 KB/sec, write 3602 KB/sec, total 7213 KB/sec
- 901 IOPS, read latency 8.3 +/- 5.3 ms, write latency 0.51 +/- 0.98 ms
You can see the effect of the slower seeks here - the difference in bandwidth is noticeably proportional to the difference in rotational speed. The IOPS numbers above are reasonably close to the theoretical (1/seek time) x number of drives in RAID-10.
The low write latencies are due to write cache on the storage array. This was kind of counter-intuitive - the result is that random writes are faster than random reads, at least until the cache fills up. I tested with a 50% read/write mix but when I turned it up to 90% read/write my total throughput went way down.
No comments:
Post a Comment