Part 1: Initial Set-up and Testing
This blog post describes how we built a high-performing NAS server using off-the-shelf components and open source software (FreeNAS). The NAS has the following characteristics:
- total cost (before tax & shipping): $2,631
- total usable storage: 16.6 TiB
- cost / usable GiB: $0.16/GiB
- IOPS: 884 
- sequential read: 1882MB/s 
- sequential write: 993MB/s 
- double-parity RAID: RAID-Z2
[2014-10-31 We have updated the performance numbers. The old numbers were wrong (they were too low). Specifically, the old performance numbers were generated by a benchmark which was CPU-bound, not disk-bound. We re-generated the numbers by running 8 benchmarks in parallel and aggregating the results]
[Author’s note & disclosure: This FreeNAS server is a personal project, intended for my home lab. At work we use an EMC VNX5400, with which we are quite pleased—in fact we are planning to triple its storage. I am employed by Pivotal Labs, which is partly-owned by EMC, a storage manufacturer]
1. The Equipment
Prices do not include tax and shipping. Prices were current as of September, 2014.
- $375: 1 × Supermicro A1SAi-2750F Mini-ITX 20W 8-Core Intel C2750 motherboard. This motherboard includes the Intel 8-core Atom C2750F. We chose this particular motherboard and chipset combination for its mini-ITX form factor (space is at a premium in our setup) and its low TDP (thermal design power) (better for the environment, lower electricity bills, heat-dissipation not a concern). The low TDP allows for passive CPU cooling (i.e. no CPU fan).
- $372: 4 × Kingston KVR13LSE9/8 8GB ECC SODIMM. 32GiB is a good match for our aggregate drive size (28TB), for ZFS allocates approximately 1GiB RAM for every 1TB raw storage (i.e. we use 28GiB used for ZFS, leaving 4GiB for the operating system and L2ARC).
- $1,190: 7 × Seagate 4TB NAS HDD ST4000VN000. According to Calomel.org, ‘You do not have the get the most expensive SAS drives … Look for any manufacture [sic] which label their drives as RE or Raid Enabled or “For NAS systems.”‘ Calomel.org goes on to warn against using power saving or ECO mode drives, for they have poor RAID performance.
- $238: 1 × LSI SAS 9211-8i 6Gb/s SAS Host Bus Adapter. Although expensive SAS drives don’t offer much value over less-expensive Raid Enabled SATA drives, the SAS/SATA controller makes a big difference, at times more than three-fold. Calomel.org’s section “All SATA controllers are NOT created equal” has a convincing series of benchmarks.
- $215: 1 × Crucial MX100 512GB SSD. This drive is intended as a caching drive (ZFS’s L2ARC for reads ZFS’s ZIL’s SLOG for writes). We chose this drive in particular because it offered power-loss protection (important for synchronous writes); however, Anandtech pointed out in the section “The Truth About Micron’s Power-Loss Protection)” that “… the M500, M550 and MX100 do not have power-loss protection—what they have is circuitry that protects against corruption of existing data in the case of a power-loss.” We encourage the research of alternative SSDs if power-loss protection is desired.
- $110: 1 × Corsair HX650 650 watt power supply. We believe we over-specified the wattage required for the power supply. This poster recommends the Silverstone ST45SF-G, whose size is half that of a standard ATX power supply, making it well-suited for our chassis.
- $100: 1 × Lian Li PC-Q25B Mini ITX chassis. We like this chassis because in spite of its small form factor we are able to install 7 × 3.5″ drives, 1 × 2.5″ SSD, and the LSI controller.
- $31: 2 × HighPoint SF-8087 → 4 × SATA cables. Although we used a different manufacturer, these cables should work.
The inside view of the NAS. Note the interesting 3.5″ drive layout: 5 of them are in a column, and the remaining 2 (along with the SSD) are installed near the LSI controller
For easier installation of the power supply, we recommend removing the retaining screw of the controller card (it’s not needed). Note that in the photo the screw has already been removed.
The assembly is straightforward.
- Do not plug a 4-pin 12V DC power into connector J1—it’s meant as an alternative power source when the 24-pin ATX power is not in use
- Use a consistent system when plugging in SATA data connectors. The LSI card has 2 × SF-8087 connectors, each of which fan out to 4 SATA connectors. We counted the drives from the top, so the topmost drive was connected to SF-8087 port 1 SATA cable 1, the second topmost drive was connected to SF-8087 port 1 SATA cable 2, etc… We connected the SSD to SF-8087 port 2 cable 4 (i.e. the final cable).
3. Power On
There are two caveats to the initial power on:
- the unit’s power draw is so low that the Corsair power supply’s fan will not turn on
- make sure your VGA monitor is turned on and working (ours wasn’t)
4. Installing FreeNAS (OS X)
We download the USB image from here. We follow the OS X instructions from the manual; Windows, Linux, and FreeBSD users should consult the manual for their respective operating system.
4.1 OS X: Destroy USB drive’s Second GPT Table
If you have previously-formatted your USB drive with GPT (not MBR) partitioning, you will need to wipe the second GPT table as described here. These are the commands we used. Your commands will be similar, but the sector numbers will be different. Be cautious.
# use diskutil list to find the device name of our (inserted USB) diskutil list # in this case it's "/dev/disk2" diskutil info /dev/disk2 # "Total Size: 16.0 GB (16008609792 Bytes) (exactly 31266816 512-Byte-Units)" # Determine the beginning of the final 8 blocks (512-byte blocks, final 4kB): # 31266816 - 8 = 31266808 # wipe the last 4k bytes sudo dd if=/dev/zero of=/dev/disk2 bs=512 oseek=31266808
4.2 OS X: Create FreeNAS USB Image
Per the FreeNAS user manual:
cd ~/Downloads xzcat FreeNAS-188.8.131.52-RELEASE-x64.img.xz > FreeNAS-184.108.40.206-RELEASE-x64.img # use diskutil list to find the device name of our (inserted USB) diskutil list # in this case it's "/dev/disk2" sudo dd if=FreeNAS-220.127.116.11-RELEASE-x64.img of=/dev/disk2 bs=64k
5. Boot FreeNAS
We do the following:
- place the USB key in one of the black USB 2 slots, not one of the blue USB 3 slots (USB 3.0 support is available if needed, check the FreeNAS User Guide for more information.
- connect an ethernet cable to the ethernet port that is closest to the blue USB slots
- turn on the machine: it boots from the USB key without needing modified BIOS settings
We see the following screen:
The FreeNAS console. Many basic administration tasks can be performed here, mostly related to configuring the network. As our DHCP server has supplied network connectivity to the FreeNAS, we are able to configure it via the richer web interface
6. Configuring FreeNAS
We log into our NAS box via our browser: http://nas.nono.com (we have previously created a DNS entry (nas.nono.com), assigned an IP address (10.9.9.80), determined the NAS’s ethernet MAC address, and entered that information into our DHCP server’s configuration).
6.1 Our first task: set the root password.
6.2 Basic Settings
- click the System icon
- Settings → General
- Protocol: HTTPS
- click Save
We are redirected to an HTTPS connection with a self-signed cert. We click through the warnings.
- click the System icon
- System Information → Hostname
- click Edit
- Hostname: nas.nono.com
- click OK
We enable ssh in order to allow us to install the disk benchmarking package (bonnie++). We enable AFP, for that will be our primary filesharing protocol. We also enable iSCSI for our ESXi host. We enable CIFS for good measure (we don’t have Windows clients, but we may in the future).
- click the Services icon
- click the SSH slider to turn it on
- click the wrench next to the SSH slider.
- check Login as Root with password
- click OK
- click the AFP slider to turn it on
- click the CIFS slider to turn it on
- click the iSCSI slider to turn it on
- click the SSH slider to turn it on
6.3 Create ZFS Volume
We create one big volume. We choose ZFS’s RAID-Z2  :
- click the Storage Icon icon
- click Active Volumes tab
- click ZFS Volume Manager
- Volume name Tank
- under Available disks, click + next to 1 – 4.0TB (7 drives, show) (we are ignoring the 512GB SSD for the time being)
- Volume Layout: RaidZ2 (ignore the non-optimal  warning)
- click Add Volume
7. Enable Filesharing
7.1 Create User
We create user ‘cunnie’ for sharing:
- From the left hand navbar: Account → Users → Add User
- Username: cunnie
- Full Name: Brian Cunnie
- Password: some-password-here
- Password confirmation: some-password-here
- click OK
7.2 Create Directory
ssh email@example.com mkdir /mnt/tank/big chmod 1777 !$ exit
7.3 Create Share
- Click the Sharing icon
- select Apple (AFP))
- click Add Apple (AFP) Share
- Name: big
- Path: /mnt/tank/big
- Allow List: cunnie
- Time Machine: checked
- click OK
- click Yes (enable this service)
7.4 Access Share from OS X Machine
- switch to finder
- press cmd-k to bring up Connect to Server dialog
- Server Address: afp://nas.nono.com
- click Connect
- Name: Brian Cunnie
- Password: some-password-here
8. Benchmarking FreeNAS
We use bonnie++ to benchmark our machine for the following reasons:
- it’s a venerable benchmark
- it allows easy comparison to Calomel.org’s bonnie++ benchmarks
We use a file size of 80GiB to eliminate the RAM cache (ARC) skewing the numbers—we are measuring disk performance, not RAM performance.
ssh firstname.lastname@example.org # we remount the root filesystem as read-write so that we # can install bonnie++ mount -o rw / pkg_add -r bonnie++ # we add root to sudoers because that will allow us # to run bonnie++ as a _non-root_ user, which it requires. cat >> /usr/local/etc/sudoers <<EOF root ALL=(ALL) NOPASSWD: ALL EOF # create a temporary directory to hold bonnie++'s # scratch files mkdir /mnt/tank/tmp chmod 1777 !$ # 9 series of runs, 8 jobs in parallel, median value # kick off 8 jobs (8 cores) to minimize CPU-bottleneck foreach I (0 1 2 3 4 5 6 7 8) ( sudo -u nobody bonnie++ -m "RAIDZ2_8C" -r 8192 -s 81920 -d /mnt/tank/tmp/ -f -b -n 1; date ) >> /mnt/tank/tmp/bonnie.txt & ( sudo -u nobody bonnie++ -m "RAIDZ2_8C" -r 8192 -s 81920 -d /mnt/tank/tmp/ -f -b -n 1; date ) >> /mnt/tank/tmp/bonnie.txt & ( sudo -u nobody bonnie++ -m "RAIDZ2_8C" -r 8192 -s 81920 -d /mnt/tank/tmp/ -f -b -n 1; date ) >> /mnt/tank/tmp/bonnie.txt & ( sudo -u nobody bonnie++ -m "RAIDZ2_8C" -r 8192 -s 81920 -d /mnt/tank/tmp/ -f -b -n 1; date ) >> /mnt/tank/tmp/bonnie.txt & ( sudo -u nobody bonnie++ -m "RAIDZ2_8C" -r 8192 -s 81920 -d /mnt/tank/tmp/ -f -b -n 1; date ) >> /mnt/tank/tmp/bonnie.txt & ( sudo -u nobody bonnie++ -m "RAIDZ2_8C" -r 8192 -s 81920 -d /mnt/tank/tmp/ -f -b -n 1; date ) >> /mnt/tank/tmp/bonnie.txt & ( sudo -u nobody bonnie++ -m "RAIDZ2_8C" -r 8192 -s 81920 -d /mnt/tank/tmp/ -f -b -n 1; date ) >> /mnt/tank/tmp/bonnie.txt & ( sudo -u nobody bonnie++ -m "RAIDZ2_8C" -r 8192 -s 81920 -d /mnt/tank/tmp/ -f -b -n 1; date ) >> /mnt/tank/tmp/bonnie.txt & wait sleep 60 end #
The raw bonnie++ output is available on GitHub. The summary (median scores): (w=993MB/s, r=1882MB/s, IOPS=884)
9.1 IOPS could be improved
The IOPS (~884) are respectable. Although well more than four times as fast as a 15k RPM SAS Drive (~175-210 IOPS), it’s still much lower than a high-end SSD offers (e.g. an Intel X25-M G2 (MLC) posts ~8,600). We feel that using the SSD as a second-level cache could improve our numbers dramatically.
9.2 No SSD
We never put the SSD to use. We plan to use the SSD as both a L2ARC (ZFS read cache) and a ZIL SLOG (a ZFS write cache for synchronous writes).
9.3 Gigabit Bottleneck
Our NAS’s performance is severely limited by the throughput of its gigabit interface on its sequential reads and writes. Our ethernet interface is limited to ~111 MB/s, but our sequential reads can reach almost seventeen times that (1882MB/s).
We can partly address that by using LACP (aggregating the throughput of the 4 available ethernet interfaces).
The fans in the case were noiser than expected, Not clicking or tapping, but a discernible hum.
The system runs cool. With a room temperature of 23.3°C (74° Fahrenheit), these are the readings we recorded after the machine being powered on for 12 hours:
- CPU: 30°C
- System: 32°C
- Peripheral: 31°C
- DIMMA1: 30°C
- DIMMA2: 32°C
- DIMMB1: 33°C
- DIMMB2: 34°C
No component is warmer than body temperature. We are especially impressed with the low CPU temperature, doubly so that it’s passively cooled.
9.6 No Hot Swap
It would be nice if the system had a hot-swap feature. It doesn’t. In the event we need to replace a drive, we’ll be powering the system down.
9.7 Pool Alignment
FreeNAS does the right thing: it creates 4kB-aligned pools by default (instead of a 512B-aligned pools). This should be more efficient, though results vary. See Calomel.org’s section, Performance of 512b versus 4K aligned pools for an in-depth discussion and benchmarks.
10. More Extensive Benchmarking
In our follow-on post, we tune our ZFS fileserver for optimal iSCSI performance.
1 These numbers are not terribly exact. To overcome being artificially limited by the CPU, we were forced to run 8 benchmarks in parallel. This had two serious shortcomings:
- The individual benchmarks weren’t synchronized—benchmarks finished as much as ten seconds apart. While one benchmark was finishing up its rewriting portion, another had already moved on to the reading portion, causing a distortion in the usage pattern.
- The numbers weren’t derived by summing the numbers from a single run of 8 benchmarks. Instead, all the benchmark results were aggregated, and the median 8 values were taken and summed.
For those interested in the raw benchmark data, they can be seen here.
2 We feel that double-parity RAID is a safer approach than single-parity (e.g. RAID 5). Adam Leventhal, in his article for the ACM, describes the challenges that large capacity disks pose to a RAID 5 solution. A NetApp paper states, “… in the event of a drive failure, utilizing a
SATA RAID 5 group (using 2TB disk drives) can mean
a 33.28% chance of data loss per year” (italics ours).
3 We aren’t concerned about a non-optimal configuration (i.e. the number of disks (less parity) should optimally be a power of 2)—we have reservations about the statement, “the number of disks should be a power of 2 for best performance”. A serverfault post states, “As a general rule, the performance boost of adding an additional spinddle [sic] will exceed the performance cost of having a sub-optimal drive count”. Also, we are enabling compression on the ZFS volume, which means that the stripe size will be variable rather than a power of 2 (we are guessing; we may be wrong), which de-couples the stripe size from the disks’ block size.
Calomel.org has one of the most comprehensive set of ZFS benchmarks and good advice for maximizing the performance of ZFS, some of it not obvious (e.g. the importance of a good controller)
About the Author