Why Fusion-IO SLC Duos may outperform MLC Duos by more than expected and some mitigations
Recently I decided to revisit the issue of write throttling that occurs on Fusion-io cards under load. I found this article http://www.hookbag.ca/media/d/b5/d987/d98720b9b537819f/original.pdf that explains the topic thoroughly, specifically for HP Server configurations. I am running 7 HP-branded Fusion-io (IO Accelerator) Duo cards presently in a DL 370 G6. Although, HP does not sell external power cable kits with the Duos and does not encourage their use, they are available through a kit as explained in the article.
Although the impact of not having external power for duos is minimal, it does occur. I ran the fio-status –fj command specified in the article. I discovered that throttling had occurred on virtually all of the duos, but the worst offenders were .the 2 MLC Duos – a 640 GB and a 1.28 TB card. The SLC Duos throttling was much lower, despite the database load processing being even distributed. One of the cards with the high throttling was actually in a PCIE X16 slot. The HP DL 370 G6 actually includes 3 x16 slots. All three of the slots support up to 75 W, although one is electrically an X8 slot with a X16 form factor.
Below is a section of output from the fio-status –fi command showing the throttling:
“write_pwr_throttling_count” : “5013”,
“write_pwr_throttling_sec_since_last” : “58400”,
“write_reg_power_level” : “Inactive”,
“write_reg_thermal_level” : “Inactive”,
“write_reg_total_level” : “Inactive”,
“write_throttling_reason” : “No reason given.”,
“write_throttling_state” : “None”
The default behavior of the driver is to only use 25 Watts regardless of the type of slot in which a card is located. The fil-config utility provides a method to override this default behavior and force the driver to utilize up to the max power rating of the card against the power provided by the PCIE slots. This capability must be used very carefully since trying to draw more power from a slot that does not have it can damage the motherboard. After checking, double-checking, and triple-checking twice, I verified the serial numbers of the cards in the X16 slots. I then utilized the fio-config to set override for the specific serial numbers.
After rebooting, I checked again the throttle counts. Sure enough, the throttling counts were eliminated for the cards in the X16 slots. The only card that was still showing significant throttling was the 1.2 TB card which was not included in the override because it was not in an x16 slot. What also was interesting was that the SLC cards did not experience nearly the degree of throttling as the MLC cards. Also, the fio-status utility reported very small amount of power draw beyond 25W for the overridden cards except for the slot containing the 640 GB MLC. The SLC slots that were given the power override barely used it. My next step will be to turn off the power override for one of the SLC cards in an X16 slot, switch the other MLC card to the X16 slot and then turn on the power override for the other MLC card in the server.
Below illustrates the different outputs of the cards. Note that although both exceeded 25 Watts, the max watt power usage is much lower for the SLC card than the MLC card.
Adapter: Dual Adapter
HP 320GB SLC PCIe ioDrive Duo for ProLiant Servers, Product Number:600281-B21
ioDrive Duo HL, PN:00190000107
External Power Override: ON
External Power: NOT connected
PCIe Bus voltage: avg 11.97V min 11.86V max 12.00V
PCIe Bus current: avg 0.92A max 2.13A
PCIe Bus power: avg 10.95W max 25.32W
Adapter: Dual Adapter
HP 640GB MLC PCIe ioDrive Duo for ProLiant Servers, Product Number:600282-B21
ioDrive Duo HL, PN:00190000108
External Power Override: ON
External Power: NOT connected
PCIe Bus voltage: avg 11.98V min 11.86V max 12.01V
PCIe Bus current: avg 0.96A max 2.62A
PCIe Bus power: avg 11.47W max 31.15W
More research is needed to validate if in fact the MLC adapters are more apt to be write-throttled than the SLC cards, but based on my experience it seems to be the case. This lines up with some anecdotal experience of noticing the MLC cards having delays under high write conditions prior to this tweaking. The work around without going to the difficulty of attaching auxiliary power is to deploy the MLC cards in x16 slots and enable (very carefully) power override. Since the override is linked to the actual serial number, this practice should not be done unless you are sure the cards will not somehow be switched around in the server at some point. That is a pretty big assumption to make and may not be worth the risk to get a small amount of extra write throughput. Note that this does not affect read throughput, only writing.