Changing the default VMware Round Robin IO Operation Limit value for Pure Storage FlashArray devices

This is a topic I have posted about in the past but this time I am going to speak about it with the Pure Storage FlashArray. Anyone familiar with the VMware Native Multipathing Plugin probably knows about the Round Robin “IOPS” value which I will interchangeably also refer to as the IO Operation Limit. This value dictates how often NMP switches paths to the device–after a configured number of I/Os NMP will move to a different path. The default value of this is 1,000 but can be changed to as low as 1. For the highest performance Pure recommends changing this setting to 1 for all devices. The tricky thing is that it has to be done for every device on every host and doing this in a simple way isn’t immediately obvious. But here is the procedure.

The most common method employed to do this was setting it on each device using esxcli, but this is not exactly the most scalable method, but it requires doing it to every device on every host until the end of time. What is much easier is to create a rule that specifically will set a IOPS value for every Pure device that comes in. The SATP that claims Pure devices is the standard ALUA one, VMW_SATP_ALUA. So a rule needs to be assigned for Pure devices claimed by that SATP. First you need some information.

To create a rule specific enough to encompass only Pure devices we need to get the vendor information from an existing device. The simplest way to do this (or a simple one at least) is to just grep the vmkernel log after a rescan:

grep -i scsiscan /var/log/vmkernel.log

This will give you lines that look like so:

2014-05-14T21:54:50.756Z cpu13:33081 opID=2ac75bde)ScsiScan: 976: Path 'vmhba3:C0:T5:L11': Vendor: 'PURE ' Model: 'FlashArray ' Rev: '342 '

We just need to take the vendor and model names, which unsurprisingly are PURE and FlashArray respectively. To create a rule to both make sure Pure devices use round robin and that the IOPS value is always set to 1 run this command on all of your ESXi hosts:

esxcli storage nmp satp rule add -s "VMW_SATP_ALUA" -V "PURE" -M "FlashArray" -P "VMW_PSP_RR" -O "iops=1"

This is case sensitive so make sure you type this exactly as above.

***See how to do this with PowerCLI here***

Note that existing devices will not get this change! If they are currently using MRU or something or have a different IOPS value this will not change them. You either need to specifically change existing devices or unclaim and reclaim them (which requires the device going offline) or reboot the host. If you want to change specific devices without taking them offline you can run (with a different NAA of course):

esxcli storage nmp device set -d naa.6006016055711d00cff95e65664ee011 --psp=VMW_PSP_RR

esxcli storage nmp psp roundrobin deviceconfig set -d naa.6006016055711d00cff95e65664ee011 -I 1 -t iops

Regardless all new devices will now be claimed with round robin using an IOPS value of 1 from this point on. You can check the IO Operation Limit value for a given device by running:

esxcli storage nmp psp roundrobin deviceconfig get --device naa.624a9370753d69fe46db318d00010000

 Byte Limit: 10485760
 Device: naa.624a9370753d69fe46db318d00010000
 IOOperation Limit: 1
 Limit Type: Default
 Use Active Unoptimized Paths: false

To change or remove the rule you cannot simply just run the command again to change the rule back to 1,000 or whatever number. You must first remove the rule and then you can create a new one with a different number, or leave it without a rule to use 1,000 again.

esxcli storage nmp satp rule remove -s "VMW_SATP_SYMM" -V "PURE" -M "FlashArray" -P "VMW_PSP_RR" -O "iops=1"

If you don’t remember what you set or want to take a look at the existing rules, run:

esxcli storage nmp satp rule list -s VMW_SATP_ALUA

Pretty straight forward!

17 Replies to “Changing the default VMware Round Robin IO Operation Limit value for Pure Storage FlashArray devices”

Bjørn A. Jørgensen (@bajorgensen) says:

August 27, 2014 at 5:59 am

Why is it so hard for vendors to add their best practice claim rules to ESX? I am going mad adding this for different vendors.

Reply
1. Cody Hosterman says:
  
  August 29, 2014 at 2:24 pm
  
  Agreed–I wish this was easier for vendors to change. When I was at EMC it took us almost four years to get them to change the VMAX default to RR from Fixed.
  
  Reply
  1. codyhosterman says:
    
    December 4, 2017 at 9:44 am
    
    By the way, Pure Storage best practices are now default in ESXi, so you do not need to do this anymore.
    
    Reply
Jason Taylor says:

September 8, 2014 at 4:35 pm

Don’t you also want to set TPGS to on in this case?

Reply
1. Cody Hosterman says:
  
  September 8, 2014 at 4:55 pm
  
  No, since the FlashArray is active/active TPGS doesnt really need to be messed with. Could set it to off, but it doesn’t really matter.
  
  Reply
Pingback: Pure Storage vSphere Web Client Plugin 2.0 Released | Cody Hosterman
Pingback: Setting up iSCSI with VMware ESXi and the FlashArray | Cody Hosterman
Payet says:

March 7, 2016 at 10:29 am

If you just don’t want to pay too much attention to the naa attribute, using these two commands should help.

RR activation :
# for i in `esxcli storage nmp device list | grep PURE | awk ‘{gsub(/[()]/,””); print $8}’` ; do `esxcli storage nmp device set -d $i –psp=VMW_PSP_RR`; done

Path Switching to 1 :
# for i in `esxcli storage nmp device list | grep PURE | awk ‘{gsub(/[()]/,””); print $8}’` ; do esxcli storage nmp psp roundrobin deviceconfig set -d $i -I 1 -t iops;done

Reply
Chris Adkin says:

November 10, 2017 at 3:07 am

Hi Cody, Isn’t round robin now the default for vsphere 6.5 ?

Reply
1. codyhosterman says:
  
  November 10, 2017 at 7:11 am
  
  Yup: https://www.codyhosterman.com/2017/07/nmp-multipathing-rules-for-the-flasharray-are-now-default/
  
  Reply
John says:

December 4, 2017 at 9:15 am

I am not seeing any ScsiScan info in /var/log/vmkernel.log
Do you know of another way to retrieve the information?

Reply
1. codyhosterman says:
  
  December 4, 2017 at 9:46 am
  
  For any device presented you should see it. esxcfg-scsidevs -l should show it too. What vendor are you looking to configure for?
  
  Reply
John says:

December 4, 2017 at 12:14 pm

Oh wow that is cool, I ran “esxcfg-scsidevs -l” and looks like there are 15 different ones.
Three pertain to “Vendor: PURE” and “Model: FlashArray”.
I just have one array, should there be 15 different naa.#’s?
Each is Multipath Plugin: NMP

Reply
1. codyhosterman says:
  
  December 4, 2017 at 12:28 pm
  
  Every volume (or datastore or LUN or whatever you want to call it) you provision will have it’s own NAA. The NAA is based on the volume serial number, so each one has a unique NAA–as it is what VMware uses to identify each datastore uniquely. Though for the FlashArray the vendor and model info will always be PURE and FlashArray–this is not unique to a volume, instead it is common to all storage from our array. To create a SATP rule you would use those values for us. If you are running the latest versions of ESXi though, you do not need to do this anymore
  
  Reply
John says:

December 4, 2017 at 12:33 pm

Ah ok so I am on esxi 5.5. I’d just run the rule for PURE FlashArray and should be good. Or alternatively, I’d upgrade to esxi 6.5 and wouldn’t need to add the rule to change the round robin io limit?

Thanks again

Reply
1. codyhosterman says:
  
  December 4, 2017 at 12:49 pm
  
  You’re welcome! Yep exactly! If you are on 5.5 run that rule on each ESXi host once and you are good. If you are on 6.0 Express patch 5 or later or 6.5 U1 or later you dont need to do it at all as these recommendations are now default in ESXi for the FlashArray in those releases and later
  
  Reply
Pingback: Updated FlashArray VMware Best Practices PowerCLI Scripts | Cody Hosterman

17 Replies to “Changing the default VMware Round Robin IO Operation Limit value for Pure Storage FlashArray devices”

Leave a Reply Cancel reply