Monday, 26 August 2013

VMFS-formatted USB sticks in VMware vSphere 5.1

UPDATE: vSphere 5.5 behaves in the same way - still no solution other than trying to find older USB keys on eBay and hoping for the best! :(

Over the last couple of weeks I've been building a VMware home lab (it'll probably get a post of its own in the near future) to replace 'eddie,' my ageing Core 2 Duo server and one of the constraints I enforced during the build was for the lab to be 'self consuming'.

By that I mean it mustn't rely on external devices for DHCP/DNS or storage - the lab will be using power all the time so I wanted to minimise the amount of running equipment.

For now I want to focus on the use of VMFS formatted partitions on USB keys / sticks / thumb drives with ESXi 5.1 because I assumed it would be straightforward. The summary is simple: ESXi is just as fussy with USB devices as it is with everything else.


Things started off well with a 4GB Kingston DataTraveler G3 (Vendor 0951, Product 1643)

# cd /dev/disks
# /etc/init.d/usbarbitrator stop
Need to disable this service else ESXi will make it available as a passthrough device for use by VMs.

Now insert the 4GB USB stick.
# esxcfg-scsidevs -c | grep USB
mpx.vmhba33:C0:T0:L0                                                      Direct-Access    /vmfs/devices/disks/mpx.vmhba33:C0:T0:L0                                                      988MB     NMP     Local USB Direct-Access (mpx.vmhba33:C0:T0:L0)
mpx.vmhba39:C0:T0:L0                                                      Direct-Access    /vmfs/devices/disks/mpx.vmhba39:C0:T0:L0                                                      3690MB    NMP     Local USB Direct-Access (mpx.vmhba39:C0:T0:L0)

Now I can see two USB devices - a 1GB and a 4GB. I am using the 1GB to boot the ESXi host - no problem there since there are no VMFS partitions on the boot device - just the small boot, a few FATs and a VMkernel diagnostic partition. The interesting stuff is on the 4GB device. Here's the commands I used to create a new GPT disk label and a VMFS5 partition to go on it:


# partedUtil mklabel mpx.vmhba39\:C0\:T0\:L0 gpt
# partedUtil setptbl mpx.vmhba39\:C0\:T0\:L0 gpt \"1 2048 7550550 AA31E02A400F11DB9590000C2911D1B8 0"
gpt  
0 0 0 0 
1 2048 7550550 AA31E02A400F11DB9590000C2911D1B8 0
# vmkfstools -C vmfs5 mpx.vmhba39\:C0\:T0\:L0:1  
create fs deviceName:'mpx.vmhba39:C0:T0:L0:1', fsShortName:'vmfs5', fsName:'(null)' 
deviceFullPath:/dev/disks/mpx.vmhba39:C0:T0:L0:1 deviceFile:mpx.vmhba39:C0:T0:L0:1 
Checking if remote hosts are using this device as a valid file system. This may take a few seconds... 
Creating vmfs5 file system on "mpx.vmhba39:C0:T0:L0:1" with blockSize 1048576 and volume label "none". 
Successfully created new volume: 521b9571-a9085b97-0a71-0015176fe0a0

At this point, you can simply click 'Refresh' in the VMware vSphere Client and the new volume will appear. Rename it to something sensible (or use the -S option in vmkfstools) and we're done.

Great. Now I got on with my life for a week and then came back to this work using the next of the USB sticks that I had lying around. This time it's a 4GB Lexar JumpDrive TwistTurn (Vendor 05dc, Product a815)

http://www.lexar.com/products/lexar-jumpdrive-twistturn-usb-flash-drive

So, same again, right?

# esxcfg-scsidevs -c | grep USB 
mpx.vmhba33:C0:T0:L0                                                      Direct-Access    /vmfs/devices/disks/mpx.vmhba33:C0:T0:L0                                                      988MB     NMP     Local USB Direct-Access (mpx.vmhba33:C0:T0:L0) 
mpx.vmhba39:C0:T0:L0                                                      Direct-Access    /vmfs/devices/disks/mpx.vmhba39:C0:T0:L0                                                      3690MB    NMP     Local USB Direct-Access (mpx.vmhba39:C0:T0:L0) 
mpx.vmhba40:C0:T0:L0                                                      Direct-Access    /vmfs/devices/disks/mpx.vmhba40:C0:T0:L0                                                      3748MB    NMP     Local USB Direct-Access (mpx.vmhba40:C0:T0:L0)
Ah OK, the Lexar is a bit bigger. Not all the 4GB sticks are the same. Anyway, let's continue as before:

# partedUtil mklabel mpx.vmhba40:C0:T0:L0 gpt 
# partedUtil setptbl mpx.vmhba40:C0:T0:L0 gpt "1 2048 7550550 AA31E02A400F11DB9590000C2911D1B8 0" 
gpt 
0 0 0 0 
1 2048 7550550 AA31E02A400F11DB9590000C2911D1B8 0   
# vmkfstools -C vmfs5 mpx.vmhba40:C0:T0:L0:1  
create fs deviceName:'mpx.vmhba40:C0:T0:L0:1', fsShortName:'vmfs5', fsName:'(null)' 
deviceFullPath:/dev/disks/mpx.vmhba40:C0:T0:L0:1 deviceFile:mpx.vmhba40:C0:T0:L0:1 
Checking if remote hosts are using this device as a valid file system. This may take a few seconds... 
Creating vmfs5 file system on "mpx.vmhba40:C0:T0:L0:1" with blockSize 1048576 and volume label "none". 
Usage: vmkfstools -C [vmfs3|vmfs5] /vmfs/devices/disks/vml... or,       vmkfstools -C [vmfs3|vmfs5] /vmfs/devices/disks/naa... or,       vmkfstools -C [vmfs3|vmfs5] /vmfs/devices/disks/mpx.vmhbaA:T:L:P 
Error: vmkfstools failed: vmkernel is not loaded or call not implemented.

wat

I had a look in the vmkernel.log (typically through the output of 'dmesg' and found a couple of lines which seemed relevant since they came up at the time when I issued the 'vmkfstools' command:
2013-08-26T17:40:09.176Z cpu1:2049)NMP: nmp_ThrottleLogForDevice:2319: Cmd 0x1a (0x41240080c0c0, 0) to dev "mpx.vmhba40:C0:T0:L0" on path "vmhba40:C0:T0:L0" Failed: H:0x0 D:0x2 P:0x0 Valid sense data: 0x5 0x24 0x0. Act:NONE
2013-08-26T17:40:09.176Z cpu1:2049)ScsiDeviceIO: 2331: Cmd(0x41240080c0c0) 0x1a, CmdSN 0x1a4b from world 0 to dev "mpx.vmhba40:C0:T0:L0" failed H:0x0 D:0x2 P:0x0 Valid sense data: 0x5 0x24 0x0.

The sense data seems interesting. The VMware KB article 289902 discusses the sense keys - 0x5 is "ILLEGAL REQUEST" and the linked page on the T10 website goes into further detail that '0x24 0x0' is INVALID FIELD IN CDB. 

However, this is all a red herring since the DataTraveler G3 is also seeing the same logs and it's working fine:


2013-08-26T18:53:31.861Z cpu1:2049)NMP: nmp_ThrottleLogForDevice:2319: Cmd 0x1a (0x412400810cc0, 0) to dev "mpx.vmhba32:C0:T0:L0" on path "vmhba39:C0:T0:L0" Failed: H:0x0 D:0x2 P:0x0 Valid sense data: 0x5 0x24 0x0. Act:NONE
2013-08-26T18:53:31.861Z cpu1:2049)ScsiDeviceIO: 2331: Cmd(0x412400810cc0) 0x1a, CmdSN 0x4b6 from world 0 to dev "mpx.vmhba39:C0:T0:L0" failed H:0x0 D:0x2 P:0x0 Valid sense data: 0x5 0x24 0x0.

Conclusion

The conclusion is a very unsatisfactory "ESXI 5.1 is fussy" since really there's no news there at all. I also have no idea how to pre-judge which USB sticks will work and which will fail, so I've just gone and bought another three DataTraveler G3's since I need them to complete the lab build.

Comments welcome! :)

<blink>HILARIOUS UPDATE</blink>

Those three identical 4GB sticks I ordered? They arrived. Look:



The one which is not in the packet is currently in the first MicroServer.

It exhibits exactly the same behaviour as the Lexar sticks. 

There are a number of differences:
  • Product IDs: the working stick is Vendor 0951, Product 1643 whilst the new one is Vendor 0951, Product 1665
  • Model strings: 'G3' in the working and '2.0' in the non-working, despite both bearing the 'G3' moniker on the plastic casing
  • Usable capacity: 3690MB in the working and 3695MB in the non-working
  • SCSI revision: the working stick is natively version 4 and natively version 2 in the non-working. There is a kernel message regarding making the version 4 command set appear to be version 2

The SCSI revision number appears to be the crux of the matter.


Working Kingston DataTraveler:

2013-08-29T20:18:13.102Z cpu0:210103)usb-storage: detected SCSI revision number 2 on vmhba32 
2013-08-29T20:18:13.102Z cpu0:210105)ScsiScan: 888: Path 'vmhba32:C0:T0:L0': Vendor: 'Kingston'  Model: 'DataTraveler G3 '  Rev: '1.00' 
2013-08-29T20:18:13.102Z cpu0:210105)ScsiScan: 891: Path 'vmhba32:C0:T0:L0': Type: 0x0, ANSI rev: 2, TPGS: 0 (none)

Non-working Kingston DataTraveler:

2013-08-29T19:47:33.356Z cpu1:29884)usb-storage: detected SCSI revision number 4 on vmhba38
2013-08-29T19:47:33.356Z cpu1:29884)usb-storage: patching inquiry data to change SCSI revision number from 4 to 2 on vmhba38
2013-08-29T19:47:33.357Z cpu1:29885)ScsiScan: 888: Path 'vmhba38:C0:T0:L0': Vendor: 'Kingston'  Model: 'DataTraveler 2.0'  Rev: '1.00' 
2013-08-29T19:47:33.357Z cpu1:29885)ScsiScan: 891: Path 'vmhba38:C0:T0:L0': Type: 0x0, ANSI rev: 2, TPGS: 0 (none)

Non-working Lexar JumpDrive:

2013-08-29T20:30:25.002Z cpu0:210739)usb-storage: detected SCSI revision number 4 on vmhba40 
2013-08-29T20:30:25.002Z cpu0:210739)usb-storage: patching inquiry data to change SCSI revision number from 4 to 2 on vmhba40 
2013-08-29T20:30:25.002Z cpu0:210739)usb-storage: patching inquiry data to clear TPGS (3) on vmhba40 
2013-08-29T20:30:25.002Z cpu0:210740)ScsiScan: 888: Path 'vmhba40:C0:T0:L0': Vendor: 'Lexar   '  Model: 'USB Flash Drive '  Rev: '1100' 
2013-08-29T20:30:25.002Z cpu0:210740)ScsiScan: 891: Path 'vmhba40:C0:T0:L0': Type: 0x0, ANSI rev: 2, TPGS: 0 (none)

I did find that the 'usb-storage' kernel module being used supports per-device 'quirks':

~ # esxcfg-module -i usb-storage
esxcfg-module module information input file: /usr/lib/vmware/vmkmod/usb-storage 
License: GPL
Version: Version 1.0-3vmw, Build: 1065491, Interface: 9.2  
Built on: Mar 23 2013
[...]
quirks: string   supplemental list of device IDs and their quirks

Since the ESXi kernel is based on Linux, I found a clue as to what the various quirks mean: http://lxr.free-electrons.com/source/drivers/usb/storage/usb.c so I tried setting all the available options for a Lexar stick except for 'Ignore Device' (option 'i') with the following command:


# esxcfg-module -s quirks=5dc:a815:abcdehlnoprsw usb-storage

I know the module settings took effect because this was logged:



2013-08-29T21:16:22.137Z cpu0:2489)<6>usb-storage 1-4:1.0: Quirks match for vid 05dc pid a815: 392b1

However, even after a reboot of that ESXi host, there was no change in the behaviour :( Boo! Let's hope that the kernel with vSphere 5.5 is a little more forgiving!


12 comments:

  1. I was trying to create some "scratch space"/persistent-storage (and potentially some smaller VMs for storage controller diags) on a largish USB (8gb) we are using for ESXi. I get the same sense errors as well. Pity.

    ReplyDelete
  2. Can you clarify -- Are you saying some G3 memory sticks of 4GB allow being used as a vmfs formated datastore and it possible to install a VM directly onto it?

    ReplyDelete
    Replies
    1. Hey Eilz,

      Yep that's exactly it, and there's no way I know of to determine which ones are compatible.

      As it happens, I've now been running into problems with the one G3 stick that I was able to format to VMFS due to the limited write cycles on a USB stick.

      As a result I'm now moving away from this approach back to traditional shared NFS.

      Delete
  3. For this line
    # partedUtil setptbl mpx.vmhba39\:C0\:T0\:L0 gpt \"1 2048 7550550 AA31E02A400F11DB9590000C2911D1B8 0"

    What are the numbers I would put in for 27 gigs on a usb drive?

    ReplyDelete
  4. Hi Jordan,

    Where I have the figure of 7550550, that is a count of the number of 2048-byte blocks in the partition (approx 3.6GB usable space), so for a 27,000,000 byte partition, you'd want approx 13,500,000 blocks.

    ReplyDelete
  5. Same issue exists in 5.5, unfortunately.

    I thought I'd figured a way to get around it. I installed ESXi inside a VM (yes, you can do that) and created a VMDK file for it with the exact disk geometry (C/H/S) of my 16GB USB stick. I then used your commands to partition that VMDK and create a vmfs5 volume on it.

    Then, I tried to use the dd command to copy the whole disk to my USB stick. No luck, ESXi refused to write to it. So I rebooted my VM into a Linux LiveCD and from there I was able to dd onto the USB stick.

    Booting back into ESXi, I was *so* close! My vmfs5 volume on my USB stick did show up in my list of storage devices! But, it was greyed out. When I right-clicked on it and chose "Mount", I got the error Call "HostStorageSystem.MountVmfsVolume" for object "storageSystem" on ESXi "192.168.100.104" failed.
    An unknown error has occurred." and a message in the kernel log "cpu0:35198 opID=f74363fb)LVM: 11752: Failed to open device mpx.vmhba33:C0:T0:L0:1 : Not supported"

    Ugh, they REALLY don't want you putting vmfs volumes on USB!

    ReplyDelete
  6. same as above, i successfully created datastore on usb from esxi 6.0 and it works fine there, same drive i moved , on to esxi 5.5(formated in 6.0) shows up but greyed out (inactive and Unmounted) :(

    ReplyDelete
  7. Thanks for providing this informative information you may also refer.
    http://www.s4techno.com/vmware-training-pune-online/

    ReplyDelete
  8. I just couldn't leave your website before telling you that I truly enjoyed the top quality info you present to your visitors? Will be back again frequently to check up on new posts.
    vmware training

    ReplyDelete
  9. BlueHost is one of the best website hosting provider with plans for any hosting needs.

    ReplyDelete