Monday, 26 August 2013

VMFS-formatted USB sticks in VMware vSphere 5.1

UPDATE: vSphere 5.5 behaves in the same way - still no solution other than trying to find older USB keys on eBay and hoping for the best! :(

Over the last couple of weeks I've been building a VMware home lab (it'll probably get a post of its own in the near future) to replace 'eddie,' my ageing Core 2 Duo server and one of the constraints I enforced during the build was for the lab to be 'self consuming'.

By that I mean it mustn't rely on external devices for DHCP/DNS or storage - the lab will be using power all the time so I wanted to minimise the amount of running equipment.

For now I want to focus on the use of VMFS formatted partitions on USB keys / sticks / thumb drives with ESXi 5.1 because I assumed it would be straightforward. The summary is simple: ESXi is just as fussy with USB devices as it is with everything else.


Things started off well with a 4GB Kingston DataTraveler G3 (Vendor 0951, Product 1643)

# cd /dev/disks
# /etc/init.d/usbarbitrator stop
Need to disable this service else ESXi will make it available as a passthrough device for use by VMs.

Now insert the 4GB USB stick.
# esxcfg-scsidevs -c | grep USB
mpx.vmhba33:C0:T0:L0                                                      Direct-Access    /vmfs/devices/disks/mpx.vmhba33:C0:T0:L0                                                      988MB     NMP     Local USB Direct-Access (mpx.vmhba33:C0:T0:L0)
mpx.vmhba39:C0:T0:L0                                                      Direct-Access    /vmfs/devices/disks/mpx.vmhba39:C0:T0:L0                                                      3690MB    NMP     Local USB Direct-Access (mpx.vmhba39:C0:T0:L0)

Now I can see two USB devices - a 1GB and a 4GB. I am using the 1GB to boot the ESXi host - no problem there since there are no VMFS partitions on the boot device - just the small boot, a few FATs and a VMkernel diagnostic partition. The interesting stuff is on the 4GB device. Here's the commands I used to create a new GPT disk label and a VMFS5 partition to go on it:


# partedUtil mklabel mpx.vmhba39\:C0\:T0\:L0 gpt
# partedUtil setptbl mpx.vmhba39\:C0\:T0\:L0 gpt \"1 2048 7550550 AA31E02A400F11DB9590000C2911D1B8 0"
gpt  
0 0 0 0 
1 2048 7550550 AA31E02A400F11DB9590000C2911D1B8 0
# vmkfstools -C vmfs5 mpx.vmhba39\:C0\:T0\:L0:1  
create fs deviceName:'mpx.vmhba39:C0:T0:L0:1', fsShortName:'vmfs5', fsName:'(null)' 
deviceFullPath:/dev/disks/mpx.vmhba39:C0:T0:L0:1 deviceFile:mpx.vmhba39:C0:T0:L0:1 
Checking if remote hosts are using this device as a valid file system. This may take a few seconds... 
Creating vmfs5 file system on "mpx.vmhba39:C0:T0:L0:1" with blockSize 1048576 and volume label "none". 
Successfully created new volume: 521b9571-a9085b97-0a71-0015176fe0a0

At this point, you can simply click 'Refresh' in the VMware vSphere Client and the new volume will appear. Rename it to something sensible (or use the -S option in vmkfstools) and we're done.

Great. Now I got on with my life for a week and then came back to this work using the next of the USB sticks that I had lying around. This time it's a 4GB Lexar JumpDrive TwistTurn (Vendor 05dc, Product a815)

http://www.lexar.com/products/lexar-jumpdrive-twistturn-usb-flash-drive

So, same again, right?

# esxcfg-scsidevs -c | grep USB 
mpx.vmhba33:C0:T0:L0                                                      Direct-Access    /vmfs/devices/disks/mpx.vmhba33:C0:T0:L0                                                      988MB     NMP     Local USB Direct-Access (mpx.vmhba33:C0:T0:L0) 
mpx.vmhba39:C0:T0:L0                                                      Direct-Access    /vmfs/devices/disks/mpx.vmhba39:C0:T0:L0                                                      3690MB    NMP     Local USB Direct-Access (mpx.vmhba39:C0:T0:L0) 
mpx.vmhba40:C0:T0:L0                                                      Direct-Access    /vmfs/devices/disks/mpx.vmhba40:C0:T0:L0                                                      3748MB    NMP     Local USB Direct-Access (mpx.vmhba40:C0:T0:L0)
Ah OK, the Lexar is a bit bigger. Not all the 4GB sticks are the same. Anyway, let's continue as before:

# partedUtil mklabel mpx.vmhba40:C0:T0:L0 gpt 
# partedUtil setptbl mpx.vmhba40:C0:T0:L0 gpt "1 2048 7550550 AA31E02A400F11DB9590000C2911D1B8 0" 
gpt 
0 0 0 0 
1 2048 7550550 AA31E02A400F11DB9590000C2911D1B8 0   
# vmkfstools -C vmfs5 mpx.vmhba40:C0:T0:L0:1  
create fs deviceName:'mpx.vmhba40:C0:T0:L0:1', fsShortName:'vmfs5', fsName:'(null)' 
deviceFullPath:/dev/disks/mpx.vmhba40:C0:T0:L0:1 deviceFile:mpx.vmhba40:C0:T0:L0:1 
Checking if remote hosts are using this device as a valid file system. This may take a few seconds... 
Creating vmfs5 file system on "mpx.vmhba40:C0:T0:L0:1" with blockSize 1048576 and volume label "none". 
Usage: vmkfstools -C [vmfs3|vmfs5] /vmfs/devices/disks/vml... or,       vmkfstools -C [vmfs3|vmfs5] /vmfs/devices/disks/naa... or,       vmkfstools -C [vmfs3|vmfs5] /vmfs/devices/disks/mpx.vmhbaA:T:L:P 
Error: vmkfstools failed: vmkernel is not loaded or call not implemented.

wat

I had a look in the vmkernel.log (typically through the output of 'dmesg' and found a couple of lines which seemed relevant since they came up at the time when I issued the 'vmkfstools' command:
2013-08-26T17:40:09.176Z cpu1:2049)NMP: nmp_ThrottleLogForDevice:2319: Cmd 0x1a (0x41240080c0c0, 0) to dev "mpx.vmhba40:C0:T0:L0" on path "vmhba40:C0:T0:L0" Failed: H:0x0 D:0x2 P:0x0 Valid sense data: 0x5 0x24 0x0. Act:NONE
2013-08-26T17:40:09.176Z cpu1:2049)ScsiDeviceIO: 2331: Cmd(0x41240080c0c0) 0x1a, CmdSN 0x1a4b from world 0 to dev "mpx.vmhba40:C0:T0:L0" failed H:0x0 D:0x2 P:0x0 Valid sense data: 0x5 0x24 0x0.

The sense data seems interesting. The VMware KB article 289902 discusses the sense keys - 0x5 is "ILLEGAL REQUEST" and the linked page on the T10 website goes into further detail that '0x24 0x0' is INVALID FIELD IN CDB. 

However, this is all a red herring since the DataTraveler G3 is also seeing the same logs and it's working fine:


2013-08-26T18:53:31.861Z cpu1:2049)NMP: nmp_ThrottleLogForDevice:2319: Cmd 0x1a (0x412400810cc0, 0) to dev "mpx.vmhba32:C0:T0:L0" on path "vmhba39:C0:T0:L0" Failed: H:0x0 D:0x2 P:0x0 Valid sense data: 0x5 0x24 0x0. Act:NONE
2013-08-26T18:53:31.861Z cpu1:2049)ScsiDeviceIO: 2331: Cmd(0x412400810cc0) 0x1a, CmdSN 0x4b6 from world 0 to dev "mpx.vmhba39:C0:T0:L0" failed H:0x0 D:0x2 P:0x0 Valid sense data: 0x5 0x24 0x0.

Conclusion

The conclusion is a very unsatisfactory "ESXI 5.1 is fussy" since really there's no news there at all. I also have no idea how to pre-judge which USB sticks will work and which will fail, so I've just gone and bought another three DataTraveler G3's since I need them to complete the lab build.

Comments welcome! :)

<blink>HILARIOUS UPDATE</blink>

Those three identical 4GB sticks I ordered? They arrived. Look:



The one which is not in the packet is currently in the first MicroServer.

It exhibits exactly the same behaviour as the Lexar sticks. 

There are a number of differences:
  • Product IDs: the working stick is Vendor 0951, Product 1643 whilst the new one is Vendor 0951, Product 1665
  • Model strings: 'G3' in the working and '2.0' in the non-working, despite both bearing the 'G3' moniker on the plastic casing
  • Usable capacity: 3690MB in the working and 3695MB in the non-working
  • SCSI revision: the working stick is natively version 4 and natively version 2 in the non-working. There is a kernel message regarding making the version 4 command set appear to be version 2

The SCSI revision number appears to be the crux of the matter.


Working Kingston DataTraveler:

2013-08-29T20:18:13.102Z cpu0:210103)usb-storage: detected SCSI revision number 2 on vmhba32 
2013-08-29T20:18:13.102Z cpu0:210105)ScsiScan: 888: Path 'vmhba32:C0:T0:L0': Vendor: 'Kingston'  Model: 'DataTraveler G3 '  Rev: '1.00' 
2013-08-29T20:18:13.102Z cpu0:210105)ScsiScan: 891: Path 'vmhba32:C0:T0:L0': Type: 0x0, ANSI rev: 2, TPGS: 0 (none)

Non-working Kingston DataTraveler:

2013-08-29T19:47:33.356Z cpu1:29884)usb-storage: detected SCSI revision number 4 on vmhba38
2013-08-29T19:47:33.356Z cpu1:29884)usb-storage: patching inquiry data to change SCSI revision number from 4 to 2 on vmhba38
2013-08-29T19:47:33.357Z cpu1:29885)ScsiScan: 888: Path 'vmhba38:C0:T0:L0': Vendor: 'Kingston'  Model: 'DataTraveler 2.0'  Rev: '1.00' 
2013-08-29T19:47:33.357Z cpu1:29885)ScsiScan: 891: Path 'vmhba38:C0:T0:L0': Type: 0x0, ANSI rev: 2, TPGS: 0 (none)

Non-working Lexar JumpDrive:

2013-08-29T20:30:25.002Z cpu0:210739)usb-storage: detected SCSI revision number 4 on vmhba40 
2013-08-29T20:30:25.002Z cpu0:210739)usb-storage: patching inquiry data to change SCSI revision number from 4 to 2 on vmhba40 
2013-08-29T20:30:25.002Z cpu0:210739)usb-storage: patching inquiry data to clear TPGS (3) on vmhba40 
2013-08-29T20:30:25.002Z cpu0:210740)ScsiScan: 888: Path 'vmhba40:C0:T0:L0': Vendor: 'Lexar   '  Model: 'USB Flash Drive '  Rev: '1100' 
2013-08-29T20:30:25.002Z cpu0:210740)ScsiScan: 891: Path 'vmhba40:C0:T0:L0': Type: 0x0, ANSI rev: 2, TPGS: 0 (none)

I did find that the 'usb-storage' kernel module being used supports per-device 'quirks':

~ # esxcfg-module -i usb-storage
esxcfg-module module information input file: /usr/lib/vmware/vmkmod/usb-storage 
License: GPL
Version: Version 1.0-3vmw, Build: 1065491, Interface: 9.2  
Built on: Mar 23 2013
[...]
quirks: string   supplemental list of device IDs and their quirks

Since the ESXi kernel is based on Linux, I found a clue as to what the various quirks mean: http://lxr.free-electrons.com/source/drivers/usb/storage/usb.c so I tried setting all the available options for a Lexar stick except for 'Ignore Device' (option 'i') with the following command:


# esxcfg-module -s quirks=5dc:a815:abcdehlnoprsw usb-storage

I know the module settings took effect because this was logged:



2013-08-29T21:16:22.137Z cpu0:2489)<6>usb-storage 1-4:1.0: Quirks match for vid 05dc pid a815: 392b1

However, even after a reboot of that ESXi host, there was no change in the behaviour :( Boo! Let's hope that the kernel with vSphere 5.5 is a little more forgiving!


10 comments:

  1. I was trying to create some "scratch space"/persistent-storage (and potentially some smaller VMs for storage controller diags) on a largish USB (8gb) we are using for ESXi. I get the same sense errors as well. Pity.

    ReplyDelete
  2. Can you clarify -- Are you saying some G3 memory sticks of 4GB allow being used as a vmfs formated datastore and it possible to install a VM directly onto it?

    ReplyDelete
    Replies
    1. Hey Eilz,

      Yep that's exactly it, and there's no way I know of to determine which ones are compatible.

      As it happens, I've now been running into problems with the one G3 stick that I was able to format to VMFS due to the limited write cycles on a USB stick.

      As a result I'm now moving away from this approach back to traditional shared NFS.

      Delete
  3. For this line
    # partedUtil setptbl mpx.vmhba39\:C0\:T0\:L0 gpt \"1 2048 7550550 AA31E02A400F11DB9590000C2911D1B8 0"

    What are the numbers I would put in for 27 gigs on a usb drive?

    ReplyDelete
  4. Hi Jordan,

    Where I have the figure of 7550550, that is a count of the number of 2048-byte blocks in the partition (approx 3.6GB usable space), so for a 27,000,000 byte partition, you'd want approx 13,500,000 blocks.

    ReplyDelete
  5. Same issue exists in 5.5, unfortunately.

    I thought I'd figured a way to get around it. I installed ESXi inside a VM (yes, you can do that) and created a VMDK file for it with the exact disk geometry (C/H/S) of my 16GB USB stick. I then used your commands to partition that VMDK and create a vmfs5 volume on it.

    Then, I tried to use the dd command to copy the whole disk to my USB stick. No luck, ESXi refused to write to it. So I rebooted my VM into a Linux LiveCD and from there I was able to dd onto the USB stick.

    Booting back into ESXi, I was *so* close! My vmfs5 volume on my USB stick did show up in my list of storage devices! But, it was greyed out. When I right-clicked on it and chose "Mount", I got the error Call "HostStorageSystem.MountVmfsVolume" for object "storageSystem" on ESXi "192.168.100.104" failed.
    An unknown error has occurred." and a message in the kernel log "cpu0:35198 opID=f74363fb)LVM: 11752: Failed to open device mpx.vmhba33:C0:T0:L0:1 : Not supported"

    Ugh, they REALLY don't want you putting vmfs volumes on USB!

    ReplyDelete
  6. same as above, i successfully created datastore on usb from esxi 6.0 and it works fine there, same drive i moved , on to esxi 5.5(formated in 6.0) shows up but greyed out (inactive and Unmounted) :(

    ReplyDelete
  7. I just couldn't leave your website before telling you that I truly enjoyed the top quality info you present to your visitors? Will be back again frequently to check up on new posts.
    vmware training

    ReplyDelete