Several weeks back, I came across a really strange post on the VMTN communities asking how to change the Device ID (DID) and Vendor ID (VID) for a USB Device that has been passthrough to a VM from ESXi? The device in question is the Google Coral USB Edge TPU (Tensor Processing Unit) Accelerator, which is a relatively in-expensive device that can help accelerate machine learning (ML) inferencing. With all the buzz these days with Generative AI and ChatGPT, I can only imagine its popularity has grown even further but I did not realize how popular this device has been in the community, especially for those wanting to use it with ESXi.

The initial observation reported by this user and also by many others in the Coral community was that ESXi was showing the incorrect VID/DID for the Coral USB device and because of this, it was not working correctly when passthrough'ed to a VM and they were looking for a way to change the DID/VID value from 1a6e:089a (Global Unichip Corp.) to 18d1:9302 (Google Inc.).

Interestingly enough, a couple of weeks ago, my buddy Alan Renouf had also shared that he recently purchased the Coral USB device, so I figured I would check with him first to see if he was observing the same behavior that was being reported, which he was. I had been going through the Github reports to try better understand the issue and some of the previous workarounds that users had done including disabling the vmkusb module, which I definitely not recommended, especially for more recent releases of ESXi where that will simply disable all USB functionality to your ESXi host.

I still could not wrap my head around the issue as the reports did not make any sense in terms of the DID/VID not being claimed correctly or that it needed to change to properly function. This also did not make sense when speaking with our USB expert (Songtao who also developed our USB Network Native Driver for ESXi), so I decided to bite the bullet and purchase the Coral USB device, which apparently is difficult to obtain unless you overpay on Amazon, which I did.

After some exhaustive testing and debugging with Songtao, I think I finally understood what was actually going on and many of the assumptions that had been floating around were simply incorrect or had missing information. Putting ESXi aside for a second, the Coral USB device is actually a USB composite device and this will make more sense later. If you plugin the Coral USB device to any system, it is expected to have the DID/VID value of 1a6e:089a Global Unichip Corp. and this is the correct and expected behavior.

Before you can use the Coral USB device, firmware is actually flashed onto the device which is indirectly performed when running one of the Coral examples and this is actually what changes the DID/VID value of 18d1:9302 Google Inc. In fact, using another Coral project from Google called webcoral, you can manually perform the firmware updated as shared in this blog post.

So while the Coral USB device must be flashed with the correct firmware to function correctly, ESXi was correctly seeing the initial DID/VID and this is also true for any other operating system when you first plugin the Coral USB device. With this information, we ran some additional experiments where the following error was observed from the VM when it initially attempts to communicate with the USB Coral Device:

Failed to load delegate from libedgetpu.so.1

While on the surface it may look like a failed attempt, but actually happened was that it was able to successfully flash the firmware onto the USB Coral device but ESXi was not aware of this change or expecting that the device would change. This was further validated by additional testing using VMware Fusion, to first understand the expected behavior of the device before proceeding to finding a solution for ESXi. Once we understood what was needed, I was able to debug further with Songtao and we came up with a pretty simple solution that would make ESXi aware of the updated Coral USB device and then the VM was able to use the passthrough device without any issues.

As with any technical issue, it is extremely important to actually understand what is happening, especially if you are looking to find or ask for a solution. Initial observations can also be miss-leading and add additional confusion when reporting issues.

I have personally not worked with any USB device that has ever behaved this way, so I can not say if this is common or not, but I do think the device could have been simplified in its design. Perhaps this was a design consideration to ensure the device was always running the latest firmware, but it definitely is one of the more stranger types of USB devices that I had ever come across and Songtao also agreed.

Note: Google also has a Coral PCIe Edge TPU Accelerator that many folks have also reported issues with ESXi, but it turns out this device does NOT actually conform to the PCIe standard and violates the PCIe specification shared by one of our Principle Engineers at VMware and therefore can not be used for passthrough with ESXi. If anyone from the Google Coral team is reading this, there is a recommendation in link above on how to remediate this problem if you are interested in enabling this for your users requesting support for ESXi.

Below are the step by step instruction for getting the Coral USB device to function in passthrough mode with a VM using recent ESXi 7.x and 8.x releases.

Step 0 - I will assume you have already setup a VM to run the Coral software. If not, install a supported operating system for use with Coral. For my setup, I am using an Ubuntu 20.04 and make sure you have USB 3.1 controller configured when adding the Coral USB device. If the VM is powered on, go ahead and shut it down as we need to add one additional configuration change to the VM.



Step 2 - Edit the VM Advanced Setting and add the following setting:

usb.quirks.device0 = 0x18d1:0x9302 skip-reset, skip-refresh, skip-setconfig



Step 3 - Power on the VM and then run through the initial Coral setup instructions which will initialize the Coral USB device and update it with the required firmware. It is expected that you will see the Failed to load delegate from libedgetpu.so.1 error message, but the underlying Coral USB device has already been successfully flashed.



Step 4 - Login to ESXi Shell to confirm that Coral USB device is still showing the default value of Global Unichip Corp. by running the lsusb command as shown in the screenshot below.



Step 5 - Next, we need to make ESXi aware of the updated Coral USB device and there are two options in achieving this:

Reboot ESXi - As long as you do NOT unplug the Coral USB device from the physical ESXi host as it has already been successfully initialized, then this will be the quickest method

- As long as you do NOT unplug the Coral USB device from the physical ESXi host as it has already been successfully initialized, then this will be the quickest method Reload USB Module - If you prefer not to reboot, we can make ESXi aware of the updated Coral USB device by reloading the USB module

To reload the USB module, login to the ESXi Shell and run the following commands:

/etc/init.d/usbarbitrator stop vmkload_mod -u vmkusb;vmkload_mod vmkusb kill -SIGHUP $(ps -C | grep vmkdevmgr | awk '{print $1}') /etc/init.d/usbarbitrator start

Note: If you are using or have the USB Network Native Driver for ESXi installed, then use the following commands instead to unload the USB module:

/etc/init.d/usbarbitrator stop vmkload_mod -u vmkusb_nic_fling;vmkload_mod vmkusb_nic_fling kill -SIGHUP $(ps -C | grep vmkdevmgr | awk '{print $1}') /etc/init.d/usbarbitrator start



Step 6 - We can now confirm the updated Coral USB device is now showing the expected value of Google Inc. by re-running the lsusb command. It may take a second or two from the previous step, but you should now see the updated DID/VID for the Coral USB device as shown in the screenshot below.



Step 7 - Finally, we can confirm that our VM can also see the updated Coral USB device by running the lsusb command within the OS. If we now re-run the Coral setup example, we can now see that the operation has successfully completed and can properly communicate with the Coral USB device! 😎



Once the Coral USB device has been successfully initialized, it will be persisted across both VM and ESXi host reboots. The Coral USB device will only return back to its default state when it is physically unplugged from the ESXi host and you just need to re-run Step 3 & 5 again.

With the Coral USB device now fully functional on ESXi, I am definitely interested in hearing how our users will be leveraging this device whether that is with the popular Frigate NVR application or for other ML inferencing solutions.