WilliamLam.com

  • About
    • About
    • Privacy
  • VMware Cloud Foundation
  • VKS
  • Homelab
    • Resources
    • Nested Virtualization
  • VMware Nostalgia
  • Apple

How to Decrease iSCSI Login Timeout on ESXi 5?

10.14.2011 by William Lam // 6 Comments

VMware recently identified an issue where the iSCSI boot process may take longer than expected on ESXi 5. This can occur when the iSCSI targets are unavailable while the ESXi host is booting up and the additional retry code that was added in ESXi 5 can cause a delay in the host startup. It has been noted that the number of retry iteration has been hard coded in the iSCSI stack to nine and this can not be modified.

UPDATE1: VMware just released the patch for iSCSI delay in ESXi 5 Express Patch 01 - kb.vmware.com/kb/2007108

Here is what you would see in /var/log/syslog.log after ESXi host boots up:

Being the curious person that I am, I decided to see if this hard coded value can actually be modified or tweaked. I believe I have found a way to decrease the delay significantly but this has only been tested in a limited lab environment. If you are potentially impacted by this iSCSI boot delay and would like to test this unsupported configuration change, I would be interested to see if this fact reduces the timing.

*** DISCLAIMER ***

This is not supported by VMware, please test this in a staging/development environment before pushing it out to any critical system
*** DISCLAIMER ***

My initial thought was check out the iscsid.conf configuration file, but noticed that VMware does not make use of this default configuration file (not automatically backed up by ESXi), instead it uses a small sqlite database file located in /etc/vmwre/vmkiscsid/vmkiscsid.db.

Since the vmkiscsid.db is actively backed up ESXi, any changes would persist through system reboot and does not require any special tricks.

UPDATE: Thanks to Andy Banta's comment below, you can view the iSCSI sqlite database by using an unsupported --dump-db and -x flag in vmkiscsid.

To dump the entire database, you can use the following command:

vmkiscsid --dump-db

To execute a specific SQL query such as viewing a particular database table, you can use the following command:

vmkiscsid -x "select * from discovery"

Again, this is an unsupported flag by VMware, use at your own risk but it does save you from copying the iSCSI database to another host for edits.

To view the contents of the sqlite file, I scp'ed the file off to a remote host that has sqlite3 client such as the VMware vMA appliance. After a bit of digging with some trial and error, I found the parameter  discovery.ltime in the discovery table would alter the time it takes for the iSCSI boot up process to complete when the iSCSI targets are unavailable during boot up. Before you make any changes, make a backup of your vmkiscsid.db file in case you need the original file.

To view the sqlite file, use the following command:

sqlite3 vmkiscsid.db
.mode line *
select * from discovery;

The default value is 27 and what I have found is that as long as it is not this particular value, the retry loop is not executed or it exits almost immediately after retrying for only a few seconds.

To change the value, I used the following SQL command:

update discovery set 'discovery.ltime'=1; 

To verify the value, you can use the following SQL command:

.mode line *
select * from discovery;

Once you have confirmed the change, you will need to type .quit to exit and then upload the modified vmkiscsid.db file to our ESXi 5 host.

Next to ensure the changes are saved immediately to the backup bootbank, run the /sbin/auto-backup.sh which will force an ESXi backup.

At this point, to test you can disconnect your iSCSI target from the network and reboot your ESXi 5 host, it should hopefully decrease the amount of time it takes to go through the iSCSI boot process.

As you can see from this screenshot of the ESXi syslog.log, it took only 15 seconds to retry and the continue through the boot up process.

In my test environment, I setup a vESXi host with software iSCSI initiator which binded to three VMkernel interfaces and connected to ten iSCSI targets on a Nexenta VSA. I disconnected the network adapter on iSCSI target and modified the discovery.ltime on ESXi host and checked out the logs to see how long it took to get pass the retry code.

Here is a table of the results:

discovery.ltime iSCSI Bootup Delay
1 15sec
3 15sec
6 15sec
12 15sec
24 15sec
26 15sec
27 7min
28 15sec
81 15sec

As you can see, only the value of 27 causes the extremely long delay (7 minutes in my environment) and all other values all behave pretty much the same (15secs roughly). VMware did mention hard coding the number of iterations to 9 and when you divide 27 into 9 you get 3. I tried using values that were multiple of three and I was not able to find any correlation to the delay other than it not taking as long as using the value 27. I also initially tested with 5 iSCSI targets and doubled it to 10 but it did not seem to be a factor in the overall delay.

I did experiment with other configuration parameters such as node.session.initial_login_retry_max, but it did not change the amount of time or iterations of the iSCSI retry code. Ultimately, I believe due to the hard coding of the retry iterations, by modifying discovery.ltime it bypasses the retry code or reduces the amount of retry all together. I am not an iscsid expert, so there is a possibility that other parameter changes could decrease the amount of wait time.

UPDATE: Please take a look at Andy Banta's comment below regarding the significance of the value 27 and the official definitions of the discovery.ltime and node.session.initial_login_retry_max parameters. Even though the behavior from my testing seems to reduce the iSCSI boot delay, there is an official fix coming from VMware.

UPDATE1: VMware just released the patch for iSCSI delay in ESXi 5 Express Patch 01 - kb.vmware.com/kb/2007108

I would be interested to see if this hack holds true for environments with multiple network portals or great number of iSCSI targets. If you are interested or would like to test this theory, please leave a comment regarding your environment.

Categories // Uncategorized Tags // ESXi 5.0, iscsi, vSphere 5.0

Search

Thank Author

Author

William is Distinguished Platform Engineering Architect in the VMware Cloud Foundation (VCF) Division at Broadcom. His primary focus is helping customers and partners build, run and operate a modern Private Cloud using the VMware Cloud Foundation (VCF) platform.

Connect

  • Bluesky
  • Email
  • GitHub
  • LinkedIn
  • Mastodon
  • Reddit
  • RSS
  • Twitter
  • Vimeo

Recent

  • VMUG Connect 2025 - Minimal VMware Cloud Foundation (VCF) 5.x in a Box  05/15/2025
  • Programmatically accessing the Broadcom Compatibility Guide (BCG) 05/06/2025
  • Quick Tip - Validating Broadcom Download Token  05/01/2025
  • Supported chipsets for the USB Network Native Driver for ESXi Fling 04/23/2025
  • vCenter Identity Federation with Authelia 04/16/2025

Advertisment

Privacy & Cookies: This site uses cookies. By continuing to use this website, you agree to their use.
To find out more, including how to control cookies, see here: Cookie Policy

Copyright WilliamLam.com © 2025