WilliamLam.com

  • About
    • About
    • Privacy
  • VMware Cloud Foundation
  • VKS
  • Homelab
    • Resources
    • Nested Virtualization
  • VMware Nostalgia
  • Apple
You are here: Home / ESXi / How to create vCenter Alarm to alert on ESXi 5.5u1 NFS APD issue?

How to create vCenter Alarm to alert on ESXi 5.5u1 NFS APD issue?

04.19.2014 by William Lam // 14 Comments

As some of you may have heard, there is currently a known issue with NFS based datastores (includes VSA NFS datastores) after upgrading to vSphere 5.5 Update 1. The issue causes NFS datastores to disconnect and go into an APD (All Paths Down) state. VMware is currently aware of the problem and you can follow KB 2076392 for the latest updates.

While going through my Twitter stream this morning, I noticed an interesting question from fellow Blogger and friend Jase McCarty who asked the following:

vsphere55u1-nfs-apd-alarm-2
I was quite surprised to hear that there were no vCenter Alarms being triggered for this issue. I decided to take a look at the KB to better understand the symptoms and see if there was anything I could do to help. From what I can tell, the only way to identify this particular problem is by looking at the logs which the KB has an example of what you would see.

Once I took a look at the logs, I knew there was at least two methods in which one could get alerts. One option would be to leverage vCenter Log Insight and create a query based on the particular string but no every customer is using Log Insight and it does require a bit of setup. The second more obvious option for me would be to key off of the VMkernel VOBs that are being generated which I have written about in the past for detecting duplicate IP Addresses for ESXi and VSAN component threshold count.

Here are the steps to create vCenter Alarm:

Step 1 - Create a new vCenter Alarm and give it a name. Select "Hosts" for Monitor and "Specific event occurring ..." for Monitor for

vsphere55u1-nfs-apd-alarm-0
Step 2 - For the Trigger, you will add the following VOB entries (just copy/paste them in)

  • esx.problem.storage.apd.start
  • esx.problem.vmfs.nfs.server.disconnect
  • esx.problem.storage.apd.timeout

Note: The alarm will activate if ANY of the VOBs are seen since it is an OR statement. It would have been nice to be able to group these together to generate the alarm

vsphere55u1-nfs-apd-alarm-1
Once the alarm has been created, you will at least have a way to get notified if you are potentially affected by this problem. I would still highly recommend you subscribe to KB 2076392 for all the latest updates.

More from my site

  • Handy VSAN VOBs for creating vCenter Alarms
  • A killer custom Apple Mac Mini setup running VSAN
  • Does VSAN work with Free ESXi?
  • ESXi 5.5 Kickstart script for setting up VSAN
  • Does reinstalling ESXi with an existing VSAN Datastore wipe your data?

Categories // ESXi, vSphere 5.5 Tags // apd, ESXi 5.5, nfs, vob, vSphere 5.5

Comments

  1. *protectedvroomblog says

    04/30/2014 at 12:01 pm

    Thanks for the Alam, there is a way to have the same for FC storage ?

    Reply
    • William Lam says

      04/30/2014 at 12:50 pm

      Take a look at this article for other vSphere VOBs including generic storage ones that you can use http://www.virtuallyghetto.com/2014/04/other-handy-vsphere-vobs-for-creating-vcenter-alarms.html

      Reply
  2. *protectedAdmin says

    05/30/2014 at 1:49 pm

    Is there a way the alarm triggers are reported in the FAT client v/s web Client?
    I have the screenshots, not sure if I can attach to the comment.

    Reply
  3. *protectedSteve H says

    06/03/2014 at 9:21 pm

    Do these VOB alerts work on vmware 5.0?

    Reply
    • William Lam says

      06/04/2014 at 5:15 am

      They should, but you can always confirm by checking whether these VOBs have been defined in 5.0

      Reply
  4. *protectedJim Millard says

    10/01/2014 at 11:45 am

    The alarms are nice, but I've noticed two things about them: 1) they never go from red to green after being tripped, and 2) there's no information about the datastore that tripped the alarm.

    Yes, the instructions above indicate that there are limitations in the way the alarm trigger works (the "or vs and" factor), but it's sort of weird to see these alarms tripped after upgrading to 5.5U2 _and_ removing NFS stores from the cluster...

    Reply
    • William Lam says

      10/07/2014 at 3:04 pm

      Jim,

      1) I forget off hand if you could create an alarm that will send an alert but not stay red. For most cases, admins would want to see it and then ACK, else you never know when an alarm was fired off unless you were watching it.

      2) You're right, this is an area we could improve in. I would guess that if you were using the API, you could pull more information about the object that tripped the alarm, I thought this was possible within the Events view when an alarm tripped but haven't tested it myself.

      Reply
  5. *protectedstacycarter says

    03/21/2016 at 11:10 am

    William,

    Is there any way to have the vCenter alert include the NFS datastore name in the body of the email, rather than just the UUID?

    Reply
    • William Lam says

      03/21/2016 at 11:35 am

      Yes, you'll need to identify which alarm environmental variable that contains that info. Some more details https://pubs.vmware.com/vsphere-60/index.jsp?topic=%2Fcom.vmware.vsphere.monitoring.doc%2FGUID-AB74502C-5F01-478D-AF66-672AB5B8065C.html and https://pubs.vmware.com/vsphere-4-esx-vcenter/index.jsp?topic=/com.vmware.vsphere.bsa.doc_40/vc_admin_guide/working_with_alarms/r_alarm_environment_variables.html

      What I normally do is just print out all environmental variables as part of the trigger, identify which variable I need as part of a given alarm. You may also want to check out this recent Reddit thread which could be helpful https://www.reddit.com/r/vmware/comments/4b6lpq/change_the_summary_line_of_email_sent_by_vcenter/

      Reply
      • *protectedstacycarter says

        02/28/2017 at 12:59 pm

        Hey William - We tried printing out all environmental variables as part of the trigger, however 4 of the variables were blank for this alarm, including the one I expected to contain the datastore name.
        VMWARE_ALARM_EVENT_VM
        VMWARE_ALARM_EVENT_NETWORK
        VMWARE_ALARM_EVENT_DATASTORE
        VMWARE_ALARM_EVENT_DVS

        The rest of the environmental variables we printed out did have info, but did not contain the datastore name 🙁

        Reply
        • William Lam says

          02/28/2017 at 7:20 pm

          Not all VMWARE_ALARM* variables will always be populated, will depend on the event triggered. In this particular case, I suspect the "datastore" which the alarm triggered off of is stored in another variable ...

          Would you mind sharing the other VMWARE_ALARM* properties that was returned?

          Reply
          • *protectedstacycarter says

            03/01/2017 at 9:48 am

            Sure. Here is what we got (redacted):

            VMWARE_ALARM_NAME = [name we gave alarm]
            VMWARE_ALARM_ID = [alarm id]
            VMWARE_ALARM_TARGET_NAME = [host fqdn]
            VMWARE_ALARM_TARGET_ID = [host id]
            VMWARE_ALARM_OLDSTATUS = Gray
            VMWARE_ALARM_NEWSTATUS = Red
            VMWARE_ALARM_TRIGGERINGSUMMARY = Event: All paths are down
            Summary: Device or filesystem with identifier [***********] has entered the All Paths Down state.
            Date: [date alarm triggered]
            Host: [host fqdn]
            Resource pool: [cluster name]
            Data center: [datacenter name]
            Arguments:
            eventTypeId = esx.problem.storage.apd.start
            objectId = [host id]
            objectName = [host fqdn]
            1 = [datastore identifier]

            VMWARE_ALARM_DECLARINGSUMMARY = ([Event alarm expression: All paths are down; Status = Red] OR [Event alarm expression: All Paths Down timed out, I/Os will be fast failed; Status = Red] OR [Event alarm expression: Lost connection to NFS server; Status = Red])
            VMWARE_ALARM_ALARMVALUE = Event details
            VMWARE_ALARM_EVENTDESCRIPTION = Device or filesystem with identifier [***********] has entered the All Paths Down state.
            VMWARE_ALARM_EVENT_USERNAME =
            VMWARE_ALARM_EVENT_DATACENTER = [datacenter name]
            VMWARE_ALARM_EVENT_COMPUTERESOURCE = [cluster name]
            VMWARE_ALARM_EVENT_HOST = [host fqdn]
            VMWARE_ALARM_EVENT_VM =
            VMWARE_ALARM_EVENT_NETWORK =
            VMWARE_ALARM_EVENT_DATASTORE =
            VMWARE_ALARM_EVENT_DVS =

          • *protectedstacycarter says

            03/13/2017 at 3:04 pm

            Hi William - Just checking in to see if you were able to figure out which variable the datastore name is stored in? Did the additional info below help at all? Thanks!

          • William Lam says

            03/14/2017 at 8:54 am

            It looks like you may have to construct the Datastore Name from "1 = [datastore identifier]" as its not included as part of the alarm.

Thanks for the comment!Cancel reply

This site uses Akismet to reduce spam. Learn how your comment data is processed.

Search

Thank Author

Author

William is Distinguished Platform Engineering Architect in the VMware Cloud Foundation (VCF) Division at Broadcom. His primary focus is helping customers and partners build, run and operate a modern Private Cloud using the VMware Cloud Foundation (VCF) platform.

Connect

  • Bluesky
  • Email
  • GitHub
  • LinkedIn
  • Mastodon
  • Reddit
  • RSS
  • Twitter
  • Vimeo

Recent

  • Programmatically accessing the Broadcom Compatibility Guide (BCG) 05/06/2025
  • Quick Tip - Validating Broadcom Download Token  05/01/2025
  • Supported chipsets for the USB Network Native Driver for ESXi Fling 04/23/2025
  • vCenter Identity Federation with Authelia 04/16/2025
  • vCenter Server Identity Federation with Kanidm 04/10/2025

Advertisment

Privacy & Cookies: This site uses cookies. By continuing to use this website, you agree to their use.
To find out more, including how to control cookies, see here: Cookie Policy

Copyright WilliamLam.com © 2025

 

Loading Comments...