Pascal's Wereld

My virtualization somethings, somewhats and the like.

EUC, Operations Management, vExpert

vROPS for Horizon: Getting more with NVIDIA vGPU insights

NVIDIA GRID Management Pack

With multimedia requirements being more of a commodity for virtual desktop use cases, with previously being just a few users needing multimedia, hardware graphics acceleration is used for the complete virtual desktop estate. With that, we also need more insights into how the (virtual) GPUs are behaving. Standardly vROPS for Horizon does not give you insights into the GPU performance. The only GPU related performance metrics are for example in-guest (perfmon, GPU-z or GPU sizer), GPU hardware stats with nvidia-smi or looking at other desktop metrics (display protocol session details or compute resource usage). But with the release of the NVIDIA Virtual GPU Management Pack, we also have the option of getting these insights in vROPS for Horizon. This management pack will bring a new level of visibility into the health, performance, and efficiency of your virtual desktop estate with NVIDIA virtual GPU right down to the application level.

The NVIDIA Virtual GPU Management Pack supports all of NVIDIA’s virtual GPU products, including the Quadro Virtual Data Center Workstation (Quadro vDWS), the NVIDIA GRID Virtual PC (GRID vPC) and the NVIDIA GRID Virtual Apps (GRID vApps) products.

Let us see how we can make the vROPS for Horizon NVIDIA vGPU Management Pack work.

Using NVIDIA vGPU Management Pack

Requirements

  • vROPS for Horizon version – You will need vROPS for Horizon 6.5 that was released on last September. This makes it also possible to upgrade vROPS itself to 6.6. If you  want some information about upgrading, see my previous post https://www.pascalswereld.nl/2017/09/26/vrops-upgrading-vrops-for-horizon-6-5-and-vrops-6-6/
  • Grid Version – NVIDIA virtual GPU software version 5.0 or later driver package is configured on the hosts in your graphic accelerated ESXi compute cluster.
  • NVIDIA Virtual GPU Management Pack – The NVIDIA GPU Management Pack downloaded and installed as a vROPS solution. Download here at the VMware Marketplace. With this, you will get the collecting adapter, NVIDIA dashboards, and GPU alert definitions.
  • Connectivity vROPS to Hosts – vROPS for Horizon needs data flow to read the hardware GPU components at the host. CIM Secure server to CIM client (TCP 5989) connection from vROPS to the host is required here. If there is some filtering in between than nothing is showed in the dashboards while the solution will show green on collecting. In that case, you will need to open these ports.

NVIDIA GRID Management Pack

Configure

To configure the NVIDIA Management Pack is easy if above requirements are met. After installing the NVIDIA Virtual GPU Management Pack for VMware vRealize Operations, you must configure it by creating an NVIDIA vGPU adapter instance. This is like configuring the vSphere Adapter, you will again need a connection to the vCenter that is managing your GPU enabled compute. It would be nice if we could re-use some of those adapters. When we fill in the vCenter FQDN and the credentials to be used from vROPS, we test the connection and save settings. And we have lift off. Give it some 10-15 minutes to have multiple cycles of the data collection (default is 5 minutes) before something is shown on the dashboards.

Getting those dashboards

The dashboards will give you GPU and host overview, encode and decode details, and memory utilization at the host or guest level. Or go even more in-depth with application-level monitoring capabilities. In the dashboard list, select on of the NVIDIA Dashboards. 

NVIDIA Dashboards

Let us see what dashboards are shown here:

  • NVIDIA Environment Overview – Showing an overview of the hardware state (temperature and alerts), and top GPU resource and en-/decoding utilization views
    (some GPUs are a little high on temperature)

NVIDIA Overview

  • NVIDIA Host Summary – Focus on the host details
  • NVIDIA GPU Summary – Focus on the physical GPU details.
  • NVIDIA vGPU Summary – Select on the vGPU of your environment and see properties details, relationships and other specifics for that vGPU

 

NVIDIA Summary

  • NVIDIA Application Summary – Focus on a vGPU and the application process of those applications using the vGPU. Go in-depth of the utilization characteristics for that specific vGPU and process combination.

NVIDIA Application Summary

If you have vGPUs running in your Horizon virtual desktop estate, vROPS for Horizon with the NVIDIA vGPU management pack is a requirement to have an optimal centralized operational management solution. Not just on a one-to-one insight, but an overview of multiple virtual desktop vGPU insights.

– Enjoy having more NVIDIA GPU insights!

Sources: vmware.com and NVIDIA.com

 

Share this:

2 Comments

  1. Zahir

    Hey Pascal,

    I have truly enjoyed and learning more about vROPs for Horizon. I have a project that l may need your technical guidance.

    vRealize for Horizon Agent installed yet not in use.
    a. Customer intends to implement vRealize in order to monitor session
    performance as well as to gather information for usage based chargebacks.
    b. Customer currently utilizes LiquidWare Stratusphere UX for this purpose
    but would like more information regarding vRealize as a replacement option.
    2. VMware Unified Access Gateways (UAG) not in use. Consider upgrading to allow for the enhanced UAGs to replace Security Servers.
    3. Testing found high levels of packet loss, significant bandwidth fluctuations in PCOIP logs.
    a. Can I troubleshoot and monitor this by addressing #1?
    4. Auto provisioning of RDSH servers to pools.
    a. This was not part of the Health Check but is a critical component in their Horizon environment.

    • Hey Zahir,

      Thank you!
      You could use vRealize. But there is more than just installing the products. E.G. for troubleshooting it also depends what layers are configured and monitored there. If the root cause for the packet loss is outside of the collected data you would need to further look at that component or layer. But the same applies to LW as well.
      For the project guidance please use my contact details to get in touch and discuss outside of the comments.

Leave a Reply

This site uses Akismet to reduce spam. Learn how your comment data is processed.

Theme by Anders Norén