Quantcast
Channel: VMware Communities: Message List
Viewing all articles
Browse latest Browse all 232413

High guest CPU when a guest is on a particular host

$
0
0

Hello forum, 

 

   I have a unusual problem and am looking for some more help.   I have a cases open with VMware and HP and HP is engaging Microsoft to troubleshoot this issue.

 

Symptoms - On random VMguests when vMotioned to a host experience 100% CPU usage.  When vmotioned back off that guest CPU
returns to normal.  Host CPU is not affected.  These are random guests (SQL,Webservers, App servers) and the processes are also random.  VMware thinks it may be some sort to CPU
power setting because when you look at esxtop under CPU the difference between PCPU %Used and PCPU % Utliz is very large.
HP thinks it is a driver issue and they are currently analyzing a crash dump of one of the guests/Host having the problem.

 

 

From HP

  

The OS is spinning in driver code, which we have identified as the guest network driver stack, which he can DISABLE/ENABLE the interface to clear the condition.  The driver is supplied from vmware.   You can trigger a crash dump using the “.crash” command in windbg.

 

1)      Firmware driver interoperability

2)      Driver/User code interoperability –

  a.       Note windows can still control the guest.. and upon disabling the interface, the problem goes away

  1. b.      This implies something is generating an enormous number of interrupts or user level instructions and yet the code does not remain in KERNEL space, as the USER level API is able to control the “pseudo hardware
    interface”.  If the code was looping in KERNEL space and never
    interrupting, the user level space would simply queue and never get an
    instruction to the hardware!

We need to windows dump to ascertain the context of the loop.

 

  Enviroment

 

  • 14 host running ESX 5.1 build 1021289(this issue also was in ESX 4.1)
    • Did a fresh install on the hosts not an upgrade
  • vSphere cluster
    • Running DRS (currently set to partial automation because of this issue)
    • Running HA
    • vMotion network is 10GB and separate from guest/host networks
  • 14 HP blade BL460c G7 in two different enclosures
    • All firmware has been upgraded to the latest within the last month.
    • Changed BIOS power settings to Maximum Performance
    • Changed BIOS power settings to OS controlled
      • Changed power management in vSphere to maximum performance
    • Changed C-state to NO C-state and C3 state.
    • Enabled Disabled Hyperthreading

 

 

Any help with this would be MUCH appreciated.

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 


Viewing all articles
Browse latest Browse all 232413

Trending Articles



<script src="https://jsc.adskeeper.com/r/s/rssing.com.1596347.js" async> </script>