After months of waiting for VMware to make Update 1 available for vSphere/vCenter 5.1, it finally arrived. We had hoped that it would provide fixes to some half-baked items that we had noticed after deploying vCenter 5.1. As of right now, I personally can't say if those issues or annoyances we found have been fixed or not.
Unfortunately, I can't login to the "preferred" web client that VMware wants us to adopt so bad.
According to KB article 2050941, my admin account that I login to vCenter belongs to too many groups in Active Directory. Are you kidding me? It says that there is not a definitive number of groups that is the threshold but is normally around 19. I belong to 24 while my co-worker that can login belongs to 20. Clearly our threshold is somewhere in there. My question is how long has VMware been running 5.1 U1 in their labs and somehow never noticed this issue?
There are three workarounds for this issue.
- Log in to vCenter Server via the vSphere Client using the Use Windows session credentials option. - So now I need to use a client that doesn't include the new 5.1 features?
- Work with your Active Directory administrator to modify the group membership of the vCenter Server login account to a minimum. - hahaha! There's a reason why I belong to so many groups. My day-to-day activities depend on those memberships.
- Limit the number of domain based identity sources to no more than one. - We have users from around the world logging in that need those identity sources available. Odds are most of them can't login either though.
Yet again, VMware has released more software/updates that seem to be half-baked and not fully tested for even the largest of their customers. This just adds more fuel to the fire that is pushing us to really consider Microsoft's latest Hyper-V release. Twelve hosts yet to be ordered this year for a refresh of old vSphere hosts in our environment. Maybe they will be Hyper-V hosts instead.
I just found out that John Troyer has opened up the application process for vExpert 2013. This is the first year that I will be applying for it. I didn't feel like I had deserved that recognition in previous years. Hopefully John and company feel I meet their criteria. For those of you that are unaware of the vExpert program, see below.
These are the bloggers, book authors, VMUG leaders, speakers, tool builders, community leaders and general enthusiasts. They work as IT admins and architects for VMware customers, they act as trusted advisors and implementors for VMware partners or as independent consultants, and some work for VMware itself. All of them have the passion and enthusiasm for technology and applying technology to solve problems. They have contributed to the success of us all by sharing their knowledge and expertise over their days, nights, and weekends. They are, quite frankly, the most interesting and talented group of people I’ve ever been in a room with.
There are three paths that can be taken by a vExpert:
The Evangelist Path includes book authors, bloggers, tool builders, public speakers, VMTN contributors, and other IT professionals who share their knowledge and passion with others with the leverage of a personal public platform to reach many people. Employees of VMware can also apply via the Evangelist path. A VMware employee reference is recommended if your activities weren’t all in public or were in a language other than English.
The Customer Path is for leaders from VMware customer organizations. They have been internal champions in their organizations, or worked with VMware to build success stories, act as customer references, given public interviews, spoken at conferences, or were VMUG leaders. A VMware employee reference is recommended if your activities weren’t all in public.
VPN (VMware Partner Network) Path
The VPN Path is for employees of our partner companies who lead with passion and by example, who are committed to continuous learning through accreditations and certifications and to making their technical knowledge and expertise available to many. This can take shape of event participation, video, IP generation, as well as public speaking engagements. A VMware employee reference is required for VPN Path candidates.
After some issues with booking a new venue, the Omaha VMUG team has locked in a date and vendor to present for Q1. It will be held on April 16 (yes, that is technically Q2) at The Old Mattress Factory in their party room. We have locked in Xangati as the primary vendor to speak about their product. VMware will naturally be there with their update and also providing some vCOPs info. Drew, David and I will also be presenting I the middle with some free management and monitoring tools that could be of use to any VMware administrator.
If you would like more information about the meeting please check out the official page here.
I have added a new page labeled "Scripts" as you can see from the menu above. It includes a few of the PowerCLI scripts that our team uses in our production environment. Some are very simple while others are more than "one-liners." Either way, they have been useful for us and may be useful to you. As more are written, I will add them to the list. Enjoy!
2012 was definitely a busy year for not only me, but the whole team I belong to. There were many accomplishments and goals met that were planned like further automation within our virtual environment. Many accomplishments were unplanned and achieved. The biggest was the ability of our virtualization team and how they pulled together to design, develop, and implement a VDI environment at VMworld 2012, with a week of notice and no prior knowledge of VMware View.
Personally, I also completed my classes at a local college to further my degree. This took a lot out of my free time but was beneficial as I did learn a few things, especially in project management, and most importantly, I got that piece of paper.
Now that I have completed my schooling, for the time being, I will have more time to put into this blog. I have definitely neglected it. I plan to add more posts about our adventures with PowerCLI and vCenter Orchestrator. We also plan to implement a Hyper-V 2012 environment in the coming year so I may throw some things in about that. Hey, it is still virtualization! 2013 looks to be packed with projects all across the board so I should have plenty to blog about.
I'm officially on vacation until 2013, so I'll see you then! Happy Holidays!
A couple weeks ago, our Network Operations team stumbled upon numerous MACs flapping on their Cisco switches. We began investigating where these MACs were in the data center as every switch stack in every row was seeing this issue. An example of what we saw is listed below:
|10/24/2012 2:27:19 PM||appsw01-c8-gis-omaedc||Warning||1622524: . Host 0021.5add.383d in vlan 700 is flapping between port Po9 and port Po8|
|10/24/2012 2:27:19 PM||appsw01-c8-gis-omaedc||Warning||1622523: . Host 0025.b382.2561 in vlan 707 is flapping between port Po37 and port Po36|
|10/24/2012 2:27:19 PM||appsw01-c8-gis-omaedc||Warning||1622520: . Host 0025.b382.2561 in vlan 425 is flapping between port Po37 and port Po36|
|10/24/2012 2:27:19 PM||appsw01-c8-gis-omaedc||Warning||1622522: . Host 0025.b382.2561 in vlan 703 is flapping between port Po37 and port Po36|
|10/24/2012 2:27:19 PM||appsw01-c8-gis-omaedc||Warning||1622521: . Host 0025.b382.2561 in vlan 450 is flapping between port Po37 and port Po36|
After digging, we found that it was the stacking link MAC address on our Virtual Connect modules in our HP c7000 enclosures. Next, I had to determine if it was from every enclosure, 29 total, or only certain enclosures that had something in common. An email was sent to our HP Account Support Manager about the issue and if he had any prior experience. He mentioned he has seen instances relating to ESX servers, NIC drivers, or LLDP packets not handled correctly.
Through our investigation, enclosures without ESX servers were causing this issue. We doubted the NIC driver issue, since it was the stacking link. Our network team went down the LLDP route initially but it resulted in no change. We called HP support to go further and one of the engineers provided the following customer advisory. The description matched our issues as the one thing we noticed was that the enclosures with VC 3.15 (we are on our last month of VC upgrades to 3.60, just in time to start upgrades to 3.70!) were not causing the issues. The advisory indicates the Network Loop Protection setting was put into place in version 3.51 and affects later versions. The NLP frame being transmitted every five seconds was aligned with what we saw in the logs as well.
HP support could not comment on whether any pings would be lost when the setting was disabled. The description is a bit vague on that question but they, along with our Account Support Manager said it shouldn't but they didn't have first hand knowledge as to if it would cause any network disruption during the disable process. We scheduled a change time late at night.
Good news followed immediately. As soon as I applied the change, no pings were lost and our switch logs began clearing up. Days later, there is still no flapping seen and our switch CPU usage has dropped to normal levels. Unfortunately, it doesn't look like this issue is in the fixes under VC 3.70.
As I had mentioned that I would be at VMworld as event staff and as an attendee, I planned to take pictures. I didn't take as many as I thought I would as I was much more busy than I had planned. I did take quite a few pictures of behind the scenes as I thought about it. Sorry for the delay as I've had an extremely limited amount of time since VMworld, but here they are!
This year will be the third year in a row I have attended VMworld. Every year gets better and better. The last two years I was fortunate enough to get full conference passes through one of the agencies under my company's holding. That agency has been the primary company that sets up much of VMworld, along with other large conferences, for many years. The last two years, I not only attended VMworld for my company's benefit but also to provide an unbiased opinion on how VMworld was presented digitally and physically. I asked many other attendees around me for their opinions and passed them on. It was a great gig!
This year, something a bit different happened. Our agency was denied those passes that we received in years past. No problem, as my VP realizes the great benefits of us going to VMworld every year so we were bound to go anyway. A little over a week ago, my contact from our agency called me in a panic needing some IT assistance for VMworld. Our company decided that we would assist in their time of need. Therefore, I will be helping set up a small portion of VMworld! I will begin on Thursday the 23rd and plan to take a few behind the scenes photos if I find something interesting during the build. I also plan to take plenty of photos during the event and share them here. Check out my twitter updates for pictures as I take them. I will post them to the website during the little downtime that I get.
I'm excited for the opportunity to help out where I can. Too bad they won't need my services in Barcelona!
Our team recently performed an upgrade of our Virtual Connect modules to ensure compatibility for the new G8 blades. We have 29 C7000 enclosures in our primary data center and all of them are slated to get the new VC 3.60 firmware that was released earlier this month. We planned to upgrade six enclosures to start with as we had a planned downtime scheduled for a Saturday afternoon. We initially kicked off the Onboard Administrator and iLO firmware upgrades with HP SUM that is packaged in the latest SPP release. For some reason, HP SUM would not authenticate with our VC modules so we were forced to use the Virtual Connect Support Utility (VCSU). We've used the VCSU in the past and had good results so we weren't too discouraged.
We began the VC upgrades after OA and iLO were finished. We noticed within VCSU that the status stalled around 40% for a very long period of time but then eventually completed. We moved on and completed all six enclosures. The OAs reported the VC firmware as v3.60 just as we expected. Some of our staff logged into a couple of our clustered nodes that reside in the upgraded enclosures and noticed there was zero network connectivity on half of the network teams. Further investigation showed that the firmware was not upgraded to v3.60 once logged into the VCM, regardless of what the OA had reported. More investigation followed and some of the VCMs were unavailable and inaccessible.
Troubleshooting began to bring them back online. Reseating the modules did not bring the modules back online. Someone then tried just powering off the modules and powering them back up. This method did allow us to gain access back into the VCM. The firmware version of 3.60 was then correctly reported in the VCM and network connectivity was restored. A call to HP Support provided less than desired results. We were even told to just sit and wait for one of the modules to come back by itself. After many hours of battling, all of the modules were restored and reported firmware v3.60.
Later we contacted our HP solutions architect and he was able to invite a VC engineer on the call. We described our situation and indicated we were upgrading from versions 3.15 and 3.18 on VC 1/10gb Ethernet modules and Flex-10 10gb modules. This issue we saw were the same across the two versions we had and the two types of modules we upgraded. The HP VC engineer confirmed they have had reports from customers with this issue. He confirmed when upgrading to v3.60 from a version older than v3.51, the modules hang during the reboot phase when it is attempting to confirm the configuration with the other module. The excessive hang at 40% that we experienced is caused by this. A power off/on of the module clears this hang and thus restores its functionality. He also said that a customer advisory would be published soon about this known issue. In the meantime, he has given us workarounds to prevent this issue from occurring:
- Upgrade from pre-v3.51 firmware to v3.51 and then perform an additional upgrade from v3.51 to v3.60 using SPP or VCSU
- Upgrade from pre-v3.51 firmware directly to v3.60 and perform a "Reset" of the modules using SPP or VCSU
- Upgrade from pre-v3.51 firmware directly to v3.60 using VCSU and order the upgrade to occur in parallel instead of the default staggered approach
Option 3 can only be performed using VCSU as the SPP (HP SUM) approach does not allow the upgrade to occur in parallel. Be aware that a parallel upgrade will cause an outage of approximately 45 seconds as the modules reboot themselves.
If you are currently running a VC firmware older than v3.51 and plan to upgrade to v3.60, be aware of the issue above. Plan to use one of the three workarounds to reduce the issues you may run into.
Update! - HP's Advisory about this issue was released on June 29th, 2012 and can be accessed through this link.
We can all agree that larger vSphere environments bring many challenges. One of them is updating all of the VMware Tools on the numerous VM guests. One of the solutions our team decided to use was to enable the VMware Tools Update during power cycling, previously discussed here. This worked for us well in vSphere 4.x and updated the majority of Tools in our environment. VMware has changed this feature in vSphere 5 however.
Our environments were recently upgraded to vSphere 5 and everything was looking great! Then we had our routine maintenance weekend. This would update well over half of the tools and we would get the rest on the next maintenance weekend. That was the plan.
Throughout the next week, we started receiving reports of users logging into their servers and being prompted about a VMware Tools install and the server needed to be rebooted for the changes to take effect. The user naturally clicked 'No' but the server proceeded to reboot. Wait? The user clicked 'No", right? That's what they said. The first couple we didn't think much of it. Then some of our team experienced this issue. So we knew this just wasn't a "user error."
We got VMware on the phone to investigate. They had no reports of this happening anywhere. After a long discussion, we were informed the order that the VMware Tools upgrade during power cycling was changed in vSphere 5. Instead of updating the tools on the way down, it now updates the tools after the server has already rebooted. I don't know why this was changed but this now requires an additional reboot on top of the reboot that initiated the tools install.
Wait, what about the users clicking 'No' when they were asked to reboot or not and it still reboots. As time went on and more info was gathered, we found this was only happening on Windows 2003 x86 servers. The user was presented with the following dialog box:
The instances we saw, the above dialog box was always shown. Servers that weren't Windows 2003 x86 showed a second dialog box that was specific to VMware Tools and accepted the No as an option. Windows 2003 x86 servers however, did not present this second dialog box and proceeded to reboot. The Windows event logs then had the following event:
VMware did some further troubleshooting and had us perform further testing. When we encountered these dialog boxes in a Windows 2003 x86 server within the vSphere Console, the server did not reboot. It did reboot when a user had logged in using a remote desktop client.
VMware still has not given us a definitive answer as to why this is occurring as this is the first they have seen of it. I will post an update here when we get a verdict. Also, ensure the new order of Tools upgrade during power cycle will fit your needs.
VMware never confirmed our testing that was described above. However, we are sticking to our findings and give caution to your updates.