VMware ESXi 5.5 – Unable to Consolidate virtual machine disk files

I’ve been working on an issue over the past couple of days where a backup has constantly been failing. the problem was isolated down to the fact that the VM has a warning that it required disks to be consolidated. Nothing major, or so I thought. I had a look at the dataastore where the VM resides and it has 185 snapshot vmdk disks. Well that can’t be right! So I did a bit of investigation and found a number of VMware KB articles around the problem. The basic option is to follow KB 2003638 and just run a basic consolidation by going to Snapshot -> Consolidate.

consolidate1

You’ll then be prompted to select Yes/No as you’ll have to consolidate the Redo logs. Select Yes.

consolidate2

At this point it looked as it the consolidation was going to work but at about 20% it failed. The next error shows that the file is locked.

consolidate3

There are a number of recommendations around what can be done to remove the lock on the file. One is to run a vMotion/svMotion in VMware to another host. Unfortunately due to these both being standalone ESXi hosts with no vMotion network or capabilities that couldn’t be done. Some people recommend reboot the ESXi host to release the lock but per my issue above, there was no vMotion network and these hosts run production manufacturing systems and cannot just be randomly rebooted. Waiting on a downtime approval would take too long. The next step was to restart the management agents on the ESXi host. This was done by connecting to the ESXi host via SSH and running the following commands: Continue reading

Cisco UCS – FSM:FAILED: Ethernet traffic flow monitoring configuration error

During a recent Cisco UCS upgrade I noticed an error for ethlanflowmon which was a critical alert. I hadn’t seen the problem before and it occurred right after I had upgraded UCS Manager firmware as per the steps listed in a previous post I wrote about UCS Firmware Upgrade. Before proceeding to upgrade the Fabric Interconnects I wanted to clear all alerts where possible. The alert for “FSM:FAILED: Ethernet traffic flow monitoring configuration error on” both switches was a cause for concern.

ethlanflowmon1On further investigation I found that this is a known bug when upgrading to versions 2.2(2) and above. I was upgrading from version 2.2(1d) to 2.2(3d). Despite being a critical alert the issue does not impact any services. The new UCSM software is looking for new features on the FI that do not exist yet as it has not been upgraded. As soon as you upgrade the FIs this critical alert will go. More information about the bug can be found Cisco’s support page for the bug CSCul11595

 

Cisco UCS – CIMC did not detect storage controller error

During a recent UCS firmware upgrade I had quite a few blades show up with the error “CIMC did not detect storage”. Within UCSM I could see that the blade had a critical alert. It initially started after I upgraded UCS Manager firmware as documents in a previous post I wrote about UCS Firmware Upgrades. I did some searching around to find what may be causing the issue and the best answer I could find was to from the Cisco community forums to disassociate the blade, decommission and reseat within the chassis. I later spoke to a Cisco engineer and he advised of the same steps but that it was also possible to do without reseating the blade. This also looks like its a problem when upgrading from 2.2(1d) to other versions of UCSM but I haven’t been able to validate if it’s only that version or if it also affects others.

The full error I saw was for code F1004 and for Controller 1 on server 2/1 is inoperable. Reason: CIMC did not detect storage

cimc1

Within UCSM I could see there was an issue with the Blade

cimc2

Before proceeding with the upgrade of the FIs, IOMs and Blades themselves I wanted to clear any alerts within UCSM, particularly critical alerts. The steps I followed to bring the blade back online were to go to the blade and select Server Maintenance
cimc3

Continue reading

The Life of NetApp – Bring out your dead!

There’s a quality scene in Monty Python’s Life of Brian where the dead are being called out to be loaded onto a cart to be taken away. Are new players in the market doing the same to NetApp? Even though they continue to say that they’re not dead everyone is writing them off and chucking them on the death-cart.

It’s easy to see why NetApp is being called to bring out its dead. There’s more and more players appearing in the storage market with serious differentiators to NetApp. Just look at the list of potential competitors like Pure Storage, Tintri, SimplivityNutanix and Nimble. And that’s not including the fully software defined storage groups such as Maxta, Stratoscale and a host of others. There’s also the old adversary EMC. All of these vendors have released new and innovative products in the past year and they have managed their marketing message far better than NetApp has. NetApp has been painfully slow at getting a smooth transition in place for its 7-Mode customers to Cluster Data OnTap (C-Dot). A lot of critics of NetApp also point to the fact that they are so heavily reliant on the OnTap software. I personally don’t see an issue with that reliance. Don’t change something just to create a new release for the sake of it. But the marketing message and the perception by the community of NetApp has caused a number of issues for them.

Continue reading