Learning “AWS Backup” Restorations for On-Prem VMWare VMs

CF Webtools has maintained VMWare ESXi guest OS instances, managed by vCenter, for about 7 years. They are a mix of Linux and Windows Server OSs and are maintained at a secure and redundant co-location data center. While an expensive up-front investment, it has paid for itself over those years, and we have a plan to continue that solution for about another 5 years. A recent upgrade to the next major version proved that virtual machines take a fraction of time for maintenance compared to bare metal instances. Granted, there’s some spin-up time when things work for so long, and you must remember, research, and troubleshoot procedures. Managed cloud takes almost all that time out of the equation, making it my favorite. Though I do miss hands-on hardware here and there.

Some of our on-prem VMs are critical, and some are not. The critical ones have always been backed up with different solutions, depending upon what they are and what the recovery needs look like. However, almost all have come with challenges. So I wanted to look for a VM snapshot-based cloud backup solution that I could trust and would be budget-friendly.

My first direction was to research Veeam. Their solution is very well known. However, it was a struggle to get the attention of Veeam and CDW as a small business without an existing account. I was able to lean on one of our hardware vendors, xByte, who hooked us up with one of their Veeam partners. But it was determined that it was fairly costly with a per-instance license model compared to our existing solutions. So I continued my search.

I then found AWS Backup has an on-prem VMWare solution. AWS Backup is relatively new to the backup game, but its implementations are continually growing. We currently use that service for all our AWS EC2 backups. That service was a “God send” after numerous awful implementations of custom Lambda/CloudWatch scripts and an EBS Automation method. Finally, a solution for what should have been around since the start of EC2.

As of November 2021, AWS Backup offers backup for on-prem VMWare vCenter servers. You must install their Storage Gateway virtual appliance as the “middleman” agent. I was hoping for an “agentless” solution; however, we only pay $0.05/GB-Mo warm storage and $0.01/GB-Mo Cold Storage. That’s a considerable saving, considering we do not have to pay for a license per instance, and there are no incoming bandwidth fees! We will have to pay bandwidth for on-prem restores, but considering that is very rarely done, and bandwidth is relatively cheap, it’s a non-issue. We’d have to pay for storage anyway, so there’s no change.

Another significant advantage is we get a single backup solution for both on-prem and AWS Cloud. It’s one less piece of software we must be familiar with, document, troubleshoot, and keep updated. Outside of an office domain controller, we also anticipate a complete cutover to AWS in 5 years.

Continue reading

#aws-backup, #backup, #vm, #vmware, #vsphere

Resolving VMWare Converter File I/O Error

Here at CF Webtools we’ve been shifting towards Virtual Machines to replace our dedicated “iron” as Mark Kruger likes to say. Let me say for starters that I’m very impressed with the Dell PowerEdge VRTX Shared Infrastructure Platform. There is so much back-end power in that thing we’ve been able to move our entire set of staging platforms onto one M620 Server Node along with a RAID 6 shared PERC disk array. It handles it like a champ and each virtual server is extremely fast. After about a year of testing and no real issues we’ve been able to move some of our production servers onto another server node. We’re looking forward to adding a couple more server nodes as well. Each node runs VMware ESXi HyperVisor 6.0 via RAID 1 SD cards.

In addition we’re also migrating some workstation VM’s away from Hyper-V and onto a separate Dell Server that we’ve reclaimed.

vcenterconverter61But here’s the real reason I’m writing today:

FAILED: A file I/O error occurred while accessing ”.

I get this error when using “WMware vcenter Converter Standalone 6.0.0” to convert any powered-on Windows machines onto one of the ESXi instances. I don’t get this issue when converting power-on Linux machines. Very odd and Google results of forums haven’t been very helpful. Mostly just a lot of run chkdsk and check for fully qualified domain resolutions.

I’m not going to cover Linux conversions here since they work. But basically what a powered-on Windows conversion does is it installs a helper VM on the machine to be converted. It’s run as a service and you have the option to manually uninstall when finished or let it automatically uninstall.

Something, probably this helper service, then takes a snapshot of the source system. Then the helper VM does a block-level clone for each volume it finds.

Mine always failed after the snapshot and before the clone.

What I did was used the “Export logs…” link in the converter. The line I found interesting, reading the file vmware-converter-server-1.log, was:

error vmware-converter-server[01288] [Originator@6876 sub=Ufa.HTTPService] Failed to read request; stream: <io_obj p:0x03dc40ac, h:-1, <pipe ‘\\.\pipe\vmware-converter-server-soap’>, <pipe ‘\\.\pipe\vmware-converter-server-soap’>>, error: class Vmacore::TimeoutException(Operation timed out)

After some Google searching it dawned on me that I am using two IP subnets. One for the general network and one for management. My machine runs 10.0.0.* (general) and 10.1.1.* (management) subnets. The source system has 10.0.0.* assigned to it while the destination ESXi server has 10.1.1.* assigned to it.

Because my system can communicate with both networks, everything could communicate just fine with both the source and destination machines.

However once things get rolling, the process moves from my machine to communicating between the source and destination. My machine merely monitors the progress. Which makes sense. Keep out the middle man and you have efficient network data transfer.

So the fix here was to bind a temporary management subnet address (10.1.1.*) to the source machine’s NIC. Now the helper VM is able to communicate with the destination server over that management subnet. Continue reading

#convert, #esxi, #file-io-error, #vmware, #windows