I was recently faced with an interesting problem. A company wanted to cost the migration of thousands of VMs to Azure using a lift and shift approach (also known as rehost). Due to the short deadline, we were not able to get our hands on detailed data. All we were provided with was a machine name, CPU cores count, RAM and a description field that was sometimes populated. Utilisation, storage and network usage were notably missing. We knew we couldn’t cost the migration accurately due to these unknowns, but we had enough data to cost the VMs themselves as we had access to CPU cores count and RAM. I must also add that the VMs varied greatly in their hardware specifications.

Microsoft offers a pricing calculator but it only supports manual input which disqualified it for our use case. A few Microsoft employees wrote web applications automating the pricing of VMs by importing Excel spreadsheets or CSV files. The ones I tried only offered USD as a currency and choked for anything bigger than a few hundred VMs. The output file was using a en-us culture so it had to be post-processed before being open in Excel. I didn’t have the time to review and select a commercial solution (Azure Migrate requires to create a VM on-premises which was not possible). At the end of the day I came up with a semi-automated process that did the trick, but I felt that not much work would be required to empower teams to price VMs based on a limited data set.

## Requirements

I wanted to build something that would fit my use case (AUD and en-au) but could also be used by people anywhere:

• Support thousands of VMs
• Support all currencies
• Support all cultures
• Ability to automate refreshing of the pricing
• Timeboxed to a weekend

Note: you can find the code on GitHub.

## What I came up with

My first goal was to retrieve the pricing from Azure. I initially considered the Resource RateCard (part of the Billing API) but the banner below didn’t fill me with confidence:

The Billing API requires authentication and parameters to be passed in, which would have increased the complexity of the solution. I knew one place would have mostly up-to-date pricing: the Virtual Machines Pricing page. This page displays all the available instances for a specific region. It is also possible to select a culture, currency and operating system. There was one problem though: the data is available as HTML markup instead of an API.

### Puppeteer

Puppeteer allows you to control Chrome or as the project puts it:

Puppeteer is a Node library which provides a high-level API to control Chrome or Chromium over the DevTools Protocol. Puppeteer runs headless by default, but can be configured to run full (non-headless) Chrome or Chromium.

It feels more natural to write a crawler in JavaScript (JavaScript is the language of the web after all). According to my limited experience, Puppeteer is also significantly faster than Selenium Web Driver. I didn’t bother creating a npm package so you’ll have to clone the repository and follow the instructions. This is the kind of output you can expect from the tool:

I made the assumption that a single culture and currency will be used through a pricing session and this is why I only encoded the region and operating system in the generated file names. Calling this tool can easily be automated as it doesn’t require any configuration and generates files on disk. You could run it at regular intervals and publish the artefacts.

Once we’ve got our hands on the pricing, all we need to do is size the VMs and cost them.

### Coster

The Coster is a .NET Core console application. Again, I didn’t bother pushing a NuGet package so you’ll have to build from source. You’ll need the pricing files generated by the Parser. The input expected by the Coster is a CSV file with the following format:

Once done, the Coster will write a CSV file:

## Conclusion

Let me know if you’re using these tools and I’ll tidy up and publish packages on npm and NuGet. I can’t say I’ve tested them extensively so be ready for some rough edges!