Making architectural decision on a simple DCIM ETL application
Hello, blog!
This week at work, I've been working on PHP-IPAM [1] to NetBox [2] migration. These are Data Center Information Management (DCIM) systems that helps us keep a track of assets at our various sites.
We use an Excel spreadsheet (per site) with the deployment information such as cage locations, racks, rack units to place the devices in, networking connections, and other configuration information. This information is then passed on to the on-site personnel to make it easy for them to setup the site.
Our team members would manually enter the data from the spreadsheet into the IPAM software. I was confident that we could replace this manual process with some Python automation code to read the spreadsheet and then update NetBox automatically.
Step 1 - Scoping the project
A problem well-stated is a problem half-solved.
What we needed our code to do could be cleanly split into 3 parts:
- (Extract) Parse information from the deployment spreadsheet
- Transform the data into relevant fields to fill up in NetBox
- (Load) POST data onto NetBox using the API
So, now this looks like an ETL problem. The core problem is reading the human-readable data from the spreadsheet into NetBox ids.
Step 2 - Implementing the project
The JSON file to be posted to NetBox to create devices looks like the following:
{
"name": "compute-119.tr2-pr-rh",
"device_type": 365,
"role": 4,
"site": 38,
"location": 34,
"rack": 187,
"position": 15,
"face": "front",
"airflow": "front-to-rear",
"status": "active"
}
As you can see; most of the fields have 'ids' but the deployment spreadsheet would have values. e.g. device type - server, role - compute, site - TR2, location - cage 002:112:039, etc.
So, in order to POST device to NetBox; we'd first need to get these IDs. Let's explore the 3 strategies to do this:
1. Naïve per-row lookups
For each row of data in the deployment sheet, we can make multiple calls to NetBox. The advantage of this approach is that it's super simple to reason about but it would be slow for large spreadsheets and increases the number of requests made to NetBox.
2. Preload reference data into in-memory dictionaries (best for our use case & choosen)
In this method, before processing the spreadsheet; we load all relevant reference tables once, then resolve everything in-memory. Using dictionaries in Python is O(1) so it'll be fast. Also, we reduce the number of API calls made to NetBox (DRY - we're not asking for the same information multiple times). The downside of this is obviously the added complexity of startup logic and also we need clear rules for matching (for e.g. the site on the spreadsheet could be named tr2-site but in Netbox, the site may be stored as 'TR2' so we need some 'fuzzy-matching' rules)
3. Maintaining a metadata cache / mapping DB
We can introduce a small DB/table that tracks mapping. A small "metadata service" / table to sync with NetBox (using a cron job) and then, our job reads from this cache, not from NetBox directly. This involves managing the service/db though so was not selected. If in the future, we raise to Hyperscaler levels where we're adding servers daily; then I will formalize the metadata cache. I'd setup a periodic sync.
So, in the end I ended up building the a batch import service that preloads reference data from Netbox into in-memory dictionaries. Then, it reads the deployment sheet row-by-row, creates the JSON payload shown above, & then upload it to NetBox using POST API call.
Thanks for reading! Feel free to let me know if there was any better way of doing this that I hadn't considered.
References:
[1] https://phpipam.net/
[2] https://netboxlabs.com/
Comments
Post a Comment