April 16, 2020 • 7 minute read • By Luca Dubies

Building Windows Containers For GPU Acceleration

Introduction

Following our hardware setup as described in Part 1 of this blog series, it’s time to look at the containers we want to run on our system.

While the possibility of supporting your dockerized workloads with GPU resources on Linux has existed for years, Microsoft only caught up with the 1809 update for Windows 10 in late 2018. With Windows based containers, there are a few restrictions one can stumble upon. Luckily, they are merely annoying or maybe disrupt your usual workflow, instead of posing real problems to a projects success. This blog post summarizes a few things we learned are important for Windows based containers.

Windows Base Images And Versions

Microsoft itself is hosting a plethora of different base images for Windows on it’s Microsoft Container Registry, varying greatly in feature size (and also data size, I had to increase our VMs storage to cope with the big docker image).

Because we have special prerequisites, we need to find the right base image, allowing us to use our GPUs to accelerate DirectX. Microsoft clearly states, that it has to be the full base image, and we can’t use the smaller Server Core or Nano Server images. Also, it has to be version 1809 or newer!

But taking a closer look at the basic working concepts of Windows containers leads to the conclusion that a little more care has to be taken when choosing which Windows build to use.

Different Isolation Modes

Near the end of the aforementioned article it is stated that no GPU acceleration is supported for Hyper-V isolated containers. Well, what is the difference between Hyper-V isolated containers and their counterparts, and how does it influence our decision for a specific Windows build?

For the avid Linux user that has to battle with isolation on Windows for the first time, Hyper-V might be a new term. Hyper-V is Microsofts Hypervisor used for virtualizing hardware. It is available in a standalone version, but also comes natively with almost all Windows 10 versions. As we dont want our containers to run on this technology, we have to find the alternative.

The Docs for Windows containers claim that the container shares the kernel with the Host-OS, using process isolation to, well isolate, the container from the rest of the system. This is basically how containers work on Linux too. And it’s also where the restriction comes from that the base image version can’t be newer than the host Windows version, because the Windows kernel might significantly change between versions. But apparently Windows containers with older base images can be successfully run on newer Windows versions.

And that is exactly where Hyper-V comes in. When dealing with an older base image, instead of using process isolation, the container will be executed in a Hyper-V Virtual Machine. For us to avoid that, it is now clear that the base image version has to exactly match the Host-Windows version, if we want to profit from GPU accelerated workloads on Windows 10. A few more details about the different Modes can be found here.

Visualization of the differen isolation modes

Building The Image

Now, knowing exactly which base image to use, building your docker image for an application should be a simple few steps. But I myself ran into a problem.

You see, while working with docker, I got used to the workflow of building my container image wherever the code I wanted do dockerize was, and just sending the image to whatever machine I wanted it to run. That means I didn’t want to build the image on the VMs, but on my machine used to develop the workload. But each time I tried to pull the correct base image, the Microsoft repository claimed no matching manifest for unknown in the manifest list entries!

While rechecking Windows versions for what seemed like the tenth time, I came across the idea to check my own machine. And would you look at that, I had not yet made the jump to the latest release. As I never intended to run the containers on this machine, that should not have any consequences. Still, after an update, docker had no issues of pulling the desired base image and building. So while it is clearly stated what Windows versions are compatible for running containers, no information is given that the same restrictions somehow also apply for building the images.

While I initially put the blame on me for somehow missing out on this information, I later stumbled across some of the reviews for the Windows images. Loads of people running into the same issue, with no information to be found anywhere, prompts me to include this little escapade in this blog.

What We’ve Learned

Because we aim to use GPUs to assist our dockerized workloads, the choice of the Windows base image is not trivial. But by understanding some of the functionality of Windows containers and the resulting restrictions, we can pin down what exactly we need for our project. It is also important to keep in mind that these restrictions might affect your build workflow, so try to build your Windows images on the target systems or make sure that all involved systems share the same version of Windows.

What’s Next

After taking a peek at our hardware setup in Part 1 of this series and exploring part of the image build process in this blog, Part 3 will show how these containers can be dynamically deployed to fit our current needs.