dockervirtual-machineemulationinstructions

How does containerization software like Docker translate CPU instructions?


I recently ran into a bug where a python library used a certain CPU instruction which existed on one x86 processor but not on another, resulting in an unexpected crash of the program (Illegal instruction) on one system but not on another. That had me thinking of the benefits of containerization to create a well-defined run-time environment for my software. But my brain ground to a halt when I realized how low level this is, and I could not figure out from reasoning nor from reading on the internet, as to what level the isolation of software like docker goes.

Question

So my questions is: Would a containerization software, like Docker or LXC, be able to emulate an instruction which does not exist on the physical hardware? And would a full VM be able to deal with it, if a container could not?

Anecdotal information

Thought I'd fill in the blanks, just because people were curious.

The specific scenario I was caught by was when trying to apply Reed-Solomon erasure coding to a data object. I'm using the PyECLib library which implements Vandermonde Reed-Solomon via the liberasurecode library (which in turn uses jerasure, I believe).

Minimal Working Example

This piece of code runs without errors on a compatible processor, but produces the Illegal instruction exception on some older processors:

from pyeclib.ec_iface import ECDriver

ec_driver = ECDriver(k=1, m=5, ec_type='liberasurecode_rs_vand')
ec_driver.encode(b'foo')

Environment

I'm using Python 3.6 on multiple Linux platforms. The notable case where things wreak havoc is in an LXC container running Fedora 25 on the processor specified below, but I'd bet LXC and Fedora has little to do with it.

I've tried both pyeclib 1.4 and 1.1, and have the same thing happen.

These processors makes my program crash:

Here are some processors which works fine:


Solution

  • Containers don't translate instructions. A program running in a container is exactly the same as any other program running on the same machine, except that it has separate ("namespaced") instances of certain things, like the filesystem, the network stack, and the system hostname. The CPU isn't emulated or virtualized (any more than usual, anyway.)

    Virtual machines can support instructions not supported on the host machine, but they do not necessarily do so. If they do, it will usually come at a substantial cost in performance.