performanceoptimization

How Long Do Programming Operations Take?


We've all probably seen this table from Peter Norvigís' essay Teach Yourself Programming in Ten Years:

COMPUTER ACTION TIME [IN NANOSECONDS]
execute typical instruction 1
fetch from L1 cache memory 0.5
branch misprediction 5
fetch from L2 cache memory 7
Mutex lock/unlock 25
fetch from main memory 100
send 2K bytes over 1Gbps network 20,000
read 1MB sequentially from memory 250,000
fetch from new disk location [seek] 8,000,000
read 1MB sequentially from disk 20,000,000
send packet from US to Europe and back 150,000,000

The problem is, I have no clue what any of this means in the real world.

You see, all of the most popular languages nowadays are so heavily-abstracted that you basically never need to touch a piece of memory yourself. All I know is "I type in thing, computer go brrrrr", or "I type in thing, computer go [TASK NOT RESPONDING] then go brrrrr". Optimization is hard when all you know is functions and libraries and the only timing info available is in bits and bytes.

I'll get to the point of understanding the memory implications of each line of code I write eventually, but for now, what I want to know is:

How long do your average programming operations take?

Things like:

I'm also interested in the implications in other languages - that is, if certain languages are better at certain tasks than others. For example, I know that array manipulation is awful in JS compared to languages like Java, and I'd imagine functions would be far easier to alter in something like lisp, so what operations would be more efficient in some languages than others?

Thank you!


Solution

  • For one, the numbers are questionable. I would correct as this:

    The rest of them are more or less correct, albeit SSDs improved disk latency by couple orders of magnitude.

    Now back to your question. There’re 3 groups of languages/runtimes.

    1. Compiled to native code like C++ or Golang. Usually but not always, that’s the fastest option available.

    2. Interpreted, like most Lua and Python implementation. Interpreters are slow by design, however many of them call lots of native functions. For these things the cost is close to native code, sans a small overhead for indirect jump and marshalling of arguments / return value. For instance, if you write and benchmark python code that reads and manipulates lots of strings, pretty sure Python runtime will call native code if you manipulate them with built-in methods, gonna be quite fast.

    3. Just in time compiled, like Java, .NET or modern JavaScript implementations. These vary a lot. Best C# or Java implementations approach the performance of native code. JavaScript generally does not because the language is dynamically typed and it’s very hard to produce fast code from that.

    Reading and manipulating a String

    If a string takes 1MB, reading it is “read 1MB sequentially from memory” in your table. Manipulations vary depending on what precisely do you do with them.

    Creating primitives

    If you mean define a local variable, for languages which use native stack (all compiled ones, and some others like C#) the cost is “execute typical instruction”. For languages which only use heap, the cost is malloc(), somewhat more expensive.

    Comparing two primitives

    For strongly typed languages “execute typical instruction”, for dynamically typed “indirect jump”.

    Comparing two objects

    Varies widely. Comparing pointers is “execute typical instruction”. Comparing values, depend on the values, potentially can be very expensive.

    Iterating a loop

    Indeed, for most languages that’s typical instruction, however some of them sometimes can do that faster for long loops, using automatic vectorization or parallelization under the hood.

    Transferring control to (or "calling") a method/function

    For one, many languages can inline automatically, as needed and when possible. Failing that, the cost of a function or non-virtual method is a few typical instructions. For virtual methods can be more. When predicted (you are calling the same virtual method many times in a loop, and it resolves into the same code location every time) about the same, when not predicted about the cost of “branch misprediction”

    Accessing a "list"

    Impossible to answer because implementations are way too different in different runtime libraries. In languages like Java or C# lists are backed by an array and share the performance characteristics. If you mean “linked list” that depends on caching, in the best case “fetch from L1 cache memory” per element, in the worst case “fetch from main memory” for each element being accessed.

    Initializing an object

    Depends on way too many things. Can be even free in some cases.

    does each method included in the object incur a runtime penalty when the class is instantiated?

    In most languages, no. Can be “yes” in very dynamically-typed languages, even Python is not dynamic enough, JavaScript probably is.

    The rest of the questions depend on specific language/runtime too much. Also on your code. Many C++ compilers may de-virtualize method calls if the compiler can deduce concrete implementations. If your getValue method just returns a field instead of computing things, most compilers/runtimes/interpreters will inline the call and these 2 versions will generate equivalent code.