pythonpypyrpython

what is statically typed in RPython?


It is often stated that RPython (a subset of Python) is statically typed. (E.g. on Wikipedia.)

Initially, I wondered how they would add that to Python and thought that they might have added the requirement to add statements such as assert isinstance(arg1, ...) at the beginning of each function (but I couldn't really believe that).

Then I looked at some RPython code and it doesn't really look statically typed at all. In many cases, it might be that the compiler can prove that a function argument can only be of certain types but definitely not in all cases.

E.g., this is the RPython implementation of string.split:

def split(value, by, maxsplit=-1):
    bylen = len(by)
    if bylen == 0:
        raise ValueError("empty separator")

    res = []
    start = 0
    while maxsplit != 0:
        next = value.find(by, start)
        if next < 0:
            break
        res.append(value[start:next])
        start = next + bylen
        maxsplit -= 1   # NB. if it's already < 0, it stays < 0

    res.append(value[start:len(value)])
    return res

In the PyPy documentation about RPython, it is said: "variables should contain values of at most one type".

So, do function arguments also count as variables? Or in what sense is RPython statically typed? Or is this actually misstated?


Solution

  • So, do function arguments also count as variables?

    Of course they do. They always do in pretty much every language.

    Or in what sense is RPython statically typed? Or is this actually misstated?

    The statement is correct. RPython is not Python. Well, it's a subset of it and can be run as Python code. But when you actually compile RPython code, so much dynamicness is taken away from you (albeit only after import time, so you can still use metaclasses, generate code from strings, etc. - used to great effect in some modules) that the compiler (which is not the Python compiler, but vastly different from traditional compilers; see associated documentation) can indeed decide types are used statically. More accurately, code that uses dynamicness makes it past the parser and everything, but results in a type error at some point.

    In many cases, it might be that the compiler can prove that a function argument can only be of certain types but definitely not in all cases.

    Of course not. There's a lot of code that's not statically typed, and quite some statically-typed code the current annotator can't prove to be statically typed. But when such code is enountered, it's a compilation errors, period.

    There are a few points that are important to realize:

    There's a lot of very interesting material on how the whole translation and typing actually works. For example, The RPython Toolchain describes the translation process in general, including type inference, and The RPython Typer describes the type system(s) used.