Is there any way to force a compiler or an assembler to only generate RIP-relative addressing code?
I am trying to find a way to provide a mapping from conventional programming models to a Turing complete abstract model of computation.
This seems to be related to your discussion in comments on
Is C actually Turing-complete? on cs.SE (which I happened to see recently before you posted this).
Note that PC-relative addressing doesn't help you achieve unbounded storage. The required data size can be an unbounded amount larger than the code size, so the offset part of PC-relative addressing would need to be of unbounded size. (And it normally is only usable for static storage.)
I thought you were suggesting having pointers that were relative to their own addresses (not to code), which would still require unbounded register widths for a traditional ISA like x86-64 so you might as well just use the Random Access Machine abstract model of computation. x86-64 needs the whole absolute address in a register, or at least 2 parts that sum to the absolute address. ([base + idx*scale]
where scale is a 2-bit left shift).
add rdi, [rdi]
adds a pointed-to offset to a pointer (like C ptr += *ptr
), but still needs the result to fit in a register.
If you mean stopping compilers from using absolute memory addressing for static data, then yes, that's easy, use gcc -fPIE
or -fPIC
.
But if you mean use only [rip + rel32]
addressing, never [reg]
or any subset of the general [base + idx*scale + disp0/8/32]
alternative to [RIP + rel32]
, then no, of course not for real-world compilers. RIP-relative addressing can only access static storage, so limiting yourself to that would mean no stack space and no pointers. x86-64's only RIP-relative addressing mode is [rip + rel32]
where the rel32 is a constant embedded in the machine code, not a register value.
(Perhaps you could use self-modifying code to modify the rel32
of a RIP+rel32 addressing mode, but no mainstream compiler will do that. It's not obvious how you'd manage re-entrancy for stack space with only one copy of a function's machine code that you're modifying, but perhaps keeping the right data in stack space would let you restore the caller's rel32 offsets).
In hand-written asm, you can of course do whatever you want, but limiting yourself to rewriting rel32 displacements makes it (strictly?) less powerful than vanilla x86-64, not more Turing complete.
If you're looking for an addressing mode like [PC + other_register]
, I think 32-bit ARM has that. It has indexed addressing, and the program counter is accessible as one of the 16 general-purpose registers (unlike in AArch64). So that would let you do PC-relative indexing of static arrays. Again, not that this helps in any obvious way. For any given instruction at a fixed PC to address any of an unbounded number of memory locations, the "other register" would have to have unbounded width.
Unbounded Turing-complete C:
I believe this is impossible unless you loosen the language to remove the fact that every type (including pointers) has some fixed width decided ahead of time, not based on the size of the input you want to deal with.
A Turing-complete C implementation could call malloc
in a loop an unbounded number of times, for example reading lines of input with fgets
and adding each line as it arrives into a binary tree with a standard recursive approach. Using a standard node layout based on C pointers:
struct node { struct node *left, *right; const char *str; };
. Then later traverse that tree and output the lines in sorted order.
For a tree to work, any existing node needs to be able to point to a newly-allocated note. Section-relative addressing doesn't get you any closer to that, as far as I can see. This binary-tree example might be a good litmus test for unbounded C, involving pointers to other objects with an arrangement depending on the input.
What you describe in comments seems to be writing local parts of a UTM state machine in x86 asm, with each state having its own 2GiB memory space and the ability to jump forward or backward to the next state. There's no clear way to have true random access or real pointers, only within the code for one state.
Using a finite C implementation for each step of a UTM doesn't give you an overall Turing-complete C implementation, it gives you a Turing machine with tape-like non-random-access when your problem sizes exceed what you can do inside one "state" or "memory bank" or whatever you call it.