cassemblyx86nasmtcc

tcc: Use C standard functions in assembly code


I have simple assembly files created by NASM. I want to link them with tcc. For debugging I want to use printf() in my assembly code. But when I do so, tcc fails with tcc: undefined symbol 'printf'.

Here is a minimal example code to reproduce the error:

extern printf
hello: db "Hello world!",0

global main
main:
    push hello
    call printf
    pop eax
ret

Console:

nasm -felf hello.asm
tcc hello.o
tcc: undefined symbol 'printf'

When I use gcc hello.o everything works fine, so it has to be a tcc specific problem. How do I get this to work with tcc?

Edit: I'm using a Windows version of NASM and TCC to generate 32-bit Windows executables.


Solution

  • It appears that TCC requires specific type information on functions that are external linkage like printf. By default NASM creates references to symbols with a NOTYPE attribute in the ELF objects. This appears to confuse TCC as it seems to expect external function symbols to be marked with a FUNCTION type.


    I discovered this by taking the simple C program:

    #include <stdio.h>
    int main()
    {
        printf ("hello\n");
    }
    

    and compiling it to an object file (TCC uses ELF objects by default) with a command like:

    tcc -c simple.c 
    

    This generates simple.o. I happened to use OBJDUMP to display the assembly code and ELF headers. I didn't see anything unusual in the code but the symbol table in the headers showed a difference. If you use the program READELF you can get a detailed dump of the symbols.

    readelf -s simple.o
    
    Symbol table '.symtab' contains 5 entries:
       Num:    Value  Size Type    Bind   Vis      Ndx Name
         0: 00000000     0 NOTYPE  LOCAL  DEFAULT  UND
         1: 00000000     0 FILE    LOCAL  DEFAULT  ABS simple.c
         2: 00000000     7 OBJECT  LOCAL  DEFAULT    2 L.0
         3: 00000000    26 FUNC    GLOBAL DEFAULT    1 main
         4: 00000000     0 FUNC    GLOBAL DEFAULT  UND printf
    

    Of particular interest is the symbol table entry for printf:

        4: 00000000     0 FUNC    GLOBAL DEFAULT  UND printf
    

    If you were to dump the ELF headers for your hello.o object you'd seem something similar to this:

    readelf -s hello.o
    
    Symbol table '.symtab' contains 6 entries:
       Num:    Value  Size Type    Bind   Vis      Ndx Name
         0: 00000000     0 NOTYPE  LOCAL  DEFAULT  UND
         1: 00000000     0 FILE    LOCAL  DEFAULT  ABS hello.asm
         2: 00000000     0 SECTION LOCAL  DEFAULT    1
         3: 00000000     0 NOTYPE  LOCAL  DEFAULT    1 hello
         4: 00000000     0 NOTYPE  GLOBAL DEFAULT  UND printf
         5: 0000000d     0 NOTYPE  GLOBAL DEFAULT    1 main
    

    Notice how the symbol printf in hello.o differs from the one in simple.o above. NASM defines labels by default using NOTYPE attribute rather than a FUNCTION .


    Use YASM instead of NASM

    I don't know of any way to resolve the problem in NASM since I don't know a way to force it to use a FUNCTION type and not NOTYPE on a symbol defined as extern. I changed the type in a hex editor and it linked and ran as expected.

    One alternative is to download YASM (a rewrite of NASM). For the most part NASM and YASM work the same. YASM's command line is mostly compatible with NASM so you should be able to use it as a direct replacement. YASM has an extra feature that allows you to specify the type of a symbol with the type directive:

    9.3.3. TYPE: Set symbol type
    
    ELF’s symbol table has the capability of indicating whether a symbol is a
    function or data. While this can be specified directly in the GLOBAL
    directive (see Section 9.4), the TYPE directive allows specifying the
    symbol type for any symbol, including local symbols.
    
    The directive takes two parameters; the first parameter is the symbol
    name, and the second is the symbol type. The symbol type must be either
    function or object. An unrecognized type will cause a warning to be
    generated. Example of use:
    
    func:
            ret
    type func function
    section .data
    var dd 4
    type var object
    

    You'd only have to add an extra line of type information to your assembly code for each external function you use. Your assembly code could be modified to look like:

    extern printf
    type printf function
    
    hello: db "Hello world!",0
    
    global main
    main:
        push hello
        call printf
        pop eax
    ret 
    

    It should compile and link with this:

    yasm -felf hello.asm -o hello.o
    tcc hello.o -o hello.exe