haskellcode-generationllvmdsldsl-tools

Haskell LLVM -- Duplicate Functions Created


The problem I am having with the LLVM-Haskell bindings is that I am getting "duplicated" names. I think the best way to explain my problem is with a small concrete example (note the example is contrived, and for such a small example there are easy ways around it... however it does point out my problem).

putc :: TFunction (Int32 -> IO Word32)
putc = newNamedFunction ExternalLinkage "putchar"

simple :: TFunction (Int32 -> IO Word32)
simple = do
    internalputc <- putc
    createNamedFunction ExternalLinkage "simple" $ \x -> do
        call internalputc x
        call internalputc x
        ret (0 :: Word32)

easy :: TFunction (Int32 -> IO Word32)
easy = do 
    internalputc <- putc
    internalsimple <- simple
    createNamedFunction ExternalLinkage "easy" $ \x -> do
        call internalsimple x
        y <- add x (42 :: Int32)
        call internalputc y
        ret (0 :: Word32)

main :: IO ()
main = do
    m <- newNamedModule "Main"
    defineModule m easy
    writeBitcodeToFile "SillyLib" m

If you now run this haskell program (you'll need some imports like Data.Int/Word, and LLVM.Core), you'll get the following output.

; ModuleID = 'SillyLib'

declare i32 @putchar(i32)

declare i32 @putchar1(i32)

define i32 @simple(i32) {
_L1:
  %1 = call i32 @putchar1(i32 %0)
  %2 = call i32 @putchar1(i32 %0)
  ret i32 0
}

define i32 @easy(i32) {
_L1:
  %1 = call i32 @simple(i32 %0)
  %2 = add i32 %0, 42
  %3 = call i32 @putchar(i32 %2)
  ret i32 0
}

The problem is that in the IR, the (external) putchar is declared twice, but the second time with the name putchar1. I have a good sense as to why this is, but not a good sense for a nice general way around this. I.e. I don't want to have to put everything inside of one giant CodeGenModule.

This brings me to the another related issue. Is the LLVM-Haskell binding appropriate for building the backend of a compiler. Perhaps with a reasonable solution to the above -- I can figure out a way to use it ... but it seems simpler just to hand write the IR code...


Solution

  • You're calling newNamedFunction "putchar" inside the CodeGenModule monad twice, which obviously has the side-effect of adding putchar to the module twice. The fact that this results in two declarations instead of an error is probably a bug, please consider reporting it. To fix this, just make putc a parameter of simple and easy. This will look approximately as follows (not tested):

    simple :: Function (Int32 -> IO Word32) -> TFunction (Int32 -> IO Word32)
    simple putc =
        createNamedFunction ExternalLinkage "simple" $ \x -> do
            call putc x
            call putc x
            ret (0 :: Word32)
    
    easy :: Function (Int32 -> IO Word32) -> Function (Int32 -> IO Word32) 
            -> TFunction (Int32 -> IO Word32)
    easy putc simple' =
        createNamedFunction ExternalLinkage "easy" $ \x -> do
            call simple' x
            y <- add x (42 :: Int32)
            call putc y
            ret (0 :: Word32)
    
    main :: IO ()
    main = do
        m <- newNamedModule "Main"
        defineModule m $ do
            putc <- newNamedFunction ExternalLinkage "putchar"
            simple' <- simple putc
            easy putc simple'
        writeBitcodeToFile "SillyLib" m