haskell code-generation llvm dsl dsl-tools

Haskell LLVM -- Duplicate Functions Created

The problem I am having with the LLVM-Haskell bindings is that I am getting "duplicated" names. I think the best way to explain my problem is with a small concrete example (note the example is contrived, and for such a small example there are easy ways around it... however it does point out my problem).

putc :: TFunction (Int32 -> IO Word32)
putc = newNamedFunction ExternalLinkage "putchar"

simple :: TFunction (Int32 -> IO Word32)
simple = do
    internalputc <- putc
    createNamedFunction ExternalLinkage "simple" $ \x -> do
        call internalputc x
        call internalputc x
        ret (0 :: Word32)

easy :: TFunction (Int32 -> IO Word32)
easy = do 
    internalputc <- putc
    internalsimple <- simple
    createNamedFunction ExternalLinkage "easy" $ \x -> do
        call internalsimple x
        y <- add x (42 :: Int32)
        call internalputc y
        ret (0 :: Word32)

main :: IO ()
main = do
    m <- newNamedModule "Main"
    defineModule m easy
    writeBitcodeToFile "SillyLib" m

If you now run this haskell program (you'll need some imports like Data.Int/Word, and LLVM.Core), you'll get the following output.

; ModuleID = 'SillyLib'

declare i32 @putchar(i32)

declare i32 @putchar1(i32)

define i32 @simple(i32) {
_L1:
  %1 = call i32 @putchar1(i32 %0)
  %2 = call i32 @putchar1(i32 %0)
  ret i32 0
}

define i32 @easy(i32) {
_L1:
  %1 = call i32 @simple(i32 %0)
  %2 = add i32 %0, 42
  %3 = call i32 @putchar(i32 %2)
  ret i32 0
}

The problem is that in the IR, the (external) putchar is declared twice, but the second time with the name putchar1. I have a good sense as to why this is, but not a good sense for a nice general way around this. I.e. I don't want to have to put everything inside of one giant CodeGenModule.

This brings me to the another related issue. Is the LLVM-Haskell binding appropriate for building the backend of a compiler. Perhaps with a reasonable solution to the above -- I can figure out a way to use it ... but it seems simpler just to hand write the IR code...

Solution

You're calling newNamedFunction "putchar" inside the CodeGenModule monad twice, which obviously has the side-effect of adding putchar to the module twice. The fact that this results in two declarations instead of an error is probably a bug, please consider reporting it. To fix this, just make putc a parameter of simple and easy. This will look approximately as follows (not tested):

simple :: Function (Int32 -> IO Word32) -> TFunction (Int32 -> IO Word32)
simple putc =
    createNamedFunction ExternalLinkage "simple" $ \x -> do
        call putc x
        call putc x
        ret (0 :: Word32)

easy :: Function (Int32 -> IO Word32) -> Function (Int32 -> IO Word32) 
        -> TFunction (Int32 -> IO Word32)
easy putc simple' =
    createNamedFunction ExternalLinkage "easy" $ \x -> do
        call simple' x
        y <- add x (42 :: Int32)
        call putc y
        ret (0 :: Word32)

main :: IO ()
main = do
    m <- newNamedModule "Main"
    defineModule m $ do
        putc <- newNamedFunction ExternalLinkage "putchar"
        simple' <- simple putc
        easy putc simple'
    writeBitcodeToFile "SillyLib" m