I am very interested in generation of command line shell scripting source code from code written in a more robustness-promoting, well-performant and platform-independent compiled language (OCaml, for instance). Basically, you would program in a compiled language to perform any interactions with the OS that you want (I would propose: the more complex interactions or ones that are not easy to do in a platform-independent way), and finally you would compile it to a native binary executable (preferably), which would generate a shell script that effects in the shell what you programmed in the compiled language. [ADDED]: With 'effects', I mean to set the environment variables and shell options, execute certain non-standard commands (the standard scripting 'glue' would be handled by the compiled executable and would be kept out of the generated shell script) and such.
I have not found any such solution so far. It seems to be relatively easy* to realize compared to other possibilities of today, like compiling OCaml to JavaScript.
What I do not mean
Practical applications
Some of you may doubt the practical usefulness of what I describe. One practical application of this is to define a shell profile based on various conditions (for example the system platform/OS on which the profile is being sourced, what follows from the security policy, the concrete shell, login/non-login type of the shell, interactive/non-interactive type of shell). The advantage over a (well-crafted) generic shell profile as a shell script would be improvement in performance (native machine code that may generate a compressed/optimized source code instead of human-written script interpretation), robustness (type checking, exception handling, compile time verification of functionality, cryptographic signing of the resultant binary executable), capabilities (less or no reliance on userland CLI tools, no limitation to use minimum functionality covered by the CLI tools of all possible platforms) and cross-platform functionality (in practice standards like the Single UNIX Specification only mean so much, and many shell profile concepts carry over to Non-Unix platforms like Windows, with its PowerShell, too).
Implementation details, side issues
Thank you for any further thoughts, and especially for concrete suggestions!
There are no Haskell libraries for this, but you can implement this using abstract syntax trees. I'll build up a simple toy example that builds an abstract language-independent syntax tree and then applies a back-end that converts the tree into the equivalent Bash script.
I will use two tricks for modelling syntax trees in Haskell:
The GADT trick is rather simple, and I use several language extensions to sweeten the syntax:
{-# LANGUAGE GADTs
, FlexibleInstances
, RebindableSyntax
, OverloadedStrings #-}
import Data.String
import Prelude hiding ((++))
type UniqueID = Integer
newtype VStr = VStr UniqueID
newtype VInt = VInt UniqueID
data Expr a where
StrL :: String -> Expr String -- String literal
IntL :: Integer -> Expr Integer -- Integer literal
StrV :: VStr -> Expr String -- String variable
IntV :: VInt -> Expr Integer -- Integer variable
Plus :: Expr Integer -> Expr Integer -> Expr Integer
Concat :: Expr String -> Expr String -> Expr String
Shown :: Expr Integer -> Expr String
instance Num (Expr Integer) where
fromInteger = IntL
(+) = Plus
(*) = undefined
abs = undefined
signum = undefined
instance IsString (Expr String) where
fromString = StrL
(++) :: Expr String -> Expr String -> Expr String
(++) = Concat
This lets us build typed Bash expression in our DSL. I only implemented a few primitive operations, but you could easily imagine how you could extend it with others.
If we didn't use any language extensions, we might write expressions like:
Concat (StrL "Test") (Shown (Plus (IntL 4) (IntL 5))) :: Expr String
This is okay, but not very sexy. The above code uses RebindableSyntax
to override numeric literals so that you can replace (IntL n)
with just n
:
Concat (StrL "Test") (Shown (Plus 4 5)) :: Expr String
Similarly, I have Expr Integer
implement Num
, so that you can add numeric literals using +
:
Concat (StrL "Test") (Shown (4 + 5)) :: Expr String
Similarly, I use OverloadedStrings
so that you can replace all occurrences of (StrL str)
with just str
:
Concat "Test" (Shown (4 + 5)) :: Expr String
I also override the Prelude (++)
operator so that we can concatenate expressions as if they were Haskell strings:
"Test" ++ Shown (4 + 5) :: Expr String
Other than the Shown
cast from integers to strings, it looks just like native Haskell code. Neat!
Now we need a way to create a user-friendly DSL, preferably with Monad
syntactic sugar. This is where free monads come in.
A free monads take a functor representing a single step in a syntax tree and creates a syntax tree from it. As a bonus, it is always a monad for any functor, so you can assemble these syntax trees using do
notation.
To demonstrate it, I'll add some more code to the previous code segment:
-- This is in addition to the previous code
{-# LANGUAGE DeriveFunctor #-}
import Control.Monad.Free
data ScriptF next
= NewInt (Expr Integer) (VInt -> next)
| NewStr (Expr String ) (VStr -> next)
| SetStr VStr (Expr String ) next
| SetInt VInt (Expr Integer) next
| Echo (Expr String) next
| Exit (Expr Integer)
deriving (Functor)
type Script = Free ScriptF
newInt :: Expr Integer -> Script VInt
newInt n = liftF $ NewInt n id
newStr :: Expr String -> Script VStr
newStr str = liftF $ NewStr str id
setStr :: VStr -> Expr String -> Script ()
setStr v expr = liftF $ SetStr v expr ()
setInt :: VInt -> Expr Integer -> Script ()
setInt v expr = liftF $ SetInt v expr ()
echo :: Expr String -> Script ()
echo expr = liftF $ Echo expr ()
exit :: Expr Integer -> Script r
exit expr = liftF $ Exit expr
The ScriptF
functor represents a single step in our DSL. Free
essentially creates a list of ScriptF
steps and defines a monad where we can assemble lists of these steps. You can think of the liftF
function as taking a single step and creating a list with one action.
We can then use do
notation to assemble these steps, where do
notation concatenates these lists of actions:
script :: Script r
script = do
hello <- newStr "Hello, "
world <- newStr "World!"
setStr hello (StrV hello ++ StrV world)
echo ("hello: " ++ StrV hello)
echo ("world: " ++ StrV world)
x <- newInt 4
y <- newInt 5
exit (IntV x + IntV y)
This shows how we assemble the primitive steps we just defined. This has all the nice properties of monads, including support for monadic combinators, like forM_
:
import Control.Monad
script2 :: Script ()
script2 = forM_ [1..5] $ \i -> do
x <- newInt (IntL i)
setInt x (IntV x + 5)
echo (Shown (IntV x))
Notice how our Script
monad enforces type safety even though our target language might be untyped. You can't accidentally use a String
literal where it expects an Integer
or vice versa. You must explicitly convert between them using type-safe conversions like Shown
.
Also note that the Script
monad swallows any commands after the exit statement. They are ignored before they even reach the interpreter. Of course, you can change this behavior by rewriting the Exit
constructor to accept a subsequent next
step.
These abstract syntax trees are pure, meaning that we can inspect and interpret them purely. We can define several backends, such as a Bash backend that converts our Script
monad to the equivalent Bash script:
bashExpr :: Expr a -> String
bashExpr expr = case expr of
StrL str -> str
IntL int -> show int
StrV (VStr nID) -> "${S" <> show nID <> "}"
IntV (VInt nID) -> "${I" <> show nID <> "}"
Plus expr1 expr2 ->
concat ["$((", bashExpr expr1, "+", bashExpr expr2, "))"]
Concat expr1 expr2 -> bashExpr expr1 <> bashExpr expr2
Shown expr' -> bashExpr expr'
bashBackend :: Script r -> String
bashBackend script = go 0 0 script where
go nStrs nInts script =
case script of
Free f -> case f of
NewInt e k ->
"I" <> show nInts <> "=" <> bashExpr e <> "\n" <>
go nStrs (nInts + 1) (k (VInt nInts))
NewStr e k ->
"S" <> show nStrs <> "=" <> bashExpr e <> "\n" <>
go (nStrs + 1) nInts (k (VStr nStrs))
SetStr (VStr nID) e script' ->
"S" <> show nID <> "=" <> bashExpr e <> "\n" <>
go nStrs nInts script'
SetInt (VInt nID) e script' ->
"I" <> show nID <> "=" <> bashExpr e <> "\n" <>
go nStrs nInts script'
Echo e script' ->
"echo " <> bashExpr e <> "\n" <>
go nStrs nInts script'
Exit e ->
"exit " <> bashExpr e <> "\n"
Pure _ -> ""
I defined two interpreters: one for the expression syntax tree and one for the monadic DSL syntax tree. These two interpreters compile any language-independent program into the equivalent Bash program, represented as a String. Of course, the choice of representation is entirely up to you.
This interpreter automatically creates fresh unique variables each time our Script
monad requests a new variable.
Let's try out this interpreter and see if it works:
>>> putStr $ bashBackend script
S0=Hello,
S1=World!
S0=${S0}${S1}
echo hello: ${S0}
echo world: ${S1}
I0=4
I1=5
exit $((${I0}+${I1}))
It generates a bash script that executes the equivalent language-indepent program. Similarly, it translates script2
just fine, too:
>>> putStr $ bashBackend script2
I0=1
I0=$((${I0}+5))
echo ${I0}
I1=2
I1=$((${I1}+5))
echo ${I1}
I2=3
I2=$((${I2}+5))
echo ${I2}
I3=4
I3=$((${I3}+5))
echo ${I3}
I4=5
I4=$((${I4}+5))
echo ${I4}
So this is obviously not comprehensive, but hopefully that gives you some ideas for how you would implement this idiomatically in Haskell. If you want to learn more about the use of free monads, I recommend you read:
I've also attached the complete code here:
{-# LANGUAGE GADTs
, FlexibleInstances
, RebindableSyntax
, DeriveFunctor
, OverloadedStrings #-}
import Control.Monad.Free
import Control.Monad
import Data.Monoid
import Data.String
import Prelude hiding ((++))
type UniqueID = Integer
newtype VStr = VStr UniqueID
newtype VInt = VInt UniqueID
data Expr a where
StrL :: String -> Expr String -- String literal
IntL :: Integer -> Expr Integer -- Integer literal
StrV :: VStr -> Expr String -- String variable
IntV :: VInt -> Expr Integer -- Integer variable
Plus :: Expr Integer -> Expr Integer -> Expr Integer
Concat :: Expr String -> Expr String -> Expr String
Shown :: Expr Integer -> Expr String
instance Num (Expr Integer) where
fromInteger = IntL
(+) = Plus
(*) = undefined
abs = undefined
signum = undefined
instance IsString (Expr String) where
fromString = StrL
(++) :: Expr String -> Expr String -> Expr String
(++) = Concat
data ScriptF next
= NewInt (Expr Integer) (VInt -> next)
| NewStr (Expr String ) (VStr -> next)
| SetStr VStr (Expr String ) next
| SetInt VInt (Expr Integer) next
| Echo (Expr String) next
| Exit (Expr Integer)
deriving (Functor)
type Script = Free ScriptF
newInt :: Expr Integer -> Script VInt
newInt n = liftF $ NewInt n id
newStr :: Expr String -> Script VStr
newStr str = liftF $ NewStr str id
setStr :: VStr -> Expr String -> Script ()
setStr v expr = liftF $ SetStr v expr ()
setInt :: VInt -> Expr Integer -> Script ()
setInt v expr = liftF $ SetInt v expr ()
echo :: Expr String -> Script ()
echo expr = liftF $ Echo expr ()
exit :: Expr Integer -> Script r
exit expr = liftF $ Exit expr
script :: Script r
script = do
hello <- newStr "Hello, "
world <- newStr "World!"
setStr hello (StrV hello ++ StrV world)
echo ("hello: " ++ StrV hello)
echo ("world: " ++ StrV world)
x <- newInt 4
y <- newInt 5
exit (IntV x + IntV y)
script2 :: Script ()
script2 = forM_ [1..5] $ \i -> do
x <- newInt (IntL i)
setInt x (IntV x + 5)
echo (Shown (IntV x))
bashExpr :: Expr a -> String
bashExpr expr = case expr of
StrL str -> str
IntL int -> show int
StrV (VStr nID) -> "${S" <> show nID <> "}"
IntV (VInt nID) -> "${I" <> show nID <> "}"
Plus expr1 expr2 ->
concat ["$((", bashExpr expr1, "+", bashExpr expr2, "))"]
Concat expr1 expr2 -> bashExpr expr1 <> bashExpr expr2
Shown expr' -> bashExpr expr'
bashBackend :: Script r -> String
bashBackend script = go 0 0 script where
go nStrs nInts script =
case script of
Free f -> case f of
NewInt e k ->
"I" <> show nInts <> "=" <> bashExpr e <> "\n" <>
go nStrs (nInts + 1) (k (VInt nInts))
NewStr e k ->
"S" <> show nStrs <> "=" <> bashExpr e <> "\n" <>
go (nStrs + 1) nInts (k (VStr nStrs))
SetStr (VStr nID) e script' ->
"S" <> show nID <> "=" <> bashExpr e <> "\n" <>
go nStrs nInts script'
SetInt (VInt nID) e script' ->
"I" <> show nID <> "=" <> bashExpr e <> "\n" <>
go nStrs nInts script'
Echo e script' ->
"echo " <> bashExpr e <> "\n" <>
go nStrs nInts script'
Exit e ->
"exit " <> bashExpr e <> "\n"
Pure _ -> ""