I'd like to implement a simple sandbox using Linux namespace and Go to execute command.
In order to prevent the command from writing to disk, the command is executed as another user using Credential: &syscall.Credential{Uid: uint32(1), Gid: uint32(1)}
.
However, I got this error: "fork/exec /Main: operation not permitted".
Even if I change code to Credential: &syscall.Credential{Uid: uint32(0), Gid: uint32(0)}
, the same error occurred.
The container.go is as follows:
// +build linux
// +build go1.12
package main
import (
"flag"
"fmt"
uuid "github.com/satori/go.uuid"
"io/ioutil"
"os"
"os/exec"
"os/user"
"path/filepath"
"strconv"
"strings"
"syscall"
"time"
"github.com/ZiheLiu/sandbox/sandbox"
"github.com/docker/docker/pkg/reexec"
)
func init() {
// register "justiceInit" => justiceInit() every time
reexec.Register("justiceInit", justiceInit)
/**
* 0. `init()` adds key "justiceInit" in `map`;
* 1. reexec.Init() seeks if key `os.Args[0]` exists in `registeredInitializers`;
* 2. for the first time this binary is invoked, the key is os.Args[0], AKA "/path/to/clike_container",
which `registeredInitializers` will return `false`;
* 3. `main()` calls binary itself by reexec.Command("justiceInit", args...);
* 4. for the second time this binary is invoked, the key is os.Args[0], AKA "justiceInit",
* which exists in `registeredInitializers`;
* 5. the value `justiceInit()` is invoked, any hooks(like set hostname) before fork() can be placed here.
*/
if reexec.Init() {
os.Exit(0)
}
}
func justiceInit() {
command := os.Args[1]
timeout, _ := strconv.ParseInt(os.Args[2], 10, 32)
cmd := exec.Command(command)
cmd.Stdin = os.Stdin
cmd.Stdout = os.Stdout
cmd.Stderr = os.Stderr
// set uid and gid as another user
cmd.SysProcAttr = &syscall.SysProcAttr{
Setpgid: true,
Credential: &syscall.Credential{Uid: uint32(1), Gid: uint32(1)},
}
cmd.Env = []string{"PS1=[justice] # "}
// got the error "fork/exec /Main: operation not permitted" here
if err := cmd.Run(); err != nil {
_, _ = os.Stderr.WriteString(fmt.Sprintf("%s\n", err.Error()))
}
}
// logs will be printed to os.Stderr
func main() {
command := flag.String("command", "./Main", "the command needed to be execute in sandbox")
username := flag.String("username", "root", "the user to execute command")
flag.Parse()
u, err := user.Lookup(*username)
if err != nil {
_, _ = os.Stderr.WriteString(fmt.Sprintf("%s\n", err.Error()))
os.Exit(0)
}
uid, _ := strconv.Atoi(u.Uid)
gid, _ := strconv.Atoi(u.Gid)
cmd := reexec.Command("justiceInit", *basedir, *command, *timeout)
cmd.Stdin = os.Stdin
cmd.Stdout = os.Stdout
cmd.Stderr = os.Stderr
cmd.SysProcAttr = &syscall.SysProcAttr{
Cloneflags: syscall.CLONE_NEWNS |
syscall.CLONE_NEWUTS |
syscall.CLONE_NEWIPC |
syscall.CLONE_NEWPID |
syscall.CLONE_NEWNET |
syscall.CLONE_NEWUSER,
UidMappings: []syscall.SysProcIDMap{
{
ContainerID: 0,
HostID: os.Getuid(),
Size: 1,
},
{
ContainerID: 1,
HostID: uid,
Size: 1,
},
},
GidMappings: []syscall.SysProcIDMap{
{
ContainerID: 0,
HostID: os.Getgid(),
Size: 1,
},
{
ContainerID: 1,
HostID: gid,
Size: 1,
},
},
}
if err := cmd.Run(); err != nil {
_, _ = os.Stderr.WriteString(fmt.Sprintf("%s\n", err.Error()))
}
os.Exit(0)
}
When I run sudo ./container -command='/Main' -username='nobody'
, the error "fork/exec /Main: operation not permitted" occurred.
The user in the user namespace of justiceInit
should be the root, but it can not set uid and gid using Credential
.
I'm a new hand of linux and namespace. Maybe I misunderstand something. How should I fix this error? Thanks very much!
According to recommendations from @Charles Duffy, I traced the source code of cmd.Run()
and found that:
type SysProcAttr struct {
UidMappings []SysProcIDMap // User ID mappings for user namespaces.
GidMappings []SysProcIDMap // Group ID mappings for user namespaces.
// GidMappingsEnableSetgroups enabling setgroups syscall.
// If false, then setgroups syscall will be disabled for the child process.
// This parameter is no-op if GidMappings == nil. Otherwise for unprivileged
// users this should be set to false for mappings work.
GidMappingsEnableSetgroups bool
}
Thus, if the value of GidMappingsEnableSetgroups
is false
as default, the child process justiceInit
will have no permission to use setgroups
syscall regardless of whether it has root privileges.
As a result, when I set cmd.SysProcAttr.GidMappingsEnableSetgroups
as true
in the function main
as follows, it works!
cmd.Stdin = os.Stdin
cmd.Stdout = os.Stdout
cmd.Stderr = os.Stderr
cmd.SysProcAttr = &syscall.SysProcAttr{
// ...
GidMappingsEnableSetgroups: true,
}