dockergoiowaitgroup

loading docker image fails


I am using golang, docker client to load a docker image in .tar format.

func loadImageFromTar(cli *client.Client, tarFilePath string) (string, error) {
    // Read tar file
    tarFile, err := os.Open(tarFilePath)
    if err != nil {
        return "", fmt.Errorf("failed to open tar file: %w", err)
    }
    defer tarFile.Close()

    // Create a pipe to stream data between tar reader and Docker client
    pr, pw := io.Pipe()

    // Set up a WaitGroup for synchronization
    var wg sync.WaitGroup
    wg.Add(2)

    // Load the Docker image in a separate goroutine
    var imageLoadResponse types.ImageLoadResponse
    go func() {
        defer wg.Done()
        imageLoadResponse, err = cli.ImageLoad(context.Background(), pr, false)
        if err != nil {
            err = fmt.Errorf("failed to load Docker image: %w", err)
        }
    }()

    // Read tar file metadata and copy the tar file to the pipe writer in a separate goroutine
    var repoTag string
    go func() {
        defer wg.Done()
        defer pw.Close()

        tarReader := tar.NewReader(tarFile)

        for {
            header, err := tarReader.Next()
            if err == io.EOF {
                break
            }
            if err != nil {
                err = fmt.Errorf("failed to read tar header: %w", err)
                fmt.Printf("Error: %v", err)
                return
            }

            // Extract the repository and tag from the manifest file
            if header.Name == "manifest.json" {
                data, err := io.ReadAll(tarReader)
                if err != nil {
                    err = fmt.Errorf("failed to read manifest file: %w", err)
                    fmt.Printf("Error: %v", err)
                    return
                }

                var manifest []map[string]interface{}
                err = json.Unmarshal(data, &manifest)
                if err != nil {
                    err = fmt.Errorf("failed to unmarshal manifest: %w", err)
                    fmt.Printf("Error: %v", err)
                    return
                }

                repoTag = manifest[0]["RepoTags"].([]interface{})[0].(string)
            }

            // Copy the tar file data to the pipe writer
            _, err = io.Copy(pw, tarReader)
            if err != nil {
                err = fmt.Errorf("failed to copy tar data: %w", err)
                fmt.Printf("Error: %v", err)
                return
            }
        }
    }()

    // Wait for both goroutines to finish
    wg.Wait()

    // Check if any error occurred in the goroutines
    if err != nil {
        return "", err
    }

    // Close the image load response body
    defer imageLoadResponse.Body.Close()

    // Get the image ID
    imageID, err := getImageIDByRepoTag(cli, repoTag)
    if err != nil {
        return "", fmt.Errorf("failed to get image ID: %w", err)
    }

    return imageID, nil
}

// func: getImageIDByRepoTag

func getImageIDByRepoTag(cli *client.Client, repoTag string) (string, error) {
    images, err := cli.ImageList(context.Background(), types.ImageListOptions{})
    if err != nil {
        return "", fmt.Errorf("failed to list images: %w", err)
    }

    for _, image := range images {
        for _, tag := range image.RepoTags {
            if tag == repoTag {
                return image.ID, nil
            }
        }
    }

    return "", fmt.Errorf("image ID not found for repo tag: %s", repoTag)
}

The getImageIDByRepoTag always returns fmt.Errorf("image ID not found for repo tag: %s", repoTag). Also when I run docker images I do not see the image being loaded. Looks like the image load is not completing .

In my other code, The docker images load usually takes time although the docker client cli.ImageLoad returns immediately. I usually add some 30 seconds wait time before checking for getImageIDByRepoTag. Adding a wait time also did not help in this case.

Thanks


Solution

  • There's a couple of issues:


    To avoid reading the tar byte stream twice, you could use io.TeeReader. This allows you to read the tar archive - to scan for the manifest file - but also have this stream written in full elsewhere (i.e. to the docker client).

    Create the TeeReader:

    tr := io.TeeReader(tarFile, pw)  // reading `tr` will read the tarFile - but simultaneously write to `pw`
    

    the image load will now read from this (instead of the pipe):

    //imageLoadResponse, err = cli.ImageLoad(context.Background(), pr, false)
    imageLoadResponse, err = cli.ImageLoad(context.Background(), tr, false)
    

    then change your archive/tar reader to read from the pipe:

    //tarReader := tar.NewReader(tarFile) // direct from file
    tarReader := tar.NewReader(pr) // read from pipe (indirectly from the file)
    

    you can then drop the your io.Copy block:

    // no longer needed:
    //
    // _, err = io.Copy(pw, tarReader)
    //
    

    since the tar-inspection code will read the entire stream to EOF.

    P.S. you may want to reset the io.EOF to nil to avoid thinking a EOF is a more critical error, when you inspect any potential errors from either of the goroutines:

    header, err = tarReader.Next()
    if err == io.EOF {
        err = nil  //  EOF is a non-fatal error here
        break
    }