gofile-iounmarshallingulimitdefer-keyword

Go ioutil using too many file descriptors/leak?


I am going through a list of files and Unmarshalling the xml data in them into an array of structs rArray. I intend to process about 18000 files. When I get to about 1300 files processed, the program panics and says that too many files are open. If I limit the amount of files processed to a safe amount of 1000, the program does not crash.

As seen below, I am using ioutil.ReadFile to read the file data.

for _, f := range files {

    func() {
        data, err := ioutil.ReadFile("./" + recordDir + "/" + f.Name())
        if err != nil {
            fmt.Println("error reading %v", err)
            return
        } else {
            if (strings.Contains(filepath.Ext(f.Name()), "xml")) {

                //unmarshal data and put into struct array
                err = xml.Unmarshal([]byte(data), &rArray[a])
                if err != nil {
                    fmt.Println("error decoding %v: %v",f.Name(), err)
                    return
                }
            }
        }
    }()
}

I am not sure if Go is using too many file descriptors or not closing the files fast enough.

After reading https://groups.google.com/forum/#!topic/golang-nuts/7yXXjgcOikM and viewing the ioutil source in http://golang.org/src/pkg/io/ioutil/ioutil.go, the code for ioutil.ReadFile shows that it uses defer to close the file. defer runs when calling function is returned and ReadFile() is the calling function. Am I correct in this understanding? I also tried wrapping the ioutil.ReadFile part of my code in a function, but it makes no difference.

My ulimit is set to unlimited.

UPDATE: I believe that the error of too many files is actually occurring during my Unzip function.

func Unzip(src, dest string) error {
    r, err := zip.OpenReader(src)
    if err != nil {
        return err
    }

    for _, f := range r.File {
        rc, err := f.Open()
        if err != nil {
            panic(err)
        }

        path := filepath.Join(dest, f.Name)
        if f.FileInfo().IsDir() {
            os.MkdirAll(path, f.Mode())
        } else {
            f, err := os.OpenFile(
                path, os.O_WRONLY|os.O_CREATE|os.O_TRUNC, f.Mode())
            if err != nil {
                panic(err)
            }

            _, err = io.Copy(f, rc)
            if err != nil {
                panic(err)
            }
            f.Close()
        }
        rc.Close()
    }
    r.Close()
    return nil
}

I initially got the Unzip function from https://gist.github.com/hnaohiro/4572580, but upon further inspection, the use of defer in the gist author's function seemed wrong as the file would only be closed after the Unzip() function returned which is too late becuase then 18000 file descriptors will be open. ;)

I replaced the deferred Closes with explicit Close() as shown above, but am still getting the same "too many open files" error. Is there a problem with my modified Unzip function?

UPDATE # 2 Oops, I was running this on Heroku and was pushing to the wrong app with my changes this entire time. Lesson learned: verify target app in heroku toolbelt.

Unzip code from https://gist.github.com/hnaohiro/4572580 does not work as it does not close files until all files processed.

My unzip code with explicit close above works and so does the defer version in @peterSO's answer.


Solution

  • I would modify the Unzip function from https://gist.github.com/hnaohiro/4572580 to the following:

    package main
    
    import (
        "archive/zip"
        "io"
        "log"
        "os"
        "path/filepath"
    )
    
    func unzipFile(f *zip.File, dest string) error {
        rc, err := f.Open()
        if err != nil {
            return err
        }
        defer rc.Close()
    
        path := filepath.Join(dest, f.Name)
        if f.FileInfo().IsDir() {
            err := os.MkdirAll(path, f.Mode())
            if err != nil {
                return err
            }
        } else {
            f, err := os.OpenFile(
                path, os.O_WRONLY|os.O_CREATE|os.O_TRUNC, f.Mode())
            if err != nil {
                return err
            }
            defer f.Close()
    
            _, err = io.Copy(f, rc)
            if err != nil {
                return err
            }
        }
        return nil
    }
    
    func Unzip(src, dest string) error {
        r, err := zip.OpenReader(src)
        if err != nil {
            return err
        }
        defer r.Close()
    
        for _, f := range r.File {
            err := unzipFile(f, dest)
            if err != nil {
                return err
            }
        }
    
        return nil
    }
    
    func main() {
        err := Unzip("./sample.zip", "./out")
        if err != nil {
            log.Fatal(err)
        }
    }