swiftmemory-management

How to force releasing memory


I’m trying to iterate over a large file tree reading URL’s and creating hash values, but there’s a memory leak I can’t fix. The consumed memory is growing until the app stopp working. For testing reason I tried to read and hash 600Gb/ 350T files with single files up to 25Gb, writing the URL and hash value to a simple text file. Production data would be at about 50-70Tb/ 1.8M files.

I read quite a few articles and posts, came over ‘autoreleasepool’ but nothing worked so far. The memory consumed for hashing the files isn’t released and the applications memory stack grows rapidly.

func readWrite(_ url: URL) {
        
        let enumerator = FileManager().enumerator(at: url, includingPropertiesForKeys: [.isRegularFileKey])
        
        while let element = enumerator?.nextObject() as? URL {
            do {
                if element.isFileURL {
                    let data = try Data(contentsOf: element)
                    let hash = createFileHash(element)
                    
                    let stringData = Data("\(hash),\(element.description)\n".utf8)
                    write(stringData, url)
                }
            } catch {
                print("\(error.localizedDescription)")
            }
        }
    }
    
    private func createFileHash(_ url: URL) -> Int {
        autoreleasepool {
            guard let data = try? Data(contentsOf: url) else { return 0 }
            return data.hashValue
        }
    }

Solution

  • Your use of the autoreleasepool successfully releases the Data(contentsOf: url) you created in createFileHash. However, there are also two lines in your while loop that creates Data objects:

    let data = try Data(contentsOf: element)
    // and
    let stringData = Data("\(hash),\(element.description)\n".utf8)
    

    The Data created by the first line is not released. You also don't seem to be using this data variable, so you can just delete this line. For the rest of this answer, I will assume that this is not your full code, and that you do use it somewhere.

    The Data created by the second line is completely a Swift object, so as far as I know, it would be deallocated normally like all other Swift objects.

    You should wrap the whole while loop body with autoreleasepool, and you can remove the autoreleasepool in createFileHash (assuming this is the only place where you call createFileHash).

    while let element = enumerator?.nextObject() as? URL {
        autoreleasepool {
            do {
                if element.isFileURL {
                    let data = try Data(contentsOf: element)
                    let hash = createFileHash(element)
                    
                    let stringData = Data("\(hash),\(element.description)\n".utf8)
                    write(stringData, url)
                }
            } catch {
                print("\(error.localizedDescription)")
            }
        }
    }