swiftasync-awaitstructured-concurrencyasyncstreamasyncsequence

TaskGroup limit amount of memory usage for lots of tasks


I'm trying to build a chunked file uploading mechanism using modern Swift Concurrency. There is a streamed file reader which I'm using to read files chunk by chunk of 1mb size. It has two closures nextChunk: (DataChunk) -> Void and completion: () - Void. The first one gets called as many times as there is data read from InputStream of a chunk size.

In order to make this reader compliant to Swift Concurrency I made the extension and created AsyncStream which seems to be the most suitable for such a case.

public extension StreamedFileReader {
    func read() -> AsyncStream<DataChunk> {
        AsyncStream { continuation in
            self.read(nextChunk: { chunk in
                continuation.yield(chunk)
            }, completion: {
                continuation.finish()
            })
        }
    }
}

Using this AsyncStream I read some file iteratively and make network calls like this:

func process(_ url: URL) async {
    // ...
    do {
        for await chunk in reader.read() {
            let request = // ...
            _ = try await service.upload(data: chunk.data, request: request)
        }
    } catch let error {
        reader.cancelReading()
        print(error)
    }
}

The issue there is that there is no any limiting mechanism I'm aware of that won't allow to execute more than N network calls. Thus when I'm trying to upload huge file (5Gb) memory consumption grows drastically. Because of that the idea of streamed reading of file makes no sense as it'd be easier to read the entire file into the memory (it's a joke but looks like that).

In contrast, if I'm using a good old GCD everything works like a charm:

func process(_ url: URL) {
    let semaphore = DispatchSemaphore(value: 5) // limit to no more than 5 requests at a given time
    let uploadGroup = DispatchGroup()
    let uploadQueue = DispatchQueue.global(qos: .userInitiated)
    uploadQueue.async(group: uploadGroup) {
        // ...
        reader.read(nextChunk: { chunk in
            let requset = // ...
            uploadGroup.enter()
            semaphore.wait()
            service.upload(chunk: chunk, request: requset) {
                uploadGroup.leave()
                semaphore.signal()
            }
        }, completion: { _ in
            print("read completed")
        })
    }    
}

Well it is not exactly the same behavior as it uses a concurrent DispatchQueue when AsyncStream runs sequentially. So I did a little research and found out that probably TaskGroup is what I need in this case. It allows to run async tasks in parallel etc.

I tried it this way:

func process(_ url: URL) async {
    // ...
    do {
        let totalParts = try await withThrowingTaskGroup(of: Void.self) { [service] group -> Int in
            var counter = 1
            for await chunk in reader.read() {
                let request = // ...
                group.addTask {
                    _ = try await service.upload(data: chunk.data, request: request)
                }
                counter = chunk.index
            }
            return counter
        }
    } catch let error {
        reader.cancelReading()
        print(error)
    }
}

In that case memory consumption is even more that in example with AsyncStream iterating!

I suspect that there should be some conditions on which I need to suspend group or task or something and call group.addTask only when it is possible to really handle these tasks I'm going to add but I have no idea how to do it.

I found this Q/A And tried to put try await group.next() for each 5th chunk but it didn't help me at all.

Is there any mechanism similar to DispatchGroup + DispatchSemaphore but for modern concurrency?

UPDATE: In order to better demonstrate the difference between all 3 ways here are screenshots of memory report

AsyncStream iterating

AsyncStream iterating

AsyncStream + TaskGroup (using try await group.next() on each 5th chunk)

AsyncStream + TaskGroup

GCD DispatchQueue + DispatchGroup + DispatchSemaphore

GCD DispatchQueue + DispatchGroup + DispatchSemaphore


Solution

  • The key problem is the use of the AsyncStream. Your AsyncStream is reading data and yielding chunks more quickly than it can be uploaded.

    Consider this MCVE where I simulate a stream of 100 chunks, 1mb each:

    import os.log
    
    private let log = OSLog(subsystem: "Test", category: .pointsOfInterest)
    
    struct Chunk {
        let index: Int
        let data: Data
    }
    
    actor FileMock {
        let maxChunks = 100
        let chunkSize = 1_000_000
        var index = 0
    
        func nextChunk() -> Chunk? {
            guard index < maxChunks else { print("done"); return nil }
            defer { index += 1 }
            return Chunk(index: index, data: Data(repeating: UInt8(index & 0xff), count: chunkSize))
        }
    
        func chunks() -> AsyncStream<Chunk> {
            AsyncStream { continuation in
                index = 0
                while let chunk = nextChunk() {
                    os_signpost(.event, log: log, name: "chunk")
                    continuation.yield(chunk)
                }
    
                continuation.finish()
            }
        }
    }
    

    And

    func uploadAll() async throws {
        try await withThrowingTaskGroup(of: Void.self) { group in
            let chunks = await FileMock().chunks()
            var index = 0
            for await chunk in chunks {
                index += 1
                if index > 5 {
                    try await group.next()
                }
                group.addTask { [self] in
                    try await upload(chunk)
                }
            }
            try await group.waitForAll()
        }
    }
    
    func upload(_ chunk: Chunk) async throws {
        let id = OSSignpostID(log: log)
        os_signpost(.begin, log: log, name: #function, signpostID: id, "%d start", chunk.index)
        try await Task.sleep(nanoseconds: 1 * NSEC_PER_SEC)
        os_signpost(.end, log: log, name: #function, signpostID: id, "end")
    }
    

    When I do that, I see memory spike to 150mb as the AsyncStream rapidly yields all of the chunks upfront:

    enter image description here

    Note that all the signposts, showing when the Data objects are created, are clumped at the start of the process.


    Note, the documentation warns us that the sequence might conceivably generate values faster than they can be consumed:

    An arbitrary source of elements can produce elements faster than they are consumed by a caller iterating over them. Because of this, AsyncStream defines a buffering behavior, allowing the stream to buffer a specific number of oldest or newest elements. By default, the buffer limit is Int.max, which means the value is unbounded.

    Unfortunately, the various buffering alternatives, .bufferingOldest and .bufferingNewest, will only discard values when the buffer is filled. In some AsyncStreams, that might be a viable solution (e.g., if you are tracking the user location, you might only care about the most recent location), but when uploading chunks of the file, you obviously cannot have it discard chunks when the buffer is exhausted.


    So, rather than AsyncStream, just wrap your file reading with a custom AsyncSequence, which will not read the next chunk until it is actually needed, dramatically reducing peak memory usage, e.g.:

    struct FileMock: AsyncSequence {
        typealias Element = Chunk
    
        struct AsyncIterator : AsyncIteratorProtocol {
            let chunkSize = 1_000_000
            let maxChunks = 100
            var current = 0
    
            mutating func next() async -> Chunk? {
                os_signpost(.event, log: log, name: "chunk")
    
                guard current < maxChunks else { return nil }
                defer { current += 1 }
                return Chunk(index: current, data: Data(repeating: UInt8(current & 0xff), count: chunkSize))
            }
        }
    
        func makeAsyncIterator() -> AsyncIterator {
            return AsyncIterator()
        }
    }
    

    And

    func uploadAll() async throws {
        try await withThrowingTaskGroup(of: Void.self) { group in
            var index = 0
            for await chunk in FileMock() {
                index += 1
                if index > 5 {
                    try await group.next()
                }
                group.addTask { [self] in
                    try await upload(chunk)
                }
            }
            try await group.waitForAll()
        }
    }
    

    And that avoids loading all 100mb in memory at once. Note, the vertical scale on memory is different, but you can see that the peak usage is 100mb less than the above graph and the signposts, showing when data is read into memory, are now distributed throughout the graph rather than all at the start:

    enter image description here

    Now, obviously, I am only mocking the reading of a large file with Chunk/Data objects and mocking the upload with a Task.sleep, but it hopefully illustrates the basic idea.

    Bottom line, do not use AsyncStream to read the file, but rather consider a custom AsyncSequence or other pattern that reads the file in as the chunks are needed.


    A few other observations: