iosswiftmultithreadingnsoperationurlsession

How To Download Multiple Files Sequentially using URLSession downloadTask in Swift


I have an app that has to download multiple large files. I want it to download each file one by one sequentially instead of concurrently. When it runs concurrently, the app gets overloaded and crashes.

So. I’m trying to wrap a downloadTask inside a BlockOperation and then setting the maxConcurrentOperationCount = 1 on the queue. I wrote this code below but it didn’t work since both files get downloaded concurrently.

class ViewController: UIViewController, URLSessionDelegate, URLSessionDownloadDelegate {

    override func viewDidLoad() {
        super.viewDidLoad()
        // Do any additional setup after loading the view, typically from a nib.
        processURLs()
    }

    func download(url: URL) {
        let session: URLSession = URLSession(configuration: .default, delegate: self, delegateQueue: nil)
        let downloadTask = session.downloadTask(with: URLRequest(url: url))
        downloadTask.resume()
    }

    func processURLs(){
        //setup queue and set max concurrent to 1
        var queue = OperationQueue()
        queue.name = "Download queue"
        queue.maxConcurrentOperationCount = 1

        let url = URL(string: "http://azspeastus.blob.core.windows.net/azurespeed/100MB.bin?sv=2014-02-14&sr=b&sig=%2FZNzdvvzwYO%2BQUbrLBQTalz%2F8zByvrUWD%2BDfLmkpZuQ%3D&se=2015-09-01T01%3A48%3A51Z&sp=r")
        let url2 = URL(string: "http://azspwestus.blob.core.windows.net/azurespeed/100MB.bin?sv=2014-02-14&sr=b&sig=ufnzd4x9h1FKmLsODfnbiszXd4EyMDUJgWhj48QfQ9A%3D&se=2015-09-01T01%3A48%3A51Z&sp=r")

        let urls = [url, url2].compactMap { $0 }
        for url in urls {
            let operation = BlockOperation {
                print("starting download")
                self.download(url: url)
            }

            queue.addOperation(operation)
        }
    }

    func urlSession(_ session: URLSession, downloadTask: URLSessionDownloadTask, didFinishDownloadingTo location: URL) {
        …
    }

    func urlSession(_ session: URLSession, downloadTask: URLSessionDownloadTask, didResumeAtOffset fileOffset: Int64, expectedTotalBytes: Int64) {
        …
    }

    func urlSession(_ session: URLSession, downloadTask: URLSessionDownloadTask, didWriteData bytesWritten: Int64, totalBytesWritten: Int64, totalBytesExpectedToWrite: Int64) {
        var progress = Double(totalBytesWritten) / Double(totalBytesExpectedToWrite)
        print(progress)
    }
}

How can I write this properly to achieve my goal of only downloading one file at a time?


Solution

  • Your code won't work because URLSessionDownloadTask runs asynchronously. Thus the BlockOperation completes before the download is done and therefore while the operations fire off sequentially, the download tasks will continue asynchronously and in parallel.

    While there are work-arounds one can contemplate (e.g., recursive patterns initiating one request after the prior one finishes, non-zero semaphore pattern on background thread, etc.), the elegant solution is one of the proven asynchronous frameworks.

    In iOS 15 and later, we would use async-await method download(from:delegate:), e.g.

    func downloadFiles() async throws {
        let folder = try! FileManager.default
            .url(for: .cachesDirectory, in: .userDomainMask, appropriateFor: nil, create: true)
    
        for url in urls {
            let (source, _) = try await URLSession.shared.download(from: url)
            let destination = folder.appendingPathComponent(url.lastPathComponent)
            try FileManager.default.moveItem(at: source, to: destination)
        }
    }
    

    Where

    override func viewDidLoad() {
        super.viewDidLoad()
    
        Task {
            do {
                try await downloadFiles()
            } catch {
                print(error)
            }
        }
    }
    

    That only works in iOS 15 and later (or macOS 12 and later). But Xcode 13.2 and later actually lets you use async-await in iOS 13, but you just have to write your own async rendition of download. See Cancelling an async/await Network Request for sample implementation. And you would then call this rendition for iOS 13 and later:

    func downloadFiles() async throws {
        let folder = try! FileManager.default
            .url(for: .cachesDirectory, in: .userDomainMask, appropriateFor: nil, create: true)
    
        for url in urls {
            let (source, _) = try await URLSession.shared.download(with: url)
            let destination = folder.appendingPathComponent(url.lastPathComponent)
            try FileManager.default.moveItem(at: source, to: destination)
        }
    }
    

    In iOS versions prior to 13, if you wanted to control the degree of concurrency of a series of asynchronous tasks, we would reach for an asynchronous Operation subclass.

    Or, in iOS 13 and later, you might also consider Combine. (There are other third-party asynchronous programming frameworks, but I will restrict myself to Apple-provided approaches.)

    Both of these are described below in my original answer.


    Operation

    To address this, you can wrap the requests in asynchronous Operation subclass. See Configuring Operations for Concurrent Execution in the Concurrency Programming Guide for more information.

    But before I illustrate how to do this in your situation (the delegate-based URLSession), let me first show you the simpler solution when using the completion handler rendition. We'll later build upon this for your more complicated question. So, in Swift 3 and later:

    class DownloadOperation : AsynchronousOperation {
        var task: URLSessionTask!
        
        init(session: URLSession, url: URL) {
            super.init()
            
            task = session.downloadTask(with: url) { temporaryURL, response, error in
                defer { self.finish() }
                
                guard
                    let httpResponse = response as? HTTPURLResponse,
                    200..<300 ~= httpResponse.statusCode
                else {
                    // handle invalid return codes however you'd like
                    return
                }
    
                guard let temporaryURL = temporaryURL, error == nil else {
                    print(error ?? "Unknown error")
                    return
                }
                
                do {
                    let manager = FileManager.default
                    let destinationURL = try manager.url(for: .documentDirectory, in: .userDomainMask, appropriateFor: nil, create: false)
                        .appendingPathComponent(url.lastPathComponent)
                    try? manager.removeItem(at: destinationURL)                   // remove the old one, if any
                    try manager.moveItem(at: temporaryURL, to: destinationURL)    // move new one there
                } catch let moveError {
                    print("\(moveError)")
                }
            }
        }
        
        override func cancel() {
            task.cancel()
            super.cancel()
        }
        
        override func main() {
            task.resume()
        }
        
    }
    

    Where

    /// Asynchronous operation base class
    ///
    /// This is abstract to class emits all of the necessary KVO notifications of `isFinished`
    /// and `isExecuting` for a concurrent `Operation` subclass. You can subclass this and
    /// implement asynchronous operations. All you must do is:
    ///
    /// - override `main()` with the tasks that initiate the asynchronous task;
    ///
    /// - call `completeOperation()` function when the asynchronous task is done;
    ///
    /// - optionally, periodically check `self.cancelled` status, performing any clean-up
    ///   necessary and then ensuring that `finish()` is called; or
    ///   override `cancel` method, calling `super.cancel()` and then cleaning-up
    ///   and ensuring `finish()` is called.
    
    class AsynchronousOperation: Operation {
        
        /// State for this operation.
        
        @objc private enum OperationState: Int {
            case ready
            case executing
            case finished
        }
        
        /// Concurrent queue for synchronizing access to `state`.
        
        private let stateQueue = DispatchQueue(label: Bundle.main.bundleIdentifier! + ".rw.state", attributes: .concurrent)
        
        /// Private backing stored property for `state`.
        
        private var rawState: OperationState = .ready
        
        /// The state of the operation
        
        @objc private dynamic var state: OperationState {
            get { return stateQueue.sync { rawState } }
            set { stateQueue.sync(flags: .barrier) { rawState = newValue } }
        }
        
        // MARK: - Various `Operation` properties
        
        open         override var isReady:        Bool { return state == .ready && super.isReady }
        public final override var isExecuting:    Bool { return state == .executing }
        public final override var isFinished:     Bool { return state == .finished }
        
        // KVO for dependent properties
        
        open override class func keyPathsForValuesAffectingValue(forKey key: String) -> Set<String> {
            if ["isReady", "isFinished", "isExecuting"].contains(key) {
                return [#keyPath(state)]
            }
            
            return super.keyPathsForValuesAffectingValue(forKey: key)
        }
        
        // Start
        
        public final override func start() {
            if isCancelled {
                finish()
                return
            }
            
            state = .executing
            
            main()
        }
        
        /// Subclasses must implement this to perform their work and they must not call `super`. The default implementation of this function throws an exception.
        
        open override func main() {
            fatalError("Subclasses must implement `main`.")
        }
        
        /// Call this function to finish an operation that is currently executing
        
        public final func finish() {
            if !isFinished { state = .finished }
        }
    }
    

    Then you can do:

    for url in urls {
        queue.addOperation(DownloadOperation(session: session, url: url))
    }
    

    So that's one very easy way to wrap asynchronous URLSession/NSURLSession requests in asynchronous Operation/NSOperation subclass. More generally, this is a useful pattern, using AsynchronousOperation to wrap up some asynchronous task in an Operation/NSOperation object.

    Unfortunately, in your question, you wanted to use delegate-based URLSession/NSURLSession so you could monitor the progress of the downloads. This is more complicated.

    This is because the "task complete" NSURLSession delegate methods are called at the session object's delegate. This is an infuriating design feature of NSURLSession (but Apple did it to simplify background sessions, which isn't relevant here, but we're stuck with that design limitation).

    But we have to asynchronously complete the operations as the tasks finish. So we need some way for the session to figure out which operation to complete when didCompleteWithError is called. Now you could have each operation have its own NSURLSession object, but it turns out that this is pretty inefficient.

    So, to handle that, I maintain a dictionary, keyed by the task's taskIdentifier, which identifies the appropriate operation. That way, when the download finishes, you can "complete" the correct asynchronous operation. Thus:

    /// Manager of asynchronous download `Operation` objects
    
    class DownloadManager: NSObject {
        
        /// Dictionary of operations, keyed by the `taskIdentifier` of the `URLSessionTask`
        
        fileprivate var operations = [Int: DownloadOperation]()
        
        /// Serial OperationQueue for downloads
        
        private let queue: OperationQueue = {
            let _queue = OperationQueue()
            _queue.name = "download"
            _queue.maxConcurrentOperationCount = 1    // I'd usually use values like 3 or 4 for performance reasons, but OP asked about downloading one at a time
            
            return _queue
        }()
        
        /// Delegate-based `URLSession` for DownloadManager
        
        lazy var session: URLSession = {
            let configuration = URLSessionConfiguration.default
            return URLSession(configuration: configuration, delegate: self, delegateQueue: nil)
        }()
        
        /// Add download
        ///
        /// - parameter URL:  The URL of the file to be downloaded
        ///
        /// - returns:        The DownloadOperation of the operation that was queued
        
        @discardableResult
        func queueDownload(_ url: URL) -> DownloadOperation {
            let operation = DownloadOperation(session: session, url: url)
            operations[operation.task.taskIdentifier] = operation
            queue.addOperation(operation)
            return operation
        }
        
        /// Cancel all queued operations
        
        func cancelAll() {
            queue.cancelAllOperations()
        }
        
    }
    
    // MARK: URLSessionDownloadDelegate methods
    
    extension DownloadManager: URLSessionDownloadDelegate {
        
        func urlSession(_ session: URLSession, downloadTask: URLSessionDownloadTask, didFinishDownloadingTo location: URL) {
            operations[downloadTask.taskIdentifier]?.urlSession(session, downloadTask: downloadTask, didFinishDownloadingTo: location)
        }
        
        func urlSession(_ session: URLSession, downloadTask: URLSessionDownloadTask, didWriteData bytesWritten: Int64, totalBytesWritten: Int64, totalBytesExpectedToWrite: Int64) {
            operations[downloadTask.taskIdentifier]?.urlSession(session, downloadTask: downloadTask, didWriteData: bytesWritten, totalBytesWritten: totalBytesWritten, totalBytesExpectedToWrite: totalBytesExpectedToWrite)
        }
    }
    
    // MARK: URLSessionTaskDelegate methods
    
    extension DownloadManager: URLSessionTaskDelegate {
        
        func urlSession(_ session: URLSession, task: URLSessionTask, didCompleteWithError error: Error?)  {
            let key = task.taskIdentifier
            operations[key]?.urlSession(session, task: task, didCompleteWithError: error)
            operations.removeValue(forKey: key)
        }
        
    }
    
    /// Asynchronous Operation subclass for downloading
    
    class DownloadOperation : AsynchronousOperation {
        let task: URLSessionTask
        
        init(session: URLSession, url: URL) {
            task = session.downloadTask(with: url)
            super.init()
        }
        
        override func cancel() {
            task.cancel()
            super.cancel()
        }
        
        override func main() {
            task.resume()
        }
    }
    
    // MARK: NSURLSessionDownloadDelegate methods
    
    extension DownloadOperation: URLSessionDownloadDelegate {
        
        func urlSession(_ session: URLSession, downloadTask: URLSessionDownloadTask, didFinishDownloadingTo location: URL) {
            guard
                let httpResponse = downloadTask.response as? HTTPURLResponse,
                200..<300 ~= httpResponse.statusCode
            else {
                // handle invalid return codes however you'd like
                return
            }
    
            do {
                let manager = FileManager.default
                let destinationURL = try manager
                    .url(for: .applicationSupportDirectory, in: .userDomainMask, appropriateFor: nil, create: true)
                    .appendingPathComponent(downloadTask.originalRequest!.url!.lastPathComponent)
                try? manager.removeItem(at: destinationURL)
                try manager.moveItem(at: location, to: destinationURL)
            } catch {
                print(error)
            }
        }
        
        func urlSession(_ session: URLSession, downloadTask: URLSessionDownloadTask, didWriteData bytesWritten: Int64, totalBytesWritten: Int64, totalBytesExpectedToWrite: Int64) {
            let progress = Double(totalBytesWritten) / Double(totalBytesExpectedToWrite)
            print("\(downloadTask.originalRequest!.url!.absoluteString) \(progress)")
        }
    }
    
    // MARK: URLSessionTaskDelegate methods
    
    extension DownloadOperation: URLSessionTaskDelegate {
        
        func urlSession(_ session: URLSession, task: URLSessionTask, didCompleteWithError error: Error?)  {
            defer { finish() }
            
            if let error = error {
                print(error)
                return
            }
            
            // do whatever you want upon success
        }
        
    }
    

    And then use it like so:

    let downloadManager = DownloadManager()
    
    override func viewDidLoad() {
        super.viewDidLoad()
        
        let urlStrings = [
            "http://spaceflight.nasa.gov/gallery/images/apollo/apollo17/hires/s72-55482.jpg",
            "http://spaceflight.nasa.gov/gallery/images/apollo/apollo10/hires/as10-34-5162.jpg",
            "http://spaceflight.nasa.gov/gallery/images/apollo-soyuz/apollo-soyuz/hires/s75-33375.jpg",
            "http://spaceflight.nasa.gov/gallery/images/apollo/apollo17/hires/as17-134-20380.jpg",
            "http://spaceflight.nasa.gov/gallery/images/apollo/apollo17/hires/as17-140-21497.jpg",
            "http://spaceflight.nasa.gov/gallery/images/apollo/apollo17/hires/as17-148-22727.jpg"
        ]
        let urls = urlStrings.compactMap { URL(string: $0) }
        
        let completion = BlockOperation {
            print("all done")
        }
        
        for url in urls {
            let operation = downloadManager.queueDownload(url)
            completion.addDependency(operation)
        }
    
        OperationQueue.main.addOperation(completion)
    }
    

    See revision history for Swift 2 implementation.


    Combine

    For Combine, the idea would be to create a Publisher for URLSessionDownloadTask. Then you can do something like:

    var downloadRequests: AnyCancellable?
    
    /// Download a series of assets
    
    func downloadAssets() {
        downloadRequests = downloadsPublisher(for: urls, maxConcurrent: 1).sink { completion in
            switch completion {
            case .finished:
                print("done")
    
            case .failure(let error):
                print("failed", error)
            }
        } receiveValue: { destinationUrl in
            print(destinationUrl)
        }
    }
    
    /// Publisher for single download
    ///
    /// Copy downloaded resource to caches folder.
    ///
    /// - Parameter url: `URL` being downloaded.
    /// - Returns: Publisher for the URL with final destination of the downloaded asset.
    
    func downloadPublisher(for url: URL) -> AnyPublisher<URL, Error> {
        URLSession.shared.downloadTaskPublisher(for: url)
            .tryCompactMap {
                let destination = try FileManager.default
                    .url(for: .cachesDirectory, in: .userDomainMask, appropriateFor: nil, create: true)
                    .appendingPathComponent(url.lastPathComponent)
                try FileManager.default.moveItem(at: $0.location, to: destination)
                return destination
            }
            .receive(on: RunLoop.main)
            .eraseToAnyPublisher()
    }
    
    /// Publisher for a series of downloads
    ///
    /// This downloads not more than `maxConcurrent` assets at a given time.
    ///
    /// - Parameters:
    ///   - urls: Array of `URL`s of assets to be downloaded.
    ///   - maxConcurrent: The maximum number of downloads to run at any given time (default 4).
    /// - Returns: Publisher for the URLs with final destination of the downloaded assets.
    
    func downloadsPublisher(for urls: [URL], maxConcurrent: Int = 4) -> AnyPublisher<URL, Error> {
        Publishers.Sequence(sequence: urls.map { downloadPublisher(for: $0) })
            .flatMap(maxPublishers: .max(maxConcurrent)) { $0 }
            .eraseToAnyPublisher()
    }
    

    Now, unfortunately, Apple supplies a DataTaskPublisher (which loads the full asset into memory which is not acceptable solution for large assets), but one can refer to their source code and adapt it to create a DownloadTaskPublisher:

    //  DownloadTaskPublisher.swift
    //
    //  Created by Robert Ryan on 9/28/20.
    //
    //  Adapted from Apple's `DataTaskPublisher` at:
    //  https://github.com/apple/swift/blob/88b093e9d77d6201935a2c2fb13f27d961836777/stdlib/public/Darwin/Foundation/Publishers%2BURLSession.swift
    
    import Foundation
    import Combine
    
    // MARK: Download Tasks
    
    @available(macOS 10.15, iOS 13.0, tvOS 13.0, watchOS 6.0, *)
    extension URLSession {
        /// Returns a publisher that wraps a URL session download task for a given URL.
        ///
        /// The publisher publishes temporary when the task completes, or terminates if the task fails with an error.
        ///
        /// - Parameter url: The URL for which to create a download task.
        /// - Returns: A publisher that wraps a download task for the URL.
    
        public func downloadTaskPublisher(for url: URL) -> DownloadTaskPublisher {
            let request = URLRequest(url: url)
            return DownloadTaskPublisher(request: request, session: self)
        }
    
        /// Returns a publisher that wraps a URL session download task for a given URL request.
        ///
        /// The publisher publishes download when the task completes, or terminates if the task fails with an error.
        ///
        /// - Parameter request: The URL request for which to create a download task.
        /// - Returns: A publisher that wraps a download task for the URL request.
    
        public func downloadTaskPublisher(for request: URLRequest) -> DownloadTaskPublisher {
            return DownloadTaskPublisher(request: request, session: self)
        }
    
        public struct DownloadTaskPublisher: Publisher {
            public typealias Output = (location: URL, response: URLResponse)
            public typealias Failure = URLError
    
            public let request: URLRequest
            public let session: URLSession
    
            public init(request: URLRequest, session: URLSession) {
                self.request = request
                self.session = session
            }
    
            public func receive<S: Subscriber>(subscriber: S) where Failure == S.Failure, Output == S.Input {
                subscriber.receive(subscription: Inner(self, subscriber))
            }
    
            private typealias Parent = DownloadTaskPublisher
            private final class Inner<Downstream: Subscriber>: Subscription, CustomStringConvertible, CustomReflectable, CustomPlaygroundDisplayConvertible
            where
                Downstream.Input == Parent.Output,
                Downstream.Failure == Parent.Failure
            {
                typealias Input = Downstream.Input
                typealias Failure = Downstream.Failure
    
                private let lock: NSLocking
                private var parent: Parent?               // GuardedBy(lock)
                private var downstream: Downstream?       // GuardedBy(lock)
                private var demand: Subscribers.Demand    // GuardedBy(lock)
                private var task: URLSessionDownloadTask! // GuardedBy(lock)
                var description: String { return "DownloadTaskPublisher" }
                var customMirror: Mirror {
                    lock.lock()
                    defer { lock.unlock() }
                    return Mirror(self, children: [
                        "task": task as Any,
                        "downstream": downstream as Any,
                        "parent": parent as Any,
                        "demand": demand,
                    ])
                }
                var playgroundDescription: Any { return description }
    
                init(_ parent: Parent, _ downstream: Downstream) {
                    self.lock = NSLock()
                    self.parent = parent
                    self.downstream = downstream
                    self.demand = .max(0)
                }
    
                // MARK: - Upward Signals
                func request(_ d: Subscribers.Demand) {
                    precondition(d > 0, "Invalid request of zero demand")
    
                    lock.lock()
                    guard let p = parent else {
                        // We've already been cancelled so bail
                        lock.unlock()
                        return
                    }
    
                    // Avoid issues around `self` before init by setting up only once here
                    if self.task == nil {
                        let task = p.session.downloadTask(
                            with: p.request,
                            completionHandler: handleResponse(location:response:error:)
                        )
                        self.task = task
                    }
    
                    self.demand += d
                    let task = self.task!
                    lock.unlock()
    
                    task.resume()
                }
    
                private func handleResponse(location: URL?, response: URLResponse?, error: Error?) {
                    lock.lock()
                    guard demand > 0,
                          parent != nil,
                          let ds = downstream
                    else {
                        lock.unlock()
                        return
                    }
    
                    parent = nil
                    downstream = nil
    
                    // We clear demand since this is a single shot shape
                    demand = .max(0)
                    task = nil
                    lock.unlock()
    
                    if let location = location, let response = response, error == nil {
                        _ = ds.receive((location, response))
                        ds.receive(completion: .finished)
                    } else {
                        let urlError = error as? URLError ?? URLError(.unknown)
                        ds.receive(completion: .failure(urlError))
                    }
                }
    
                func cancel() {
                    lock.lock()
                    guard parent != nil else {
                        lock.unlock()
                        return
                    }
                    parent = nil
                    downstream = nil
                    demand = .max(0)
                    let task = self.task
                    self.task = nil
                    lock.unlock()
                    task?.cancel()
                }
            }
        }
    }
    

    Now, unfortunately, that isn’t using URLSession delegate pattern, but rather the completion handler rendition. But one could conceivably adapt it for delegate pattern.

    Also, this will stop downloads when one fails. If you don't want it to stop just because one fails, you could conceivably define it to Never fail, and instead replaceError with nil:

    /// Publisher for single download
    ///
    /// Copy downloaded resource to caches folder.
    ///
    /// - Parameter url: `URL` being downloaded.
    /// - Returns: Publisher for the URL with final destination of the downloaded asset. Returns `nil` if request failed.
    
    func downloadPublisher(for url: URL) -> AnyPublisher<URL?, Never> {
        URLSession.shared.downloadTaskPublisher(for: url)
            .tryCompactMap {
                let destination = try FileManager.default
                    .url(for: .cachesDirectory, in: .userDomainMask, appropriateFor: nil, create: true)
                    .appendingPathComponent(url.lastPathComponent)
                try FileManager.default.moveItem(at: $0.location, to: destination)
                return destination
            }
            .replaceError(with: nil)
            .receive(on: RunLoop.main)
            .eraseToAnyPublisher()
    }
    
    /// Publisher for a series of downloads
    ///
    /// This downloads not more than `maxConcurrent` assets at a given time.
    ///
    /// - Parameters:
    ///   - urls: Array of `URL`s of assets to be downloaded.
    ///   - maxConcurrent: The maximum number of downloads to run at any given time (default 4).
    /// - Returns: Publisher for the URLs with final destination of the downloaded assets.
    
    func downloadsPublisher(for urls: [URL], maxConcurrent: Int = 4) -> AnyPublisher<URL?, Never> {
        Publishers.Sequence(sequence: urls.map { downloadPublisher(for: $0) })
            .flatMap(maxPublishers: .max(maxConcurrent)) { $0 }
            .eraseToAnyPublisher()
    }
    

    Perhaps needless to say, I would generally discourage the downloading of assets/files sequentially. You should allow them to run concurrently, but control the degree of concurrency so your app is not overloaded. All of the patterns outlined above constrain the degree of concurrency to something reasonable.