iosswifthtml-parsingnsurlsessiondatataskswiftsoup

url data task is not showing the right content when parsed with SwiftSoup? Swift 5


I am pretty new to swift and have an app that performs a simple url data task to parse the html contents of that website. I was trying to load certain elements but wasn't getting the content that I was seeing on the website when I inspect it manually. I don't really know what the problem.

I guess my question is; is there a way to load content as it would come up if I manually searched this website?

Here is the relevant code:

import SwiftSoup

let config = URLSessionConfiguration.default
config.httpAdditionalHeaders = ["User-Agent": "Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/70.0.3538.77 Safari/537.36"]
        
let session = URLSession(configuration: config)
        
let url = URL(string: link)

let task = session.dataTask(with: url!) { [self] (data, response, error) in            
    do {
        let htmlContent = NSString(data: data!, encoding: String.Encoding.utf8.rawValue)
        let doc: Document = try SwiftSoup.parse(htmlContent! as String)

        let elements = try doc.getAllElements().array()                    
                    
    } catch Exception.Error(type: let type, Message: let message) {
        print(type)
        print(message)
    } catch {
        print("error")
    }
                
}
            

Please let me know if there is any way to do this, even if it involves using a different package to parse the data. It is very important for my app. I would highly appreciate any help possible!

Thanks.


Solution

  • I found a solution that works for me. Here is the relevant code:

    private let webView: WKWebView = {
        let prefs = WKPreferences()
        prefs.javaScriptEnabled = true
        let config = WKWebViewConfiguration()
        config.preferences = prefs
        let webView = WKWebView(frame: .zero, configuration: config)
        return webView
    }()
    
    override func viewDidLoad() {
        super.viewDidLoad()
          
        view.addSubview(webView)
        webView.navigationDelegate = self
     
    }
    
    func webView(_ webView: WKWebView, didFinish navigation: WKNavigation!) {
        parseData()        
    }
    
    
    func parseData() {
            
        DispatchQueue.main.asyncAfter(deadline: .now() + 5.0) { [unowned self] in
    
            webView.evaluateJavaScript("document.body.innerHTML") { result, error in
                guard let htmlContent = result, error == nil else {
                    print("error")
                    return
               }                
                    
               do {
                   let doc = try SwiftSoup.parse(htmlContent as! String)
                   var allProducts = try doc.getAllElements.array()
               } catch {
                   print("error")
               }
                    
           }
      
       }   
            
    }
    
    

    Using a WebView to load the website first, then parse the data after a delay is a working solution for me. It might not be the best idea to have a fixed delay, so if any has any other suggestion it would be highly appreciated!