python-3.xweb-scrapingscrapy

Why does Scarpfly API sometimes return 422 and sometimes 200 for the same JS scenario?


I'm using the Scarpfly API to scrape a webpage using a GET request with a JavaScript scenario like this:

    "js_scenario": [
    { "fill": { "selector": "form#peoplesearch input#topsearch", "value": data.first_name + " " + data.last_name } },
    { "fill": { "selector": "form#peoplesearch input#city_state", "value": data.city + ", " + data.state } },
    { "click": { "selector": "form#peoplesearch div.form-search__submit.submit-button" } },
    { "wait_for_navigation": { "timeout": 3000 } }
]

Sometimes the API responds with 200 OK, and other times I get a 422 Unprocessable Entity error — even when the input data seems fine and the same JS scenario is sent.

What I tried:

Question: What could be causing Scarpfly to return 422 inconsistently? Is there something wrong with my js_scenario, or could this be due to timing/page load behavior?


Solution

  • I faced the same issue.

    In my case, I examined the response JSON from the Scarpfly API and noticed that "executed": false appeared under one of the input tags in the browser-context key in scrapfly response. After digging deeper, I found that sometimes the input tag wasn’t fully loaded on the UI before the scenario tried to interact with it. That’s why it returned a 422 — the request wasn’t fully processed.

    To fix this, I added a wait_for_selector step before filling the input. It waits until the tag appears on screen. You can just add

    { "wait_for_selector": { "selector": "form#peoplesearch input#topsearch" }},
    

    Once I added that wait_for_selector, the 422 errors stopped completely.