I want to use Python to fill this form.
I tried using Mechanize but this is a Microsoft Form which uses JavaScript and has no form tag and no GET/POST URL. Maybe BeautifulSoup/Selenium can do this, but I do not have any experience in scraping JS forms. Can anyone help me out and suggest how to go about this?
Here's what I've tried, Mechanize is unable to recognize any form on the page:
import mechanize
def main():
br = mechanize.Browser()
br.set_handle_robots(False)
br.set_handle_refresh(False)
br.addheaders = [('User-agent', 'Mozilla/5.0 (X11; U; Linux i686; en-US; rv:1.9.0.1) Gecko/2008071615 Fedora/3.0.1-1.fc9 Firefox/3.0.1')]
response = br.open("https://forms.office.com/Pages/ResponsePage.aspx?id=8Pm7rtoj40mYvzIXGrvJvCxQDveyljlCrKN2Teo3EHFUQVNaWDlYRkhYR09JRTZWRFpKTTNIQU9HUC4u")
for form in br.forms():
print("Form name:", form.name) #prints nothing
print(form) #prints nothing
if __name__ == '__main__':
main()
Selenium works fine.
You'll need to install the components
pip install selenium
Then this runs:
from selenium import webdriver
driver = webdriver.Chrome()
url = "https://forms.office.com/Pages/ResponsePage.aspx?id=8Pm7rtoj40mYvzIXGrvJvCxQDveyljlCrKN2Teo3EHFUQVNaWDlYRkhYR09JRTZWRFpKTTNIQU9HUC4u"
driver.get(url)
name = driver.find_element_by_xpath("//div[@class='question-title-box'][.//span[text()='NAME']]/following-sibling::*//input")
name.send_keys("hello, World")
setionSelection = "F"
section = driver.find_element_by_xpath("//div[@class='question-title-box'][.//span[text()='Section']]/following-sibling::*//input[@value='" + setionSelection + "']")
section.click()
date = driver.find_element_by_xpath("//input[contains(@placeholder, 'Please input date')]")
date.send_keys("01/12/2020")
submit = driver.find_element_by_xpath("//div[text()='Submit']")
submit.click()
The xapths are a little long but they're based on the question text so potentially stable
For an alternative approach - When you say there is no POST url, did you check devtools? - That exposes the destination of the form:
Request URL: https://forms.office.com/formapi/api/aebbf9f0-23da-49e3-98bf-32171abbc9bc/users/f70e502c-96b2-4239-aca3-764dea371071/forms('8Pm7rtoj40mYvzIXGrvJvCxQDveyljlCrKN2Teo3EHFUQVNaWDlYRkhYR09JRTZWRFpKTTNIQU9HUC4u')/responses
Request Method: POST
it also exposes the payload... This is the first submit:
{startDate: "2020-08-17T10:40:18.504Z", submitDate: "2020-08-17T10:40:18.507Z",…}
answers: "[{"questionId":"r8f09d63e6f6f42feb2f8f4f8ed3f9389","answer1":"Hello, World"},{"questionId":"r28fe12073dfa47399f8ce95ae679dccf","answer1":"G"},{"questionId":"r8f9e9fedcc2e410c80bfa1e0e3ef9750","answer1":"2020-08-28"}]"
startDate: "2020-08-17T10:40:18.504Z"
submitDate: "2020-08-17T10:40:18.507Z"
Those post URL UUID/GUIDs questions IDs seem to be satic for this form. Every time i run form they're not chaning. This is the second run:
{startDate: "2020-08-17T10:43:48.544Z", submitDate: "2020-08-17T10:43:48.546Z",…}
answers: "[{"questionId":"r8f09d63e6f6f42feb2f8f4f8ed3f9389","answer1":"test me"},{"questionId":"r28fe12073dfa47399f8ce95ae679dccf","answer1":"G"},{"questionId":"r8f9e9fedcc2e410c80bfa1e0e3ef9750","answer1":"2020-08-12"}]"
startDate: "2020-08-17T10:43:48.544Z"
submitDate: "2020-08-17T10:43:48.546Z"
Once you capture this once you'll probably be able to do it through the API without a GUI.
... Just to make sure, i tried it and i get success...
import requests
url = "https://forms.office.com/formapi/api/aebbf9f0-23da-49e3-98bf-32171abbc9bc/users/f70e502c-96b2-4239-aca3-764dea371071/forms('8Pm7rtoj40mYvzIXGrvJvCxQDveyljlCrKN2Teo3EHFUQVNaWDlYRkhYR09JRTZWRFpKTTNIQU9HUC4u')/responses"
myobj = {"startDate":"2020-08-17T10:48:40.118Z","submitDate":"2020-08-17T10:48:40.121Z","answers":"[{\"questionId\":\"r8f09d63e6f6f42feb2f8f4f8ed3f9389\",\"answer1\":\"Hello again, World\"},{\"questionId\":\"r28fe12073dfa47399f8ce95ae679dccf\",\"answer1\":\"F\"},{\"questionId\":\"r8f9e9fedcc2e410c80bfa1e0e3ef9750\",\"answer1\":\"2020-08-26\"}]"}
x = requests.post(url, data = myobj)
My answers are just hard coded into the data object but it seems to work.
Remember to pip install requests
if you don't already have it