I'm trying to use the following formula to retrieve the first URL off a Google search result, but I'm getting the error that Imported content is empty.
:
=IMPORTXML("https://www.google.com/search?q=cnn&num=1", "//cite")
It looks like the result URL is contained within an cite
tag in the results page, but the formula isn't working as expected. I expected this formula would return https://www.cnn.com
. Same formula with //h3
as XPath does return the title.
Any idea what I am doing wrong?
The cite
tag is generated dynamically via script, making it unreachable to formulas like IMPORTXML
.
You can see that's the case if you disable JavaScript and refresh the page. You won't see that element anymore.
The URL is also available in a div
that is sibling of the h3
element containing the title. This element is not injected dynamically via script, so it can be accessed by IMPORTXML
. Retrieve that instead:
=IMPORTXML("https://www.google.com/search?q=cnn&num=1", "//h3/following-sibling::div")