I'm trying to scrape a page that has a few buttons.
I want to select/click on the last button. Using Chrome's selector gadget extention, I can successfully select the last buton by adding :last
at the end of my selector. But, when I run the following functions in rvest
they rerturn: Error in onRejected(reason) : code: -32000 message: DOM Error while querying
Here are the codes:
page <-
read_html_live("https://researchers.cedars-sinai.edu/search?by=text&type=user")
page %>%
html_elements("span button:last")
# or
page$click(css = "span button:last")
I have tried these changes but they don't do the job:
:nth-child(1)
, :first-child
, and :nth-last-child(1)
.
Also, I know XPATH can solve the problem. But, the issue is that rvest
's click()
does not accept XPATH, yet. So, I have to stick with CSS.
You can call on the API instead to fetch everything.
library(tidyverse)
library(httr2)
req <- request("https://researchers.cedars-sinai.edu/api/users") %>%
req_body_json(list(params = list(by = "text", type = "user"))) %>%
req_perform() %>%
resp_body_json(simplifyVector = TRUE)
n <- req %>%
pluck("pagination", "total")
df <- map(seq(0, n, 100),
~ request("https://researchers.cedars-sinai.edu/api/users") %>%
req_body_json(list(params = list(by = "text", type = "user"),
pagination = list(startFrom = .x, perPage = 100))) %>%
req_perform() %>%
resp_body_json(simplifyVector = TRUE) %>%
pluck("resource") %>%
as_tibble()) %>%
list_rbind()
# A tibble: 986 × 16
lastName overview hasThumbnail discoveryUrlId positions tags$explicit discoveryId linkedObjectsCounts$…¹
<chr> <chr> <lgl> <chr> <list> <list> <chr> <int>
1 Abdel-Hafiz "Hany Ab… TRUE Hany.Abdel-Ha… <df> <df [1 × 3]> 1513 1
2 Abdul-Haqq NA TRUE Ryan.Abdul <df> <NULL> 3636 0
3 Aboujaoude NA TRUE Elias.Aboujao… <df> <NULL> 10472 0
4 Abuav "Dr. Abu… TRUE Rachel.Abuav <df> <NULL> 4847 0
5 Accortt "Eynav A… TRUE Eynav.Accortt <df> <df [8 × 3]> 1865 8
6 Ader "The ove… TRUE Marilyn.Ader <df> <df [8 × 3]> 1237 13
7 Ahdoot NA TRUE Michael.Ahdoot <df> <df [1 × 3]> 877 10
8 Ahluwalia NA TRUE Sonu.Ahluwalia <df> <df [1 × 3]> 2958 0
9 Ahmed NA FALSE Waseem.Ahmed <df> <NULL> 18202 0
10 Ainsworth NA TRUE Richard.Ainsw… <df> <NULL> 7154 3
# ℹ 976 more rows
# ℹ abbreviated name: ¹linkedObjectsCounts$grants$all
# ℹ 13 more variables: linkedObjectsCounts$grants$favourites <int>,
# linkedObjectsCounts$teachingActivities <df[,2]>, $equipment <df[,2]>, $professionalActivities <df[,2]>,
# $publications <df[,2]>, firstName <chr>, firstNameLastName <chr>, equipmentLinkTypes <list>,
# objectId <int>, updatedWhen <chr>, hasCollaborationData <lgl>, embeddableMediaList <list>,
# customFilterOne <list>
# ℹ Use `print(n = ...)` to see more rows