In getting the video length from a YouTube url, Inspect from web browser shows there's a line:
Then I use requests and BeautifulSoup to get it:
import requests
from bs4 import BeautifulSoup
url = "https://www.youtube.com/watch?v=ANYyoutubeLINK"
response = requests.get(url)
response.raise_for_status()
soup = BeautifulSoup(response.text, 'html.parser')
duration_span = soup.find_all('span', class_='ytp-time-duration')
print (duration_span)
Neither "soup.find_all" nor "soup.find" works. What went wrong?
The element you are searching for doesn't exist in the response.
Without JS rendering you will not get the information you are seeking.
Use selenium in headless mode and you will get the time.
You can use Beautifulsoup
or get the data directly from Webdriver
.
from selenium import webdriver
from selenium.webdriver.chrome.options import Options
from selenium.webdriver.common.by import By
from bs4 import BeautifulSoup
chrome_options = Options()
chrome_options.add_argument('--start-maximized')
chrome_options.add_argument("--headless") # Run in headless mode (no GUI)
driver = webdriver.Chrome(options=chrome_options)
URL = "https://www.youtube.com/watch?v=ANYyoutubeLINK"
driver.get(URL)
#Get the time directly from webdriver
duration = driver.find_element(By.CLASS_NAME,'ytp-time-duration')
print(f"From webdriver: {duration.text}")
#Get the time using beautifulsoup
soup = BeautifulSoup(driver.page_source, 'html.parser')
duration_span = soup.find('span', class_='ytp-time-duration')
print (f"From beautifulsoup: {duration_span.text}")
#quit the webdriver
driver.quit()
Output:
From webdriver: 1:43
From beautifulsoup: 1:43