I'm trying to download a file result from this site:
And for that i use Selenium for select dates and click Download (if i wget the link get 403 error, it looks like the download link redirect to AWS S3, so i can't just use wget)
Is there a better way to do this? (i don't have experience with AWS, so i don't know how to deal with that)
My code:
from selenium import webdriver
from selenium.webdriver.common.by import By
import time
driver = webdriver.Chrome()
driver.get('https://www.coordinador.cl/operacion/documentos/registro-de-instrucciones-de-operacion-rio-sscc-energia/')
fini=driver.find_element(by='id',value='energia-inicio')
fini.click()
cal_ini=driver.find_element(By.CSS_SELECTOR,value='a.ui-state-default.ui-state-highlight')
cal_ini.click()
fini=driver.find_element(by='id',value='energia-termino')
fini.click()
cal_fin=driver.find_element(By.CSS_SELECTOR,value='a.ui-state-default.ui-state-highlight')
cal_fin.click()
downloadcsv=driver.find_element(By.CSS_SELECTOR,value='a.cen_btn.cen_btn-primary.download-energia')
downloadcsv.click()
time.sleep(30)
driver.close()
You can use wget
with a useragent for download or use aria2 for faster downloading.
wget --user-agent="Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/87.0.4280.88 Safari/537.36" -O export_energia.csv "https://www.coordinador.cl/wp-admin/admin-ajax.php?action=export_energia_csv&fecha_inicio=2024-11-14&fecha_termino=2024-12-03&hora_inicio=03:00:00&hora_termino=23:59:59"
--user-agent="..."
: Set the User-Agent string to mimic a specific web browser (Google Chrome on Windows 10).
-O export_energia.csv
: Saves the downloaded file as export_energia.csv.
To install aria2c
, you can follow the instructions based on your operating system:
Download the Installer and Extract:
Add to PATH (Optional):
aria2c
you can add the folder that you extracted to system PATH links to help you.Debian/Ubuntu:
sudo apt update
sudo apt install aria2
CentOS/RHEL:
sudo yum install aria2
Fedora:
sudo dnf install aria2
Command for downloading
aria2c --header="User-Agent: Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/87.0.4280.88 Safari/537.36" -x 16 -s 16 "https://www.coordinador.cl/wp-admin/admin-ajax.php?action=export_energia_csv&fecha_inicio=2024-11-14&fecha_termino=2024-12-03&hora_inicio=03:00:00&hora_termino=23:59:59"
--header="User-Agent: ..."
: like wget
use User-Agent
-x 16
: Allows up to 16 connections for download.
-s 16
: Splits the download into 16 segments for faster downloading.
Or just use it without -x
, -s
for normal download