So I am working on a project which requires specific data from the cosmic-2 satellite.
The data is stored in compressed tar.gz and there are thousands of files so I don't want to download them all and then process them one by one due to time and storage constraints.
Instead I would like to look for an alternative way that allow me to read data from files directly without having to download them first.
Maybe requests or urllib can do that
Currently I tried
url = https://sitename.com/data.tar.gz
File = response.get(url, stream= True)
With tarfile.open(file, "r:gz") as f: f.extractall()
You can read data from a tar.gz file online without downloading it locally in Python by using the urllib module to fetch the file and tarfile module to extract its contents.
Here's an example of how you can do this:
import urllib.request
import tarfile
import io
url = "http://example.com/your_file.tar.gz" # Replace with the actual URL of the tar.gz file
# Fetch the tar.gz file
response = urllib.request.urlopen(url)
tar_bytes = io.BytesIO(response.read())
# Extract the contents
with tarfile.open(fileobj=tar_bytes, mode="r:gz") as tar:
for member in tar.getmembers():
f = tar.extractfile(member)
if f is not None:
content = f.read()
print(content.decode("utf-8"))