What I want to do : load a raster from an s3 bucket in memory and set its CRS to 4326 (it has no crs set)
What I have so far:
import boto3
import rasterio
from rasterio.crs import CRS
bucket = 'my bucket'
key = 'my_key'
s3 = boto3.client('s3')
file_byte_string = s3.get_object(Bucket=bucket,Key=key)['Body'].read()
with rasterio.open(BytesIO(file_byte_string), mode='r+') as ds:
crs = CRS({"init": "epsg:4326"})
ds.crs = crs
I have found the way to structure my code here
Set CRS for a file read with rasterio
It works if I give it a path to a local file but it does not work for bytestreams.
The error I get when I have '+r' mode:
rasterio.errors.PathError: invalid path '<_io.BytesIO object at 0x7fb4503ca4d0>'
The error I get when I have 'r' mode:
rasterio.errors.DatasetAttributeError: read-only attribute
Is there a way to load bytestream in r+ mode so that I can set/modify the CRS?
You can achieve this if you wrap your bytes in a NamedTemporaryFile
. This and some alternatives are explained in the docs.
import boto3
import rasterio
from rasterio.crs import CRS
import tempfile
bucket = 'asdf'
key = 'asdf'
s3 = boto3.client('s3')
file_byte_string = s3.get_object(Bucket=bucket,Key=key)['Body'].read()
with tempfile.NamedTemporaryFile() as tmpfile:
tmpfile.write(file_byte_string)
with rasterio.open(tmpfile.name, "r+") as ds:
crs = CRS({"init": "epsg:4326"})
ds.crs = crs
An important limitation of this approach is that you have to download the whole file into memory from S3, as opposed to mounting the file remotely like this.