amazon-web-servicesamazon-s3

Does AWS S3 GetObject provide random access?


I can provide HTTP Range headers to AWS S3's GetObject to request a specified range of bytes of an object.

Is it truly random access, or does S3 have to process all of the object before that range before returning my requested range?

Is the range header simply reducing the bytes transferred, or does it also provide efficient random access?


Solution

  • I did a quick test with a 2GB file in S3 and executed ranged gets for 8 bytes at various offsets in the file (including start, middle, and end). The total time seemed to be pretty consistent at 250ms user time (including starting node.js, loading packages, executing range GetObject), as measured by time from my Mac to us-east-1.

    Here is the original AWS SDK v2 script that I used (run multiple times with different range values each time):

    const AWS = require('aws-sdk');
    const s3 = new AWS.S3();
    
    const params = {
      Bucket: 'mybucket',
      Key: '2gfile',
      // Range: "bytes=0-7"
      // Range: "bytes=100000-100007"
      // Range: "bytes=10000000-10000007"
      // Range: "bytes=20000000-20000007"
      // Range: "bytes=40000000-40000007"
      // Range: "bytes=180000000-180000007"
      // Range: "bytes=1800000000-1800000007"
      // Range: "bytes=1073741824-1073741831",
      Range: "bytes=2147483100-2147483107",
    };
    
    (async () => {
      try {
        await s3.getObject(params).promise();
        console.log(params.Range, 'OK');
      } catch (err) {
        console.log(err, err.stack);
        throw err;
      }
    })();
    

    I wasn't able to find a definitive statement in the AWS documentation for the expected behavior here though I'd hope and expect that it is close to O(1) constant time.

    I'd encourage you to investigate further before committing to a design. And maybe update us here.

    [Update] Here are the results of a more extensive experiment (thanks very much to @VivekMaharajh). S3, Lambda, a 2GB file, and 100 reads of 100 bytes to random parts of the file:

    enter image description here