jsonamazon-web-servicespowershellamazon-s3select-object

How To Query S3 Objects with CLI instead of S3 Select?


I need a CLI alternative similar to the example here in Dashboard link but with Json as input & output serialization types.

I have tried running the following for Json in AWS cloud shell to get the output printed on the terminal,but end up getting an error.

aws s3api select-object-content --bucket "my-bucket" --key jobs/test.json --expression "SELECT * FROM s3object s LIMIT 5" --expression-type 'SQL' --input-serialization "{"JSON":{"Type": "DOCUMENT"},"CompressionType": "None"}" --output-serialization "{"JSON": {Type: 'DOCUMENT'}}" /dev/stdout

Error: Error parsing parameter '--input-serialization': Invalid JSON: Expecting property name enclosed in double quotes: line 1 column 2 (char 1) JSON received: {JSON:{Type: DOCUMENT},CompressionType: None}

I see a lot of options for csv format ,but unable to find the same for Json.

Thank you in advance.

Note:Running on AWS cloudshell which is basically on Linux.

FYI: The following is the dashboard alternative of the input & output serialization I am trying to achieve here.

enter image description here


Solution

  • Use single quotes ' to enclose the entire JSON string, if you are using linux/macOS terminal. In powershell, use \ to escape the double quotes.

    like this -

    aws s3api select-object-content --bucket "my-bucket" --key jobs/test.json --expression "SELECT * FROM s3object s LIMIT 5" --expression-type 'SQL' --input-serialization '{"JSON":{"Type": "DOCUMENT"},"CompressionType": "NONE"}' --output-serialization '{"JSON":{"RecordDelimiter":"n"}}' /dev/stdout
    

    Note: If you have any single quotes inside your JSON string, needs to be escaped with backslash \.