I'm trying to stream upload a file submitted via a form directly to an Amazon S3 bucket, using aws-sdk or knox. Form handling is done with formidable.
My question is: how do I properly use formidable with aws-sdk (or knox) using each of these libraries' latest features for handling streams?
I'm aware that this topic has already been asked here in different flavors, ie:
However, I believe the answers are a bit outdated and/or off topic (ie. CORS support, which I don't wish to use for now for various reasons) and/or, most importantly, make no reference to the latest features from either aws-sdk (see: https://github.com/aws/aws-sdk-js/issues/13#issuecomment-16085442) or knox (notably putStream() or its readableStream.pipe(req) variant, both explained in the doc).
After hours of struggling, I came to the conclusion that I needed some help (disclaimer: I'm quite a newbie with streams).
HTML form:
<form action="/uploadPicture" method="post" enctype="multipart/form-data">
<input name="picture" type="file" accept="image/*">
<input type="submit">
</form>
Express bodyParser middleware is configured this way:
app.use(express.bodyParser({defer: true}))
POST request handler:
uploadPicture = (req, res, next) ->
form = new formidable.IncomingForm()
form.parse(req)
form.onPart = (part) ->
if not part.filename
# Let formidable handle all non-file parts (fields)
form.handlePart(part)
else
handlePart(part, form.bytesExpected)
handlePart = (part, fileSize) ->
# aws-sdk version
params =
Bucket: "mybucket"
Key: part.filename
ContentLength: fileSize
Body: part # passing stream object as body parameter
awsS3client.putObject(params, (err, data) ->
if err
console.log err
else
console.log data
)
However, I'm getting the following error:
{ [RequestTimeout: Your socket connection to the server was not read from or written to within the timeout period. Idle connections will be closed.]
message: 'Your socket connection to the server was not read from or written to within the timeout period. Idle connections will be closed.', code: 'RequestTimeout', name: 'RequestTimeout', statusCode: 400, retryable: false }
A knox version of handlePart() function tailored this way also miserably fails:
handlePart = (part, fileSize) ->
headers =
"Content-Length": fileSize
"Content-Type": part.mime
knoxS3client.putStream(part, part.filename, headers, (err, res) ->
if err
console.log err
else
console.log res
)
I also get a big res object with a 400 statusCode somewhere.
Region is configured to eu-west-1 in both case.
Additional notes:
node 0.10.12
latest formidable from npm (1.0.14)
latest aws-sdk from npm (1.3.1)
latest knox from npm (0.8.3)
Well, according to the creator of Formidable, direct streaming to Amazon S3 is impossible :
The S3 API requires you to provide the size of new files when creating them. This information is not available for multipart/form-data files until they have been fully received. This means streaming is impossible.
Indeed, form.bytesExpected refers to the size of the whole form, and not the size of the single file.
The data must therefore either hit the memory or the disk on the server first before being uploaded to S3.