httpescapingmultipartcontent-disposition

How to escape the name parameter in the content-disposition header?


When using HTTP/1.1 to submit a document using the multipart/form-data content type, each part must contain a Content-Disposition header with the form field name given as the name parameter.

--------------------------f0261bc90f5d4215
Content-Disposition: form-data; name="field name"

<field_data>

How should be escaped the characters in the name parameter for each part?

As an example, curl seems to use a backslash to escape " and \, but otherwise passes UTF8 data unchanged. But I can't find where this is specified, nor what is the set of characters to escape.

$ nc -l 1234 | hexdump -C&
$ curl -s -m1 http://localhost:1234 --form 'a"a=1' --form 'béb=2'
00000000  50 4f 53 54 20 2f 20 48  54 54 50 2f 31 2e 31 0d  |POST / HTTP/1.1.|
00000010  0a 48 6f 73 74 3a 20 6c  6f 63 61 6c 68 6f 73 74  |.Host: localhost|
00000020  3a 31 32 33 34 0d 0a 55  73 65 72 2d 41 67 65 6e  |:1234..User-Agen|
00000030  74 3a 20 63 75 72 6c 2f  37 2e 35 38 2e 30 0d 0a  |t: curl/7.58.0..|
00000040  41 63 63 65 70 74 3a 20  2a 2f 2a 0d 0a 43 6f 6e  |Accept: */*..Con|
00000050  74 65 6e 74 2d 4c 65 6e  67 74 68 3a 20 32 33 34  |tent-Length: 234|
00000060  0d 0a 43 6f 6e 74 65 6e  74 2d 54 79 70 65 3a 20  |..Content-Type: |
00000070  6d 75 6c 74 69 70 61 72  74 2f 66 6f 72 6d 2d 64  |multipart/form-d|
00000080  61 74 61 3b 20 62 6f 75  6e 64 61 72 79 3d 2d 2d  |ata; boundary=--|
00000090  2d 2d 2d 2d 2d 2d 2d 2d  2d 2d 2d 2d 2d 2d 2d 2d  |----------------|
000000a0  2d 2d 2d 2d 2d 2d 37 32  65 65 63 63 37 65 39 61  |------72eecc7e9a|
000000b0  65 65 66 30 31 37 0d 0a  0d 0a 2d 2d 2d 2d 2d 2d  |eef017....------|
000000c0  2d 2d 2d 2d 2d 2d 2d 2d  2d 2d 2d 2d 2d 2d 2d 2d  |----------------|
000000d0  2d 2d 2d 2d 37 32 65 65  63 63 37 65 39 61 65 65  |----72eecc7e9aee|
000000e0  66 30 31 37 0d 0a 43 6f  6e 74 65 6e 74 2d 44 69  |f017..Content-Di|
000000f0  73 70 6f 73 69 74 69 6f  6e 3a 20 66 6f 72 6d 2d  |sposition: form-|
00000100  64 61 74 61 3b 20 6e 61  6d 65 3d 22 61 5c 22 61  |data; name="a\"a|
00000110  22 0d 0a 0d 0a 31 0d 0a  2d 2d 2d 2d 2d 2d 2d 2d  |"....1..--------|
00000120  2d 2d 2d 2d 2d 2d 2d 2d  2d 2d 2d 2d 2d 2d 2d 2d  |----------------|
00000130  2d 2d 37 32 65 65 63 63  37 65 39 61 65 65 66 30  |--72eecc7e9aeef0|
00000140  31 37 0d 0a 43 6f 6e 74  65 6e 74 2d 44 69 73 70  |17..Content-Disp|
00000150  6f 73 69 74 69 6f 6e 3a  20 66 6f 72 6d 2d 64 61  |osition: form-da|
00000160  74 61 3b 20 6e 61 6d 65  3d 22 62 c3 a9 62 22 0d  |ta; name="b..b".|
00000170  0a 0d 0a 32 0d 0a 2d 2d  2d 2d 2d 2d 2d 2d 2d 2d  |...2..----------|
00000180  2d 2d 2d 2d 2d 2d 2d 2d  2d 2d 2d 2d 2d 2d 2d 2d  |----------------|
00000190  37 32 65 65 63 63 37 65  39 61 65 65 66 30 31 37  |72eecc7e9aeef017|
000001a0  2d 2d 0d 0a                                       |--..|
000001a4

Solution

  • RFC7230 specifies how backslash escaping works for quoted strings in HTTP headers.

    A string of text is parsed as a single value if it is quoted using double-quote marks.

     quoted-string  = DQUOTE *( qdtext / quoted-pair ) DQUOTE
     qdtext         = HTAB / SP /%x21 / %x23-5B / %x5D-7E / obs-text
     obs-text       = %x80-FF
    

    The backslash octet ("") can be used as a single-octet quoting
    mechanism within quoted-string and comment constructs. Recipients
    that process the value of a quoted-string MUST handle a quoted-pair
    as if it were replaced by the octet following the backslash.

     quoted-pair    = "\" ( HTAB / SP / VCHAR / obs-text )
    

    This explains how backslash escaping works, and note that it includes the upper hex range as well as just the ASCII characters.