Multipart Forms and Boundary Parameters
Multipart/Form-Data Example
Example Web Form
Consider the following web form...
File Upload Form HTML
<!DOCTYPE html>
<html lang="en">
<head>
<meta charset="UTF-8">
<meta name="viewport" content="width=device-width, initial-scale=1.0">
<title>File Upload Form</title>
</head>
<body>
<form action="/upload.php" method="post" enctype="multipart/form-data">
<!-- File Input -->
<label for="fileInput">Choose a file:</label>
<input type="file" id="fileInput" name="file" accept=".jpg, .jpeg, .png">
<!-- Hidden Input for upload_id -->
<input type="hidden" name="upload_id" value="your_generated_uuid_here">
<br><br>
<!-- Submit Button -->
<button type="submit">Upload File</button>
</form>
</body>
</html>
Looking at the web form source code, there are two input points:
file
-- the file uploaded by the userupload_id
-- a hidden field for a randomized UUID to identify the transaction
The enctype="multipart/form-data"
attribute of the <form>
element informs us of the content type.
Client Workflow
The workflow should look something like this:
- Client chooses a file
- Client clicks the Upload file button
- Client web browser creates a raw byte stream of the input file
- Client web browser generates a unique multipart/form-data boundary
- Client web browser submits a
HTTP POST
request tohttp://domain.tld/upload.php
The resulting web request would look something like this:
Complete Client Web Request
POST /upload.php HTTP/1.1
Host: 127.0.0.1
Content-Length: 15947
Cache-Control: max-age=0
sec-ch-ua: "Chromium";v="119", "Not?A_Brand";v="24"
sec-ch-ua-mobile: ?0
sec-ch-ua-platform: "Windows"
Upgrade-Insecure-Requests: 1
Origin: http://127.0.0.1
Content-Type: multipart/form-data; boundary=----WebKitFormBoundaryBODBNK9vWWeDNOP1
User-Agent: Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/119.0.6045.199 Safari/537.36
Accept: text/html,application/xhtml+xml,application/xml;q=0.9,image/avif,image/webp,image/apng,*/*;q=0.8,application/signed-exchange;v=b3;q=0.7
Sec-Fetch-Site: same-origin
Sec-Fetch-Mode: navigate
Sec-Fetch-User: ?1
Sec-Fetch-Dest: document
Referer: http://127.0.0.1/test.html
Accept-Encoding: gzip, deflate, br
Accept-Language: en-US,en;q=0.9
Connection: close
------WebKitFormBoundaryBODBNK9vWWeDNOP1
Content-Disposition: form-data; name="file"; filename="picture.png"
Content-Type: image/png
RAW PNG
FILE BYTES
GO HERE
-----WebKitFormBoundaryBODBNK9vWWeDNOP1
Content-Disposition: form-data; name="upload_id"
1485fa67-4c0e-49c7-b136-75a09c61ede0
------WebKitFormBoundaryBODBNK9vWWeDNOP1--
Understanding the Request
Request Headers
All Request Headers
Host: 127.0.0.1
Content-Length: 15947
Cache-Control: max-age=0
sec-ch-ua: "Chromium";v="119", "Not?A_Brand";v="24"
sec-ch-ua-mobile: ?0
sec-ch-ua-platform: "Windows"
Upgrade-Insecure-Requests: 1
Origin: http://127.0.0.1
Content-Type: multipart/form-data; boundary=----WebKitFormBoundaryBODBNK9vWWeDNOP1
User-Agent: Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/119.0.6045.199 Safari/537.36
Accept: text/html,application/xhtml+xml,application/xml;q=0.9,image/avif,image/webp,image/apng,*/*;q=0.8,application/signed-exchange;v=b3;q=0.7
Sec-Fetch-Site: same-origin
Sec-Fetch-Mode: navigate
Sec-Fetch-User: ?1
Sec-Fetch-Dest: document
Referer: http://127.0.0.1/test.html
Accept-Encoding: gzip, deflate, br
Accept-Language: en-US,en;q=0.9
Connection: close
When the user selects their file and clicks the Upload File button, the web browser generates a unique boundary which is identified in the HTTP request headers.
Content-Type: multipart/form-data; boundary=----WebKitFormBoundaryBODBNK9vWWeDNOP1
It is this boundary, ----WebKitFormBoundaryBODBNK9vWWeDNOP1
that is going to separate the data submitted with the request.
Recall before that the web form above had two input fields -- one hidden and one not -- file
and upload_id
Those are two distinct inputs that are going to be separated by this boundary ID.
Request Body
Form Fields and Values
Entire Multipart/Form-Data Body
------WebKitFormBoundaryBODBNK9vWWeDNOP1
Content-Disposition: form-data; name="file"; filename="picture.png"
Content-Type: image/png
RAW PNG
FILE BYTES
GO HERE
-----WebKitFormBoundaryBODBNK9vWWeDNOP1
Content-Disposition: form-data; name="upload_id"
1485fa67-4c0e-49c7-b136-75a09c61ede0
------WebKitFormBoundaryBODBNK9vWWeDNOP1--
If you look at the request body -- as pertains to this example -- you'll see three instances of the multipart form boundary ID.
First Boundary
Indicates the start of the file
data
------WebKitFormBoundaryBODBNK9vWWeDNOP1
Content-Disposition: form-data; name="file"; filename="picture.png"
Content-Type: image/png
RAW PNG
FILE BYTES
GO HERE
- Contains a
Content-Disposition
header indicating:form-data
from the web form falls below this boundary- The name of the field from the form this data comes from,
file
in this case - And, the file name
- It also contains a
Content-Type
header advising the nature of the raw data just below
RFC 2388
"multipart/form-data" contains a series of parts. Each part is expected to contain a content-disposition header [RFC 2183] where the disposition type is "form-data", and where the disposition contains an (additional) parameter of "name", where the value of that parameter is the original field name in the form.
Second Boundary
Indicates the start of the upload_id
data
-----WebKitFormBoundaryBODBNK9vWWeDNOP1
Content-Disposition: form-data; name="upload_id"
1485fa67-4c0e-49c7-b136-75a09c61ede0
- Again, contains a
Content-Disposition
header indicating:form-data
from the web form falls below this boundary- The name of the field from the form this data comes from,
upload_id
in this case
- This boundary does not contain a
Content-Type
header- This is because, when this header is not provided, it defaults to
text/plain
, which is fine for this case
- This is because, when this header is not provided, it defaults to
RFC 2388
As with all multipart MIME types, each part has an optional "Content-Type", which defaults to text/plain. If the contents of a file are returned via filling out a form, then the file input is identified as the appropriate media type, if known, or "application/octet-stream".
Final Boundary
Indicates the end of the form submission data
------WebKitFormBoundaryBODBNK9vWWeDNOP1--
Section Summary
- For every field in the form, there should be a boundary with a
Content-Disposition
header to indicate the field name. - There is an optional
Content-Type
header to indicate the type of input that was passed to the form. - The
Content-Type
header defaults totext/plain
if not provided.
Boundaries
Purpose of Boundaries
RFC 2388
As with other multipart types, a boundary is selected that does not occur in any of the data. Each field of the form is sent, in the order defined by the sending application and form, as a part of the multipart stream. Each part identifies the INPUT name within the original form. Each part should be labelled with an appropriate content-type if the media type is known (e.g., inferred from the file extension or operating system typing information) or as "application/octet-stream".
There should be one boundary per each field of the web form
- As noted with the example web form shown above, there are two
form-data
boundaries and one boundary to indicate the end of the form submission - The boundaries are ordered based on the order of the fields of the web form
- Each boundary identifies the field name from the form in the
Content-Disposition
header - The
Content-Type
header should be used in the boundary if known- Can be inferred from the file extension
- If the content type is unknown, send as a byte stream ---
application/octet-stream
Defining Boundaries
RFC 1341
The encapsulation boundary is defined as a line consisting entirely of two hyphen characters ("-", decimal code 45) followed by the boundary parameter value from the Content-Type header field.
The beginning of a multipart/form-data
boundary is indicated by --
However, as noted with our example of ------WebKitFormBoundaryBODBNK9vWWeDNOP1--
more than two hyphen is acceptable
- As stated before, each boundary identifies a field name in a web form
- Each boundary should start with two (2) hyphens
--
- All of the boundaries should use the boundary identity specified in the
Content-Type
request header
- In the case of the web form example, the browser assigned the boundary ID of
Content-Type: multipart/form-data; boundary=----WebKitFormBoundaryBODBNK9vWWeDNOP1
- In the case of the web form example, the browser assigned the boundary ID of
Demarcating Boundaries
RFC 1341
Note that the encapsulation boundary must occur at the beginning of a line, i.e., following a CRLF, and that that initial CRLF is considered to be part of the encapsulation boundary rather than part of the preceding part. The boundary must be followed immediately either by another CRLF and the header fields for the next part, or by two CRLFs, in which case there are no header fields for the next part (and it is therefore assumed to be of Content-Type text/plain).
POST /upload.php HTTP/1.1
...
...
Content-Length: 15947
Content-Type: multipart/form-data; boundary=----WebKitFormBoundaryBODBNK9vWWeDNOP1
...
...
__________________________________________________________________________|2 CRLF (no preamble) AND START OF FIRST BOUNDARY
------WebKitFormBoundaryBODBNK9vWWeDNOP1__________________________________|CRLF AND START OF HEADERS
Content-Disposition: form-data; name="file"; filename="picture.png"_______|REQUIRED HEADER
Content-Type: image/png___________________________________________________|OPTIONAL HEADER
__________________________________________________________________________|2 CRLF AND START OF DATA
RAW DATA
GOES HERE
- We designate the boundary with
------WebKitFormBoundaryBODBNK9vWWeDNOP1
and then provideCRLF
(carriage return line feed, a.k.a. new line) - On the new line just below the boundary ID, we should provide the
Content-Disposition
header - On the line just below here, we should optionally provide a
Content-Type
header- If the
Content-Type
header is not provided, it is assumed to beContent-Type: text/plain
- If the
- We should provide another
CRLF
and pass in the form contnet
Additional Rules for Boundaries
RFC 1341
The requirement that the encapsulation boundary begins with a CRLF implies that the body of a multipart entity must itself begin with a CRLF before the first encapsulation line -- that is, if the "preamble" area is not used, the entity headers must be followed by TWO CRLFs.
POST /upload.php HTTP/1.1
...
...
Content-Length: 15947
Content-Type: multipart/form-data; boundary=----WebKitFormBoundaryBODBNK9vWWeDNOP1
...
...
__________________________________________________________________________|2 CRLF BEFORE FIRST BOUNDARY IF NO PREAMBLE
------WebKitFormBoundaryBODBNK9vWWeDNOP1 |1 CRLF BEFORE FIRST BOUNDARY IF PREAMBLE
Content-Disposition: form-data; name="file"; filename="picture.png"
Content-Type: image/png
RAW DATA
GOES HERE
RFC 1341
Encapsulation boundaries must not appear within the encapsulations, and must be no longer than 70 characters, not counting the two leading hyphens.
The encapsulation boundary following the last body part is a distinguished delimiter that indicates that no further body parts will follow. Such a delimiter is identical to the previous delimiters, with the addition of two more hyphens at the end of the line
- The boundary ID in our example is
------WebKitFormBoundaryBODBNK9vWWeDNOP1
- This is 40 characters long, the maximum allowed length is 70 characters, including the hyphens (
-
) - This boundary ID must not appear anywhere in the form data
- This is 40 characters long, the maximum allowed length is 70 characters, including the hyphens (
- The final boundary ID in our example is
------WebKitFormBoundaryBODBNK9vWWeDNOP1--
- It must match the starting form boundary and include two (2) hyphens the end (
--
) - The two hyphens indicate that this is the end of the form data
- It must match the starting form boundary and include two (2) hyphens the end (