Skip to main content

Multipart Forms and Boundary Parameters

Multipart/Form-Data Example

Example Web Form

Consider the following web form...

image.png

File Upload Form HTML
<!DOCTYPE html>
<html lang="en">
<head>
    <meta charset="UTF-8">
    <meta name="viewport" content="width=device-width, initial-scale=1.0">
    <title>File Upload Form</title>
</head>
<body>

<form action="/upload.php" method="post" enctype="multipart/form-data">
    <!-- File Input -->
    <label for="fileInput">Choose a file:</label>
    <input type="file" id="fileInput" name="file" accept=".jpg, .jpeg, .png">

    <!-- Hidden Input for upload_id -->
    <input type="hidden" name="upload_id" value="your_generated_uuid_here">

    <br><br>

    <!-- Submit Button -->
    <button type="submit">Upload File</button>
</form>

</body>
</html>

 

Looking at the web form source code, there are two input points:

  • file -- the file uploaded by the user
  • upload_id -- a hidden field for a randomized UUID to identify the transaction

The enctype="multipart/form-data" attribute of the <form> element informs us of the content type.

Client Workflow

The workflow should look something like this:

  1. Client chooses a file
  2. Client clicks the Upload file button
  3. Client web browser creates a raw byte stream of the input file
  4. Client web browser generates a unique multipart/form-data boundary
  5. Client web browser submits a HTTP POST request to http://domain.tld/upload.php

The resulting web request would look something like this:

Complete Client Web Request
POST /upload.php HTTP/1.1
Host: 127.0.0.1
Content-Length: 15947
Cache-Control: max-age=0
sec-ch-ua: "Chromium";v="119", "Not?A_Brand";v="24"
sec-ch-ua-mobile: ?0
sec-ch-ua-platform: "Windows"
Upgrade-Insecure-Requests: 1
Origin: http://127.0.0.1
Content-Type: multipart/form-data; boundary=----WebKitFormBoundaryBODBNK9vWWeDNOP1
User-Agent: Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/119.0.6045.199 Safari/537.36
Accept: text/html,application/xhtml+xml,application/xml;q=0.9,image/avif,image/webp,image/apng,*/*;q=0.8,application/signed-exchange;v=b3;q=0.7
Sec-Fetch-Site: same-origin
Sec-Fetch-Mode: navigate
Sec-Fetch-User: ?1
Sec-Fetch-Dest: document
Referer: http://127.0.0.1/test.html
Accept-Encoding: gzip, deflate, br
Accept-Language: en-US,en;q=0.9
Connection: close

------WebKitFormBoundaryBODBNK9vWWeDNOP1
Content-Disposition: form-data; name="file"; filename="picture.png"
Content-Type: image/png

RAW PNG
FILE BYTES
GO HERE
-----WebKitFormBoundaryBODBNK9vWWeDNOP1
Content-Disposition: form-data; name="upload_id"

1485fa67-4c0e-49c7-b136-75a09c61ede0
------WebKitFormBoundaryBODBNK9vWWeDNOP1--

 

Understanding the Request

Request Headers

All Request Headers
Host: 127.0.0.1
Content-Length: 15947
Cache-Control: max-age=0
sec-ch-ua: "Chromium";v="119", "Not?A_Brand";v="24"
sec-ch-ua-mobile: ?0
sec-ch-ua-platform: "Windows"
Upgrade-Insecure-Requests: 1
Origin: http://127.0.0.1
Content-Type: multipart/form-data; boundary=----WebKitFormBoundaryBODBNK9vWWeDNOP1
User-Agent: Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/119.0.6045.199 Safari/537.36
Accept: text/html,application/xhtml+xml,application/xml;q=0.9,image/avif,image/webp,image/apng,*/*;q=0.8,application/signed-exchange;v=b3;q=0.7
Sec-Fetch-Site: same-origin
Sec-Fetch-Mode: navigate
Sec-Fetch-User: ?1
Sec-Fetch-Dest: document
Referer: http://127.0.0.1/test.html
Accept-Encoding: gzip, deflate, br
Accept-Language: en-US,en;q=0.9
Connection: close

When the user selects their file and clicks the Upload File button, the web browser generates a unique boundary which is identified in the HTTP request headers.

Content-Type: multipart/form-data; boundary=----WebKitFormBoundaryBODBNK9vWWeDNOP1

It is this boundary, ----WebKitFormBoundaryBODBNK9vWWeDNOP1 that is going to separate the data submitted with the request.

Recall before that the web form above had two input fields -- one hidden and one not -- file and upload_id

Those are two distinct inputs that are going to be separated by this boundary ID.

Request Body

Form Fields and Values

Entire Multipart/Form-Data Body
------WebKitFormBoundaryBODBNK9vWWeDNOP1
Content-Disposition: form-data; name="file"; filename="picture.png"
Content-Type: image/png

RAW PNG
FILE BYTES
GO HERE
-----WebKitFormBoundaryBODBNK9vWWeDNOP1
Content-Disposition: form-data; name="upload_id"

1485fa67-4c0e-49c7-b136-75a09c61ede0
------WebKitFormBoundaryBODBNK9vWWeDNOP1--

If you look at the request body -- as pertains to this example -- you'll see three instances of the multipart form boundary ID.

First Boundary

Indicates the start of the file data

------WebKitFormBoundaryBODBNK9vWWeDNOP1
Content-Disposition: form-data; name="file"; filename="picture.png"
Content-Type: image/png

RAW PNG
FILE BYTES
GO HERE
  • Contains a Content-Disposition header indicating:
    • form-data from the web form falls below this boundary
    • The name of the field from the form this data comes from, file in this case
    • And, the file name
  • It also contains a Content-Type header advising the nature of the raw data just below
RFC 2388

"multipart/form-data" contains a series of parts. Each part is expected to contain a content-disposition header [RFC 2183] where the disposition type is "form-data", and where the disposition contains an (additional) parameter of "name", where the value of that parameter is the original field name in the form.

https://www.ietf.org/rfc/rfc2388.txt

Second Boundary

Indicates the start of the upload_id data

-----WebKitFormBoundaryBODBNK9vWWeDNOP1
Content-Disposition: form-data; name="upload_id"

1485fa67-4c0e-49c7-b136-75a09c61ede0
  • Again, contains a Content-Disposition header indicating:
    • form-data from the web form falls below this boundary
    • The name of the field from the form this data comes from, upload_id in this case
  • This boundary does not contain a Content-Disposition header
    • This is because, when this header is not provided, it defaults to text/plain, which is fine for this case
RFC 2388

As with all multipart MIME types, each part has an optional  "Content-Type", which defaults to text/plain. If the contents of a file are returned via filling out a form, then the file input is identified as the appropriate media type, if known, or "application/octet-stream".

https://www.ietf.org/rfc/rfc2388.txt

Final Boundary

Indicates the end of the form submission data

------WebKitFormBoundaryBODBNK9vWWeDNOP1--

Section Summary

  • For every field in the form, there should be a boundary with a Content-Disposition header to indicate the field name.
  • There is an optional Content-Type header to indicate the type of input that was passed to the form. 
  • The Content-Type header defaults to text/plain if not provided.

Boundaries

Purpose of Boundaries

RFC 2388

As with other multipart types, a boundary is selected that does not occur in any of the data. Each field of the form is sent, in the order defined by the sending application and form, as a part of the multipart stream.  Each part identifies the INPUT name within the original form. Each part should be labelled with an appropriate content-type if the media type is known (e.g., inferred from the file extension or operating system typing information) or as "application/octet-stream".

https://www.ietf.org/rfc/rfc2388.txt

There should be one boundary per each field of the web form

  • As noted with the example web form shown above, there are two form-data boundaries and one boundary to indicate the end of the form submission
  • The boundaries are ordered based on the order of the fields of the web form
  • Each boundary identifies the field name from the form in the Content-Disposition header
  • The Content-Type header should be used in the boundary if known
    • Can be inferred from the file extension
    • If the content type is unknown, send as a byte stream --- application/octet-stream

Defining Boundaries

RFC 1341

The encapsulation boundary is defined as a line consisting entirely of two hyphen characters ("-", decimal code 45) followed by the boundary parameter value from the Content-Type header field.

https://datatracker.ietf.org/doc/html/rfc1341

The beginning of a multipart/form-data boundary is indicated by --
However, as noted with our example of ------WebKitFormBoundaryBODBNK9vWWeDNOP1-- more than two hyphen is acceptable

  • As stated before, each boundary identifies a field name in a web form
  • Each boundary should start with two (2) hyphens --
  • All of the boundaries should use the boundary identity specified in the Content-Type request header
    • In the case of the web form example, the browser assigned the boundary ID of
      Content-Type: multipart/form-data; boundary=----WebKitFormBoundaryBODBNK9vWWeDNOP1


Demarcating Boundaries

RFC 1341

Note that the encapsulation boundary must occur at the beginning of a line, i.e., following a CRLF, and that that initial CRLF is considered to be part of the encapsulation boundary rather than part of the preceding part. The boundary must be followed immediately either by another CRLF and the header fields for the next part, or by two CRLFs, in which case there are no header fields for the next part (and it is therefore assumed to be of Content-Type text/plain).

https://datatracker.ietf.org/doc/html/rfc1341

POST /upload.php HTTP/1.1
...
...
Content-Length: 15947
Content-Type: multipart/form-data; boundary=----WebKitFormBoundaryBODBNK9vWWeDNOP1
...
...
__________________________________________________________________________|2 CRLF (no preamble) AND START OF FIRST BOUNDARY
------WebKitFormBoundaryBODBNK9vWWeDNOP1__________________________________|CRLF AND START OF HEADERS
Content-Disposition: form-data; name="file"; filename="picture.png"_______|REQUIRED HEADER
Content-Type: image/png___________________________________________________|OPTIONAL HEADER
__________________________________________________________________________|2 CRLF AND START OF DATA
RAW DATA
GOES HERE
  • We designate the boundary with ------WebKitFormBoundaryBODBNK9vWWeDNOP1 and then provide CRLF (carriage return line feed, a.k.a. new line)
  • On the new line just below the boundary ID, we should provide the Content-Disposition header
  • On the line just below here, we should optionally provide a Content-Type header
    • If the Content-Type header is not provided, it is assumed to be Content-Type: text/plain
  • We should provide another CRLF and pass in the form contnet

Additional Rules for Boundaries

RFC 1341

The requirement that the encapsulation boundary begins with a CRLF implies that the body of a multipart entity must itself begin with a CRLF before the first encapsulation line -- that is, if the "preamble" area is not used, the entity headers must be followed by TWO CRLFs.

https://datatracker.ietf.org/doc/html/rfc1341

POST /upload.php HTTP/1.1
...
...
Content-Length: 15947
Content-Type: multipart/form-data; boundary=----WebKitFormBoundaryBODBNK9vWWeDNOP1
...
...
__________________________________________________________________________|2 CRLF BEFORE FIRST BOUNDARY IF NO PREABLE
------WebKitFormBoundaryBODBNK9vWWeDNOP1                                  |1 CRLF BEFORE FIRST BOUNDARY IF PREAMBLE
Content-Disposition: form-data; name="file"; filename="picture.png"
Content-Type: image/png

RAW DATA
GOES HERE

RFC 1341

Encapsulation boundaries must not appear within the encapsulations, and must be no longer than 70 characters, not counting the two leading hyphens.

The encapsulation boundary following the last body part is a distinguished delimiter that indicates that no further body parts will follow. Such a delimiter is identical to the previous delimiters, with the addition of two more hyphens at the end of the line

  • The boundary ID in our example is ------WebKitFormBoundaryBODBNK9vWWeDNOP1
    • This is 40 characters long, the maximum allowed length is 70 characters, including the hyphens ( - )
    • This boundary ID must not appear anywhere in the form data
  • The final boundary ID in our example is ------WebKitFormBoundaryBODBNK9vWWeDNOP1--
    • It must match the starting form boundary and include two (2) hyphens the end ( -- )
    • The two hyphens indicate that this is the end of the form data