Skip to content

Commit 5eb8184

Browse files
committed
feat: add multipart/form-data parser support
Add support for parsing multipart/form-data request bodies. The parser extracts text fields and automatically drops file fields, following the design pattern discussed in issue #88. The implementation uses the existing read utility (no external dependencies) and follows the same architecture as other parsers in the codebase. It validates individual field sizes and supports per-field verification callbacks. Closes #88 Addresses #258
1 parent 03f17c2 commit 5eb8184

File tree

4 files changed

+590
-3
lines changed

4 files changed

+590
-3
lines changed

README.md

Lines changed: 61 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -19,9 +19,9 @@ and `toString` may not be a function and instead a string or other user input.
1919

2020
[Learn about the anatomy of an HTTP transaction in Node.js](https://nodejs.org/en/learn/http/anatomy-of-an-http-transaction).
2121

22-
_This does not handle multipart bodies_, due to their complex and typically
23-
large nature. For multipart bodies, you may be interested in the following
24-
modules:
22+
_This module provides basic multipart/form-data support for text fields only._
23+
File fields are automatically dropped. For full file upload support, you may be
24+
interested in the following modules:
2525

2626
* [busboy](https://www.npmjs.com/package/busboy#readme) and
2727
[connect-busboy](https://www.npmjs.com/package/connect-busboy#readme)
@@ -33,6 +33,7 @@ modules:
3333
This module provides the following parsers:
3434

3535
* [JSON body parser](#bodyparserjsonoptions)
36+
* [Multipart/form-data body parser](#bodyparsermultipartoptions)
3637
* [Raw body parser](#bodyparserrawoptions)
3738
* [Text body parser](#bodyparsertextoptions)
3839
* [URL-encoded form body parser](#bodyparserurlencodedoptions)
@@ -300,6 +301,54 @@ form. Defaults to `false`.
300301

301302
The `depth` option is used to configure the maximum depth of the `qs` library when `extended` is `true`. This allows you to limit the amount of keys that are parsed and can be useful to prevent certain types of abuse. Defaults to `32`. It is recommended to keep this value as low as possible.
302303

304+
### bodyParser.multipart([options])
305+
306+
Returns middleware that only parses `multipart/form-data` bodies and only looks at
307+
requests where the `Content-Type` header matches the `type` option. This parser
308+
extracts text fields and automatically drops file fields. It supports automatic
309+
inflation of `gzip`, `br` (brotli) and `deflate` encodings.
310+
311+
A new `body` object containing the parsed data is populated on the `request`
312+
object after the middleware (i.e. `req.body`). This object will contain
313+
key-value pairs for text fields only. File fields (fields with `filename` in
314+
their `Content-Disposition` header) are automatically dropped.
315+
316+
#### Options
317+
318+
The `multipart` function takes an optional `options` object that may contain
319+
any of the following keys:
320+
321+
##### inflate
322+
323+
When set to `true`, then deflated (compressed) bodies will be inflated; when
324+
`false`, deflated bodies are rejected. Defaults to `true`.
325+
326+
##### limit
327+
328+
Controls the maximum size of individual text fields. If this is a number, then
329+
the value specifies the number of bytes; if it is a string, the value is passed
330+
to the [bytes](https://www.npmjs.com/package/bytes) library for parsing.
331+
Defaults to `'100kb'`. Note: The overall body size limit is automatically set
332+
higher to allow multiple fields.
333+
334+
##### type
335+
336+
The `type` option is used to determine what media type the middleware will
337+
parse. This option can be a string, array of strings, or a function. If not
338+
a function, `type` option is passed directly to the
339+
[type-is](https://www.npmjs.com/package/type-is#readme) library and this can
340+
be an extension name (like `multipart`), a mime type (like
341+
`multipart/form-data`), or a mime type with a wildcard (like `multipart/*`).
342+
If a function, the `type` option is called as `fn(req)` and the request is parsed
343+
if it returns a truthy value. Defaults to `multipart/form-data`.
344+
345+
##### verify
346+
347+
The `verify` option, if supplied, is called as `verify(req, res, buf, encoding)`,
348+
where `buf` is a string containing the field value and `encoding` is the
349+
encoding of the request. The verification is called for each text field
350+
individually. The parsing can be aborted by throwing an error.
351+
303352
## Errors
304353

305354
The middlewares provided by this module create errors using the
@@ -445,12 +494,21 @@ const jsonParser = bodyParser.json()
445494
// create application/x-www-form-urlencoded parser
446495
const urlencodedParser = bodyParser.urlencoded()
447496

497+
// create multipart/form-data parser
498+
const multipartParser = bodyParser.multipart()
499+
448500
// POST /login gets urlencoded bodies
449501
app.post('/login', urlencodedParser, function (req, res) {
450502
if (!req.body || !req.body.username) res.sendStatus(400)
451503
res.send('welcome, ' + req.body.username)
452504
})
453505

506+
// POST /upload gets multipart bodies (text fields only, files are dropped)
507+
app.post('/upload', multipartParser, function (req, res) {
508+
if (!req.body || !req.body.description) res.sendStatus(400)
509+
res.send('uploaded: ' + req.body.description)
510+
})
511+
454512
// POST /api/users gets JSON bodies
455513
app.post('/api/users', jsonParser, function (req, res) {
456514
if (!req.body) res.sendStatus(400)

index.js

Lines changed: 12 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -9,6 +9,7 @@
99
/**
1010
* @typedef {Object} Parsers
1111
* @property {Function} json JSON parser
12+
* @property {Function} multipart Multipart/form-data parser
1213
* @property {Function} raw Raw parser
1314
* @property {Function} text Text parser
1415
* @property {Function} urlencoded URL-encoded parser
@@ -60,6 +61,17 @@ Object.defineProperty(exports, 'urlencoded', {
6061
get: () => require('./lib/types/urlencoded')
6162
})
6263

64+
/**
65+
* Multipart/form-data parser.
66+
* Only extracts text fields and drops file fields.
67+
* @public
68+
*/
69+
Object.defineProperty(exports, 'multipart', {
70+
configurable: true,
71+
enumerable: true,
72+
get: () => require('./lib/types/multipart')
73+
})
74+
6375
/**
6476
* Create a middleware to parse json and urlencoded bodies.
6577
*

lib/types/multipart.js

Lines changed: 198 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,198 @@
1+
/*!
2+
* body-parser
3+
* Copyright(c) 2014-2015 Douglas Christopher Wilson
4+
* MIT Licensed
5+
*/
6+
7+
'use strict'
8+
9+
/**
10+
* Module dependencies.
11+
* @private
12+
*/
13+
14+
var createError = require('http-errors')
15+
var debug = require('debug')('body-parser:multipart')
16+
var read = require('../read')
17+
var { normalizeOptions } = require('../utils')
18+
19+
/**
20+
* Module exports.
21+
*/
22+
23+
module.exports = multipart
24+
25+
/**
26+
* Create a middleware to parse multipart/form-data bodies.
27+
* This parser only extracts text fields and drops file fields.
28+
*
29+
* @param {Object} [options]
30+
* @returns {Function}
31+
* @public
32+
*/
33+
function multipart (options) {
34+
const normalizedOptions = normalizeOptions(options, 'multipart/form-data')
35+
36+
var limit = normalizedOptions.limit
37+
var verify = normalizedOptions.verify
38+
39+
function parse (body, encoding) {
40+
var req = this
41+
if (!body || body.length === 0) {
42+
return {}
43+
}
44+
45+
var contentType = req.headers && req.headers['content-type']
46+
if (!contentType) {
47+
throw createError(400, 'missing content-type header', {
48+
type: 'multipart.content-type.missing'
49+
})
50+
}
51+
52+
if (!contentType.toLowerCase().includes('multipart')) {
53+
debug('non-multipart content-type in parse function - should have been skipped')
54+
return undefined
55+
}
56+
57+
var boundary = extractBoundary(contentType)
58+
var bodyStr = typeof body === 'string' ? body : body.toString('utf-8')
59+
var parts = bodyStr.split('--' + boundary)
60+
var result = {}
61+
62+
for (var i = 1; i < parts.length - 1; i++) {
63+
var field = parsePart(parts[i], limit, req, encoding)
64+
if (field) {
65+
addField(result, field.name, field.value)
66+
}
67+
}
68+
69+
return result
70+
}
71+
72+
var readLimit = normalizedOptions.limit
73+
var overallLimit = Math.max(readLimit * 100, 100 * 1024 * 1024)
74+
75+
const readOptions = {
76+
...normalizedOptions,
77+
limit: overallLimit,
78+
skipCharset: true,
79+
verify: false
80+
}
81+
82+
return function multipartParser (req, res, next) {
83+
req._multipartVerify = verify
84+
read(req, res, next, parse.bind(req), debug, readOptions)
85+
}
86+
}
87+
88+
/**
89+
* Extract boundary from content-type header.
90+
*
91+
* @param {string} contentType
92+
* @returns {string}
93+
* @private
94+
*/
95+
function extractBoundary (contentType) {
96+
var boundaryMatch = contentType.match(/boundary=([^;]+)/i)
97+
if (!boundaryMatch) {
98+
throw createError(400, 'missing boundary in content-type', {
99+
type: 'multipart.boundary.missing'
100+
})
101+
}
102+
return boundaryMatch[1].replace(/^["']|["']$/g, '')
103+
}
104+
105+
/**
106+
* Parse a single multipart part.
107+
*
108+
* @param {string} part
109+
* @param {number} limit
110+
* @param {Object} req
111+
* @param {string} encoding
112+
* @returns {Object|null}
113+
* @private
114+
*/
115+
function parsePart (part, limit, req, encoding) {
116+
var trimmed = part.trim()
117+
if (trimmed === '--' || trimmed === '') {
118+
return null
119+
}
120+
121+
var headerEnd = trimmed.indexOf('\r\n\r\n')
122+
if (headerEnd === -1) {
123+
headerEnd = trimmed.indexOf('\n\n')
124+
if (headerEnd === -1) {
125+
debug('invalid part format')
126+
return null
127+
}
128+
headerEnd += 1
129+
} else {
130+
headerEnd += 4
131+
}
132+
133+
var headers = trimmed.substring(0, headerEnd)
134+
var bodyContent = trimmed.substring(headerEnd).replace(/\r\n$/, '')
135+
136+
var contentDisposition = headers.match(/Content-Disposition:\s*([^\r\n]+)/i)
137+
if (!contentDisposition) {
138+
debug('missing Content-Disposition header')
139+
return null
140+
}
141+
142+
var disposition = contentDisposition[1]
143+
144+
if (/filename\s*=/i.test(disposition)) {
145+
debug('dropping file field')
146+
return null
147+
}
148+
149+
var nameMatch = disposition.match(/name\s*=\s*"([^"]+)"|name\s*=\s*([^;,\s]+)/i)
150+
if (!nameMatch) {
151+
debug('missing field name')
152+
return null
153+
}
154+
155+
var fieldName = nameMatch[1] || nameMatch[2]
156+
157+
if (bodyContent.length > limit) {
158+
var err = createError(413, 'field size limit exceeded', {
159+
type: 'entity.too.large',
160+
limit: limit
161+
})
162+
err.expose = true
163+
throw err
164+
}
165+
166+
var fieldVerify = req._multipartVerify
167+
if (fieldVerify) {
168+
try {
169+
fieldVerify(req, null, bodyContent, encoding || 'utf-8')
170+
} catch (err) {
171+
throw createError(403, err, {
172+
type: err.type || 'entity.verify.failed'
173+
})
174+
}
175+
}
176+
177+
return { name: fieldName, value: bodyContent }
178+
}
179+
180+
/**
181+
* Add field to result object, handling multiple values.
182+
*
183+
* @param {Object} result
184+
* @param {string} name
185+
* @param {string} value
186+
* @private
187+
*/
188+
function addField (result, name, value) {
189+
if (result[name]) {
190+
if (Array.isArray(result[name])) {
191+
result[name].push(value)
192+
} else {
193+
result[name] = [result[name], value]
194+
}
195+
} else {
196+
result[name] = value
197+
}
198+
}

0 commit comments

Comments
 (0)