Skip to content

Commit 40090fe

Browse files
authored
streams + attachment size check + attachment url in body (#72)
* Created new `Read file` action instead of existing due to incorrect output metadata, old one set as deprecated * Fix memory leak for `Read file` action and `Get New and Updated S3 Objects` trigger, also added attachment url in message body to them * Default value for environment variable `ATTACHMENT_MAX_SIZE` increased from `10000000` (almost **10** MB) to `104857600` bytes (**100** MB) * Implemented additional check attachments size in `Get New and Updated S3 Objects` trigger * Get rid of vulnerabilities in dependencies
1 parent f12fd33 commit 40090fe

File tree

13 files changed

+2227
-847
lines changed

13 files changed

+2227
-847
lines changed

.circleci/config.yml

Lines changed: 7 additions & 7 deletions
Original file line numberDiff line numberDiff line change
@@ -72,23 +72,23 @@ commands:
7272
jobs:
7373
test:
7474
docker:
75-
- image: circleci/node:14-stretch
75+
- image: circleci/node:16-stretch
7676
steps:
7777
- checkout
7878
- node/install:
7979
node-version: << pipeline.parameters.node-version >>
80-
# - run:
81-
# name: Audit Dependencies
82-
# command: npm audit --audit-level=high
80+
- run:
81+
name: Audit Dependencies
82+
command: npm audit --audit-level=high
8383
- node/install-packages:
8484
cache-path: ./node_modules
8585
override-ci-command: npm install
8686
- run:
87-
name: Running Mocha Tests
88-
command: npm test
87+
name: Running Mocha Unit&Integration Tests
88+
command: npm test && npm run integration-test
8989
build:
9090
docker:
91-
- image: circleci/node:14-stretch
91+
- image: circleci/node:16-stretch
9292
user: root
9393
steps:
9494
- checkout

.gitignore

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -3,3 +3,4 @@ coverage
33
.idea
44
.env
55
.vscode
6+
.nyc_output

CHANGELOG.md

Lines changed: 7 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -1,5 +1,11 @@
1-
# 1.4.3 (April 22, 2022)
1+
# 1.5.0 (May 20, 2022)
2+
* Created new `Read file` action instead of existing due to incorrect output metadata, old one set as deprecated
3+
* Fix memory leak for `Read file` action and `Get New and Updated S3 Objects` trigger, also added attachment url in message body to them
4+
* Default value for environment variable `ATTACHMENT_MAX_SIZE` increased from `10000000` (almost **10** MB) to `104857600` bytes (**100** MB)
5+
* Implemented additional check attachments size in `Get New and Updated S3 Objects` trigger
6+
* Get rid of vulnerabilities in dependencies
27

8+
# 1.4.3 (April 22, 2022)
39
* Update `component-commons-library` to 2.0.2
410
* Update `oih-standard-library` to 2.0.2
511
* Update `elasticio-sailor-nodejs` to 2.6.27

README.md

Lines changed: 21 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -39,15 +39,15 @@ The component provides ability to connect to Amazon Simple Storage Service (Amaz
3939
[Completeness Matrix](https://docs.google.com/spreadsheets/d/1sptYGKkInnAbfRRbzLr5oOZUd3-COekQWGKpqQUYVfc/edit#gid=0)
4040

4141
### How works. SDK version
42-
The component is based on [AWS S3 SDK](https://aws.amazon.com/sdk-for-node-js/ 'SDK for NodeJS') version 2.683.0.
42+
The component is based on [AWS S3 SDK](https://aws.amazon.com/sdk-for-node-js/ 'SDK for NodeJS') version 2.1132.0.
4343

4444
## Requirements
4545

4646
#### Environment variables
4747
Name|Mandatory|Description|Values|
4848
|----|---------|-----------|------|
4949
|`LOG_LEVEL`| false | Controls logger level | `trace`, `debug`, `info`, `warning`, `error` |
50-
|`ATTACHMENT_MAX_SIZE`| false | For `elastic.io` attachments configuration. Maximal possible attachment size in bytes. By default set to 1000000 and according to platform limitations CAN'T be bigger than that. | Up to `1000000` bytes|
50+
|`ATTACHMENT_MAX_SIZE`| false | For `elastic.io` attachments configuration. Maximal possible attachment size in bytes. By default set to `104857600` and according to platform limitations **CAN'T** be bigger than that. | Up to `104857600` bytes (100MB)|
5151
|`ACCESS_KEY_ID`| false | For integration-tests is required to specify this variable | |
5252
|`ACCESS_KEY_SECRET`| false | For integration-tests is required to specify this variable | |
5353
|`REGION` | false | For integration-tests is required to specify this variable | |
@@ -72,18 +72,27 @@ Triggers to get all new and updated s3 objects since last polling.
7272

7373
#### List of Expected Config fields
7474
- **Bucket Name and folder** - name of S3 bucket to read files from
75-
- **Emit Behaviour**: Options are: default is `Emit Individually` emits each object in separate message, `Fetch All` emits all objects in one message
75+
- **Emit Behaviour**: Options are: default is `Emit Individually` emits each object in separate message, `Fetch All` emits all objects as array in one object with key `results`
7676
- **Start Time**: Start datetime of polling. Default min date:`-271821-04-20T00:00:00.000Z`
7777
- **End Time**: End datetime of polling. Default max date: `+275760-09-13T00:00:00.000Z`
78-
- **Enable File Attachments**: If selected, the contents of the file will be exported in addition to the file metadata.
78+
- **Enable File Attachments**: If selected, the contents of the file will be exported in addition to the attachment.
79+
7980

8081
<details>
8182
<summary>Output metadata</summary>
8283

84+
If **Emit Behaviour** selected as `Emit Individually` - emits each object in separate message with schema below, if `Fetch All` emits all objects as array in one object with key `results`, each item regards schema below
85+
86+
`attachmentUrl` appears only if selected **Enable File Attachments**
87+
8388
```json
8489
{
8590
"type": "object",
8691
"properties": {
92+
"attachmentUrl": {
93+
"type": "string",
94+
"required": true
95+
},
8796
"Key": {
8897
"type": "string",
8998
"required": true
@@ -181,6 +190,14 @@ File type resolves by it's extension. The name of attachment would be same to fi
181190
"filename": {
182191
"type": "string",
183192
"required": true
193+
},
194+
"attachmentUrl": {
195+
"type": "string",
196+
"required": true
197+
},
198+
"size": {
199+
"type": "number",
200+
"required": true
184201
}
185202
}
186203
}

component.json

Lines changed: 54 additions & 38 deletions
Original file line numberDiff line numberDiff line change
@@ -1,6 +1,6 @@
11
{
22
"title": "AWS S3",
3-
"version": "1.4.3",
3+
"version": "1.5.0",
44
"description": "Integration component that can read and write to AWS S3",
55
"docsUrl": "https://github.com/elasticio/amazon-s3-component",
66
"credentials": {
@@ -67,42 +67,7 @@
6767
"label": "Enable File Attachments"
6868
}
6969
},
70-
"metadata": {
71-
"out": {
72-
"type": "object",
73-
"properties": {
74-
"Key": {
75-
"type": "string",
76-
"required": true
77-
},
78-
"LastModified": {
79-
"type": "string",
80-
"required": true
81-
},
82-
"ETag": {
83-
"type": "string",
84-
"required": true
85-
},
86-
"Size": {
87-
"type": "number",
88-
"required": true
89-
},
90-
"StorageClass": {
91-
"type": "string",
92-
"required": true
93-
},
94-
"Owner": {
95-
"type": "object",
96-
"properties": {
97-
"ID": {
98-
"type": "string",
99-
"required": true
100-
}
101-
}
102-
}
103-
}
104-
}
105-
}
70+
"dynamicMetadata": true
10671
}
10772
},
10873
"actions": {
@@ -162,9 +127,10 @@
162127
}
163128
},
164129
"readFile": {
130+
"deprecated": true,
165131
"title": "Read file",
166132
"help": {
167-
"description": "Read file from S3 bucket",
133+
"description": "Action is deprecated do to incorrect output metadata. Use main 'Read file' action instead",
168134
"link": "/components/aws-s3/index.html#read-file"
169135
},
170136
"main": "./lib/actions/readFile.js",
@@ -203,6 +169,56 @@
203169
}
204170
}
205171
},
172+
"readFile2": {
173+
"title": "Read file",
174+
"help": {
175+
"description": "Read file from S3 bucket",
176+
"link": "/components/aws-s3/index.html#read-file"
177+
},
178+
"main": "./lib/actions/readFile2.js",
179+
"fields": {
180+
"bucketName": {
181+
"viewClass": "TextFieldView",
182+
"label": "Default Bucket Name and folder",
183+
"placeholder": "my-fancy-bucket",
184+
"note": "Default Bucket Name and folder will override if 'Bucket Name and folder' field is set in metadata",
185+
"required": false
186+
}
187+
},
188+
"metadata": {
189+
"in": {
190+
"type": "object",
191+
"properties": {
192+
"filename": {
193+
"type": "string",
194+
"required": true
195+
},
196+
"bucketName": {
197+
"title": "Bucket Name and folder",
198+
"type": "string",
199+
"required": false
200+
}
201+
}
202+
},
203+
"out": {
204+
"type": "object",
205+
"properties": {
206+
"filename": {
207+
"type": "string",
208+
"required": true
209+
},
210+
"attachmentUrl": {
211+
"type": "string",
212+
"required": true
213+
},
214+
"size": {
215+
"type": "number",
216+
"required": true
217+
}
218+
}
219+
}
220+
}
221+
},
206222
"getAllFilesInBucket": {
207223
"title": "Get filenames",
208224
"help": {

lib/actions/readFile.js

Lines changed: 11 additions & 18 deletions
Original file line numberDiff line numberDiff line change
@@ -1,11 +1,12 @@
1-
/* eslint-disable no-use-before-define,consistent-return,func-names,no-param-reassign */
2-
const { AttachmentProcessor } = require('@elastic.io/component-commons-library');
1+
/* eslint-disable no-use-before-define,consistent-return,func-names */
32
const { messages } = require('elasticio-node');
43
const convert = require('xml-js');
54
const mime = require('mime-types');
65
const iconv = require('iconv-lite');
76
const { Client } = require('../client');
8-
const { REQUEST_MAX_BODY_LENGTH } = require('../parameters');
7+
const params = require('../parameters');
8+
9+
const attachmentProcessor = require('../utils/attachmentProcessor');
910

1011
exports.process = async function (msg, cfg) {
1112
const client = new Client(this.logger, cfg);
@@ -14,11 +15,11 @@ exports.process = async function (msg, cfg) {
1415

1516
const result = await client.getObject(bucketName, filename);
1617

17-
if (result.ContentLength > REQUEST_MAX_BODY_LENGTH) {
18+
if (result.ContentLength > params.ATTACHMENT_MAX_SIZE) {
1819
this.logger.error('File %s with size %d bytes is too big for attachment usage. '
19-
+ 'Current attachment max size is %d bytes', filename, result.ContentLength, REQUEST_MAX_BODY_LENGTH);
20+
+ 'Current attachment max size is %d bytes', filename, result.ContentLength, params.ATTACHMENT_MAX_SIZE);
2021
throw new Error(`File ${filename} with size ${result.ContentLength} bytes is too big for attachment usage. `
21-
+ `Current attachment max size is ${REQUEST_MAX_BODY_LENGTH} bytes`);
22+
+ `Current attachment max size is ${params.ATTACHMENT_MAX_SIZE} bytes`);
2223
}
2324

2425
const fileContent = iconv.decode(result.Body, 'iso-8859-15');
@@ -31,18 +32,10 @@ exports.process = async function (msg, cfg) {
3132
const xmlDoc = JSON.parse(convert.xml2json(fileContent));
3233
await this.emit('data', messages.newMessageWithBody(xmlDoc));
3334
} else {
34-
const attachmentProcessor = new AttachmentProcessor();
35-
const response = await attachmentProcessor.uploadAttachment(result.Body, contentType);
36-
const attachmentUrl = `${response.config.url}${response.data.objectId}?storage_type=maester`;
37-
msg.attachments = {
38-
[filename]: {
39-
url: attachmentUrl,
40-
size: fileContent.length,
41-
'content-type': contentType,
42-
},
43-
};
44-
const output = messages.newMessageWithBody(msg);
45-
output.attachments = msg.attachments;
35+
const response = await attachmentProcessor.addAttachment.call(this, msg, filename,
36+
fileContent, contentType);
37+
const output = messages.newMessageWithBody(response);
38+
output.attachments = response.attachments;
4639
return output;
4740
}
4841
};

lib/actions/readFile2.js

Lines changed: 56 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,56 @@
1+
/* eslint-disable no-use-before-define,consistent-return,func-names,no-param-reassign */
2+
const { messages } = require('elasticio-node');
3+
const convert = require('xml-js');
4+
const mime = require('mime-types');
5+
const iconv = require('iconv-lite');
6+
const { AttachmentProcessor } = require('@elastic.io/component-commons-library');
7+
const { Client } = require('../client');
8+
const params = require('../parameters');
9+
10+
11+
exports.process = async function (msg, cfg) {
12+
const client = new Client(this.logger, cfg);
13+
const bucketName = msg.body.bucketName ? msg.body.bucketName : cfg.bucketName;
14+
const { filename } = msg.body;
15+
16+
// const result = await client.getObjectReadStream(bucketName, filename);
17+
const result = await client.getObjectMetadata(bucketName, filename);
18+
19+
if (result.ContentLength > params.ATTACHMENT_MAX_SIZE) {
20+
const err = `File ${filename} with size ${result.ContentLength} bytes is too big for attachment usage. `
21+
+ `Current attachment max size is ${params.ATTACHMENT_MAX_SIZE} bytes`;
22+
this.logger.error(err);
23+
throw new Error(err);
24+
}
25+
26+
const contentType = mime.lookup(filename);
27+
this.logger.info(`File type - "${contentType}"`);
28+
29+
if (['application/json', 'application/xml'].includes(contentType)) {
30+
const data = await client.getObject(bucketName, filename);
31+
const fileContent = iconv.decode(data.Body, 'iso-8859-15');
32+
let doc;
33+
if (contentType === 'application/json') doc = JSON.parse(fileContent);
34+
if (contentType === 'application/xml') doc = JSON.parse(convert.xml2json(fileContent));
35+
await this.emit('data', messages.newMessageWithBody(doc));
36+
} else {
37+
const readStream = client.getObjectReadStream(bucketName, filename);
38+
const results = await new AttachmentProcessor().uploadAttachment(readStream);
39+
const attachmentUrl = `${results.config.url}${results.data.objectId}?storage_type=maester`;
40+
41+
const attachments = {
42+
[filename]: {
43+
url: attachmentUrl,
44+
size: result.ContentLength,
45+
'content-type': contentType,
46+
},
47+
};
48+
const output = messages.newMessageWithBody({
49+
filename,
50+
attachmentUrl,
51+
size: result.ContentLength,
52+
});
53+
output.attachments = attachments;
54+
return output;
55+
}
56+
};

lib/parameters.js

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -1 +1 @@
1-
exports.REQUEST_MAX_BODY_LENGTH = process.env.REQUEST_MAX_BODY_LENGTH ? parseInt(process.env.REQUEST_MAX_BODY_LENGTH, 10) : 104857600; // 100MB
1+
exports.ATTACHMENT_MAX_SIZE = process.env.ATTACHMENT_MAX_SIZE ? parseInt(process.env.ATTACHMENT_MAX_SIZE, 10) : 1024 * 1024 * 100;

lib/triggers/pollingTrigger.js

Lines changed: 27 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -1,10 +1,34 @@
11
const { Client } = require('../client');
22
const { AwsS3Polling } = require('../utils/pollingUtil');
33

4-
// eslint-disable-next-line func-names
5-
exports.process = async function (msg, cfg, snapshot) {
4+
async function process(_msg, cfg, snapshot) {
65
const client = new Client(this.logger, cfg);
76

87
const pollingTrigger = new AwsS3Polling(this.logger, this, client, cfg);
98
await pollingTrigger.process(cfg, snapshot);
10-
};
9+
}
10+
11+
async function getMetaModel(cfg) {
12+
const metadataBase = {
13+
attachmentUrl: { type: 'string', required: true },
14+
Key: { type: 'string', required: true },
15+
LastModified: { type: 'string', required: true },
16+
ETag: { type: 'string', required: true },
17+
Size: { type: 'number', required: true },
18+
StorageClass: { type: 'string', required: true },
19+
Owner: {
20+
type: 'object',
21+
properties: { ID: { type: 'string', required: true } },
22+
},
23+
};
24+
if (cfg.emitBehaviour === 'emitIndividually') {
25+
return { out: { type: 'object', properties: metadataBase } };
26+
}
27+
if (cfg.emitBehaviour === 'fetchAll') {
28+
return { out: { type: 'object', properties: { results: { type: 'array', items: metadataBase, required: true }, required: true } } };
29+
}
30+
return {};
31+
}
32+
33+
module.exports.getMetaModel = getMetaModel;
34+
module.exports.process = process;

0 commit comments

Comments
 (0)