Skip to content

Commit e6b3f2e

Browse files
authored
Merge pull request #14 from J-Hoplin/feat/2.0.4
2.0.6
2 parents 33088d3 + 14c43ad commit e6b3f2e

File tree

9 files changed

+213
-134
lines changed

9 files changed

+213
-134
lines changed

.gitignore

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -6,4 +6,5 @@ dist/
66

77
.idea
88
src/**/*.js
9+
test.js
910

Readme.md

Lines changed: 74 additions & 72 deletions
Original file line numberDiff line numberDiff line change
@@ -23,7 +23,11 @@
2323

2424
## Fully changed from 2.0.0
2525

26-
The internal implementation is event-based, which significantly improves the stability. In addition, instead of relying on generic-pools to manage the pool, we have solved the problem of third-party dependency and features that were incompatible with generic-pools through our own pooling. However, there are many API changes and some features are currently disabled. If you update to 2.0.0, please be aware of the migration progress and disabled features for the changes.
26+
The internal implementation is event-based, which significantly improves the stability. In addition, instead of relying
27+
on generic-pools to manage the pool, we have solved the problem of third-party dependency and features that were
28+
incompatible with generic-pools through our own pooling. However, there are many API changes and some features are
29+
currently disabled. If you update to 2.0.0, please be aware of the migration progress and disabled features for the
30+
changes.
2731
Also cluster mode client will be provided in near future.
2832

2933
### API Changes
@@ -33,82 +37,79 @@ After that you can use dispatcher to control pool manager.
3337

3438
**[ Client API ]**
3539

36-
- StartPuppeteerPool
37-
40+
- PuppeteePool
41+
- `PuppeteerPool` is singleton class. You can use `PuppeteerPool.start` to initialize pool manager.
42+
- PuppeteerPool.start
43+
- Static Method
44+
- Description: Initialize pool manager. You need to call this function to start puppeteer pool. Even if you invoke
45+
this function multiple times with differenct arguments, it will return the first initialized instance.
3846
- Args
39-
- concurrencyLevel: number
40-
- Number of context level to run tasks concurrently.
47+
- concurrencyLevel
48+
- Required
49+
- number
4150
- contextMode: ContextMode
42-
- ContextMode.SHARED(Default): Each session will share local storage, cookies, etc.
43-
- ContextMode.ISOLATED: Each session will have its own local storage, cookies, etc.
44-
- options: [Puppeteer LaunchOptions](https://pptr.dev/api/puppeteer.launchoptions)
45-
- customConfigPath: string
46-
- Optional. If you want to use custom config file, you can pass path to config file.
47-
- Returns
48-
- dispatcher: TaskDispatcher
49-
- Dispatcher instance to control pool manager.
50-
51-
- StopPuppeteerPool
52-
- Args
53-
- dispatcher: TaskDispatcher
54-
- Dispatcher instance returned from StartPuppeteerPool
55-
56-
**[ TaskDispatcher API ]**
57-
58-
- dispatchTask<T>
59-
51+
- Required
52+
- ContextMode.ISOLATED | ContextMode.SHARED
53+
- options
54+
- Optional
55+
- [puppeteer.LaunchOptions](https://pptr.dev/api/puppeteer.launchoptions)
56+
- customConfigPath
57+
- Optional
58+
- string (Default: `puppeteer-pool-config.json` in project root)
59+
- Return
60+
- `Promise<PuppeteerPool>`
61+
- Returns PuppeteerPool Instance.
62+
- Instance<PuppeteerPool>.stop
63+
- Description: Stop pool manager. It will close all sessions and terminate pool manager.
64+
- Return
65+
- `Promise<void>`
66+
- Instance<PuppeteerPool>.runTask
67+
- Description: Run task in pool manager. It will return result of task.
6068
- Args
61-
- task: RequestedTask<T>
62-
- Returns
63-
```typescript
64-
{
65-
event: EventEmitter,
66-
resultListener: Promise<unknown>
67-
}
68-
```
69-
- event: Check given task's state. You can listen to two event(Node.js Event) task state
70-
- RUNNING: Emits when task is running
71-
- DONE: Emits when task is done
72-
- resultListener: Promise Object for result. resultListener is not callable. You should just await Promise to be resolve.(Same as when 'DONE' event emits)
73-
74-
- getPoolMetrics
75-
- Returns
76-
```typescript
77-
{
78-
memoryUsageValue: number, // Memory Usage
79-
memoryUsagePercentage: number, // Memory usage percentage
80-
cpuUsage: number, // CPU Usage Percentage
81-
};
69+
- task
70+
- Required
71+
- Function
72+
- Return
73+
- `Promise<any>`
74+
- Returns result of task(Same return type with task callback return type)
75+
- Instance<PuppeteerPool>.getPoolMetrics
76+
- Description: Get pool metrics. It will return metrics of pool manager.
77+
- Return
78+
```json
79+
{
80+
memoryUsageValue: (Memory Usage in MegaBytes),
81+
memoryUsagePercentage: (Memory Usage with percentage),
82+
cpuUsage: (CPU Usage with percentage)
83+
}
8284
```
8385

8486
## Simple Demo
8587

8688
```typescript
87-
import { ContextMode, StartPuppeteerPool } from '@hoplin/puppeteer-pool';
89+
import { ContextMode, PuppeteerPool } from '@hoplin/puppeteer-pool';
8890

8991
async function main() {
90-
const instance = await StartPuppeteerPool(3, ContextMode.ISOLATED);
92+
const poolInstance = await PuppeteerPool.start(5, ContextMode.ISOLATED);
93+
9194
const urls = [
9295
'https://www.google.com',
9396
'https://www.bing.com',
9497
'https://www.yahoo.com',
95-
'https://www.duckduckgo.com',
96-
'https://www.ask.com',
9798
];
98-
let taskCounter = 0;
99-
for (const url of urls) {
100-
const taskId = `TASK_${String(++taskCounter).padStart(3, '0')}`;
101-
const { event, resultListener } = await instance.dispatchTask(
102-
async (page) => {
103-
await page.goto(url);
104-
return await page.title();
105-
},
106-
);
107-
const result = await resultListener;
108-
console.log(`[${taskId}] Result:`, result);
109-
console.log('-'.repeat(50));
110-
}
99+
100+
console.log(await poolInstance.getPoolMetrics());
101+
102+
const promises = urls.map((url) =>
103+
poolInstance.runTask(async (page) => {
104+
await page.goto(url);
105+
return await page.title();
106+
}),
107+
);
108+
109+
const titles = await Promise.all(promises);
110+
titles.forEach((title) => console.log(title));
111111
}
112+
112113
main();
113114
```
114115

@@ -133,20 +134,21 @@ Default config should be `puppeteer-pool-config.json` in root directory path.
133134

134135
### Default config setting
135136

136-
If config file are not given or invalid path, manager will use default defined configurations. Or if you want to pass config path, you can pass path to `bootPoolManager` function as parameter.
137+
If config file are not given or invalid path, manager will use default defined configurations. Or if you want to pass
138+
config path, you can pass path to `bootPoolManager` function as parameter.
137139

138-
```typescript
140+
```json
139141
{
140-
session_pool: {
141-
width: 1080,
142-
height: 1024,
143-
},
144-
threshold: {
145-
activate: true,
146-
interval: 5,
147-
cpu: 80,
148-
memory: 2048,
142+
"session_pool": {
143+
"width": 1080,
144+
"height": 1024
149145
},
146+
"threshold": {
147+
"activate": true,
148+
"interval": 5,
149+
"cpu": 80,
150+
"memory": 2048
151+
}
150152
}
151153
```
152154

package.json

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -1,6 +1,6 @@
11
{
22
"name": "@hoplin/puppeteer-pool",
3-
"version": "2.0.3",
3+
"version": "2.0.6",
44
"main": "dist/index.js",
55
"description": "Puppeteer Pool Manager for worker server, process daemon, commands etc...",
66
"repository": "https://github.com/J-Hoplin/Puppeteer-Pool.git",

src/client/rest.ts

Lines changed: 74 additions & 21 deletions
Original file line numberDiff line numberDiff line change
@@ -1,28 +1,81 @@
11
/**
22
* APIs for RESTful API mode
33
*/
4-
import { ContextMode, TaskDispatcher } from '../pool';
4+
import { PoolNotInitializedException } from '../error';
5+
import { TaskDispatcher } from '../pool/dispatcher';
6+
import { ContextMode } from '../pool/enum';
7+
import { RequestedTask } from '../types';
58
import * as puppeteer from 'puppeteer';
69

7-
/**
8-
* Invoke this function to start a new Puppeteer Pool
9-
*/
10-
export async function StartPuppeteerPool(
11-
concurrencyLevel: number,
12-
contextMode: ContextMode,
13-
options?: puppeteer.LaunchOptions,
14-
customConfigPath?: string,
15-
): Promise<TaskDispatcher> {
16-
const instance = new TaskDispatcher();
17-
await instance.init(concurrencyLevel, contextMode, options, customConfigPath);
18-
return instance;
19-
}
10+
export class PuppeteerPool {
11+
private static isInitialized = false;
12+
private static dispatcherInstance: TaskDispatcher;
13+
private static instance: PuppeteerPool;
2014

21-
/**
22-
* Invoke this function to stop a Puppeteer Pool
23-
*
24-
* Please enroll this function in your graceful shutdown process
25-
*/
26-
export async function StopPuppeteerPool(instance: TaskDispatcher) {
27-
await instance.close();
15+
/**
16+
* Check if the instance is initialized
17+
* Throw an error if the instance is not initialized
18+
*/
19+
private checkInstanceInitalized() {
20+
if (!PuppeteerPool.isInitialized) {
21+
throw new PoolNotInitializedException();
22+
}
23+
}
24+
25+
/**
26+
* Private constructor to make sure this class is a singleton
27+
*/
28+
private constructor() {}
29+
30+
/**
31+
* Invoke this function to start a new Puppeteer Pool
32+
*/
33+
public static async start(
34+
concurrencyLevel: number,
35+
contextMode: ContextMode,
36+
options?: puppeteer.LaunchOptions,
37+
customConfigPath?: string,
38+
) {
39+
if (!PuppeteerPool.isInitialized) {
40+
// Initialize Task Dispatcher
41+
PuppeteerPool.dispatcherInstance = new TaskDispatcher();
42+
await PuppeteerPool.dispatcherInstance.init(
43+
concurrencyLevel,
44+
contextMode,
45+
options,
46+
customConfigPath,
47+
);
48+
// Initialize REST Client Instance
49+
PuppeteerPool.instance = new PuppeteerPool();
50+
// Change state to initialized
51+
PuppeteerPool.isInitialized = true;
52+
}
53+
return PuppeteerPool.instance;
54+
}
55+
56+
/**
57+
* Invoke this function to stop a Puppeteer Pool
58+
*
59+
* Please enroll this function in your graceful shutdown process
60+
*/
61+
public async stop() {
62+
this.checkInstanceInitalized();
63+
await PuppeteerPool.dispatcherInstance.close();
64+
}
65+
66+
/**
67+
* Invoke this function to run a task
68+
*/
69+
public async runTask<T>(task: RequestedTask<T>) {
70+
this.checkInstanceInitalized();
71+
return await PuppeteerPool.dispatcherInstance.dispatchTask(task);
72+
}
73+
74+
/**
75+
* Invoke this function to get pool metrics
76+
*/
77+
public async getPoolMetrics() {
78+
this.checkInstanceInitalized();
79+
return await PuppeteerPool.dispatcherInstance.getPoolMetrics();
80+
}
2881
}

src/pool/context/context.ts

Lines changed: 3 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -36,6 +36,7 @@ export abstract class TaskContext {
3636
if (this.page) {
3737
if (this.page.url() !== 'about:blank') {
3838
// 'about:blank' can't access to local storage
39+
// It'll lead to chronium security error if try to access local storage in 'about:blank'
3940
await this.page.evaluate(() => {
4041
localStorage.clear();
4142
sessionStorage.clear();
@@ -73,6 +74,8 @@ export abstract class TaskContext {
7374
error: e as Error,
7475
};
7576
} finally {
77+
// Clear resource after task is done
78+
// This is to prevent memory leak and to prevent waste of resource
7679
await this.clearResource();
7780
}
7881
}

0 commit comments

Comments
 (0)