Skip to content

Commit f63bba0

Browse files
committed
WIP. check build in pipeline
1 parent 5cfedbe commit f63bba0

File tree

5 files changed

+13303
-18490
lines changed

5 files changed

+13303
-18490
lines changed
Lines changed: 88 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,88 @@
1+
---
2+
id: custom-logger
3+
title: Custom logger
4+
description: Use your own logging library (Winston, Pino, etc.) with Crawlee
5+
---
6+
7+
import ApiLink from '@site/src/components/ApiLink';
8+
import Tabs from '@theme/Tabs';
9+
import TabItem from '@theme/TabItem';
10+
import CodeBlock from '@theme/CodeBlock';
11+
12+
import WinstonSource from '!!raw-loader!./winston.ts';
13+
import PinoSource from '!!raw-loader!./pino.ts';
14+
15+
Crawlee uses `@apify/log` as its default logging library, but you can replace it with any logger you prefer, such as Winston or Pino. This is done by implementing a small adapter and passing it to the crawler.
16+
17+
## Creating an adapter
18+
19+
All Crawlee logging goes through the <ApiLink to="core/interface/CrawleeLogger">`CrawleeLogger`</ApiLink> interface. To plug in your own logger, extend the <ApiLink to="core/class/BaseCrawleeLogger">`BaseCrawleeLogger`</ApiLink> abstract class and implement two methods:
20+
21+
- **`logWithLevel(level, message, data)`** — dispatches a log message to your logging library. The `level` parameter uses <ApiLink to="core/enum/LogLevel">`LogLevel`</ApiLink> constants (`ERROR = 1`, `SOFT_FAIL = 2`, `WARNING = 3`, `INFO = 4`, `DEBUG = 5`, `PERF = 6`). Map these to your logger's native levels.
22+
- **`createChild(options)`** — creates a child logger instance. Crawlee creates child loggers with prefixes (e.g. `CheerioCrawler`, `AutoscaledPool`, `SessionPool`) so each internal component is easily identifiable in the output.
23+
24+
All other methods (`error`, `warning`, `info`, `debug`, `exception`, `perf`, etc.) are derived automatically from `logWithLevel` — you don't need to implement them.
25+
26+
:::info Level filtering
27+
28+
`logWithLevel()` is called for **every** log message, regardless of the configured level. Level filtering is the responsibility of the underlying logging library (e.g. Winston's `level` option or Pino's `level` setting). This means your adapter doesn't need to check log levels — just forward everything and let the library decide what to output.
29+
30+
:::
31+
32+
## Injecting the logger
33+
34+
There are two ways to inject a custom logger: per-crawler and globally.
35+
36+
### Per-crawler logger
37+
38+
Pass your adapter via the `logger` option in the crawler constructor. When a `logger` is provided, the crawler creates its own isolated <ApiLink to="core/class/ServiceLocator">`ServiceLocator`</ApiLink> instance, so the custom logger is used by all internal components of that crawler (autoscaling, session pool, statistics, etc.):
39+
40+
```ts
41+
import { CheerioCrawler } from 'crawlee';
42+
43+
const crawler = new CheerioCrawler({
44+
logger: new WinstonAdapter(winstonLogger),
45+
async requestHandler({ log }) {
46+
// `log` is a child of your custom logger, with prefix set to the crawler class name
47+
log.info('Hello from my custom logger!');
48+
},
49+
});
50+
```
51+
52+
The same logger is available as `crawler.log` outside of the request handler, for example when setting up routes.
53+
54+
### Global logger via service locator
55+
56+
Instead of passing the logger to each crawler individually, you can set it globally via the <ApiLink to="core/variable/serviceLocator">`serviceLocator`</ApiLink>. This is useful when you run multiple crawlers and want them all to use the same logging backend:
57+
58+
```ts
59+
import { serviceLocator, CheerioCrawler, PlaywrightCrawler } from 'crawlee';
60+
61+
// Set the logger globally — must be done before creating any crawlers
62+
serviceLocator.setLogger(new WinstonAdapter(winstonLogger));
63+
64+
// Both crawlers will use the Winston logger
65+
const cheerioCrawler = new CheerioCrawler({ /* ... */ });
66+
const playwrightCrawler = new PlaywrightCrawler({ /* ... */ });
67+
```
68+
69+
:::warning
70+
71+
`serviceLocator.setLogger()` must be called **before** any crawler is created. Once a logger has been retrieved from the service locator (which happens during crawler construction), it cannot be replaced — an error will be thrown.
72+
73+
:::
74+
75+
## Full examples
76+
77+
<Tabs>
78+
<TabItem value="winston" label="Winston" default>
79+
80+
<CodeBlock language="ts">{WinstonSource}</CodeBlock>
81+
82+
</TabItem>
83+
<TabItem value="pino" label="Pino">
84+
85+
<CodeBlock language="ts">{PinoSource}</CodeBlock>
86+
87+
</TabItem>
88+
</Tabs>

docs/guides/custom-logger/pino.ts

Lines changed: 53 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,53 @@
1+
import { CheerioCrawler, BaseCrawleeLogger, LogLevel } from 'crawlee';
2+
import type { CrawleeLogger, CrawleeLoggerOptions } from 'crawlee';
3+
import pino from 'pino';
4+
5+
// Map Crawlee log levels to Pino levels
6+
const CRAWLEE_TO_PINO: Record<number, string> = {
7+
[LogLevel.ERROR]: 'error',
8+
[LogLevel.SOFT_FAIL]: 'warn',
9+
[LogLevel.WARNING]: 'warn',
10+
[LogLevel.INFO]: 'info',
11+
[LogLevel.DEBUG]: 'debug',
12+
[LogLevel.PERF]: 'trace',
13+
};
14+
15+
class PinoAdapter extends BaseCrawleeLogger {
16+
constructor(
17+
private logger: pino.Logger,
18+
options?: Partial<CrawleeLoggerOptions>,
19+
) {
20+
super(options);
21+
}
22+
23+
logWithLevel(level: number, message: string, data?: Record<string, unknown>): void {
24+
const pinoLevel = CRAWLEE_TO_PINO[level] ?? 'info';
25+
const prefix = this.getOptions().prefix;
26+
this.logger[pinoLevel as pino.Level]({ ...data, prefix }, message);
27+
}
28+
29+
protected createChild(options: Partial<CrawleeLoggerOptions>): CrawleeLogger {
30+
return new PinoAdapter(this.logger.child({ prefix: options.prefix }), { ...this.getOptions(), ...options });
31+
}
32+
}
33+
34+
// Create a Pino logger with your preferred configuration
35+
const pinoLogger = pino({
36+
level: 'debug',
37+
transport: {
38+
target: 'pino-pretty',
39+
options: { colorize: true },
40+
},
41+
});
42+
43+
// Pass the adapter to the crawler via the `logger` option
44+
const crawler = new CheerioCrawler({
45+
logger: new PinoAdapter(pinoLogger),
46+
async requestHandler({ request, $, log }) {
47+
log.info(`Processing ${request.url}`);
48+
const title = $('title').text();
49+
log.debug('Page title extracted', { title });
50+
},
51+
});
52+
53+
await crawler.run(['https://crawlee.dev']);
Lines changed: 60 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,60 @@
1+
import { CheerioCrawler, BaseCrawleeLogger, LogLevel } from 'crawlee';
2+
import type { CrawleeLogger, CrawleeLoggerOptions } from 'crawlee';
3+
import winston from 'winston';
4+
5+
// Map Crawlee log levels to Winston levels
6+
const CRAWLEE_TO_WINSTON: Record<number, string> = {
7+
[LogLevel.ERROR]: 'error',
8+
[LogLevel.SOFT_FAIL]: 'warn',
9+
[LogLevel.WARNING]: 'warn',
10+
[LogLevel.INFO]: 'info',
11+
[LogLevel.DEBUG]: 'debug',
12+
[LogLevel.PERF]: 'debug',
13+
};
14+
15+
class WinstonAdapter extends BaseCrawleeLogger {
16+
constructor(
17+
private logger: winston.Logger,
18+
options?: Partial<CrawleeLoggerOptions>,
19+
) {
20+
super(options);
21+
}
22+
23+
logWithLevel(level: number, message: string, data?: Record<string, unknown>): void {
24+
const winstonLevel = CRAWLEE_TO_WINSTON[level] ?? 'info';
25+
this.logger.log(winstonLevel, message, {
26+
...data,
27+
prefix: this.getOptions().prefix,
28+
});
29+
}
30+
31+
protected createChild(options: Partial<CrawleeLoggerOptions>): CrawleeLogger {
32+
return new WinstonAdapter(this.logger.child({ prefix: options.prefix }), { ...this.getOptions(), ...options });
33+
}
34+
}
35+
36+
// Create a Winston logger with your preferred configuration
37+
const winstonLogger = winston.createLogger({
38+
level: 'debug',
39+
format: winston.format.combine(
40+
winston.format.colorize(),
41+
winston.format.timestamp(),
42+
winston.format.printf(({ level, message, timestamp, prefix }) => {
43+
const tag = prefix ? `[${prefix}] ` : '';
44+
return `${timestamp} ${level}: ${tag}${message}`;
45+
}),
46+
),
47+
transports: [new winston.transports.Console()],
48+
});
49+
50+
// Pass the adapter to the crawler via the `logger` option
51+
const crawler = new CheerioCrawler({
52+
logger: new WinstonAdapter(winstonLogger),
53+
async requestHandler({ request, $, log }) {
54+
log.info(`Processing ${request.url}`);
55+
const title = $('title').text();
56+
log.debug('Page title extracted', { title });
57+
},
58+
});
59+
60+
await crawler.run(['https://crawlee.dev']);

website/sidebars.js

Lines changed: 2 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -49,7 +49,8 @@ module.exports = {
4949
'guides/stagehand-crawler-guide',
5050
'guides/running-in-web-server/running-in-web-server',
5151
'guides/parallel-scraping/parallel-scraping-guide',
52-
'guides/custom-http-client/custom-http-client'
52+
'guides/custom-http-client/custom-http-client',
53+
'guides/custom-logger/custom-logger'
5354
],
5455
},
5556
{

0 commit comments

Comments
 (0)