Skip to content
Merged
Show file tree
Hide file tree
Changes from 6 commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
88 changes: 88 additions & 0 deletions docs/guides/custom-logger/custom-logger.mdx
Original file line number Diff line number Diff line change
@@ -0,0 +1,88 @@
---
id: custom-logger
title: Custom logger
description: Use your own logging library (Winston, Pino, etc.) with Crawlee
---

import ApiLink from '@site/src/components/ApiLink';
import Tabs from '@theme/Tabs';
import TabItem from '@theme/TabItem';
import CodeBlock from '@theme/CodeBlock';

import WinstonSource from '!!raw-loader!./winston.ts';
import PinoSource from '!!raw-loader!./pino.ts';

Crawlee uses `@apify/log` as its default logging library, but you can replace it with any logger you prefer, such as Winston or Pino. This is done by implementing a small adapter and passing it to the crawler.

## Creating an adapter

All Crawlee logging goes through the <ApiLink to="core/interface/CrawleeLogger">`CrawleeLogger`</ApiLink> interface. To plug in your own logger, extend the <ApiLink to="core/class/BaseCrawleeLogger">`BaseCrawleeLogger`</ApiLink> abstract class and implement two methods:

- **`logWithLevel(level, message, data)`** — dispatches a log message to your logging library. The `level` parameter uses <ApiLink to="core/enum/LogLevel">`LogLevel`</ApiLink> constants (`ERROR = 1`, `SOFT_FAIL = 2`, `WARNING = 3`, `INFO = 4`, `DEBUG = 5`, `PERF = 6`). Map these to your logger's native levels.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same here, can we have a description of what message and data types are (and how do these differ)?

- **`createChild(options)`** — creates a child logger instance. Crawlee creates child loggers with prefixes (e.g. `CheerioCrawler`, `AutoscaledPool`, `SessionPool`) so each internal component is easily identifiable in the output.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Out of curiosity - why does the implementer need to implement this method?

Describing the fields on the options type (or adding a link to the API docs) might clarify this a little.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yes, this deserves more clarity. main reason for createChild is we use child loggers on multiple places.


All other methods (`error`, `warning`, `info`, `debug`, `exception`, `perf`, etc.) are derived automatically from `logWithLevel` — you don't need to implement them.

:::info Level filtering

`logWithLevel()` is called for **every** log message, regardless of the configured level. Level filtering is the responsibility of the underlying logging library (e.g. Winston's `level` option or Pino's `level` setting). This means your adapter doesn't need to check log levels — just forward everything and let the library decide what to output.

:::

## Injecting the logger

There are two ways to inject a custom logger: per-crawler and globally.

### Per-crawler logger

Pass your adapter via the `logger` option in the crawler constructor. When a `logger` is provided, the crawler creates its own isolated <ApiLink to="core/class/ServiceLocator">`ServiceLocator`</ApiLink> instance, so the custom logger is used by all internal components of that crawler (autoscaling, session pool, statistics, etc.):

```ts
import { CheerioCrawler } from 'crawlee';

const crawler = new CheerioCrawler({
logger: new WinstonAdapter(winstonLogger),
async requestHandler({ log }) {
// `log` is a child of your custom logger, with prefix set to the crawler class name
log.info('Hello from my custom logger!');
},
});
```

The same logger is available as `crawler.log` outside of the request handler, for example when setting up routes.

### Global logger via service locator

Instead of passing the logger to each crawler individually, you can set it globally via the `serviceLocator`. This is useful when you run multiple crawlers and want them all to use the same logging backend:

```ts
import { serviceLocator, CheerioCrawler, PlaywrightCrawler } from 'crawlee';

// Set the logger globally — must be done before creating any crawlers
serviceLocator.setLogger(new WinstonAdapter(winstonLogger));

// Both crawlers will use the Winston logger
const cheerioCrawler = new CheerioCrawler({ /* ... */ });
const playwrightCrawler = new PlaywrightCrawler({ /* ... */ });
```

:::warning

`serviceLocator.setLogger()` must be called **before** any crawler is created. Once a logger has been retrieved from the service locator (which happens during crawler construction), it cannot be replaced — an error will be thrown.

:::

## Full examples

<Tabs>
<TabItem value="winston" label="Winston" default>

<CodeBlock language="ts">{WinstonSource}</CodeBlock>

</TabItem>
<TabItem value="pino" label="Pino">

<CodeBlock language="ts">{PinoSource}</CodeBlock>

</TabItem>
</Tabs>
49 changes: 49 additions & 0 deletions docs/guides/custom-logger/pino.ts
Original file line number Diff line number Diff line change
@@ -0,0 +1,49 @@
import { CheerioCrawler, BaseCrawleeLogger, LogLevel } from 'crawlee';
import type { CrawleeLogger, CrawleeLoggerOptions } from 'crawlee';
import pino from 'pino';

// Map Crawlee log levels to Pino levels
const CRAWLEE_TO_PINO: Record<number, string> = {
[LogLevel.ERROR]: 'error',
[LogLevel.SOFT_FAIL]: 'warn',
[LogLevel.WARNING]: 'warn',
[LogLevel.INFO]: 'info',
[LogLevel.DEBUG]: 'debug',
[LogLevel.PERF]: 'trace',
};

class PinoAdapter extends BaseCrawleeLogger {
constructor(
private logger: pino.Logger,
options?: Partial<CrawleeLoggerOptions>,
) {
super(options);
}

logWithLevel(level: number, message: string, data?: Record<string, unknown>): void {
const pinoLevel = CRAWLEE_TO_PINO[level] ?? 'info';
const prefix = this.getOptions().prefix;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It seems the prefix option requires some special handling.

Can we / is it worth mentioning this somewhere in the guide? Are there any other options fields that require special care?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this is good catch, it is not needed. In crawlee we never use prefix directly with log function, we use prefix only through createChild (and then it is handled automatically by child instance).

this.logger[pinoLevel as pino.Level]({ ...data, prefix }, message);
}

protected createChild(options: Partial<CrawleeLoggerOptions>): CrawleeLogger {
return new PinoAdapter(this.logger.child({ prefix: options.prefix }), { ...this.getOptions(), ...options });
}
}

// Create a Pino logger with your preferred configuration
const pinoLogger = pino({
level: 'debug',
});

// Pass the adapter to the crawler via the `logger` option
const crawler = new CheerioCrawler({
logger: new PinoAdapter(pinoLogger),
async requestHandler({ request, $, log }) {
log.info(`Processing ${request.url}`);
const title = $('title').text();
log.debug('Page title extracted', { title });
},
});

await crawler.run(['https://crawlee.dev']);
60 changes: 60 additions & 0 deletions docs/guides/custom-logger/winston.ts
Original file line number Diff line number Diff line change
@@ -0,0 +1,60 @@
import { CheerioCrawler, BaseCrawleeLogger, LogLevel } from 'crawlee';
import type { CrawleeLogger, CrawleeLoggerOptions } from 'crawlee';
import winston from 'winston';

// Map Crawlee log levels to Winston levels
const CRAWLEE_TO_WINSTON: Record<number, string> = {
[LogLevel.ERROR]: 'error',
[LogLevel.SOFT_FAIL]: 'warn',
[LogLevel.WARNING]: 'warn',
[LogLevel.INFO]: 'info',
[LogLevel.DEBUG]: 'debug',
[LogLevel.PERF]: 'debug',
};

class WinstonAdapter extends BaseCrawleeLogger {
constructor(
private logger: winston.Logger,
options?: Partial<CrawleeLoggerOptions>,
) {
super(options);
}

logWithLevel(level: number, message: string, data?: Record<string, unknown>): void {
const winstonLevel = CRAWLEE_TO_WINSTON[level] ?? 'info';
this.logger.log(winstonLevel, message, {
...data,
prefix: this.getOptions().prefix,
});
}

protected createChild(options: Partial<CrawleeLoggerOptions>): CrawleeLogger {
return new WinstonAdapter(this.logger.child({ prefix: options.prefix }), { ...this.getOptions(), ...options });
}
}

// Create a Winston logger with your preferred configuration
const winstonLogger = winston.createLogger({
level: 'debug',
format: winston.format.combine(
winston.format.colorize(),
winston.format.timestamp(),
winston.format.printf(({ level, message, timestamp, prefix }) => {
const tag = prefix ? `[${prefix}] ` : '';
return `${timestamp} ${level}: ${tag}${message}`;
}),
),
transports: [new winston.transports.Console()],
});

// Pass the adapter to the crawler via the `logger` option
const crawler = new CheerioCrawler({
logger: new WinstonAdapter(winstonLogger),
async requestHandler({ request, $, log }) {
log.info(`Processing ${request.url}`);
const title = $('title').text();
log.debug('Page title extracted', { title });
},
});

await crawler.run(['https://crawlee.dev']);
4 changes: 3 additions & 1 deletion docs/package.json
Original file line number Diff line number Diff line change
Expand Up @@ -11,8 +11,10 @@
},
"dependencies": {
"impit": "^0.7.1",
"pino": "^9.6.0",
"playwright-extra": "^4.3.6",
"puppeteer-extra": "^3.3.6",
"puppeteer-extra-plugin-stealth": "^2.11.2"
"puppeteer-extra-plugin-stealth": "^2.11.2",
"winston": "^3.17.0"
}
}
Loading
Loading