-
Notifications
You must be signed in to change notification settings - Fork 15
docs: Add PPE guide #416
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Merged
docs: Add PPE guide #416
Changes from 5 commits
Commits
Show all changes
6 commits
Select commit
Hold shift + click to select a range
6b5712b
docs: Add PPE guide
janbuchar c3e0a8b
Add components
janbuchar 326ef1b
Update docs/02_guides/pay_per_event.mdx
janbuchar d11b2bd
Merge remote-tracking branch 'origin/master' into ppe-guide
janbuchar c92e6e3
Address review feedback
janbuchar d099e17
review feedback
janbuchar File filter
Filter by extension
Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
There are no files selected for viewing
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,46 @@ | ||
--- | ||
id: pay-per-event | ||
title: Pay-per-event monetization | ||
description: Monetize your Actors using the pay-per-event pricing model | ||
--- | ||
|
||
import ActorChargeSource from '!!raw-loader!./code/actor_charge.py'; | ||
import ConditionalActorChargeSource from '!!raw-loader!./code/conditional_actor_charge.py'; | ||
import ApiLink from '@site/src/components/ApiLink'; | ||
import CodeBlock from '@theme/CodeBlock'; | ||
|
||
Apify provides several [pricing models](https://docs.apify.com/platform/actors/publishing/monetize) for monetizing your Actors. The most recent and most flexible one is [pay-per-event](https://docs.apify.com/platform/actors/running/actors-in-store#pay-per-event), which lets you charge your users programmatically directly from your Actor. As the name suggests, you may charge the users each time a specific event occurs, for example a call to an external API or when you return a result. | ||
|
||
To use the pay-per-event pricing model, you first need to [set it up](https://docs.apify.com/platform/actors/running/actors-in-store#pay-per-event) for your Actor in the Apify console. After that, you're free to start charging for events. | ||
|
||
## Charging for events | ||
|
||
After monetization is set in the Apify console, you can add <ApiLink to="class/Actor#charge">`Actor.charge`</ApiLink> calls to your code and start monetizing! | ||
|
||
<CodeBlock language="python"> | ||
{ActorChargeSource} | ||
</CodeBlock> | ||
|
||
Then you just push your code to Apify and that's it! The SDK will even keep track of the max total charge setting for you, so you will not provide more value than what the user chose to pay for. | ||
|
||
If you need finer control over charging, you can access call <ApiLink to="class/Actor#get_charging_manager">`Actor.get_charging_manager()`</ApiLink> to access the <ApiLink to="class/ChargingManager">`ChargingManager`</ApiLink>, which can provide more detailed information - for example how many events of each type can be charged before reaching the configured limit. | ||
|
||
## Transitioning from a different pricing model | ||
|
||
When you plan to start using the pay-per-event pricing model for an Actor that is already monetized with a different pricing model, your source code will need support both pricing models during the transition period enforced by the Apify platform. Arguably the most frequent case is the transition from the pay-per-result model which utilizes the `ACTOR_MAX_PAID_DATASET_ITEMS` environment variable to prevent returning unpaid dataset items. The following is an example how to handle such scenarios. The key part is the <ApiLink to="class/ChargingManager#get_pricing_info">`ChargingManager.get_pricing_info`</ApiLink> method which returns information about the current pricing model. | ||
|
||
<CodeBlock language="python"> | ||
{ConditionalActorChargeSource} | ||
</CodeBlock> | ||
|
||
## Local development | ||
|
||
It is encouraged to test your monetization code on your machine before releasing it to the public. To tell your Actor that it should work in pay-per-event mode, pass it the `ACTOR_TEST_PAY_PER_EVENT` environment variable: | ||
|
||
```shell | ||
ACTOR_TEST_PAY_PER_EVENT=true python -m youractor | ||
``` | ||
|
||
If you also wish to see a log of all the events charged throughout the run, the Apify SDK keeps a log of charged events in a so called charging dataset. Your charging dataset can be found under the `charging_log` name (unless you change your storage settings, this dataset is stored in `storage/datasets/charging_log/`). Please note that this log is not available when running the Actor in production on the Apify platform. | ||
|
||
Because pricing configuration is stored by the Apify platform, all events will have a default price of $1. |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,30 @@ | ||
from apify import Actor | ||
|
||
|
||
async def main() -> None: | ||
async with Actor: | ||
# highlight-start | ||
# Charge for a single occurence of an event | ||
await Actor.charge(event_name='init') | ||
# highlight-end | ||
|
||
# Prepare some mock results | ||
result = [ | ||
{'word': 'Lorem'}, | ||
{'word': 'Ipsum'}, | ||
{'word': 'Dolor'}, | ||
{'word': 'Sit'}, | ||
{'word': 'Amet'}, | ||
] | ||
# highlight-start | ||
# Shortcut for charging for each pushed dataset item | ||
await Actor.push_data(result, 'result-item') | ||
# highlight-end | ||
|
||
# highlight-start | ||
# Or you can charge for a given number of events manually | ||
await Actor.charge( | ||
event_name='result-item', | ||
count=len(result), | ||
) | ||
# highlight-end |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,18 @@ | ||
from apify import Actor | ||
|
||
|
||
async def main() -> None: | ||
async with Actor: | ||
# Check the dataset because there might already be items | ||
# if the run migrated or was restarted | ||
default_dataset = await Actor.open_dataset() | ||
dataset_info = await default_dataset.get_info() | ||
charged_items = dataset_info.item_count if dataset_info else 0 | ||
|
||
# highlight-start | ||
if Actor.get_charging_manager().get_pricing_info().is_pay_per_event: | ||
# highlight-end | ||
await Actor.push_data({'hello': 'world'}, 'dataset-item') | ||
elif charged_items < (Actor.config.max_paid_dataset_items or 0): | ||
await Actor.push_data({'hello': 'world'}) | ||
charged_items += 1 |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,32 @@ | ||
import React from 'react'; | ||
import Link from '@docusaurus/Link'; | ||
// eslint-disable-next-line import/no-extraneous-dependencies | ||
import { useDocsVersion } from '@docusaurus/theme-common/internal'; | ||
import useDocusaurusContext from '@docusaurus/useDocusaurusContext'; | ||
|
||
// const pkg = require('../../../packages/crawlee/package.json'); | ||
// | ||
// const [v1, v2] = pkg.version.split('.'); | ||
// const stable = [v1, v2].join('.'); | ||
|
||
const ApiLink = ({ to, children }) => { | ||
return ( | ||
<Link to={`/reference/${to}`}>{children}</Link> | ||
); | ||
|
||
// const version = useDocsVersion(); | ||
// const { siteConfig } = useDocusaurusContext(); | ||
// | ||
// // if (siteConfig.presets[0][1].docs.disableVersioning || version.version === stable) { | ||
// if (siteConfig.presets[0][1].docs.disableVersioning) { | ||
// return ( | ||
// <Link to={`/api/${to}`}>{children}</Link> | ||
// ); | ||
// } | ||
// | ||
// return ( | ||
// <Link to={`/api/${version.version === 'current' ? 'next' : version.version}/${to}`}>{children}</Link> | ||
// ); | ||
}; | ||
|
||
export default ApiLink; |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,20 @@ | ||
import React from 'react'; | ||
|
||
export default function Gradients() { | ||
return ( | ||
<svg xmlns="http://www.w3.org/2000/svg" width="0" height="0" viewBox="0 0 0 0" fill="none"> | ||
<defs> | ||
<linearGradient id="gradient-1" x1="26.6667" y1="12" x2="14.2802" y2="34.5208" | ||
gradientUnits="userSpaceOnUse"> | ||
<stop offset="0%" stop-color="#9dceff"/> | ||
<stop offset="70%" stop-color="#4584b6"/> | ||
<stop offset="100%" stop-color="#4584b6"/> | ||
</linearGradient> | ||
<linearGradient id="gradient-2" x1="29.6667" y1="0" x2="-1.80874" y2="26.2295" | ||
gradientUnits="userSpaceOnUse"> | ||
<stop offset="0%" stop-color="#4584b6"/> | ||
</linearGradient> | ||
</defs> | ||
</svg> | ||
); | ||
} |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,104 @@ | ||
import React from 'react'; | ||
import clsx from 'clsx'; | ||
import styles from './Highlights.module.css'; | ||
import Gradients from './Gradients'; | ||
|
||
const FeatureList = [ | ||
{ | ||
title: 'Python with type hints', | ||
Svg: require('../../static/img/features/runs-on-py.svg').default, | ||
description: ( | ||
<> | ||
Crawlee for Python is written in a modern way using type hints, providing code completion in your IDE | ||
and helping you catch bugs early on build time. | ||
</> | ||
), | ||
}, | ||
// { | ||
// title: 'HTTP scraping', | ||
// Svg: require('../../static/img/features/fingerprints.svg').default, | ||
// description: ( | ||
// <> | ||
// Crawlee makes HTTP requests that <a href="https://crawlee.dev/docs/guides/avoid-blocking"><b>mimic browser headers and TLS fingerprints</b></a>. | ||
// It also rotates them automatically based on data about real-world traffic. Popular HTML | ||
// parsers <b><a href="https://crawlee.dev/docs/guides/cheerio-crawler-guide">Cheerio</a> | ||
// and <a href="https://crawlee.dev/docs/guides/jsdom-crawler-guide">JSDOM</a></b> are included. | ||
// </> | ||
// ), | ||
// }, | ||
{ | ||
title: 'Headless browsers', | ||
Svg: require('../../static/img/features/works-everywhere.svg').default, | ||
description: ( | ||
<> | ||
Switch your crawlers from HTTP to a <a href="https://crawlee.dev/python/api/class/PlaywrightCrawler">headless browser</a> in 3 lines of code. | ||
Crawlee builds on top of <b>Playwright</b> and adds its own features. Chrome, Firefox and more. | ||
</> | ||
), | ||
|
||
// TODO: this is not true yet | ||
// Crawlee builds on top of <b>Playwright</b> and adds its own <b>anti-blocking features and human-like fingerprints</b>. Chrome, Firefox and more. | ||
}, | ||
{ | ||
title: 'Automatic scaling and proxy management', | ||
Svg: require('../../static/img/features/auto-scaling.svg').default, | ||
description: ( | ||
<> | ||
Crawlee automatically manages concurrency based on <a href="https://crawlee.dev/python/api/class/AutoscaledPool">available system resources</a> and | ||
<a href="https://crawlee.dev/python/api/class/ProxyConfiguration">smartly rotates proxies</a>. | ||
Proxies that often time-out, return network errors or bad HTTP codes like 401 or 403 are discarded. | ||
</> | ||
), | ||
}, | ||
// { | ||
// title: 'Queue and Storage', | ||
// Svg: require('../../static/img/features/storage.svg').default, | ||
// description: ( | ||
// <> | ||
// You can <a href="https://crawlee.dev/docs/guides/result-storage">save files, screenshots and JSON results</a> to disk with one line of code | ||
// or plug an adapter for your DB. Your URLs are <a href="https://crawlee.dev/docs/guides/request-storage">kept in a queue</a> that ensures their | ||
// uniqueness and that you don't lose progress when something fails. | ||
// </> | ||
// ), | ||
// }, | ||
// { | ||
// title: 'Helpful utils and configurability', | ||
// Svg: require('../../static/img/features/node-requests.svg').default, | ||
// description: ( | ||
// <> | ||
// Crawlee includes tools for <a href="https://crawlee.dev/api/utils/namespace/social">extracting social handles</a> or phone numbers, infinite scrolling, blocking | ||
// unwanted assets <a href="https://crawlee.dev/api/utils">and many more</a>. It works great out of the box, but also provides | ||
// <a href="https://crawlee.dev/api/basic-crawler/interface/BasicCrawlerOptions">rich configuration options</a>. | ||
// </> | ||
// ), | ||
// }, | ||
]; | ||
|
||
function Feature({ Svg, title, description }) { | ||
return ( | ||
<div className={clsx('col col--4')}> | ||
<div className="padding-horiz--md padding-bottom--md"> | ||
<div className={styles.featureIcon}> | ||
{Svg ? <Svg alt={title}/> : null} | ||
</div> | ||
<h3>{title}</h3> | ||
<p>{description}</p> | ||
</div> | ||
</div> | ||
); | ||
} | ||
|
||
export default function Highlights() { | ||
return ( | ||
<section className={styles.features}> | ||
<Gradients /> | ||
<div className="container"> | ||
<div className="row"> | ||
{FeatureList.map((props, idx) => ( | ||
<Feature key={idx} {...props} /> | ||
))} | ||
</div> | ||
</div> | ||
</section> | ||
); | ||
} |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,46 @@ | ||
.features { | ||
display: flex; | ||
align-items: center; | ||
width: 100%; | ||
font-size: 18px; | ||
line-height: 32px; | ||
color: #41465d; | ||
} | ||
|
||
html[data-theme="dark"] .features { | ||
color: #b3b8d2; | ||
} | ||
|
||
.feature svg { | ||
height: 60px; | ||
width: 60px; | ||
} | ||
|
||
.features svg path:nth-child(1) { | ||
fill: url(#gradient-1) !important; | ||
} | ||
|
||
.features svg path:nth-child(n + 1) { | ||
fill: url(#gradient-2) !important; | ||
} | ||
|
||
html[data-theme="dark"] .featureIcon { | ||
background: #272c3d; | ||
} | ||
|
||
.featureIcon { | ||
display: flex; | ||
justify-content: center; | ||
align-items: center; | ||
margin-bottom: 24px; | ||
border-radius: 8px; | ||
background-color: #f2f3fb; | ||
width: 48px; | ||
height: 48px; | ||
} | ||
|
||
.features h3 { | ||
font-weight: 700; | ||
font-size: 18px; | ||
line-height: 32px; | ||
} |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,44 @@ | ||
import React from 'react'; | ||
import clsx from 'clsx'; | ||
import CodeBlock from '@theme/CodeBlock'; | ||
import Link from '@docusaurus/Link'; | ||
import styles from './RunnableCodeBlock.module.css'; | ||
|
||
const EXAMPLE_RUNNERS = { | ||
playwright: '6i5QsHBMtm3hKph70', | ||
puppeteer: '7tWSD8hrYzuc9Lte7', | ||
cheerio: 'kk67IcZkKSSBTslXI', | ||
}; | ||
|
||
const RunnableCodeBlock = ({ children, actor, hash, type, ...props }) => { | ||
hash = hash ?? children.hash; | ||
|
||
if (!children.code) { | ||
throw new Error(`RunnableCodeBlock requires "code" and "hash" props | ||
Make sure you are importing the code block contents with the roa-loader.`); | ||
} | ||
|
||
if (!hash) { | ||
return ( | ||
<CodeBlock {...props}> | ||
{ children.code } | ||
</CodeBlock> | ||
); | ||
} | ||
|
||
const href = `https://console.apify.com/actors/${actor ?? EXAMPLE_RUNNERS[type ?? 'playwright']}?runConfig=${hash}&asrc=run_on_apify`; | ||
|
||
return ( | ||
<div className={clsx(styles.container, 'runnable-code-block')}> | ||
<Link href={href} className={styles.button} rel="follow"> | ||
Run on | ||
<svg width="91" height="25" viewBox="0 0 91 25" fill="none" xmlns="http://www.w3.org/2000/svg" className="apify-logo-light alignMiddle_src-theme-Footer-index-module"><path d="M3.135 2.85A3.409 3.409 0 0 0 .227 6.699l2.016 14.398 8.483-19.304-7.59 1.059Z" fill="#97D700"></path><path d="M23.604 14.847 22.811 3.78a3.414 3.414 0 0 0-3.64-3.154c-.077 0-.153.014-.228.025l-3.274.452 7.192 16.124c.54-.67.805-1.52.743-2.379Z" fill="#71C5E8"></path><path d="M5.336 24.595c.58.066 1.169-.02 1.706-.248l12.35-5.211L13.514 5.97 5.336 24.595Z" fill="#FF9013"></path><path d="M33.83 5.304h3.903l5.448 14.623h-3.494l-1.022-2.994h-5.877l-1.025 2.994h-3.384L33.83 5.304Zm-.177 9.032h4.14l-2-5.994h-.086l-2.054 5.994ZM58.842 5.304h3.302v14.623h-3.302V5.304ZM64.634 5.304h10.71v2.7h-7.4v4.101h5.962v2.632h-5.963v5.186h-3.309V5.303ZM82.116 14.38l-5.498-9.076h3.748l3.428 6.016h.085l3.599-6.016H91l-5.56 9.054v5.569h-3.324v-5.548ZM51.75 5.304h-7.292v14.623h3.3v-4.634h3.993a4.995 4.995 0 1 0 0-9.99Zm-.364 7.417h-3.628V7.875h3.627a2.423 2.423 0 0 1 0 4.846Z" className="apify-logo" fill="#000"></path></svg> | ||
</Link> | ||
<CodeBlock {...props} className={clsx(styles.codeBlock, 'code-block', props.title != null ? 'has-title' : 'no-title')}> | ||
{ children.code } | ||
</CodeBlock> | ||
</div> | ||
); | ||
}; | ||
|
||
export default RunnableCodeBlock; |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,39 @@ | ||
.button { | ||
display: inline-block; | ||
padding: 3px 10px; | ||
position: absolute; | ||
top: 9px; | ||
right: 9px; | ||
z-index: 1; | ||
font-size: 16px; | ||
line-height: 28px; | ||
background: var(--prism-background-color); | ||
color: var(--prism-color); | ||
border: 1px solid var(--ifm-color-emphasis-300); | ||
border-radius: var(--ifm-global-radius); | ||
opacity: 0.7; | ||
font-weight: 600; | ||
width: 155px; | ||
} | ||
|
||
@media screen and (max-width: 768px) { | ||
.button { | ||
display: none; | ||
} | ||
} | ||
|
||
.button svg { | ||
height: 20px; | ||
position: absolute; | ||
top: 7.5px; | ||
right: 0; | ||
} | ||
|
||
.button:hover { | ||
opacity: 1; | ||
color: var(--prism-color); | ||
} | ||
|
||
.container { | ||
position: relative; | ||
} |
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actor.get_charging_manager()
- bracketsChargingManager.get_pricing_info
- no bracketsCould you unify it?