A powerful and completely FREE Apify actor that scrapes contact data from Apollo.io lists. Extract names, emails, phone numbers, job titles, companies, and more with just a URL and page count!
π Converted from Chrome Extension to Cloud-Based Scraper!
π First time here? Read START_HERE.md to choose your path!
β‘ Want to start now? Jump to QUICK_START.md for 5-minute setup!
- π― Simple Input: Just provide an Apollo.io list URL and number of pages
- π° Completely Free: Designed to run on Apify's free tier
- π Rich Data: Extract first name, last name, email, phone, title, company, and more
- β‘ Fast & Reliable: Uses Playwright for stable scraping
- π Rate Limiting: Configurable delays between pages to avoid blocks
- π₯ Multiple Export Formats: Download as CSV, JSON, Excel, or HTML
- π‘οΈ Proxy Support: Built-in Apify proxy support for better reliability
| Document | Description | For Who? |
|---|---|---|
| QUICK_START.md | Get started in 5 minutes | Everyone |
| SETUP_GUIDE.md | Complete setup instructions | Beginners |
| README.md | Main documentation (this file) | Everyone |
| USAGE.md | Detailed usage & examples | Users |
| DEPLOYMENT.md | How to deploy to Apify | DevOps |
| CONTRIBUTING.md | How to contribute | Developers |
| PROJECT_SUMMARY.md | Technical overview | Developers |
| CHANGELOG.md | Version history | Everyone |
- Go to Apify: Visit apify.com and create a free account
- Create Actor: Click on "Actors" β "Create new" β "Import from Git"
- Import This Repo: Paste your repository URL
- Build & Run: Click "Build" and then "Start"
import { ApifyClient } from 'apify-client';
const client = new ApifyClient({
token: 'YOUR_APIFY_TOKEN',
});
const input = {
url: "https://app.apollo.io/#/people?page=1",
numberOfPages: 5,
timeBetweenPages: 5
};
const run = await client.actor("YOUR_ACTOR_ID").call(input);
const { items } = await client.dataset(run.defaultDatasetId).listItems();
console.log(items);-
Clone this repository
git clone <your-repo-url> cd apollo-data-scraper
-
Install dependencies
npm install
-
Set up input - Create a file
input.json:{ "url": "https://app.apollo.io/#/people?page=1", "numberOfPages": 5, "timeBetweenPages": 5 } -
Run the actor
npm start
| Parameter | Type | Required | Default | Description |
|---|---|---|---|---|
url |
String | β Yes | - | Apollo.io list URL (must start with https://app.apollo.io/) |
numberOfPages |
Integer | β Yes | 1 | Number of pages to scrape (1-100) |
timeBetweenPages |
Integer | β No | 5 | Delay in seconds between pages (2-30) |
proxyConfiguration |
Object | β No | {useApifyProxy: true} |
Proxy settings for the scraper |
{
"url": "https://app.apollo.io/#/people?finderViewId=123456&page=1",
"numberOfPages": 10,
"timeBetweenPages": 5
}The actor saves data to an Apify dataset. Each contact is saved as:
{
"firstName": "John",
"lastName": "Doe",
"fullName": "John Doe",
"email": "john.doe@company.com",
"phone": "+1 (555) 123-4567",
"title": "Software Engineer",
"company": "Tech Corp"
}You can download the scraped data in multiple formats:
- CSV - Perfect for Excel and Google Sheets
- JSON - For developers and APIs
- Excel - Native XLSX format
- HTML - For viewing in browser
- RSS - For feed readers
- Start Small: Test with 1-2 pages first to ensure your URL works
- Use Delays: Keep
timeBetweenPagesat 5+ seconds to avoid rate limiting - Check URL: Make sure you're logged into Apollo.io and the URL is accessible
- Free Tier: On Apify's free tier, you get $5/month credit which is enough for thousands of contacts
- Proxy Usage: Enable Apify proxy for better reliability (included in free tier)
This actor is optimized to run on Apify's free tier:
- Free Credits: $5/month (plenty for most use cases)
- Memory: Uses minimal memory (256 MB is enough)
- Runtime: Efficient scraping to minimize compute time
- Storage: Datasets are free on Apify
Estimated Costs (on free tier):
- Scraping 100 contacts β $0.01-0.02
- Scraping 1,000 contacts β $0.10-0.20
- With $5 free monthly credit, you can scrape 20,000-50,000 contacts/month for FREE!
You need to be logged into Apollo.io for this scraper to work. There are two ways to handle this:
- Run the actor in headed mode (set
headless: falsein main.js) - The browser will open - log into Apollo.io manually
- The scraper will then access your lists
- Log into Apollo.io in your browser
- Export your cookies using a browser extension
- Add cookie support to the actor (modify main.js to inject cookies)
- β Only scrape data you have permission to access
- β Respect Apollo.io's Terms of Service
- β Use reasonable delays between requests
- β Don't overload their servers
β οΈ This tool is for personal/research use- β Don't use for spam or unauthorized purposes
apollo-data-scraper/
βββ actor.json # Actor configuration
βββ INPUT_SCHEMA.json # Input field definitions
βββ main.js # Main scraping logic
βββ package.json # Dependencies
βββ Dockerfile # Docker configuration
βββ README.md # This file
- apify (^3.1.0) - Apify SDK for actor development
- playwright (^1.40.0) - Browser automation
You can modify main.js to:
- Extract additional fields from the table
- Change the data structure
- Add custom filters
- Implement different scraping strategies
| Issue | Solution |
|---|---|
| "No table found" | Make sure you're logged into Apollo.io and the URL is valid |
| "No data scraped" | Check if the page requires authentication or has changed structure |
| Rate limiting | Increase timeBetweenPages to 10+ seconds |
| Timeout errors | Increase timeout values in main.js |
| Actor fails to build | Make sure all files are committed to your repository |
- π README.md (you are here) - Main documentation
- β‘ QUICK_START.md - Get started in 5 minutes
- π USAGE.md - Detailed usage examples and best practices
- π DEPLOYMENT.md - Complete deployment guide
- π€ CONTRIBUTING.md - How to contribute to this project
- π CHANGELOG.md - Version history and updates
- π PROJECT_SUMMARY.md - Technical overview
| Feature | Browser Extension | Apify Actor |
|---|---|---|
| Installation | Chrome only | Works anywhere |
| Automation | Manual clicks | Fully automated |
| Scheduling | No | Yes (free schedules) |
| API Access | No | Yes |
| Large Datasets | Slow | Fast & parallel |
| Cost | Free | Free tier available |
| Reliability | Browser dependent | Cloud-based |
π See PROJECT_SUMMARY.md for detailed comparison
Contributions are welcome! Feel free to:
- Report bugs
- Suggest new features
- Submit pull requests
- Improve documentation
This project is licensed under the MIT License - see the LICENSE file for details.
- Original Chrome extension by Liveupx
- Converted to Apify Actor for cloud automation
- Built with Apify SDK and Playwright
- π Issues: GitHub Issues
- π¬ Discussions: GitHub Discussions
- π§ Email: your-email@example.com
- β Donate: Buy Me a Coffee
Made with β€οΈ for the data community
Disclaimer: This tool is for educational and research purposes. Always respect website terms of service and data privacy laws.