-
Notifications
You must be signed in to change notification settings - Fork 2
Home
Welcome to the official wiki page of the Node-Crawler repository. This is your go-to resource for understanding and using this tool. Node-Crawler is a powerful, flexible, and easy-to-use web crawler, with a node-based editing system built on top of Next.js and React Flow.
Here you'll find detailed documentation about each of the functionalities of this application and guides on how to extend the application to help you make the most out of Node-Crawler.
Node-Crawler is a Node.js-based web application for creating custom web crawlers and manipulating the collected data. Users can construct and modify their own crawler workflows by dragging and dropping nodes onto a canvas. It also provides extensive data manipulation and transformation options for cleaning and restructuring the gathered data. The final output can be exported in various formats including JSON, CSV, and database formats.
If you're new to Node-Crawler, I recommend starting with the installation instructions provided in the repository's README. I've made the setup process straightforward, whether you're working in a development environment or deploying for production.
Node-Crawler is built to be extensible. If the existing nodes don't meet your requirements, you can create your own. The repository's wiki has a detailed guide on how to implement a new node into the Node-Crawler system.
I hope this wiki will help you to get the most out of Node-Crawler. Happy crawling!
- Welcome to Node-Crawler's Wiki!
- Editor Components
- Form Components
- Configuration
-
Engine, Execution, and Pipelines
- Overview of Node Logic and Types
- Execution Flow and Sequence of Web Crawlers
- Handling Different Node Types and Data Extraction
- Error Handling, Logging, and Result Management
- Overview of Pipelines and Their Structure
- Execution Control (Step-by-Step, Next Node, Backtrack)
- Pipeline Activation, Deactivation, and Connection with Nodes
- Data Models
- State Management
- Utilities
- Implementing a New Node