Skip to content

yeshu2004/go-web-crawler

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

12 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Web Crawler

A simple and efficient web crawler written in Go. This is designed for crawling web pages and following links to deepen exploration(BFS approch).

Features

  • Multi-threaded crawling for efficiency
  • Bloom Filter for Duplicates URL
  • Customizable depth and URL filtering
  • Graceful handling of robots.txt
  • Parsing HTML and extraction of links
  • Added comments for easy work flow

Run

  1. Set Up Redis Stack with Docker:
    • Pull the Redis Stack image:
      docker pull redis/redis-stack:latest
    • Run the Redis Stack container:
      docker run -d -p 6379:6379 --name redis-stack redis/redis-stack:latest
    • Verify the container is running:
      docker ps

Output

Screenshot 2026-02-04 at 10 37 23 AM Screenshot 2026-02-04 at 10 37 57 AM Screenshot 2026-02-04 at 10 41 31 AM Screenshot 2026-02-04 at 10 41 01 AM

About

A simple and efficient web crawler written in Go. This is designed for crawling web pages and following links to deepen exploration(BFS approch).

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors

Languages