🕸️ WebScraper AI Agent

An AI-powered web scraper that extracts and processes website content using Crawl4AI, LangChain, HuggingFace Embeddings, FAISS, and GROQ LLMs. It features a simple Gradio UI and allows users to download extracted text and ask intelligent questions from web data.

🚀 Features

🌐 Crawl websites asynchronously with Crawl4AI
📄 Extract and chunk website text data
🧾 Download extracted content as a .txt file
🤖 Embed content using HuggingFaceEmbeddings
🔍 Perform semantic search using FAISS
💬 Answer questions using GROQ LLM (via LangChain)
🎛️ Clean and interactive UI using Gradio

🧰 Tech Stack

Python
LangChain
FAISS
HuggingFace Transformers
Crawl4AI
Gradio
GROQ LLM
dotenv

📦 Installation

Clone the Repo

git clone https://github.com/jasoncobra3/WebScraper_AI_Agent.git
cd WebScraper_AI_Agent

Create Virtual Environment
```
 python -m venv venv
```

Activate the Virtual Environment

 # Windows:
 venv\Scripts\activate
 # macOS/Linux:
 venv/bin/activate

Install Dependencies
```
pip install -r requirements.txt
```

🔐 Setup

Create a .env file in root folder with
```
 GROQ_API_KEY=your_groq_api_key_here
```

🚀Run the App

Run the Script in Terminal

  python app.py

📁 Project Structure

├── app.py
├── requirements.txt
├── .env
├── .gitignore
├── README.md
└── Assets/

📸 Screenshots

🌐 Scraping Website	Semantic Search 📄

🤝 Contributing

Pull requests are welcome. For major changes, please open an issue first to discuss what you’d like to change.

Name		Name	Last commit message	Last commit date
Latest commit History 9 Commits
.gradio		.gradio
Assets		Assets
.gitignore		.gitignore
README.md		README.md
app.py		app.py
gohyer.com.txt		gohyer.com.txt
main.py		main.py
requirement.txt		requirement.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

🕸️ WebScraper AI Agent

🚀 Features

🧰 Tech Stack

📦 Installation

🔐 Setup

🚀Run the App

📁 Project Structure

📸 Screenshots

🤝 Contributing

About

Uh oh!

Languages

jasoncobra3/WebScraper_AI_Agent

Folders and files

Latest commit

History

Repository files navigation

🕸️ WebScraper AI Agent

🚀 Features

🧰 Tech Stack

📦 Installation

🔐 Setup

🚀Run the App

📁 Project Structure

📸 Screenshots

🤝 Contributing

About

Topics

Resources

Uh oh!

Stars

Watchers

Forks

Languages