Skip to content
Closed
Show file tree
Hide file tree
Changes from 4 commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
Binary file modified .DS_Store
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

to remove

Binary file not shown.
3 changes: 3 additions & 0 deletions .gitignore
Original file line number Diff line number Diff line change
@@ -0,0 +1,3 @@
.env
.DS_Store
*.csv
1 change: 1 addition & 0 deletions cookbook/company-info/.env.example
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

to remove

Original file line number Diff line number Diff line change
@@ -0,0 +1 @@
SGAI_API_KEY="your-api-key-here"
1 change: 1 addition & 0 deletions cookbook/company-info/scrapegraph_llama_index.ipynb

Large diffs are not rendered by default.

1 change: 1 addition & 0 deletions cookbook/github-trending/.env.example
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

to remove

Original file line number Diff line number Diff line change
@@ -0,0 +1 @@
SGAI_API_KEY="your-api-key-here"
1 change: 1 addition & 0 deletions cookbook/github-trending/scrapegraph_llama_index.ipynb

Large diffs are not rendered by default.

1 change: 1 addition & 0 deletions cookbook/homes-forsale/.env.example
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

to remove

Original file line number Diff line number Diff line change
@@ -0,0 +1 @@
SGAI_API_KEY="your-api-key-here"
1 change: 1 addition & 0 deletions cookbook/homes-forsale/scrapegraph_llama_index.ipynb

Large diffs are not rendered by default.

1 change: 1 addition & 0 deletions cookbook/wired-news/.env.example
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

to remove

Original file line number Diff line number Diff line change
@@ -0,0 +1 @@
SGAI_API_KEY="your-api-key-here"
2 changes: 1 addition & 1 deletion cookbook/wired-news/scrapegraph_langgraph.ipynb
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

doesn't need to be changed

Large diffs are not rendered by default.

1 change: 1 addition & 0 deletions cookbook/wired-news/scrapegraph_llama_index.ipynb

Large diffs are not rendered by default.

9 changes: 0 additions & 9 deletions scrapegraph-py/cookbook/README.md
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

do not remove

This file was deleted.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

to remove

Large diffs are not rendered by default.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

to remove

Large diffs are not rendered by default.

960 changes: 960 additions & 0 deletions scrapegraph-py/cookbook/company-info/scrapegraph_llama_index.ipynb
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

to remove

Large diffs are not rendered by default.

1 change: 1 addition & 0 deletions scrapegraph-py/cookbook/company-info/scrapegraph_sdk.ipynb
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

to remove

Large diffs are not rendered by default.

39 changes: 39 additions & 0 deletions scrapegraph-py/cookbook/github-trending/a.py
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

to remove

Original file line number Diff line number Diff line change
@@ -0,0 +1,39 @@
from llama_index.tools.scrapegraph.base import ScrapegraphToolSpec
from pydantic import BaseModel, Field
from typing import List
import os

# Initialize the ScrapegraphToolSpec
scrapegraph_tool = ScrapegraphToolSpec()

# Define the schema for a single repository
class RepositorySchema(BaseModel):
name: str = Field(description="Name of the repository (e.g., 'owner/repo')")
description: str = Field(description="Description of the repository")
stars: int = Field(description="Star count of the repository")
forks: int = Field(description="Fork count of the repository")
today_stars: int = Field(description="Stars gained today")
language: str = Field(description="Programming language used")

# Define the schema for a list of repositories
class ListRepositoriesSchema(BaseModel):
repositories: List[RepositorySchema] = Field(description="List of GitHub trending repositories")

# Make the API call to scrape GitHub trending repositories
response = scrapegraph_tool.scrapegraph_smartscraper(
prompt="Extract information about trending GitHub repositories",
url="https://github.com/trending",
api_key="sgai-cd497c94-9ac5-4259-b7b5-f3283affe481",
schema=ListRepositoriesSchema,
)

# Get the result and print each repository
result = response["result"]
print("\nTrending Repositories:")
for repo in result["repositories"]:
print(f"\nRepository: {repo['name']}")
print(f"Description: {repo['description']}")
print(f"Stars: {repo['stars']}")
print(f"Forks: {repo['forks']}")
print(f"Today's Stars: {repo['today_stars']}")
print(f"Language: {repo['language']}")
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

to remove

Large diffs are not rendered by default.

1,042 changes: 1,042 additions & 0 deletions scrapegraph-py/cookbook/github-trending/scrapegraph_llama_index.ipynb
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

to remove

Large diffs are not rendered by default.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

to remove

Large diffs are not rendered by default.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

to remove

Large diffs are not rendered by default.

803 changes: 803 additions & 0 deletions scrapegraph-py/cookbook/homes-forsale/scrapegraph_llama_index.ipynb
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

to remove

Large diffs are not rendered by default.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

to remove

Large diffs are not rendered by default.

1,302 changes: 1,302 additions & 0 deletions scrapegraph-py/cookbook/research-agent/scrapegraph_langgraph_tavily.ipynb
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

to remove

Large diffs are not rendered by default.

26 changes: 26 additions & 0 deletions scrapegraph-py/cookbook/trial.py
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

to remove

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

the files changed panel is not updated and they are not present in the codebase (https://github.com/ScrapeGraphAI/scrapegraph-sdk/tree/lama-index-integration/cookbook/company-info)

Original file line number Diff line number Diff line change
@@ -0,0 +1,26 @@

from llama_index.tools.scrapegraph.base import ScrapegraphToolSpec

scrapegraph_tool = ScrapegraphToolSpec()

from pydantic import BaseModel, Field

class FounderSchema(BaseModel):
name: str = Field(description="Name of the founder")
role: str = Field(description="Role of the founder")
social_media: str = Field(description="Social media URL of the founder")

class ListFoundersSchema(BaseModel):
founders: list[FounderSchema] = Field(description="List of founders")

response = scrapegraph_tool.scrapegraph_smartscraper(
prompt="Extract product information",
url="https://scrapegraphai.com/",
api_key="sgai-cd497c94-9ac5-4259-b7b5-f3283affe481",
schema=ListFoundersSchema,
)

result = response["result"]

for founder in result["founders"]:
print(founder)
41 changes: 41 additions & 0 deletions scrapegraph-py/cookbook/wired-news/a.py
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

to remove

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Original file line number Diff line number Diff line change
@@ -0,0 +1,41 @@
from llama_index.tools.scrapegraph.base import ScrapegraphToolSpec
from pydantic import BaseModel, Field
from typing import List
import os

# Initialize the ScrapegraphToolSpec
scrapegraph_tool = ScrapegraphToolSpec()

# Schema for a single news item
class NewsItemSchema(BaseModel):
category: str = Field(description="Category of the news (e.g., 'Health', 'Environment')")
title: str = Field(description="Title of the news article")
link: str = Field(description="URL to the news article")
author: str = Field(description="Author of the news article")

# Schema containing a list of news items
class ListNewsSchema(BaseModel):
news: List[NewsItemSchema] = Field(description="List of news articles with their details")

# Make the API call to scrape news articles
response = scrapegraph_tool.scrapegraph_smartscraper(
prompt="Extract information about science news articles",
url="https://www.wired.com/tag/science/",
api_key="sgai-cd497c94-9ac5-4259-b7b5-f3283affe481",
schema=ListNewsSchema,
)

# Get the result and print each news article
result = response["result"]
print("\nWired Science News Articles:")
for article in result["news"]:
print(f"\nCategory: {article['category']}")
print(f"Title: {article['title']}")
print(f"Author: {article['author']}")
print(f"Link: {article['link']}")

# Save to CSV (optional)
import pandas as pd
df = pd.DataFrame(result["news"])
df.to_csv("wired_news.csv", index=False)
print("\nData saved to wired_news.csv")
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

to remove

Large diffs are not rendered by default.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

to remove

Large diffs are not rendered by default.

1 change: 1 addition & 0 deletions scrapegraph-py/cookbook/wired-news/scrapegraph_sdk.ipynb
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

to remove

Large diffs are not rendered by default.

Loading