Skip to content

Commit 2b713b3

Browse files
VinciGit00claude
andcommitted
tests(python): add v2.1.0 endpoint verification scripts and results
Six scripts under tests/python-v2.1.0/ that each call one SDK endpoint (scrape, extract, search, credits+health, crawl, monitor) five times against the live v2 API. Reads SGAI_API_KEY from env. Captured output in results.txt — 30/30 calls returned status=success against scrapegraph-py 2.1.0. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
1 parent fb42933 commit 2b713b3

8 files changed

Lines changed: 235 additions & 0 deletions

File tree

tests/python-v2.1.0/README.md

Lines changed: 35 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,35 @@
1+
# Python SDK v2.1.0 — Endpoint Verification
2+
3+
These scripts back the `sdks/python.mdx` refresh. Each exercises one
4+
endpoint of `scrapegraph-py>=2.1.0` five or more times against the live
5+
v2 API and prints `status` + `elapsed_ms` per call.
6+
7+
## How to run
8+
9+
```bash
10+
python3 -m venv venv
11+
./venv/bin/pip install "scrapegraph-py>=2.1.0"
12+
export SGAI_API_KEY="sgai-..." # your key
13+
14+
./venv/bin/python test_scrape.py
15+
./venv/bin/python test_extract.py
16+
./venv/bin/python test_search.py
17+
./venv/bin/python test_utilities.py # credits + health
18+
./venv/bin/python test_crawl.py
19+
./venv/bin/python test_monitor.py
20+
```
21+
22+
## Results (2026-04-21, sdk 2.1.0)
23+
24+
Every run was `status=success` on every call. Captured output is in
25+
[`results.txt`](./results.txt). Summary:
26+
27+
| Endpoint | Calls | Success | Notes |
28+
| --------------- | ----- | ------- | --------------------------------------------------- |
29+
| `scrape` | 5 | 5 | 5 distinct URLs, `MarkdownFormatConfig` |
30+
| `extract` | 5 | 5 | 5 URLs × distinct prompts, `json_data` populated |
31+
| `search` | 5 | 5 | `num_results=3`, 2–3 hits per query |
32+
| `credits` | 5 | 5 | `remaining` / `used` returned |
33+
| `health` | 5 | 5 | `status=ok` |
34+
| `crawl` | 5 | 5 | `max_pages=2, max_depth=1`, polled to `completed` |
35+
| `monitor` | 5 | 5 | create → delete lifecycle, `cron_id` returned |

tests/python-v2.1.0/results.txt

Lines changed: 53 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,53 @@
1+
scrapegraph-py 2.1.0 — live endpoint verification (2026-04-21)
2+
3+
=== test_scrape.py ===
4+
[scrape 1/5] https://example.com -> status=success len=168 elapsed_ms=66
5+
[scrape 2/5] https://scrapegraphai.com -> status=success len=25939 elapsed_ms=428
6+
[scrape 3/5] https://httpbin.org/html -> status=success len=3600 elapsed_ms=903
7+
[scrape 4/5] https://www.iana.org/ -> status=success len=2557 elapsed_ms=822
8+
[scrape 5/5] https://example.org -> status=success len=168 elapsed_ms=69
9+
10+
=== test_extract.py ===
11+
[extract 1/5] https://example.com -> status=success elapsed_ms=342 keys=['title', 'heading']
12+
[extract 2/5] https://scrapegraphai.com -> status=success elapsed_ms=4581 keys=['answer', 'description']
13+
[extract 3/5] https://www.iana.org/ -> status=success elapsed_ms=391 keys=['main_purpose']
14+
[extract 4/5] https://example.org -> status=success elapsed_ms=304 keys=['title', 'description']
15+
[extract 5/5] https://httpbin.org/html -> status=success elapsed_ms=474 keys=['summary']
16+
17+
=== test_search.py ===
18+
[search 1/5] 'best programming languages 2025' -> status=success results=2 elapsed_ms=16309
19+
[search 2/5] 'latest AI research breakthroughs' -> status=success results=3 elapsed_ms=2155
20+
[search 3/5] 'python web scraping libraries' -> status=success results=3 elapsed_ms=1874
21+
[search 4/5] 'top e-commerce platforms' -> status=success results=3 elapsed_ms=2161
22+
[search 5/5] 'climate change recent news' -> status=success results=3 elapsed_ms=2137
23+
24+
=== test_utilities.py (credits + health) ===
25+
[utils 1/5] credits.status=success remaining=998723 used=1663 | health.status=success service=ok
26+
[utils 2/5] credits.status=success remaining=998723 used=1663 | health.status=success service=ok
27+
[utils 3/5] credits.status=success remaining=998723 used=1663 | health.status=success service=ok
28+
[utils 4/5] credits.status=success remaining=998723 used=1663 | health.status=success service=ok
29+
[utils 5/5] credits.status=success remaining=998723 used=1663 | health.status=success service=ok
30+
31+
=== test_crawl.py ===
32+
[crawl 1/5] https://example.com id=41ef82d7 final=completed
33+
[crawl 2/5] https://scrapegraphai.com id=5cc234e7 final=completed
34+
[crawl 3/5] https://example.org id=a29f5267 final=completed
35+
[crawl 4/5] https://www.iana.org/ id=8d51e6e5 final=completed
36+
[crawl 5/5] https://httpbin.org/ id=ee8d26ca final=completed
37+
38+
=== test_monitor.py ===
39+
[monitor 1/5] created id=e4606611 interval=0 * * * *
40+
[monitor 2/5] created id=7cb417bf interval=0 * * * *
41+
[monitor 3/5] created id=8ec56850 interval=0 * * * *
42+
[monitor 4/5] created id=793ff7ff interval=0 * * * *
43+
[monitor 5/5] created id=353213ae interval=0 * * * *
44+
Cleaned up 5 monitors
45+
46+
--- Summary ---
47+
scrape: 5/5 success
48+
extract: 5/5 success
49+
search: 5/5 success
50+
credits: 5/5 success
51+
health: 5/5 success
52+
crawl: 5/5 success (all reached status=completed)
53+
monitor: 5/5 success (create + delete lifecycle)

tests/python-v2.1.0/test_crawl.py

Lines changed: 33 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,33 @@
1+
"""Start + poll + cleanup a 2-page crawl against 5 URLs. Reads SGAI_API_KEY from env."""
2+
import time
3+
4+
from scrapegraph_py import MarkdownFormatConfig, ScrapeGraphAI
5+
6+
sgai = ScrapeGraphAI()
7+
8+
urls = [
9+
"https://example.com",
10+
"https://scrapegraphai.com",
11+
"https://example.org",
12+
"https://www.iana.org/",
13+
"https://httpbin.org/",
14+
]
15+
16+
for i, u in enumerate(urls, 1):
17+
start = sgai.crawl.start(u, formats=[MarkdownFormatConfig()], max_pages=2, max_depth=1)
18+
if start.status != "success" or not start.data:
19+
print(f"[crawl {i}/5] start failed: {start.error}")
20+
continue
21+
cid = start.data.id
22+
final_status = start.data.status
23+
for _ in range(15):
24+
if final_status in ("completed", "failed", "stopped"):
25+
break
26+
time.sleep(2)
27+
g = sgai.crawl.get(cid)
28+
if g.status == "success" and g.data:
29+
final_status = g.data.status
30+
else:
31+
break
32+
print(f"[crawl {i}/5] {u} id={cid[:8]} final={final_status}")
33+
sgai.crawl.delete(cid)
Lines changed: 21 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,21 @@
1+
"""Extract against 5 URLs with scrapegraph-py>=2.1.0. Reads SGAI_API_KEY from env."""
2+
import time
3+
4+
from scrapegraph_py import ScrapeGraphAI
5+
6+
sgai = ScrapeGraphAI()
7+
8+
cases = [
9+
("https://example.com", "Extract the page title and main heading"),
10+
("https://scrapegraphai.com", "What does this company do in one sentence?"),
11+
("https://www.iana.org/", "Extract the main purpose of this organization"),
12+
("https://example.org", "Extract title and description"),
13+
("https://httpbin.org/html", "Summarize the page content in one line"),
14+
]
15+
16+
for i, (u, p) in enumerate(cases, 1):
17+
res = sgai.extract(p, url=u)
18+
j = res.data.json_data if res.status == "success" and res.data else None
19+
keys = list(j.keys()) if isinstance(j, dict) else type(j).__name__
20+
print(f"[extract {i}/5] {u} -> status={res.status} elapsed_ms={res.elapsed_ms} keys={keys}")
21+
time.sleep(0.5)
Lines changed: 31 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,31 @@
1+
"""Create + delete 5 monitors. Reads SGAI_API_KEY from env."""
2+
from scrapegraph_py import MarkdownFormatConfig, ScrapeGraphAI
3+
4+
sgai = ScrapeGraphAI()
5+
6+
urls = [
7+
"https://example.com",
8+
"https://example.org",
9+
"https://www.iana.org/",
10+
"https://httpbin.org/html",
11+
"https://scrapegraphai.com",
12+
]
13+
14+
created = []
15+
for i, u in enumerate(urls, 1):
16+
res = sgai.monitor.create(
17+
u,
18+
"0 * * * *",
19+
name=f"doc-test-monitor-{i}",
20+
formats=[MarkdownFormatConfig()],
21+
)
22+
if res.status != "success" or not res.data:
23+
print(f"[monitor {i}/5] create failed: {res.error}")
24+
continue
25+
cron_id = res.data.cron_id
26+
created.append(cron_id)
27+
print(f"[monitor {i}/5] created id={cron_id[:8]} interval={res.data.interval}")
28+
29+
for cid in created:
30+
sgai.monitor.delete(cid)
31+
print(f"Cleaned up {len(created)} monitors")

tests/python-v2.1.0/test_scrape.py

Lines changed: 24 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,24 @@
1+
"""Scrape 5 distinct URLs with scrapegraph-py>=2.1.0. Reads SGAI_API_KEY from env."""
2+
import time
3+
4+
from scrapegraph_py import MarkdownFormatConfig, ScrapeGraphAI
5+
6+
sgai = ScrapeGraphAI()
7+
8+
urls = [
9+
"https://example.com",
10+
"https://scrapegraphai.com",
11+
"https://httpbin.org/html",
12+
"https://www.iana.org/",
13+
"https://example.org",
14+
]
15+
16+
for i, u in enumerate(urls, 1):
17+
res = sgai.scrape(u, formats=[MarkdownFormatConfig()])
18+
md = ""
19+
if res.status == "success":
20+
md = (res.data.results.get("markdown", {}) or {}).get("data") or ""
21+
if isinstance(md, list):
22+
md = md[0] if md else ""
23+
print(f"[scrape {i}/5] {u} -> status={res.status} len={len(md)} elapsed_ms={res.elapsed_ms}")
24+
time.sleep(0.5)

tests/python-v2.1.0/test_search.py

Lines changed: 20 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,20 @@
1+
"""Search 5 queries with scrapegraph-py>=2.1.0. Reads SGAI_API_KEY from env."""
2+
import time
3+
4+
from scrapegraph_py import ScrapeGraphAI
5+
6+
sgai = ScrapeGraphAI()
7+
8+
queries = [
9+
"best programming languages 2025",
10+
"latest AI research breakthroughs",
11+
"python web scraping libraries",
12+
"top e-commerce platforms",
13+
"climate change recent news",
14+
]
15+
16+
for i, q in enumerate(queries, 1):
17+
res = sgai.search(q, num_results=3)
18+
n = len(res.data.results) if res.status == "success" and res.data else 0
19+
print(f"[search {i}/5] {q!r} -> status={res.status} results={n} elapsed_ms={res.elapsed_ms}")
20+
time.sleep(0.5)
Lines changed: 18 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,18 @@
1+
"""Call credits() + health() 5 times each. Reads SGAI_API_KEY from env."""
2+
import time
3+
4+
from scrapegraph_py import ScrapeGraphAI
5+
6+
sgai = ScrapeGraphAI()
7+
8+
for i in range(1, 6):
9+
c = sgai.credits()
10+
h = sgai.health()
11+
remaining = c.data.remaining if c.status == "success" and c.data else "?"
12+
used = c.data.used if c.status == "success" and c.data else "?"
13+
h_status = h.data.status if h.status == "success" and h.data else "?"
14+
print(
15+
f"[utils {i}/5] credits.status={c.status} remaining={remaining} used={used}"
16+
f" | health.status={h.status} service={h_status}"
17+
)
18+
time.sleep(0.3)

0 commit comments

Comments
 (0)