tests(python): add v2.1.0 endpoint verification scripts and results

VinciGit00 · claude · VinciGit00 · commit 2b713b340884 · 2026-04-21T15:11:12.000+02:00
Six scripts under tests/python-v2.1.0/ that each call one SDK endpoint
(scrape, extract, search, credits+health, crawl, monitor) five times
against the live v2 API. Reads SGAI_API_KEY from env. Captured output
in results.txt — 30/30 calls returned status=success against
scrapegraph-py 2.1.0.

Co-Authored-By: Claude Opus 4.7 (1M context) &lt;noreply@anthropic.com&gt;
diff --git a/tests/python-v2.1.0/README.md b/tests/python-v2.1.0/README.md
@@ -0,0 +1,35 @@
+# Python SDK v2.1.0 — Endpoint Verification
+
+These scripts back the `sdks/python.mdx` refresh. Each exercises one
+endpoint of `scrapegraph-py>=2.1.0` five or more times against the live
+v2 API and prints `status` + `elapsed_ms` per call.
+
+## How to run
+
+```bash
+python3 -m venv venv
+./venv/bin/pip install "scrapegraph-py>=2.1.0"
+export SGAI_API_KEY="sgai-..."     # your key
+
+./venv/bin/python test_scrape.py
+./venv/bin/python test_extract.py
+./venv/bin/python test_search.py
+./venv/bin/python test_utilities.py   # credits + health
+./venv/bin/python test_crawl.py
+./venv/bin/python test_monitor.py
+```
+
+## Results (2026-04-21, sdk 2.1.0)
+
+Every run was `status=success` on every call. Captured output is in
+[`results.txt`](./results.txt). Summary:
+
+| Endpoint        | Calls | Success | Notes                                               |
+| --------------- | ----- | ------- | --------------------------------------------------- |
+| `scrape`        | 5     | 5       | 5 distinct URLs, `MarkdownFormatConfig`             |
+| `extract`       | 5     | 5       | 5 URLs × distinct prompts, `json_data` populated    |
+| `search`        | 5     | 5       | `num_results=3`, 2–3 hits per query                 |
+| `credits`       | 5     | 5       | `remaining` / `used` returned                       |
+| `health`        | 5     | 5       | `status=ok`                                         |
+| `crawl`         | 5     | 5       | `max_pages=2, max_depth=1`, polled to `completed`   |
+| `monitor`       | 5     | 5       | create → delete lifecycle, `cron_id` returned       |
diff --git a/tests/python-v2.1.0/results.txt b/tests/python-v2.1.0/results.txt
@@ -0,0 +1,53 @@
+scrapegraph-py 2.1.0 — live endpoint verification (2026-04-21)
+
+=== test_scrape.py ===
+[scrape 1/5] https://example.com -> status=success len=168 elapsed_ms=66
+[scrape 2/5] https://scrapegraphai.com -> status=success len=25939 elapsed_ms=428
+[scrape 3/5] https://httpbin.org/html -> status=success len=3600 elapsed_ms=903
+[scrape 4/5] https://www.iana.org/ -> status=success len=2557 elapsed_ms=822
+[scrape 5/5] https://example.org -> status=success len=168 elapsed_ms=69
+
+=== test_extract.py ===
+[extract 1/5] https://example.com -> status=success elapsed_ms=342 keys=['title', 'heading']
+[extract 2/5] https://scrapegraphai.com -> status=success elapsed_ms=4581 keys=['answer', 'description']
+[extract 3/5] https://www.iana.org/ -> status=success elapsed_ms=391 keys=['main_purpose']
+[extract 4/5] https://example.org -> status=success elapsed_ms=304 keys=['title', 'description']
+[extract 5/5] https://httpbin.org/html -> status=success elapsed_ms=474 keys=['summary']
+
+=== test_search.py ===
+[search 1/5] 'best programming languages 2025' -> status=success results=2 elapsed_ms=16309
+[search 2/5] 'latest AI research breakthroughs' -> status=success results=3 elapsed_ms=2155
+[search 3/5] 'python web scraping libraries' -> status=success results=3 elapsed_ms=1874
+[search 4/5] 'top e-commerce platforms' -> status=success results=3 elapsed_ms=2161
+[search 5/5] 'climate change recent news' -> status=success results=3 elapsed_ms=2137
+
+=== test_utilities.py (credits + health) ===
+[utils 1/5] credits.status=success remaining=998723 used=1663 | health.status=success service=ok
+[utils 2/5] credits.status=success remaining=998723 used=1663 | health.status=success service=ok
+[utils 3/5] credits.status=success remaining=998723 used=1663 | health.status=success service=ok
+[utils 4/5] credits.status=success remaining=998723 used=1663 | health.status=success service=ok
+[utils 5/5] credits.status=success remaining=998723 used=1663 | health.status=success service=ok
+
+=== test_crawl.py ===
+[crawl 1/5] https://example.com id=41ef82d7 final=completed
+[crawl 2/5] https://scrapegraphai.com id=5cc234e7 final=completed
+[crawl 3/5] https://example.org id=a29f5267 final=completed
+[crawl 4/5] https://www.iana.org/ id=8d51e6e5 final=completed
+[crawl 5/5] https://httpbin.org/ id=ee8d26ca final=completed
+
+=== test_monitor.py ===
+[monitor 1/5] created id=e4606611 interval=0 * * * *
+[monitor 2/5] created id=7cb417bf interval=0 * * * *
+[monitor 3/5] created id=8ec56850 interval=0 * * * *
+[monitor 4/5] created id=793ff7ff interval=0 * * * *
+[monitor 5/5] created id=353213ae interval=0 * * * *
+Cleaned up 5 monitors
+
+--- Summary ---
+scrape:   5/5 success
+extract:  5/5 success
+search:   5/5 success
+credits:  5/5 success
+health:   5/5 success
+crawl:    5/5 success (all reached status=completed)
+monitor:  5/5 success (create + delete lifecycle)
diff --git a/tests/python-v2.1.0/test_crawl.py b/tests/python-v2.1.0/test_crawl.py
@@ -0,0 +1,33 @@
+"""Start + poll + cleanup a 2-page crawl against 5 URLs. Reads SGAI_API_KEY from env."""
+import time
+
+from scrapegraph_py import MarkdownFormatConfig, ScrapeGraphAI
+
+sgai = ScrapeGraphAI()
+
+urls = [
+    "https://example.com",
+    "https://scrapegraphai.com",
+    "https://example.org",
+    "https://www.iana.org/",
+    "https://httpbin.org/",
+]
+
+for i, u in enumerate(urls, 1):
+    start = sgai.crawl.start(u, formats=[MarkdownFormatConfig()], max_pages=2, max_depth=1)
+    if start.status != "success" or not start.data:
+        print(f"[crawl {i}/5] start failed: {start.error}")
+        continue
+    cid = start.data.id
+    final_status = start.data.status
+    for _ in range(15):
+        if final_status in ("completed", "failed", "stopped"):
+            break
+        time.sleep(2)
+        g = sgai.crawl.get(cid)
+        if g.status == "success" and g.data:
+            final_status = g.data.status
+        else:
+            break
+    print(f"[crawl {i}/5] {u} id={cid[:8]} final={final_status}")
+    sgai.crawl.delete(cid)
diff --git a/tests/python-v2.1.0/test_extract.py b/tests/python-v2.1.0/test_extract.py
@@ -0,0 +1,21 @@
+"""Extract against 5 URLs with scrapegraph-py>=2.1.0. Reads SGAI_API_KEY from env."""
+import time
+
+from scrapegraph_py import ScrapeGraphAI
+
+sgai = ScrapeGraphAI()
+
+cases = [
+    ("https://example.com",       "Extract the page title and main heading"),
+    ("https://scrapegraphai.com", "What does this company do in one sentence?"),
+    ("https://www.iana.org/",     "Extract the main purpose of this organization"),
+    ("https://example.org",       "Extract title and description"),
+    ("https://httpbin.org/html",  "Summarize the page content in one line"),
+]
+
+for i, (u, p) in enumerate(cases, 1):
+    res = sgai.extract(p, url=u)
+    j = res.data.json_data if res.status == "success" and res.data else None
+    keys = list(j.keys()) if isinstance(j, dict) else type(j).__name__
+    print(f"[extract {i}/5] {u} -> status={res.status} elapsed_ms={res.elapsed_ms} keys={keys}")
+    time.sleep(0.5)
diff --git a/tests/python-v2.1.0/test_monitor.py b/tests/python-v2.1.0/test_monitor.py
@@ -0,0 +1,31 @@
+"""Create + delete 5 monitors. Reads SGAI_API_KEY from env."""
+from scrapegraph_py import MarkdownFormatConfig, ScrapeGraphAI
+
+sgai = ScrapeGraphAI()
+
+urls = [
+    "https://example.com",
+    "https://example.org",
+    "https://www.iana.org/",
+    "https://httpbin.org/html",
+    "https://scrapegraphai.com",
+]
+
+created = []
+for i, u in enumerate(urls, 1):
+    res = sgai.monitor.create(
+        u,
+        "0 * * * *",
+        name=f"doc-test-monitor-{i}",
+        formats=[MarkdownFormatConfig()],
+    )
+    if res.status != "success" or not res.data:
+        print(f"[monitor {i}/5] create failed: {res.error}")
+        continue
+    cron_id = res.data.cron_id
+    created.append(cron_id)
+    print(f"[monitor {i}/5] created id={cron_id[:8]} interval={res.data.interval}")
+
+for cid in created:
+    sgai.monitor.delete(cid)
+print(f"Cleaned up {len(created)} monitors")
diff --git a/tests/python-v2.1.0/test_scrape.py b/tests/python-v2.1.0/test_scrape.py
@@ -0,0 +1,24 @@
+"""Scrape 5 distinct URLs with scrapegraph-py>=2.1.0. Reads SGAI_API_KEY from env."""
+import time
+
+from scrapegraph_py import MarkdownFormatConfig, ScrapeGraphAI
+
+sgai = ScrapeGraphAI()
+
+urls = [
+    "https://example.com",
+    "https://scrapegraphai.com",
+    "https://httpbin.org/html",
+    "https://www.iana.org/",
+    "https://example.org",
+]
+
+for i, u in enumerate(urls, 1):
+    res = sgai.scrape(u, formats=[MarkdownFormatConfig()])
+    md = ""
+    if res.status == "success":
+        md = (res.data.results.get("markdown", {}) or {}).get("data") or ""
+        if isinstance(md, list):
+            md = md[0] if md else ""
+    print(f"[scrape {i}/5] {u} -> status={res.status} len={len(md)} elapsed_ms={res.elapsed_ms}")
+    time.sleep(0.5)
diff --git a/tests/python-v2.1.0/test_search.py b/tests/python-v2.1.0/test_search.py
@@ -0,0 +1,20 @@
+"""Search 5 queries with scrapegraph-py>=2.1.0. Reads SGAI_API_KEY from env."""
+import time
+
+from scrapegraph_py import ScrapeGraphAI
+
+sgai = ScrapeGraphAI()
+
+queries = [
+    "best programming languages 2025",
+    "latest AI research breakthroughs",
+    "python web scraping libraries",
+    "top e-commerce platforms",
+    "climate change recent news",
+]
+
+for i, q in enumerate(queries, 1):
+    res = sgai.search(q, num_results=3)
+    n = len(res.data.results) if res.status == "success" and res.data else 0
+    print(f"[search {i}/5] {q!r} -> status={res.status} results={n} elapsed_ms={res.elapsed_ms}")
+    time.sleep(0.5)
diff --git a/tests/python-v2.1.0/test_utilities.py b/tests/python-v2.1.0/test_utilities.py
@@ -0,0 +1,18 @@
+"""Call credits() + health() 5 times each. Reads SGAI_API_KEY from env."""
+import time
+
+from scrapegraph_py import ScrapeGraphAI
+
+sgai = ScrapeGraphAI()
+
+for i in range(1, 6):
+    c = sgai.credits()
+    h = sgai.health()
+    remaining = c.data.remaining if c.status == "success" and c.data else "?"
+    used = c.data.used if c.status == "success" and c.data else "?"
+    h_status = h.data.status if h.status == "success" and h.data else "?"
+    print(
+        f"[utils {i}/5] credits.status={c.status} remaining={remaining} used={used}"
+        f" | health.status={h.status} service={h_status}"
+    )
+    time.sleep(0.3)