Skip to content

v2.0.0 - Breaking Changes

Latest

Choose a tag to compare

@shahar-brd shahar-brd released this 01 Dec 17:51
· 13 commits to main since this release
4108b23

πŸš€ v2.0.0 - Complete Architecture Rewrite

⚠️ Breaking Changes - Migration Required

This is a major breaking release requiring code changes. Python 3.9+ now required.

Client Initialization

# ❌ Old
from brightdata import bdclient
client = bdclient(api_token="your_token")

# βœ… New
from brightdata import BrightDataClient
client = BrightDataClient(token="your_token")

API Structure - Hierarchical Methods

# ❌ Old - Flat API
client.scrape_linkedin.profiles(url)
client.search_linkedin.jobs()
result = client.scrape(url, zone="my_zone")

# βœ… New - Hierarchical API
client.scrape.linkedin.profiles(url)
client.search.linkedin.jobs()
result = client.scrape_url(url, zone="my_zone")

Platform-Specific Scraping

# βœ… New - Recommended approach
client.scrape.amazon.products(url)
client.scrape.amazon.reviews(url)
client.scrape.amazon.sellers(url)
client.scrape.linkedin.profiles(url)
client.scrape.instagram.profiles(url)
client.scrape.facebook.posts(url)

Search Operations

# ❌ Old
results = client.search(query, search_engine="google")

# βœ… New - Dedicated methods
client.search.google(query)
client.search.bing(query)
client.search.yandex(query)

Async Support (New)

# βœ… Sync (still supported)
client = BrightDataClient(token="...")
result = client.scrape_url(url)

# βœ… Async (recommended for performance)
async with BrightDataClient(token="...") as client:
    result = await client.scrape_url_async(url)
    
# βœ… Async batch operations
async def scrape_multiple():
    async with BrightDataClient(token="...") as client:
        tasks = [client.scrape_url_async(url) for url in urls]
        results = await asyncio.gather(*tasks)

Manual Job Control (New)

# βœ… Fine-grained control
job = await scraper.trigger(url)
# Do other work...
status = await job.status_async()
if status == "ready":
    data = await job.fetch_async()

Type-Safe Payloads (New)

# ❌ Old - untyped dicts
payload = {"url": "...", "reviews_count": 100}

# βœ… New - structured with validation
from brightdata import AmazonProductPayload
payload = AmazonProductPayload(
    url="https://amazon.com/dp/B123",
    reviews_count=100
)
result = client.scrape.amazon.products(payload)

Return Types

# βœ… New - structured objects with metadata
result = client.scrape.amazon.products(url)
print(result.data)        # Actual scraped data
print(result.timing)      # Performance metrics
print(result.cost)        # Cost tracking
print(result.snapshot_id) # Job identifier

CLI Tool (New)

# βœ… Command-line interface
brightdata scrape amazon products --url https://amazon.com/dp/B123
brightdata search google --query "python sdk"
brightdata search linkedin jobs --location "Paris"
brightdata crawler discover --url https://example.com --depth 3

Configuration Changes

# ❌ Old
client = bdclient(
    api_token="token",              # Changed parameter name
    auto_create_zones=True,          # Default changed to False
    web_unlocker_zone="sdk_unlocker", # Default changed
    serp_zone="sdk_serp",            # Default changed
    browser_zone="sdk_browser"       # Default changed
)

# βœ… New
client = BrightDataClient(
    token="token",                   # Renamed from api_token
    auto_create_zones=False,         # New default
    web_unlocker_zone="web_unlocker1", # New default name
    serp_zone="serp_api1",           # New default name
    browser_zone="browser_api1",     # New default name
    timeout=30,                      # New parameter
    rate_limit=10,                   # New parameter (optional)
    rate_period=1.0                  # New parameter
)

✨ New Features

Platform Coverage

Platform Status Methods
Amazon βœ… NEW products(), reviews(), sellers()
Instagram βœ… NEW profiles(), posts(), comments(), reels()
Facebook βœ… NEW posts(), comments(), groups()
LinkedIn βœ… Enhanced Full scraping and search
ChatGPT βœ… Enhanced Improved interaction
Google/Bing/Yandex βœ… Enhanced Dedicated services

Performance

  • ⚑ 10x better concurrency - Event loop-based architecture
  • πŸ”Œ Advanced connection pooling - 100 total, 30 per host
  • 🎯 Built-in rate limiting - Configurable request throttling

βœ… Upgrade Checklist

  • Update Python to 3.9+
  • Change imports: bdclient β†’ BrightDataClient
  • Update parameter: api_token= β†’ token=
  • Migrate method calls to hierarchical structure
  • Handle new ScrapeResult/SearchResult return types
  • Review zone configuration defaults
  • Consider async for better performance
  • Test in staging environment

πŸ“š Resources

Full Changelog: v1.1.3...v2.0.0