Skip to content

Commit 36d4fc9

Browse files
committed
Fixed Amazon Search Dataset ID and broken links in README
1 parent 928b2b4 commit 36d4fc9

File tree

2 files changed

+36
-56
lines changed

2 files changed

+36
-56
lines changed

README.md

Lines changed: 7 additions & 40 deletions
Original file line numberDiff line numberDiff line change
@@ -83,11 +83,11 @@ Modern async-first Python SDK for [Bright Data](https://brightdata.com) APIs wit
8383

8484
Perfect for data scientists! Interactive tutorials with examples:
8585

86-
1. **[01_quickstart.ipynb](notebooks/01_quickstart.ipynb)** - Get started in 5 minutes [![Open In Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/github/brightdata/sdk-python/blob/master/notebooks/01_quickstart.ipynb)
87-
2. **[02_pandas_integration.ipynb](notebooks/02_pandas_integration.ipynb)** - Work with DataFrames [![Open In Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/github/brightdata/sdk-python/blob/master/notebooks/02_pandas_integration.ipynb)
88-
3. **[03_amazon_scraping.ipynb](notebooks/03_amazon_scraping.ipynb)** - Amazon deep dive [![Open In Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/github/brightdata/sdk-python/blob/master/notebooks/03_amazon_scraping.ipynb)
89-
4. **[04_linkedin_jobs.ipynb](notebooks/04_linkedin_jobs.ipynb)** - Job market analysis [![Open In Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/github/brightdata/sdk-python/blob/master/notebooks/04_linkedin_jobs.ipynb)
90-
5. **[05_batch_processing.ipynb](notebooks/05_batch_processing.ipynb)** - Scale to 1000s of URLs [![Open In Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/github/brightdata/sdk-python/blob/master/notebooks/05_batch_processing.ipynb)
86+
1. **[01_quickstart.ipynb](notebooks/01_quickstart.ipynb)** - Get started in 5 minutes [![Open In Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/github/brightdata/sdk-python/blob/main/notebooks/01_quickstart.ipynb)
87+
2. **[02_pandas_integration.ipynb](notebooks/02_pandas_integration.ipynb)** - Work with DataFrames [![Open In Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/github/brightdata/sdk-python/blob/main/notebooks/02_pandas_integration.ipynb)
88+
3. **[03_amazon_scraping.ipynb](notebooks/03_amazon_scraping.ipynb)** - Amazon deep dive [![Open In Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/github/brightdata/sdk-python/blob/main/notebooks/03_amazon_scraping.ipynb)
89+
4. **[04_linkedin_jobs.ipynb](notebooks/04_linkedin_jobs.ipynb)** - Job market analysis [![Open In Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/github/brightdata/sdk-python/blob/main/notebooks/04_linkedin_jobs.ipynb)
90+
5. **[05_batch_processing.ipynb](notebooks/05_batch_processing.ipynb)** - Scale to 1000s of URLs [![Open In Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/github/brightdata/sdk-python/blob/main/notebooks/05_batch_processing.ipynb)
9191

9292
---
9393

@@ -1078,10 +1078,8 @@ pytest tests/ --cov=brightdata --cov-report=html
10781078
- [All examples →](examples/)
10791079

10801080
### Documentation
1081-
- [Quick Start Guide](docs/quickstart.md)
1082-
- [Architecture Overview](docs/architecture.md)
10831081
- [API Reference](docs/api-reference/)
1084-
- [Contributing Guide](docs/contributing.md)
1082+
- [Contributing Guidelines](https://github.com/brightdata/sdk-python/blob/main/CONTRIBUTING.md) (See upstream repo)
10851083

10861084
---
10871085

@@ -1140,7 +1138,7 @@ pip install -e .
11401138

11411139
## 🤝 Contributing
11421140

1143-
Contributions are welcome! Please see [CONTRIBUTING.md](docs/contributing.md) for guidelines.
1141+
Contributions are welcome! Check the [GitHub repository](https://github.com/brightdata/sdk-python) for contribution guidelines.
11441142

11451143
### Development Setup
11461144

@@ -1269,37 +1267,6 @@ Run the included demo to explore the SDK interactively:
12691267
```bash
12701268
python demo_sdk.py
12711269
```
1272-
1273-
---
1274-
1275-
## 🎯 Roadmap
1276-
1277-
### ✅ Completed
1278-
- [x] Core client with authentication
1279-
- [x] Web Unlocker service
1280-
- [x] Platform scrapers (Amazon, LinkedIn, ChatGPT, Facebook, Instagram)
1281-
- [x] SERP API (Google, Bing, Yandex)
1282-
- [x] Comprehensive test suite (502+ tests)
1283-
- [x] .env file support via python-dotenv
1284-
- [x] SSL error handling with helpful guidance
1285-
- [x] Centralized constants module
1286-
- [x] Function-level monitoring
1287-
- [x] **Dataclass payloads with validation**
1288-
- [x] **Jupyter notebooks for data scientists**
1289-
- [x] **CLI tool (brightdata command)**
1290-
- [x] **Pandas integration examples**
1291-
- [x] **Single shared AsyncEngine (8x efficiency)**
1292-
1293-
### 🚧 In Progress
1294-
- [ ] Browser automation API
1295-
- [ ] Web crawler API
1296-
1297-
### 🔮 Future
1298-
- [ ] Additional platforms (Reddit, Twitter/X, TikTok, YouTube)
1299-
- [ ] Real-time data streaming
1300-
- [ ] Advanced caching strategies
1301-
- [ ] Prometheus metrics export
1302-
13031270
---
13041271

13051272
## 🙏 Acknowledgments

src/brightdata/scrapers/amazon/search.py

Lines changed: 29 additions & 16 deletions
Original file line numberDiff line numberDiff line change
@@ -35,7 +35,7 @@ class AmazonSearchScraper:
3535
"""
3636

3737
# Amazon dataset IDs
38-
DATASET_ID_PRODUCTS_SEARCH = "gd_l7q7dkf244hwjntr0" # Amazon Products with search
38+
DATASET_ID_PRODUCTS_SEARCH = "gd_lwdb4vjm1ehb499uxs" # Amazon Products Search (15.84M records)
3939

4040
def __init__(self, bearer_token: str, engine: Optional[AsyncEngine] = None):
4141
"""
@@ -125,26 +125,39 @@ async def products_async(
125125
conditions = self._normalize_param(condition, batch_size)
126126
countries = self._normalize_param(country, batch_size)
127127

128-
# Build payload - Amazon API requires URLs
129-
# If keyword provided, build Amazon search URL internally
128+
# Build payload - Amazon Products Search dataset expects keyword field
130129
payload = []
131130
for i in range(batch_size):
131+
item = {}
132+
132133
# If URL provided directly, use it
133134
if urls and i < len(urls):
134-
item = {"url": urls[i]}
135+
item["url"] = urls[i]
136+
# Extract keyword from URL if possible for the keyword field
137+
if "k=" in urls[i]:
138+
import urllib.parse
139+
parsed = urllib.parse.urlparse(urls[i])
140+
params = urllib.parse.parse_qs(parsed.query)
141+
item["keyword"] = params.get("k", [""])[0]
142+
else:
143+
item["keyword"] = ""
135144
else:
136-
# Build Amazon search URL from parameters
137-
search_url = self._build_amazon_search_url(
138-
keyword=keywords[i] if keywords and i < len(keywords) else None,
139-
category=categories[i] if categories and i < len(categories) else None,
140-
min_price=min_prices[i] if min_prices and i < len(min_prices) else None,
141-
max_price=max_prices[i] if max_prices and i < len(max_prices) else None,
142-
condition=conditions[i] if conditions and i < len(conditions) else None,
143-
prime_eligible=prime_eligible,
144-
country=countries[i] if countries and i < len(countries) else None,
145-
)
146-
item = {"url": search_url}
147-
145+
# Send keyword directly (dataset expects this field)
146+
item["keyword"] = keywords[i] if keywords and i < len(keywords) else ""
147+
148+
# Optionally build URL for additional context
149+
if item["keyword"]:
150+
search_url = self._build_amazon_search_url(
151+
keyword=item["keyword"],
152+
category=categories[i] if categories and i < len(categories) else None,
153+
min_price=min_prices[i] if min_prices and i < len(min_prices) else None,
154+
max_price=max_prices[i] if max_prices and i < len(max_prices) else None,
155+
condition=conditions[i] if conditions and i < len(conditions) else None,
156+
prime_eligible=prime_eligible,
157+
country=countries[i] if countries and i < len(countries) else None,
158+
)
159+
item["url"] = search_url
160+
148161
payload.append(item)
149162

150163
return await self._execute_search(

0 commit comments

Comments
 (0)