Skip to content

Commit 8a7cccc

Browse files
committed
Markdown Output formatter feature
1 parent 72fa443 commit 8a7cccc

File tree

13 files changed

+784
-35
lines changed

13 files changed

+784
-35
lines changed

README.md

Lines changed: 62 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -62,6 +62,7 @@ Modern async-first Python SDK for [Bright Data](https://brightdata.com) APIs wit
6262
### 🏗️ **Core Features**
6363
- 🚀 **Async-first architecture** with sync wrappers for compatibility
6464
- 🎨 **Dataclass Payloads** - Runtime validation, IDE autocomplete, helper methods
65+
- 📝 **Markdown Output** - Export results as GitHub-flavored markdown tables
6566
- 🌐 **Web scraping** via Web Unlocker proxy service
6667
- 🔍 **SERP API** - Google, Bing, Yandex search results
6768
- 📦 **Platform scrapers** - LinkedIn, Amazon, ChatGPT, Facebook, Instagram
@@ -460,10 +461,11 @@ asyncio.run(scrape_multiple())
460461
## 🆕 What's New in v2 2.0.0
461462

462463
### 🆕 **Latest Updates (December 2025)**
464+
-**Markdown Output Format** - NEW! Export results as GitHub-flavored markdown
463465
-**Amazon Search API** - NEW parameter-based product discovery with correct dataset
464466
-**LinkedIn Job Search Fixed** - Now builds URLs from keywords internally
465467
-**Trigger Interface** - Manual trigger/poll/fetch control for all platforms
466-
-**29 Sync Wrapper Fixes** - All sync methods work (scrapers + SERP API)
468+
-**30 Sync Wrapper Fixes** - ALL sync methods work (scrapers + SERP + generic)
467469
-**Batch Operations Fixed** - Returns List[ScrapeResult] correctly
468470
-**Auto-Create Zones** - Now enabled by default (was opt-in)
469471
-**Improved Zone Names** - `sdk_unlocker`, `sdk_serp`, `sdk_browser`
@@ -656,9 +658,11 @@ result.elapsed_ms() # Total time in milliseconds
656658
result.get_timing_breakdown() # Detailed timing dict
657659

658660
# Serialization
659-
result.to_dict() # Convert to dictionary
660-
result.to_json(indent=2) # JSON string
661-
result.save_to_file("result.json") # Save to file
661+
result.to_dict() # Convert to dictionary
662+
result.to_json(indent=2) # JSON string
663+
result.to_markdown() # GitHub-flavored markdown (NEW!)
664+
result.save_to_file("result.json") # Save as JSON
665+
result.save_to_file("result.md", format="markdown") # Save as markdown (NEW!)
662666
```
663667

664668
---
@@ -728,6 +732,9 @@ brightdata scrape amazon products "https://amazon.com/dp/B123" --output-format p
728732

729733
# Minimal format - Just the data, no metadata
730734
brightdata scrape amazon products "https://amazon.com/dp/B123" --output-format minimal
735+
736+
# Markdown format - GitHub-flavored tables (NEW!)
737+
brightdata scrape amazon products "https://amazon.com/dp/B123" --output-format markdown
731738
```
732739

733740
#### Generic Scraper Response Format (`--response-format`)
@@ -749,6 +756,57 @@ brightdata scrape generic "https://example.com" \
749756
--output-format pretty
750757
```
751758

759+
#### Markdown Output Format (NEW!)
760+
761+
Export results as GitHub-flavored markdown tables - perfect for reports and documentation:
762+
763+
```bash
764+
# CLI: Markdown output
765+
brightdata search google "python tutorial" --output-format markdown
766+
767+
# Save to file
768+
brightdata search google "python tutorial" \
769+
--output-format markdown \
770+
--output-file report.md
771+
```
772+
773+
**SDK: Markdown methods**
774+
775+
```python
776+
from brightdata import BrightDataClient
777+
778+
client = BrightDataClient()
779+
result = client.search.google(query="python tutorial", num_results=5)
780+
781+
# Generate markdown
782+
md = result.to_markdown()
783+
print(md)
784+
785+
# Save as markdown
786+
result.save_to_file("report.md", format="markdown")
787+
```
788+
789+
**Example Output:**
790+
791+
```markdown
792+
# Result: ✅ Success
793+
794+
## Metadata
795+
796+
| Field | Value |
797+
|-------|-------|
798+
| Cost | $0.0010 USD |
799+
| Time | 1234.56ms |
800+
801+
## Data
802+
803+
| position | title | url |
804+
|----------|-------|-----|
805+
| 1 | The Python Tutorial | https://docs.python.org/3/tutorial/ |
806+
| 2 | Python Tutorial - W3Schools | https://www.w3schools.com/python/ |
807+
| 3 | Learn Python | https://www.learnpython.org/ |
808+
```
809+
752810
---
753811

754812
## 🐼 Pandas Integration

src/brightdata/cli/commands/scrape.py

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -16,9 +16,9 @@
1616
)
1717
@click.option(
1818
"--output-format",
19-
type=click.Choice(["json", "pretty", "minimal"], case_sensitive=False),
19+
type=click.Choice(["json", "pretty", "minimal", "markdown"], case_sensitive=False),
2020
default="json",
21-
help="Output format",
21+
help="Output format (json, pretty, minimal, markdown)",
2222
)
2323
@click.option("--output-file", type=click.Path(), help="Save output to file")
2424
@click.pass_context

src/brightdata/cli/commands/search.py

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -16,9 +16,9 @@
1616
)
1717
@click.option(
1818
"--output-format",
19-
type=click.Choice(["json", "pretty", "minimal"], case_sensitive=False),
19+
type=click.Choice(["json", "pretty", "minimal", "markdown"], case_sensitive=False),
2020
default="json",
21-
help="Output format",
21+
help="Output format (json, pretty, minimal, markdown)",
2222
)
2323
@click.option("--output-file", type=click.Path(), help="Save output to file")
2424
@click.pass_context

src/brightdata/cli/utils.py

Lines changed: 26 additions & 19 deletions
Original file line numberDiff line numberDiff line change
@@ -67,34 +67,41 @@ def create_client(api_key: Optional[str] = None, **kwargs) -> BrightDataClient:
6767

6868
def format_result(result: Any, output_format: str = "json") -> str:
6969
"""
70-
Format result for output.
70+
Format result for output using formatter registry.
7171
7272
Args:
7373
result: Result object (ScrapeResult, SearchResult, etc.)
74-
output_format: Output format ("json", "pretty", "minimal")
74+
output_format: Output format ("json", "pretty", "minimal", "markdown")
7575
7676
Returns:
7777
Formatted string
7878
"""
79-
if output_format == "json":
80-
if hasattr(result, "to_dict"):
81-
data = result.to_dict()
82-
elif hasattr(result, "__dict__"):
83-
from dataclasses import asdict, is_dataclass
84-
85-
if is_dataclass(result):
86-
data = asdict(result)
79+
try:
80+
from ..formatters import FormatterRegistry
81+
82+
formatter = FormatterRegistry.get_formatter(output_format)
83+
return formatter.format(result)
84+
except (ValueError, ImportError):
85+
# Fallback to legacy formatting for backward compatibility
86+
if output_format == "json":
87+
if hasattr(result, "to_dict"):
88+
data = result.to_dict()
89+
elif hasattr(result, "__dict__"):
90+
from dataclasses import asdict, is_dataclass
91+
92+
if is_dataclass(result):
93+
data = asdict(result)
94+
else:
95+
data = result.__dict__
8796
else:
88-
data = result.__dict__
97+
data = result
98+
return json.dumps(data, indent=2, default=str)
99+
elif output_format == "pretty":
100+
return format_result_pretty(result)
101+
elif output_format == "minimal":
102+
return format_result_minimal(result)
89103
else:
90-
data = result
91-
return json.dumps(data, indent=2, default=str)
92-
elif output_format == "pretty":
93-
return format_result_pretty(result)
94-
elif output_format == "minimal":
95-
return format_result_minimal(result)
96-
else:
97-
return str(result)
104+
return str(result)
98105

99106

100107
def format_result_pretty(result: Any) -> str:
Lines changed: 24 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,24 @@
1+
"""Output formatters for results."""
2+
3+
from .registry import FormatterRegistry
4+
from .base import BaseFormatter
5+
from .json_formatter import JSONFormatter
6+
from .pretty_formatter import PrettyFormatter
7+
from .minimal_formatter import MinimalFormatter
8+
from .markdown import MarkdownFormatter
9+
10+
# Auto-register formatters
11+
FormatterRegistry.register("json", JSONFormatter)
12+
FormatterRegistry.register("pretty", PrettyFormatter)
13+
FormatterRegistry.register("minimal", MinimalFormatter)
14+
FormatterRegistry.register("markdown", MarkdownFormatter)
15+
FormatterRegistry.register("md", MarkdownFormatter) # Alias
16+
17+
__all__ = [
18+
"FormatterRegistry",
19+
"BaseFormatter",
20+
"JSONFormatter",
21+
"PrettyFormatter",
22+
"MinimalFormatter",
23+
"MarkdownFormatter",
24+
]

src/brightdata/formatters/base.py

Lines changed: 36 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,36 @@
1+
"""Base formatter interface."""
2+
3+
from abc import ABC, abstractmethod
4+
from typing import Any
5+
6+
7+
class BaseFormatter(ABC):
8+
"""
9+
Base formatter interface using Strategy Pattern.
10+
11+
All formatters must implement this interface to ensure
12+
consistent behavior across different output formats.
13+
"""
14+
15+
@abstractmethod
16+
def format(self, result: Any) -> str:
17+
"""
18+
Format result to string representation.
19+
20+
Args:
21+
result: Result object (ScrapeResult, SearchResult, etc.)
22+
23+
Returns:
24+
Formatted string representation
25+
"""
26+
pass
27+
28+
@abstractmethod
29+
def get_extension(self) -> str:
30+
"""
31+
Get file extension for this format.
32+
33+
Returns:
34+
File extension including dot (e.g., ".json", ".md")
35+
"""
36+
pass
Lines changed: 35 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,35 @@
1+
"""JSON output formatter."""
2+
3+
import json
4+
from typing import Any
5+
from dataclasses import asdict, is_dataclass
6+
from .base import BaseFormatter
7+
8+
9+
class JSONFormatter(BaseFormatter):
10+
"""
11+
Format results as JSON.
12+
13+
Provides clean, structured JSON output suitable for:
14+
- API consumption
15+
- Data processing
16+
- Automation
17+
"""
18+
19+
def format(self, result: Any) -> str:
20+
"""Format result as JSON string."""
21+
if hasattr(result, "to_dict"):
22+
data = result.to_dict()
23+
elif hasattr(result, "__dict__"):
24+
if is_dataclass(result):
25+
data = asdict(result)
26+
else:
27+
data = result.__dict__
28+
else:
29+
data = result
30+
31+
return json.dumps(data, indent=2, default=str)
32+
33+
def get_extension(self) -> str:
34+
"""Get file extension."""
35+
return ".json"

0 commit comments

Comments
 (0)