Skip to content

Commit bdaeace

Browse files
Merge pull request #187 from NotJoeMartinez/handle-bot-blocking-maybe-use-api
This pull request introduces several improvements and updates to the `yt-fts` project, focusing on troubleshooting documentation, type hinting, dependency updates, and Python compatibility. The most significant changes include adding a comprehensive troubleshooting guide for 403 errors, updating type hints across multiple functions for better code clarity, upgrading dependencies, and increasing the minimum required Python version. ### Randomizing User agent, retry method added yt-dlp config options to retry downloads and randomize user agent ### Documentation Enhancements: * Added a new file `docs/TROUBLESHOOTING_403.md` with detailed explanations of 403 errors, diagnosis tools, common solutions, advanced troubleshooting steps, and prevention tips. This includes example workflows and error message references for user guidance. ### Type Hinting Improvements: * Updated type hints across functions in `src/yt_fts/config.py`, `src/yt_fts/db_utils.py`, and `src/yt_fts/export.py` to improve code readability and enforce stricter type checking. Examples include specifying return types (`str | None`, `list[tuple[int, str, str, str]]`) and parameter types (`channel_id: str`, `limit: int | None`). [[1]](diffhunk://#diff-6113cfecef0a78f531444ed85e58c8783626a2f7610b4e062a3ca6571989ff97L1-R8) [[2]](diffhunk://#diff-dc3d4af5ca8205a5ad2fecf0f00f98cf853640e148149333af296439c1a6cde2L13-R13) [[3]](diffhunk://#diff-03b0323cdf4466eba537f666efacf13a39ed4b56b0efdbb3da03625ce133964dL22-R22) ### Dependency Updates: * Upgraded dependencies in `pyproject.toml`: - `openai` updated from `1.35.3` to `1.93.0`. - `chromadb` updated from `0.5.2` to `1.0.15`. ### Python Compatibility: * Increased the minimum required Python version from `>=3.8` to `>=3.10` in `pyproject.toml` to leverage newer language features and maintain compatibility with updated dependencies. ### File Renaming: * Renamed `yt_fts/config.py`, `yt_fts/db_utils.py`, and `yt_fts/export.py` to `src/yt_fts/config.py`, `src/yt_fts/db_utils.py`, and `src/yt_fts/export.py` respectively, reflecting a change in project structure. [[1]](diffhunk://#diff-6113cfecef0a78f531444ed85e58c8783626a2f7610b4e062a3ca6571989ff97L1-R8) [[2]](diffhunk://#diff-dc3d4af5ca8205a5ad2fecf0f00f98cf853640e148149333af296439c1a6cde2L13-R13) [[3]](diffhunk://#diff-03b0323cdf4466eba537f666efacf13a39ed4b56b0efdbb3da03625ce133964dL22-R22)
2 parents 88a6b4d + ff8ce06 commit bdaeace

File tree

18 files changed

+869
-492
lines changed

18 files changed

+869
-492
lines changed

.gitignore

Lines changed: 6 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -174,4 +174,9 @@ UCYO_jab_esuFRV4b17AJtAw
174174
.ignore/
175175
tests/test_data/
176176
.idea
177-
*.sh
177+
*.sh
178+
179+
# custom
180+
181+
scratch/
182+
.env

docs/TROUBLESHOOTING_403.md

Lines changed: 157 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,157 @@
1+
# Troubleshooting 403 Errors
2+
3+
## What are 403 Errors?
4+
5+
403 Forbidden errors occur when YouTube blocks requests from your application. This typically happens due to:
6+
7+
- **Rate limiting**: Too many requests in a short time period
8+
- **Missing authentication**: No cookies or session data
9+
- **Bot detection**: YouTube identifying automated requests
10+
- **IP blocking**: Your IP address has been temporarily blocked
11+
- **Geographic restrictions**: Content not available in your region
12+
13+
## Quick Diagnosis
14+
15+
Run the built-in diagnosis tool to identify the issue:
16+
17+
```bash
18+
# Basic diagnosis
19+
yt-fts diagnose
20+
21+
# Diagnosis with browser cookies
22+
yt-fts diagnose --cookies-from-browser chrome
23+
24+
# Diagnosis with specific job count
25+
yt-fts diagnose -j 4
26+
```
27+
28+
## Common Solutions
29+
30+
### 1. Use Browser Cookies
31+
32+
The most effective solution is to use cookies from your browser:
33+
34+
```bash
35+
# Use Chrome cookies
36+
yt-fts download --cookies-from-browser chrome <channel_url>
37+
38+
# Use Firefox cookies
39+
yt-fts download --cookies-from-browser firefox <channel_url>
40+
```
41+
42+
**How to set up cookies:**
43+
1. Log into YouTube in your browser (Chrome or Firefox)
44+
2. Make sure you're logged in and can access the channel
45+
3. Run the download command with `--cookies-from-browser`
46+
47+
### 2. Reduce Parallel Jobs
48+
49+
High parallel job counts can trigger rate limiting:
50+
51+
```bash
52+
# Use fewer parallel jobs
53+
yt-fts download -j 2 <channel_url>
54+
yt-fts download -j 4 <channel_url>
55+
56+
# For very problematic channels, use just 1 job
57+
yt-fts download -j 1 <channel_url>
58+
```
59+
60+
### 3. Wait Between Attempts
61+
62+
If you're getting rate limited, wait a few minutes before trying again:
63+
64+
```bash
65+
# Wait 5-10 minutes between attempts
66+
# Then try again with reduced jobs
67+
yt-fts download -j 2 --cookies-from-browser chrome <channel_url>
68+
```
69+
70+
### 4. Check Channel Accessibility
71+
72+
Some channels may be:
73+
- **Private**: Only accessible to subscribers
74+
- **Age-restricted**: Requires login and age verification
75+
- **Region-blocked**: Not available in your country
76+
77+
Try accessing the channel in your browser first to verify it's publicly accessible.
78+
79+
## Advanced Troubleshooting
80+
81+
### Test Network Connectivity
82+
83+
```bash
84+
# Test basic connectivity
85+
curl -I https://www.youtube.com
86+
87+
# Test with custom user agent
88+
curl -H "User-Agent: Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36" https://www.youtube.com
89+
```
90+
91+
### Update yt-dlp
92+
93+
Ensure you have the latest version of yt-dlp:
94+
95+
```bash
96+
pip install --upgrade yt-dlp
97+
```
98+
99+
### Check for VPN/Proxy Issues
100+
101+
If you're using a VPN or proxy:
102+
1. Try disabling it temporarily
103+
2. Switch to a different server/location
104+
3. Use a residential IP if possible
105+
106+
### Monitor Rate Limits
107+
108+
Watch for these error patterns:
109+
- **429 Too Many Requests**: Immediate rate limit
110+
- **403 Forbidden**: General blocking
111+
- **503 Service Unavailable**: Temporary server issues
112+
113+
## Error Message Reference
114+
115+
| Error | Cause | Solution |
116+
|-------|-------|----------|
117+
| `403 Forbidden` | General blocking | Use cookies, reduce jobs |
118+
| `429 Too Many Requests` | Rate limiting | Wait, reduce jobs |
119+
| `Video unavailable` | Private/restricted | Check channel access |
120+
| `Sign in to confirm your age` | Age restriction | Use logged-in cookies |
121+
122+
## Prevention Tips
123+
124+
1. **Always use browser cookies** for consistent access
125+
2. **Start with low job counts** (2-4) and increase gradually
126+
3. **Monitor for errors** and adjust accordingly
127+
4. **Don't run multiple instances** simultaneously
128+
5. **Respect rate limits** - wait between large downloads
129+
130+
## Getting Help
131+
132+
If you're still experiencing issues:
133+
134+
1. Run the diagnosis tool: `yt-fts diagnose`
135+
2. Check the error messages for specific details
136+
3. Try the test script: `python test_403_diagnosis.py`
137+
4. Report issues with:
138+
- Error messages
139+
- Channel URL (if public)
140+
- Your configuration (jobs, cookies, etc.)
141+
- Diagnosis output
142+
143+
## Example Workflow
144+
145+
```bash
146+
# 1. Diagnose the issue
147+
yt-fts diagnose --cookies-from-browser chrome
148+
149+
# 2. Try with cookies and low job count
150+
yt-fts download --cookies-from-browser chrome -j 2 <channel_url>
151+
152+
# 3. If successful, gradually increase jobs
153+
yt-fts download --cookies-from-browser chrome -j 4 <channel_url>
154+
155+
# 4. For large channels, consider breaking into smaller batches
156+
# Download in chunks with breaks between them
157+
```

pyproject.toml

Lines changed: 4 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -7,22 +7,22 @@ name = "yt-fts"
77
version = "0.1.60"
88
description = "Search all of a YouTube channel from the command line"
99
readme = "README.md"
10-
requires-python = ">=3.8"
10+
requires-python = ">=3.10"
1111
license = { file = "LICENSE" }
1212
authors = [
1313
{ name = "NotJoeMartinez", email = "notjoemartinez@protonmail.com" }
1414
]
1515
keywords = ["youtube", "subtitles", "search"]
1616
classifiers = [
17-
"Programming Language :: Python :: 3",
17+
"Programming Language :: Python :: 3.10",
1818
"License :: OSI Approved :: The Unlicense (Unlicense)",
1919
"Operating System :: OS Independent",
2020
]
2121

2222
dependencies = [
2323
"click==8.1.7",
24-
"openai==1.35.3",
25-
"chromadb==0.5.2",
24+
"openai==1.93.0",
25+
"chromadb==1.0.15",
2626
"requests>=2.32.2,<3",
2727
"rich==13.7.1",
2828
"sqlite-utils==3.36",
Lines changed: 5 additions & 6 deletions
Original file line numberDiff line numberDiff line change
@@ -1,12 +1,11 @@
1-
21
import sys
32
import os
43

54
import chromadb
65
from chromadb.config import Settings
76

87

9-
def get_config_path():
8+
def get_config_path() -> str | None:
109

1110
platform = sys.platform
1211

@@ -27,7 +26,7 @@ def get_config_path():
2726
return None
2827

2928

30-
def make_config_dir():
29+
def make_config_dir() -> str | None:
3130
platform = sys.platform
3231

3332
try:
@@ -49,7 +48,7 @@ def make_config_dir():
4948
return None
5049

5150

52-
def get_db_path():
51+
def get_db_path() -> str:
5352
from .db_utils import make_db
5453
# make sure config path exists
5554
# if config path is none, make config path
@@ -92,7 +91,7 @@ def get_db_path():
9291
return "subtitles.db"
9392

9493

95-
def get_or_make_chroma_path():
94+
def get_or_make_chroma_path() -> str:
9695

9796
config_path = get_config_path()
9897

@@ -112,7 +111,7 @@ def get_or_make_chroma_path():
112111
return chroma_path
113112

114113

115-
def get_chroma_client():
114+
def get_chroma_client() -> chromadb.PersistentClient:
116115
chroma_path = get_or_make_chroma_path()
117116
return chromadb.PersistentClient(path=chroma_path,
118117
settings=Settings(anonymized_telemetry=False))

0 commit comments

Comments
 (0)