Skip to content

Commit d0d61be

Browse files
docs and version updates
1 parent bdaeace commit d0d61be

File tree

3 files changed

+154
-42
lines changed

3 files changed

+154
-42
lines changed

README.md

Lines changed: 152 additions & 40 deletions
Original file line numberDiff line numberDiff line change
@@ -22,41 +22,118 @@ pip
2222
pip install yt-fts
2323
```
2424

25-
## `download`
26-
Download subtitles for a channel.
25+
## Commands
2726

28-
Takes a channel url as an argument. Specify the number of jobs to parallelize the download with the `--jobs` flag.
27+
### `download`
28+
Download subtitles for a channel or playlist.
29+
30+
Takes a channel or playlist URL as an argument. Specify the number of jobs to parallelize the download with the `--jobs` flag.
2931
Use the `--cookies-from-browser` to use cookies from your browser in the requests, will help if you're getting errors
3032
that request you to sign in. You can also run the `update` command several times to gradually get more videos into the database.
3133

3234
```bash
35+
# Download channel
3336
yt-fts download --jobs 5 "https://www.youtube.com/@3blue1brown"
3437
yt-fts download --cookies-from-browser firefox "https://www.youtube.com/@3blue1brown"
38+
39+
# Download playlist
40+
yt-fts download --playlist "https://www.youtube.com/playlist?list=PLZHQObOWTQDPD3MizzM2xVFitgF8hE_ab"
41+
```
42+
43+
**Options:**
44+
- `-p, --playlist`: Download all videos from a playlist
45+
- `-l, --language`: Language of the subtitles to download (default: en)
46+
- `-j, --jobs`: Number of parallel download jobs (default: 8, recommended: 4-16)
47+
- `--cookies-from-browser`: Browser to extract cookies from (chrome, firefox, etc.)
48+
49+
### `diagnose`
50+
Diagnose 403 errors and other download issues.
51+
52+
This command will test various aspects of the connection to YouTube and provide recommendations for fixing common issues.
53+
54+
```bash
55+
yt-fts diagnose
56+
yt-fts diagnose --test-url "https://www.youtube.com/watch?v=dQw4w9WgXcQ" --cookies-from-browser firefox
3557
```
3658

37-
## `list`
38-
List saved channels.
59+
**Options:**
60+
- `-u, --test-url`: URL to test with (default: https://www.youtube.com/watch?v=dQw4w9WgXcQ)
61+
- `--cookies-from-browser`: Browser to extract cookies from
62+
- `-j, --jobs`: Number of parallel download jobs to test with (default: 8)
63+
64+
### `list`
65+
List saved channels, videos, and transcripts.
3966

4067
The (ss) next to the channel name indicates that the channel has semantic search enabled.
4168

4269
```bash
70+
# List all channels
4371
yt-fts list
72+
73+
# List videos for a specific channel
74+
yt-fts list --channel "3Blue1Brown"
75+
76+
# Show transcript for a specific video
77+
yt-fts list --transcript "dQw4w9WgXcQ"
78+
79+
# Show library (same as default)
80+
yt-fts list --library
4481
```
4582

83+
**Options:**
84+
- `-t, --transcript`: Show transcript for a video
85+
- `-c, --channel`: Show list of videos for a channel
86+
- `-l, --library`: Show list of channels in library
87+
88+
### `update`
89+
Update subtitles for all channels in the library or a specific channel.
90+
91+
Keep in mind some might not have subtitles enabled. This command will still attempt to download subtitles as subtitles are sometimes added later.
92+
93+
```bash
94+
# Update all channels
95+
yt-fts update
96+
97+
# Update specific channel
98+
yt-fts update --channel "3Blue1Brown" --jobs 5
4699
```
47-
┏━━━━┳━━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━┳━━━━━━━━━━━━━━━━━━━━━━━━━━┓
48-
┃ ID ┃ Name ┃ Count ┃ Channel ID ┃
49-
┡━━━━╇━━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━╇━━━━━━━━━━━━━━━━━━━━━━━━━━┩
50-
│ 1 │ ChessPage1 (ss) │ 19 │ UCO2QPmnJFjdvJ6ch-pe27dQ │
51-
│ 2 │ 3Blue1Brown │ 127 │ UCYO_jab_esuFRV4b17AJtAw │
52-
│ 3 │ george hotz archive │ 410 │ UCwgKmJM4ZJQRJ-U5NjvR2dg │
53-
│ 4 │ The Tim Dillon Show │ 288 │ UC4woSp8ITBoYDmjkukhEhxg │
54-
│ 5 │ Academy of Ideas (ss) │ 190 │ UCiRiQGCHGjDLT9FQXFW0I3A │
55-
└────┴───────────────────────┴───────┴──────────────────────────┘
56100

101+
**Options:**
102+
- `-c, --channel`: The name or id of the channel to update
103+
- `-l, --language`: Language of the subtitles to download (default: en)
104+
- `-j, --jobs`: Number of parallel download jobs (default: 8)
105+
- `--cookies-from-browser`: Browser to extract cookies from
106+
107+
### `delete`
108+
Delete a channel and all its data.
109+
110+
You must provide the name or the id of the channel you want to delete. The command will ask for confirmation before performing the deletion.
111+
112+
```bash
113+
yt-fts delete --channel "3Blue1Brown"
114+
```
115+
116+
**Options:**
117+
- `-c, --channel`: The name or id of the channel to delete (required)
118+
119+
### `export`
120+
Export transcripts for a channel.
121+
122+
This command will create a directory in the current working directory with the YouTube channel id of the specified channel.
123+
124+
```bash
125+
# Export to txt format (default)
126+
yt-fts export --channel "3Blue1Brown" --format txt
127+
128+
# Export to vtt format
129+
yt-fts export --channel "3Blue1Brown" --format vtt
57130
```
58131

59-
## `search` (Full Text Search)
132+
**Options:**
133+
- `-c, --channel`: The name or id of the channel to export transcripts for (required)
134+
- `-f, --format`: The format to export transcripts to. Supported formats: txt, vtt (default: txt)
135+
136+
### `search` (Full Text Search)
60137
Full text search for a string in saved channels.
61138

62139
- The search string does not have to be a word for word and match
@@ -79,7 +156,13 @@ yt-fts search "[search query]" --limit "[number of results]" --channel "[channel
79156
yt-fts search "[search query]" --export --channel "[channel name or id]"
80157
```
81158

82-
Advanced Search Syntax:
159+
**Options:**
160+
- `-c, --channel`: The name or id of the channel to search in
161+
- `-v, --video-id`: The id of the video to search in
162+
- `-l, --limit`: Number of results to return (default: 10)
163+
- `-e, --export`: Export search results to a CSV file
164+
165+
**Advanced Search Syntax:**
83166

84167
The search string supports sqlite [Enhanced Query Syntax](https://www.sqlite.org/fts3.html#full_text_index_queries).
85168
which includes things like [prefix queries](https://www.sqlite.org/fts3.html#termprefix) which you can use to match parts of a word.
@@ -95,17 +178,15 @@ yt-fts search "knife OR Malibu" --channel "The Tim Dillon Show"
95178
yt-fts search "rea* kni* Mali*" --channel "The Tim Dillon Show"
96179
```
97180

98-
99181
# Semantic Search and RAG
100-
You can enable semantic search for a channel by using the `mbeddings` command.
182+
You can enable semantic search for a channel by using the `embeddings` command.
101183
This requires an OpenAI API key set in the environment variable `OPENAI_API_KEY`, or
102184
you can pass the key with the `--openai-api-key` flag.
103185

104-
105-
## `embeddings`
186+
### `embeddings`
106187
Fetches OpenAI embeddings for specified channel
107-
```bash
108188

189+
```bash
109190
# make sure openAI key is set
110191
# export OPENAI_API_KEY="[yourOpenAIKey]"
111192

@@ -116,10 +197,43 @@ yt-fts embeddings --channel "3Blue1Brown"
116197
# but semantic search will have more text for you to read.
117198
yt-fts embeddings --interval 60 --channel "3Blue1Brown"
118199
```
200+
201+
**Options:**
202+
- `-c, --channel`: The name or id of the channel to generate embeddings for
203+
- `--openai-api-key`: OpenAI API key (if not provided, reads from OPENAI_API_KEY environment variable)
204+
- `-i, --interval`: Interval in seconds to split the transcripts into chunks (default: 30)
205+
119206
After the embeddings are saved you will see a `(ss)` next to the channel name when you
120207
list channels, and you will be able to use the `vsearch` command for that channel.
121208

122-
## `llm` (Chat Bot)
209+
### `vsearch` (Semantic Search)
210+
`vsearch` is for "Vector search". This requires that you enable semantic
211+
search for a channel with `embeddings`. It has the same options as
212+
`search` but output will be sorted by similarity to the search string and
213+
the default return limit is 10.
214+
215+
```bash
216+
# search by channel name
217+
yt-fts vsearch "[search query]" --channel "[channel name or id]"
218+
219+
# search in specific video
220+
yt-fts vsearch "[search query]" --video-id "[video id]"
221+
222+
# limit results
223+
yt-fts vsearch "[search query]" --limit "[number of results]" --channel "[channel name or id]"
224+
225+
# export results to csv
226+
yt-fts vsearch "[search query]" --export --channel "[channel name or id]"
227+
```
228+
229+
**Options:**
230+
- `-c, --channel`: The name or id of the channel to search in
231+
- `-v, --video-id`: The id of the video to search in
232+
- `-l, --limit`: Number of results to return (default: 10)
233+
- `-e, --export`: Export search results to a CSV file
234+
- `--openai-api-key`: OpenAI API key (if not provided, reads from OPENAI_API_KEY environment variable)
235+
236+
### `llm` (Chat Bot)
123237
Starts interactive chat session with `gpt-4o` OpenAI model using
124238
the semantic search results of your initial prompt as the context
125239
to answer questions. If it can't answer your question, it has a
@@ -130,7 +244,11 @@ off the conversation. The channel must have semantic search enabled.
130244
yt-fts llm --channel "3Blue1Brown" "How does back propagation work?"
131245
```
132246

133-
## `summarize`
247+
**Options:**
248+
- `-c, --channel`: The name or id of the channel to use (required)
249+
- `--openai-api-key`: OpenAI API key (if not provided, reads from OPENAI_API_KEY environment variable)
250+
251+
### `summarize`
134252
Summarizes a YouTube video transcript, providing time stamped URLS.
135253
Requires a valid YouTube video URL or video ID as argument. If the
136254
trancript is not in the database it will try to scrape it.
@@ -139,7 +257,15 @@ trancript is not in the database it will try to scrape it.
139257
yt-fts summarize "https://www.youtube.com/watch?v=9-Jl0dxWQs8"
140258
# or
141259
yt-fts summarize "9-Jl0dxWQs8"
260+
261+
# Use different model
262+
yt-fts summarize --model "gpt-3.5-turbo" "9-Jl0dxWQs8"
142263
```
264+
265+
**Options:**
266+
- `--model, -m`: Model to use in summary (default: gpt-4o)
267+
- `--openai-api-key`: OpenAI API key (if not provided, reads from OPENAI_API_KEY environment variable)
268+
143269
output:
144270
```
145271
In this video, 3Blue1Brown explores how large language models (LLMs) like GPT-3
@@ -154,25 +280,11 @@ might store facts within their vast...
154280
• Provides a refresher on transformers and explains that the video will focus
155281
```
156282

157-
## `vsearch` (Semantic Search)
158-
`vsearch` is for "Vector search". This requires that you enable semantic
159-
search for a channel with `embeddings`. It has the same options as
160-
`search` but output will be sorted by similarity to the search string and
161-
the default return limit is 10.
283+
### `config`
284+
Show config settings including database and chroma paths.
162285

163286
```bash
164-
# search by channel name
165-
yt-fts vsearch "[search query]" --channel "[channel name or id]"
166-
167-
# search in specific video
168-
yt-fts vsearch "[search query]" --video-id "[video id]"
169-
170-
# limit results
171-
yt-fts vsearch "[search query]" --limit "[number of results]" --channel "[channel name or id]"
172-
173-
# export results to csv
174-
yt-fts vsearch "[search query]" --export --channel "[channel name or id]"
175-
287+
yt-fts config
176288
```
177289

178290
## How To

pyproject.toml

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -4,7 +4,7 @@ build-backend = "setuptools.build_meta"
44

55
[project]
66
name = "yt-fts"
7-
version = "0.1.60"
7+
version = "0.1.61"
88
description = "Search all of a YouTube channel from the command line"
99
readme = "README.md"
1010
requires-python = ">=3.10"

src/yt_fts/__init__.py

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -1 +1 @@
1-
__version__ = "0.1.60"
1+
__version__ = "0.1.61"

0 commit comments

Comments
 (0)