TLDV Conversation Extractor

A tool to extract clean conversation transcripts from TLDV HTML, available as both a web app and a Python script.

🚀 Live Demo

Try the web version: TLDV Conversation Extractor

🎯 What It Does

This tool extracts clean, readable conversation transcripts from TLDV (The Long Distance Video) meeting recordings. It processes the HTML of a TLDV transcript and converts it into a simple text format with speaker names and their dialogue.

📋 Example Output

Speaker 1: Not at the moment. No. Yeah, that's
Speaker 2: It. Listen, listen. Now and Another another project too. Pretty cool. Eh
Speaker 1: Um, what are you talking about this page?

🌐 Web Version

Features

Works entirely in your browser - no data is sent to any server
Privacy-focused: all processing happens locally
Download the extracted conversation as a text file
Copy the extracted conversation to clipboard
No installation or dependencies required

How to Use the Web Version

Visit TLDV Conversation Extractor
Go to your TLDV transcript page
Right-click on the transcript container and select "Inspect Element"
Find the <div id="transcript-container"> element
Right-click on it and select "Copy" → "Copy outerHTML"
Paste the copied HTML into the tool
Click "Extract Conversation" to process the transcript
Use the "Download" or "Copy to Clipboard" buttons as needed

Sample Input

We've included a sample-input.html file in this repository that shows the expected HTML structure from TLDV. The sample input would produce this output:

Speaker 1: Hello and welcome to our meeting.
Speaker 2: Thanks for having me here today.
Speaker 1: Let's discuss the project timeline.

This sample demonstrates the HTML structure that the tool is designed to parse. You can use it to test the tool or understand what kind of HTML to copy from TLDV.

Installation (For Developers)

Want to host your own version? You have several options:

Netlify

Fork this repository
Sign up for Netlify
Click "New site from Git" and select your forked repository
Configure build settings (leave defaults)
Click "Deploy site"

Cloudflare Pages

Fork this repository
Sign up for Cloudflare Pages
Create a new project and connect your GitHub account
Select your forked repository
Configure with these settings:
- Build command: (leave empty)
- Build output directory: /
Click "Save and Deploy"

Vercel

Fork this repository
Sign up for Vercel
Create a new project and import your forked repository
Configure with default settings
Click "Deploy"

💻 Python Script Version

For those who prefer a command-line tool or want to process files in batch, we provide a Python script.

Requirements

Python 3.6+
BeautifulSoup4 (pip install beautifulsoup4)

Installation

# Clone the repository
git clone https://github.com/barshy/tldv-transcript-extractor-.git

# Navigate to the directory
cd tldv-transcript-extractor-

# Install dependencies
pip install -r requirements.txt

Usage

# Process a single file
python extract_conversation.py input_file.txt [output_file.txt]

# If output file is not specified, result is printed to console
python extract_conversation.py conversation1.txt

Python Script Features

Process TLDV transcript HTML files from the command line
Save output to a file or print to console
Can be integrated into other Python projects
Suitable for batch processing multiple files

🔒 Privacy

Both versions (web and Python) process all data locally. Your transcript data never leaves your computer.

💡 Technical Details

Web version: Built with vanilla HTML, CSS, and JavaScript
Python version: Uses BeautifulSoup4 for HTML parsing
No external API calls or data collection

🤝 Contributing

Contributions are welcome! Feel free to open issues or submit pull requests.

Fork the repository
Create your feature branch (git checkout -b feature/amazing-feature)
Commit your changes (git commit -m 'Add some amazing feature')
Push to the branch (git push origin feature/amazing-feature)
Open a Pull Request

📄 License

This project is licensed under the MIT License - see the LICENSE file for details.

Name		Name	Last commit message	Last commit date
Latest commit History 10 Commits
DEPLOY.md		DEPLOY.md
LICENSE		LICENSE
README.md		README.md
extract_conversation.py		extract_conversation.py
index.html		index.html
netlify.toml		netlify.toml
requirements.txt		requirements.txt
sample-input.html		sample-input.html

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

TLDV Conversation Extractor

🚀 Live Demo

🎯 What It Does

📋 Example Output

🌐 Web Version

Features

How to Use the Web Version

Sample Input

Installation (For Developers)

Netlify

Cloudflare Pages

Vercel

💻 Python Script Version

Requirements

Installation

Usage

Python Script Features

🔒 Privacy

💡 Technical Details

🤝 Contributing

📄 License

About

Uh oh!

Releases

Packages

Uh oh!

Languages

License

barshy/tldv-transcript-extractor-

Folders and files

Latest commit

History

Repository files navigation

TLDV Conversation Extractor

🚀 Live Demo

🎯 What It Does

📋 Example Output

🌐 Web Version

Features

How to Use the Web Version

Sample Input

Installation (For Developers)

Netlify

Cloudflare Pages

Vercel

💻 Python Script Version

Requirements

Installation

Usage

Python Script Features

🔒 Privacy

💡 Technical Details

🤝 Contributing

📄 License

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Languages

Packages