🚀 A comprehensive tool to analyze YouTube titles, descriptions, and thumbnails for policy compliance, demonetization risks, and content violations.
✅ Text Analysis
- Scans titles & descriptions for policy violations
- Detects hate speech, violence, harassment, sexual content, misinformation, and copyright issues
- Provides severity ratings and suggested fixes
✅ Sentiment Analysis
- Uses TextBlob to determine if content is positive, negative, or neutral
✅ Image Analysis (Thumbnails)
- OCR to extract text from images
- Skin detection to flag potential nudity
- Checks for inappropriate visuals
✅ Machine Learning (LSTM Model)
- Predicts risk scores based on historical data
- Improves accuracy over time
✅ YouTube API Integration
- Analyze existing videos by ID
- Fetch metadata for deeper insights
✅ Multilingual Support
- Supports non-English content via Google Translate
✅ Historical Trend Tracking
- Tracks risk scores over time
- Visualizes compliance trends
✅ User-Friendly GUI
- Built with Tkinter
- Easy-to-use interface for non-technical users
- Python 3.8+
- Tesseract OCR (for image text extraction)
git clone https://github.com/Harsha-hue/YouTube_Content_Guidelines_Analyzer.git
cd youtube-guidelines-analyzerpip install -r requirements.txt
python -m spacy download en_core_web_lg- Windows: Download from GitHub
- Mac:
brew install tesseract
- Linux (Debian/Ubuntu):
sudo apt install tesseract-ocr
- Go to Google Cloud Console
- Create a project & enable YouTube Data API v3
- Generate an API key
- Replace
YOUR_API_KEYinmain.py
python main.py- Enter video Title and Description
- Upload a thumbnail (optional)
- Click "Analyze"
- View detailed report
# Analyze text only
python main.py --title "Your Video Title" --desc "Your Description"
# Analyze a YouTube video by ID
python main.py --video-id "VIDEO_ID"
# Analyze non-English text
python main.py --text "Foreign Text" --lang "es"Edit demonetization_triggers.csv to add/remove terms.
python train_model.py --data "your_dataset.csv"Modify target_lang in translate_and_analyze() for different languages.
-
Text Analysis
- Uses regex & NLP to detect violations
- Checks sentiment (positive/negative/neutral)
-
Image Analysis
- OCR extracts text
- OpenCV detects skin tones
-
Machine Learning
- LSTM model predicts risk scores
- Improves with user feedback
-
Historical Trends
- Stores past analyses
- Plots risk trends over time
- Fork the repo
- Create a branch (
git checkout -b feature/new-feature) - Commit changes (
git commit -m "Add new feature") - Push (
git push origin feature/new-feature) - Open a Pull Request
This project is licensed under MIT License.
Open an Issue or reach out at harshavardhankarne@gmail.com.
🚀 Happy Analyzing! 🚀