Thank you for your interest in contributing to Pinterest Media Scraper! We welcome contributions from everyone and appreciate your help in making this Streamlit-based media downloader better for the community.
- Code of Conduct
- Getting Started
- How to Contribute
- Development Setup
- Coding Standards
- Testing Guidelines
- Commit Guidelines
- Pull Request Process
- Issue Guidelines
- Community
This project and everyone participating in it is governed by our Code of Conduct. By participating, you are expected to uphold this code. Please report unacceptable behavior to ujjwalkrai@gmail.com.
Before you begin, ensure you have the following:
- A GitHub account
- Git installed on your local machine
- Python 3.8 or higher
- Basic knowledge of web scraping and Streamlit applications
- Understanding of Pinterest's structure and media handling
- Fork the repository on GitHub
- Clone your fork locally:
git clone https://github.com/your-username/pinterest-media-scraper.git cd pinterest-media-scraper - Add the upstream repository:
git remote add upstream https://github.com/uikraft-hub/pinterest-media-scraper.git
- Review the project structure:
pinterest-media-scraper/
├── .github/
│ ├── ISSUE_TEMPLATE/
│ │ ├── bug_report.md
│ │ └── feature_request.md
│ ├── PULL_REQUEST_TEMPLATE.md
│ ├── RELEASE_TEMPLATE.md
│ └── workflows/
│ └── ci.yml
├── .gitignore
├── assets/
│ ├── pinterest-media-scraper-banner.jpg
│ └── screenshots/
│ └── screenshot.png
├── docs/
│ ├── CHANGELOG.md
│ ├── CODE_OF_CONDUCT.md
│ ├── CONTRIBUTING.md
│ ├── README.md
│ ├── SECURITY.md
│ ├── STATUS.md
│ └── USAGE.md
├── LICENSE
├── pyproject.toml
├── requirements.txt
├── src/
│ ├── app/
│ │ ├── __init__.py
│ │ ├── downloader.py
│ │ ├── ui.py
│ │ └── utils.py
│ └── main.py
└── tests/
└── test_downloader.py
- Install the project in development mode:
pip install -e .
We welcome several types of contributions:
- Core Functionality: Enhance media scraping capabilities and download features
- UI/UX Improvements: Improve the Streamlit interface and user experience
- Performance Optimization: Optimize scraping speed, memory usage, and concurrency
- Error Handling: Improve error handling and user feedback mechanisms
- Testing: Add comprehensive tests for scraping and download functionality
- Documentation: Improve guides, API documentation, and usage examples
- Bug Reports: Help us identify and fix scraping or download issues
- Feature Requests: Suggest new Pinterest media sources or download options
- Platform Support: Add support for new Pinterest URL formats or media types
- Check existing issues and pull requests to avoid duplicates
- For major changes or new features, please open an issue first to discuss your proposed changes
- Make sure your contribution aligns with the project's goal of providing reliable Pinterest media downloading
- Test your changes with various Pinterest URLs (pins, boards, profiles)
-
Create a virtual environment:
python -m venv venv source venv/bin/activate # On Windows: venv\Scripts\activate
-
Install dependencies:
pip install -r requirements.txt pip install -e . -
Install development dependencies:
pip install pytest pytest-cov black flake8 mypy
-
Create a new branch for your feature or improvement:
git checkout -b feature/your-feature-name # or git checkout -b fix/scraping-issue-description # or git checkout -b ui/interface-improvement
-
Start the Streamlit application:
streamlit run src/main.py
-
Test with various Pinterest URLs:
- Individual pins
- Board URLs
- Profile URLs
- Different media types (images, videos)
- Code Formatting: Use
blackfor consistent code formatting - Linting: Use
flake8for code quality checks - Type Checking: Use
mypyfor static type analysis - Testing: Use
pytestfor running tests
- Write clean, readable, and well-documented Python code
- Follow PEP 8 style guidelines with Black formatting
- Use type hints for function parameters and return values
- Handle errors gracefully with proper exception handling
- Log important events and errors for debugging
- Respect Pinterest's robots.txt and rate limiting
- Use Black for automatic code formatting
- Maximum line length: 88 characters (Black default)
- Use meaningful variable and function names
- Add docstrings for all public functions and classes
- Follow PEP 8 naming conventions
- Separate concerns: UI logic in
ui.py, scraping logic indownloader.py - Use utility functions in
utils.pyfor common operations - Keep the main application entry point clean in
main.py - Implement proper error handling and user feedback
- Respect Rate Limits: Implement appropriate delays between requests
- User-Agent Headers: Use appropriate user-agent strings
- Error Handling: Handle HTTP errors, network timeouts, and parsing errors
- Content Validation: Verify downloaded content integrity
- Legal Compliance: Ensure scraping practices comply with terms of service
- Responsive Design: Ensure UI works on different screen sizes
- Progress Feedback: Show progress for long-running operations
- Error Messages: Display clear, actionable error messages
- Input Validation: Validate Pinterest URLs before processing
- State Management: Use Streamlit session state appropriately
Tests are located in the tests/ directory and should follow these patterns:
# tests/test_downloader.py
import pytest
from src.app.downloader import PinterestDownloader
def test_url_validation():
"""Test Pinterest URL validation."""
downloader = PinterestDownloader()
assert downloader.is_valid_pinterest_url("https://pinterest.com/pin/123456")
assert not downloader.is_valid_pinterest_url("https://invalid-url.com")
def test_media_extraction():
"""Test media extraction from Pinterest pages."""
# Implementation details
pass# Run all tests
pytest
# Run with coverage
pytest --cov=src
# Run specific test file
pytest tests/test_downloader.py
# Run tests with verbose output
pytest -v- Write tests for all new functionality
- Test both success and failure scenarios
- Mock external API calls and network requests
- Test with various Pinterest URL formats
- Include edge cases and error conditions
- Maintain test coverage above 80%
We follow the Conventional Commits specification for commit messages.
<type>[optional scope]: <description>
[optional body]
[optional footer(s)]
feat: A new feature (scraping capability, UI improvement)fix: A bug fix (scraping error, UI issue, download problem)perf: Performance improvements (faster scraping, memory optimization)refactor: Code refactoring without changing functionalitytest: Adding or updating testsdocs: Documentation only changesstyle: Code style changes (formatting, etc.)chore: Maintenance tasks and dependency updates
scraper: Changes to scraping functionalityui: Streamlit interface changesdownloader: Download mechanism changesutils: Utility functionstests: Test-related changesdocs: Documentation changes
feat(scraper): add support for Pinterest video downloads
fix(ui): resolve progress bar not updating during batch downloads
perf(downloader): optimize concurrent download processing
docs: update usage examples with new Pinterest URL formats
test(scraper): add comprehensive URL validation tests
-
Ensure your branch is up to date with the main branch:
git fetch upstream git rebase upstream/main
-
Run the full test suite:
pytest
-
Check code formatting and linting:
black src/ tests/ flake8 src/ tests/ mypy src/
-
Test the application manually with various Pinterest URLs
-
Update documentation if necessary
-
Push your branch to your fork:
git push origin feature/your-feature-name
-
Create a pull request from your fork to the main repository
-
Fill out the pull request template completely
-
Include testing information and sample Pinterest URLs used for testing
- Code follows the project's coding standards
- Tests pass locally and cover new functionality
- Documentation has been updated (if applicable)
- Commit messages follow conventional commit format
- Changes have been tested with various Pinterest URL types
- No breaking changes (or breaking changes are documented)
- Performance impact has been considered
- Error handling has been implemented appropriately
- Search existing issues to avoid duplicates
- Test with the latest version of the application
- Gather relevant information (URLs, error messages, screenshots)
- Try to reproduce the issue consistently
When reporting a scraping or download bug, please include:
- Bug Description: Clear and concise description of the issue
- Pinterest URL: The specific Pinterest URL that's causing issues
- Steps to Reproduce: Detailed steps to reproduce the issue
- Expected Behavior: What should happen
- Actual Behavior: What actually happens
- Error Messages: Any error messages or logs
- Environment: Operating system, Python version, browser (if relevant)
- Screenshots: UI screenshots showing the issue
When requesting a new feature, please include:
- Feature Description: Clear description of the proposed feature
- Use Case: Why is this feature needed? What problem does it solve?
- Pinterest Context: Which Pinterest features or URL types this relates to
- Proposed Implementation: Your ideas for how this could be implemented
- Examples: Examples from other scraping tools if applicable
- Priority: How important is this feature to you?
When reporting performance problems:
- Performance Issue: Description of the slowness or resource usage
- Test URLs: Pinterest URLs used for testing
- System Specs: Hardware specifications and available resources
- Measurements: Specific timing or memory usage measurements
- Comparison: Performance with different URL types or quantities
- Proposed Solutions: Any optimization ideas you might have
If you need help or have questions:
- Open an issue with the "question" label
- Email us at ujjwalkrai@gmail.com
- Check existing documentation in the
docs/folder - Review the USAGE.md for detailed usage instructions
For development-related discussions:
- Use GitHub Discussions for general questions
- Comment on relevant issues for specific topics
- Join conversations in pull requests
- Follow our Code of Conduct in all interactions
We appreciate all contributions and maintain a contributors list to recognize everyone who has helped improve this project. Contributors will be acknowledged in our documentation and release notes.
By contributing to this project, you agree that your contributions will be licensed under the MIT License, the same license as the project. See LICENSE for details.
# Setup
git clone https://github.com/your-username/pinterest-media-scraper.git
cd pinterest-media-scraper
python -m venv venv
source venv/bin/activate # Windows: venv\Scripts\activate
pip install -e .
# Development
git checkout -b feature/new-scraping-feature
# Make your changes
black src/ tests/
flake8 src/ tests/
pytest
git add .
git commit -m "feat(scraper): add new Pinterest board scraping capability"
git push origin feature/new-scraping-feature
# Testing
pytest
pytest --cov=src
streamlit run src/main.py- 📧 Email: ujjwalkrai@gmail.com
- 🐛 Issues: Repository Issues
- 🔓 Security: Repository Security
- ⛏ Pull Requests: Repository Pull Requests
- 📖 Documentation: Repository Documentation
- 📃 Changelog: Repository Changelog
If you're new to contributing or need assistance:
- Start by reviewing the existing code structure
- Test the application with different Pinterest URLs
- Begin with small improvements or bug fixes
- Don't hesitate to ask questions in issues or via email
- Check the USAGE.md for application usage details
Thank you for contributing to Pinterest Media Scraper! Together, we're making Pinterest media downloading more accessible and reliable for everyone. 🎉