InstaScraper

Python CLI tool that downloads all photos and text content from an Instagram profile. Produces two outputs: a folder of highest-resolution photos (posts, stories, highlights) and a consolidated JSON file with all text content for LLM voice profiling.

Requirements

Python 3.12+
instaloader>=4.15 (only dependency; pulls in requests transitively)

Setup

python3.12 -m venv venv
source venv/bin/activate
pip install instaloader

Or using the venv directly (no activate needed):

./venv/bin/python instagram_scraper.py ...

Usage

First run (prompts for password, saves session for reuse):

python instagram_scraper.py <target_username> --login <your_username>

Subsequent runs reuse the saved session automatically:

python instagram_scraper.py <target_username> --login <your_username>

Quick test run:

python instagram_scraper.py <target_username> --login <your_username> --max-posts 5 --skip-stories --skip-highlights --skip-comments

Options

Flag	Description
`target`	Instagram username to scrape (positional)
`--login`	Your Instagram username for authentication (required)
`--output-dir`	Base output directory (default: `instagram_scrape`)
`--max-posts`	Limit number of posts to scrape (default: all)
`--skip-stories`	Skip active story scraping
`--skip-highlights`	Skip highlight album scraping
`--skip-comments`	Skip owner comment scraping (faster)

Output

instagram_scrape/<username>/
  <username>_content.json     # All text content for LLM ingestion
  photos/
    profile_pic.jpg
    posts/
      <shortcode>_01.jpg      # Single or carousel images
    stories/
      story_<date>_<id>.jpg
    highlights/
      <album_title>/
        <date>_<id>.jpg

The JSON file includes: bio, captions, hashtags, mentions, the target user's own comments, story/highlight captions, and an aggregated text_summary section designed for LLM voice profiling.

Notes

Authentication required - stories, highlights, and comments need a logged-in session
2FA supported - prompts for code when needed; codes are one-time
Resume support - re-running skips already-downloaded photos
Ctrl+C safe - saves partial JSON before exiting
Photos only - videos are skipped
Rate limiting - large profiles (500+ posts) may take a while; don't use Instagram in another tab during scraping
Session files are saved to ~/.config/instaloader/ and contain auth tokens - don't share them

Name		Name	Last commit message	Last commit date
Latest commit History 1 Commit
.gitignore		.gitignore
CLAUDE.md		CLAUDE.md
README.md		README.md
instagram_scraper.py		instagram_scraper.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

InstaScraper

Requirements

Setup

Usage

Options

Output

Notes

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

InstaScraper

Requirements

Setup

Usage

Options

Output

Notes

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages