ChatGPT Browser - Code Documentation

Project Overview
Architecture
Database Schema
Core Application (app.py)
Database Management
Templates and Frontend
Static Assets
API Endpoints
Data Import System — including What we exclude (for nerds)
Settings Management
View Modes
Error Handling
Security Considerations
Performance Optimizations
Development Guidelines

Project Overview

ChatGPT Browser is a Flask-based web application designed to import, store, and browse ChatGPT conversation history. The application provides two distinct viewing modes:

Nice Mode: Clean, focused view showing only the canonical conversation path
Dev Mode: Full technical view with metadata and all conversation branches

Key Features

Import ChatGPT JSON export files
Dual viewing modes (Nice/Dev)
Dark/Light theme support
Markdown rendering
Conversation tree navigation
Metadata inspection
Customizable user/assistant names

Architecture

Technology Stack

Backend: Flask 3.0.2
Database: SQLite3
Template Engine: Jinja2 3.1.3
Markdown Processing: markdown 3.5.2
Date Handling: python-dateutil 2.8.2

Project Structure

chatGPT-browser/
├── app.py                 # Main Flask application
├── init_db.py            # Database initialization script
├── schema.sql            # Database schema definition
├── requirements.txt      # Python dependencies
├── templates/            # Jinja2 HTML templates
│   ├── base.html
│   ├── index.html
│   ├── conversation.html
│   ├── nice_conversation.html
│   └── settings.html
├── static/               # Static assets
│   └── style.css
├── docs/                 # Documentation
│   └── CODE_DOCUMENTATION.md
└── diag-tools/           # Diagnostic tools (empty)

Database Schema

Core Tables

1. `conversations` Table

Stores conversation metadata and basic information.

CREATE TABLE conversations (
    id TEXT PRIMARY KEY,           -- ChatGPT conversation ID
    create_time TEXT,              -- Creation timestamp
    update_time TEXT,              -- Last update timestamp
    title TEXT                     -- Conversation title
);

2. `messages` Table

Stores individual messages within conversations.

CREATE TABLE messages (
    id TEXT PRIMARY KEY,           -- Message ID
    conversation_id TEXT,          -- Foreign key to conversations
    role TEXT,                     -- 'user' or 'assistant'
    content TEXT,                  -- JSON-encoded message content
    create_time TEXT,              -- Message creation timestamp
    update_time TEXT,              -- Message update timestamp
    parent_id TEXT,                -- Parent message ID for threading
    FOREIGN KEY (conversation_id) REFERENCES conversations(id)
);

3. `message_metadata` Table

Stores technical metadata for messages.

CREATE TABLE message_metadata (
    message_id TEXT PRIMARY KEY,   -- Foreign key to messages
    message_type TEXT,             -- Type of message
    model_slug TEXT,               -- AI model used
    citations TEXT,                -- JSON-encoded citations
    content_references TEXT,       -- JSON-encoded content references
    finish_details TEXT,           -- JSON-encoded finish details
    is_complete BOOLEAN,           -- Whether message is complete
    request_id TEXT,               -- Request identifier
    timestamp_ TEXT,               -- Technical timestamp
    message_source TEXT,           -- Message source
    serialization_metadata TEXT,   -- JSON-encoded serialization metadata
    FOREIGN KEY (message_id) REFERENCES messages(id)
);

4. `message_children` Table

Manages parent-child relationships between messages for conversation threading.

CREATE TABLE message_children (
    parent_id TEXT,                -- Parent message ID
    child_id TEXT,                 -- Child message ID
    PRIMARY KEY (parent_id, child_id),
    FOREIGN KEY (parent_id) REFERENCES messages(id),
    FOREIGN KEY (child_id) REFERENCES messages(id)
);

5. `settings` Table

Stores application configuration settings.

CREATE TABLE settings (
    key TEXT PRIMARY KEY,          -- Setting key
    value TEXT                     -- Setting value
);

Default Settings

INSERT OR IGNORE INTO settings (key, value) VALUES 
    ('user_name', 'User'),
    ('assistant_name', 'Assistant'),
    ('dev_mode', 'false'),         -- false = nice mode, true = dev mode
    ('dark_mode', 'false'),
    ('verbose_mode', 'false');

Core Application (`app.py`)

Application Initialization

app = Flask(__name__)
app.secret_key = os.urandom(24)  # Required for session management

# Initialize markdown with extensions
md = markdown.Markdown(extensions=['fenced_code', 'tables'])

Database Connection Management

def get_db():
    """Create and return a database connection with Row factory."""
    conn = sqlite3.connect('chatgpt.db')
    conn.row_factory = sqlite3.Row
    return conn

Database Initialization

The init_db() function:

Creates all necessary tables if they don't exist
Sets up foreign key relationships
Inserts default settings
Handles database schema migrations

Jinja2 Template Filters

JSON Filters

@app.template_filter('fromjson')
def fromjson(value):
    """Convert JSON string to Python object."""
    try:
        return json.loads(value)
    except:
        return []

@app.template_filter('tojson')
def tojson(value, indent=None):
    """Convert Python object to JSON string."""
    try:
        return json.dumps(value, indent=indent)
    except:
        return str(value)

DateTime Filter

@app.template_filter('datetime')
def format_datetime(timestamp):
    """Format timestamp for display."""
    try:
        if isinstance(timestamp, str):
            timestamp = float(timestamp)
        return datetime.fromtimestamp(timestamp).strftime('%Y-%m-%d %H:%M:%S')
    except (ValueError, TypeError):
        return timestamp

Markdown Filter

@app.template_filter('markdown')
def markdown_filter(text):
    """Convert markdown text to HTML."""
    if text is None:
        return ""
    return Markup(md.convert(text))

Database Management

Settings Management

def get_setting(key, default=None):
    """Retrieve a setting value from the database."""
    conn = get_db()
    setting = conn.execute('SELECT value FROM settings WHERE key = ?', (key,)).fetchone()
    conn.close()
    return setting['value'] if setting else default

def set_setting(key, value):
    """Store a setting value in the database."""
    conn = get_db()
    conn.execute('INSERT OR REPLACE INTO settings (key, value) VALUES (?, ?)', (key, value))
    conn.commit()
    conn.close()

Database Initialization Script (`init_db.py`)

def main():
    # Remove existing database if it exists
    if os.path.exists('chatgpt.db'):
        os.remove('chatgpt.db')
    
    # Initialize fresh database
    init_db()
    print("Database initialized successfully!")

Templates and Frontend

Template Hierarchy

base.html: Base template with common layout and navigation
index.html: Conversation list page
conversation.html: Full conversation view (Dev mode)
nice_conversation.html: Clean conversation view (Nice mode)
settings.html: Application settings page

Template Features

Theme Support: Dark/light mode toggle
Responsive Design: Mobile-friendly layout
Markdown Rendering: Code highlighting and formatting
Dynamic Content: Real-time settings updates
Navigation: Breadcrumb-style navigation

Static Assets

CSS Styling (`static/style.css`)

The stylesheet provides:

Theme System: Dark and light mode styles
Responsive Layout: Mobile-first design
Typography: Readable font choices and spacing
Interactive Elements: Hover effects and transitions
Code Highlighting: Syntax highlighting for code blocks

API Endpoints

Main Routes

1. Index Page (`/`)

@app.route('/')
def index():
    """Display conversation list with current settings."""

Method: GET
Purpose: Shows all conversations in chronological order
Template: index.html

2. Conversation View (`/conversation/<conversation_id>`)

@app.route('/conversation/<conversation_id>')
def conversation(conversation_id):
    """Display full conversation in dev mode."""

Method: GET
Purpose: Shows complete conversation with all metadata
Template: conversation.html
Redirects: To nice view if dev mode is disabled

3. Nice Conversation View (`/conversation/<conversation_id>/nice`)

@app.route('/conversation/<conversation_id>/nice')
def nice_conversation(conversation_id):
    """Display canonical conversation path."""

Method: GET
Purpose: Shows only the canonical conversation path
Template: nice_conversation.html

4. Full Conversation View (`/conversation/<conversation_id>/full`)

@app.route('/conversation/<conversation_id>/full')
def full_conversation(conversation_id):
    """Force dev mode for conversation view."""

Method: GET
Purpose: Temporarily enables dev mode for viewing
Behavior: Sets session override and redirects

Settings Routes

5. Settings Page (`/settings`)

@app.route('/settings')
def settings():
    """Display settings page."""

Method: GET
Purpose: Shows application settings
Template: settings.html

6. Update Names (`/update_names`)

@app.route('/update_names', methods=['POST'])
def update_names():
    """Update user and assistant display names."""

Method: POST
Purpose: Updates display names in database
Redirects: To index page

Toggle Routes

7. Toggle View Mode (`/toggle_view_mode`)

@app.route('/toggle_view_mode')
def toggle_view_mode():
    """Toggle between nice and dev modes."""

Method: GET
Purpose: Switches between nice and dev viewing modes
Response: JSON with new mode status

8. Toggle Dark Mode (`/toggle_dark_mode`)

@app.route('/toggle_dark_mode')
def toggle_dark_mode():
    """Toggle between dark and light themes."""

Method: GET
Purpose: Switches between dark and light themes
Redirects: To previous page

9. Toggle Verbose Mode (`/toggle_verbose_mode`)

@app.route('/toggle_verbose_mode')
def toggle_verbose_mode():
    """Toggle verbose mode for additional details."""

Method: GET
Purpose: Shows/hides additional technical details
Response: JSON with new verbose status

Import Route

10. Import JSON (`/import`)

@app.route('/import', methods=['POST'])
def import_json():
    """Import ChatGPT conversation data from JSON file."""

Method: POST
Purpose: Processes ChatGPT export files
File Handling: Accepts JSON file uploads
Redirects: To index page after import

Data Import System

Import Process Flow

File Validation: Check for uploaded file and valid JSON
Data Parsing: Parse ChatGPT export format
Conversation Processing: Extract conversation metadata
Message Processing: Process individual messages and relationships
Metadata Extraction: Extract technical metadata
Database Storage: Store all data in SQLite database

ChatGPT Export Format

The import system expects ChatGPT's JSON export format:

[
  {
    "id": "conversation_id",
    "create_time": "timestamp",
    "update_time": "timestamp",
    "title": "Conversation Title",
    "mapping": {
      "message_id": {
        "id": "message_id",
        "message": {
          "id": "message_id",
          "author": {"role": "user|assistant"},
          "content": {"parts": ["message content"]},
          "create_time": "timestamp",
          "update_time": "timestamp",
          "metadata": {...}
        },
        "parent": "parent_message_id",
        "children": ["child_message_ids"]
      }
    }
  }
]

Import Error Handling

File Validation: Checks for empty files and valid JSON
Conversation Processing: Skips conversations with missing IDs
Message Processing: Continues processing even if individual messages fail
Database Transactions: Uses transactions for data consistency

What we exclude (for nerds)

This section is for anyone who cares exactly what data the importer does not store.

1. Other files in the export

Only conversations.json is read. Everything else in the ChatGPT export zip is ignored: chat.html, group_chats.json, message_feedback.json, shared_conversations.json, shopping.json, user.json. So group chats, feedback, shared conversations, and user profile are not imported.

2. Whole conversations

Any conversation with a missing or empty id is skipped (logged as "Skipping conversation: missing ID").

3. Messages

Any mapping entry whose message is missing or an empty dict is skipped. That message is never inserted and its id is not added to the set of "inserted" ids for that conversation.

4. Message content

Only content.parts is stored (as JSON). Other keys on content in the export are not persisted.

5. Metadata

Only a fixed set of metadata fields is stored: message_type, model_slug, citations, content_references, finish_details, is_complete, request_id, timestamp, message_source, serialization_metadata. Any other metadata keys are dropped.

6. Parent–child links (message_children)

We insert a row (parent_id, child_id) only when both the parent and the child message were successfully inserted for that conversation. We skip a link when:

The child was not inserted (e.g. no message key, or exception during insert) — we don't add that child link.
The parent was not inserted — we skip the entire block of children for that parent, so no links from that parent are added.

Sampling real export data shows:

Excluded "child" links (parent inserted, child not): In practice 0 in sampled conversations. We are not dropping links because the child message was missing.
Excluded "parent" blocks: Typically one per conversation. These are mapping entries that have a children array but no message key (or an empty message). They are structural/synthetic nodes, for example:
- client-created-root — synthetic root; no content; its single child is the real first message of the thread (which we do import).
- Other UUIDs with no message — branch/structure-only nodes in the export with no displayable content.

So the only excluded links are from these synthetic nodes to real messages. We do not lose any user or assistant content. The app's canonical path walks parent_id on messages (leaf → root), not the children table, so display and threading still work. The script scripts/sample_excluded_children.py can be run against your export to reproduce these counts and sample excluded entries (set MAX_CONV and MAX_SAMPLES if needed).

7. Custom instructions and memories

The app does not import or display ChatGPT’s custom instructions or memories as first-class data. Reason:

The official ChatGPT export (the zip you get from Settings → Data Controls → Export) does not appear to include a dedicated file for custom instructions or memories. The zip typically contains conversations.json, chat.html, user.json, group_chats.json, message_feedback.json, shared_conversations.json, shopping.json, etc. None of these are a dedicated “custom instructions” or “memories” export.
We only read conversations.json. So even if OpenAI added a custom_instructions.json or memories.json in a future export, the current app would not load it unless we added support.
If custom instructions or memory-like text are embedded inside a conversation (e.g. as a system message or a special message type in conversations.json), they would be imported as normal messages (we store all roles, including system, and content.parts). We don’t treat them as a separate “custom instructions” or “memories” section in the UI.

So: custom instructions and memories are not exported by the system in a way we consume, and we don’t have a dedicated place to show them. Any such content that appears inside a conversation’s messages will still be in the DB and visible in the thread.

Limitation. Custom instructions and memories are important user data and are worth exporting. The app would support importing and displaying them if OpenAI included them in the data export (e.g. a custom_instructions.json or memories.json in the zip). Until then, users who want this data preserved can request that OpenAI add it to the export format (e.g. via in-product feedback or support).

Settings Management

Available Settings

Setting Key	Default Value	Description
`user_name`	"User"	Display name for user messages
`assistant_name`	"Assistant"	Display name for assistant messages
`dev_mode`	"false"	View mode (false=nice, true=dev)
`dark_mode`	"false"	Theme mode (false=light, true=dark)
`verbose_mode`	"false"	Show additional technical details

Settings Persistence

Settings are stored in SQLite database
Changes persist across application restarts
Session overrides available for temporary changes

View Modes

Nice Mode (Default)

Purpose: Clean, focused conversation viewing

Features:

Shows only canonical conversation path
Hides technical metadata
Clean, distraction-free interface
Optimized for conversation review

Implementation:

# Find canonical endpoint (message with no children)
canonical_endpoint = conn.execute('''
    SELECT m.id, m.role, m.content, m.create_time, m.parent_id
    FROM messages m
    LEFT JOIN messages child ON m.id = child.parent_id
    WHERE m.conversation_id = ? AND child.id IS NULL
    ORDER BY m.create_time DESC
    LIMIT 1
''', (conversation_id,)).fetchone()

Dev Mode

Purpose: Full technical conversation analysis

Features:

Shows all messages and branches
Displays technical metadata
Message IDs and timestamps
Conversation tree structure
Debugging information

Implementation:

# Get all messages with metadata
messages = conn.execute('''
    SELECT m.*, 
           mm.message_type, mm.model_slug, mm.citations, mm.content_references,
           mm.finish_details, mm.is_complete, mm.request_id, mm.timestamp_,
           mm.message_source, mm.serialization_metadata
    FROM messages m
    LEFT JOIN message_metadata mm ON m.id = mm.message_id
    WHERE m.conversation_id = ?
    ORDER BY m.create_time
''', (conversation_id,)).fetchall()

Error Handling

Database Errors

Connection Management: Proper connection cleanup
Transaction Rollback: Automatic rollback on errors
Graceful Degradation: Continue processing on partial failures

Import Errors

File Validation: Comprehensive file format checking
Data Validation: Skip invalid records, continue processing
Error Logging: Console output for debugging

Web Errors

404 Handling: Proper "not found" responses
400 Handling: Bad request responses for invalid data
500 Handling: Internal server error handling

Security Considerations

File Upload Security

File Type Validation: Only accepts JSON files
Content Validation: Validates JSON structure
Size Limits: Implicit size limits through Flask configuration

Session Security

Random Secret Key: Generated using os.urandom(24)
Session Management: Proper session cleanup
CSRF Protection: Form-based protection

Database Security

Parameterized Queries: Prevents SQL injection
Input Validation: Validates all user inputs
Error Information: Limited error details in production

Performance Optimizations

Database Optimizations

Indexes: Created on frequently queried columns
Connection Pooling: Efficient database connection management
Query Optimization: Optimized SQL queries for large datasets

Frontend Optimizations

CSS Minification: Optimized stylesheet delivery
Template Caching: Jinja2 template caching
Static Asset Caching: Browser caching for static files

Memory Management

Connection Cleanup: Proper database connection handling
Large File Handling: Streaming file processing
Memory-Efficient Processing: Processing large datasets in chunks

Development Guidelines

Code Style

PEP 8 Compliance: Follow Python style guidelines
Docstrings: Comprehensive function documentation
Type Hints: Consider adding type hints for better IDE support

Testing

Unit Tests: Test individual functions and components
Integration Tests: Test database operations and API endpoints
Manual Testing: Test import functionality with real data

Deployment

Production Configuration: Disable debug mode
Database Backup: Regular database backups
Logging: Implement proper logging for production

Future Enhancements

Search Functionality: Full-text search across conversations
Export Features: Export conversations in various formats
User Authentication: Multi-user support
API Endpoints: RESTful API for external integrations
Advanced Analytics: Conversation analysis and insights

Maintenance

Database Migrations: Schema versioning and migration system
Dependency Updates: Regular security and feature updates
Performance Monitoring: Monitor application performance
Error Tracking: Implement error tracking and alerting

This documentation covers the complete technical implementation of the ChatGPT Browser application. For user-facing documentation, see the main README.md file.

FilesExpand file tree

CODE_DOCUMENTATION.md

Latest commit

History

CODE_DOCUMENTATION.md

File metadata and controls

ChatGPT Browser - Code Documentation

Table of Contents

Project Overview

Key Features

Architecture

Technology Stack

Project Structure

Database Schema

Core Tables

1. conversations Table

2. messages Table

3. message_metadata Table

4. message_children Table

5. settings Table

Default Settings

Core Application (app.py)

Application Initialization

Database Connection Management

Database Initialization

Jinja2 Template Filters

JSON Filters

DateTime Filter

Markdown Filter

Database Management

Settings Management

Database Initialization Script (init_db.py)

Templates and Frontend

Template Hierarchy

Template Features

Static Assets

CSS Styling (static/style.css)

API Endpoints

Main Routes

1. Index Page (/)

2. Conversation View (/conversation/<conversation_id>)

3. Nice Conversation View (/conversation/<conversation_id>/nice)

4. Full Conversation View (/conversation/<conversation_id>/full)

Settings Routes

5. Settings Page (/settings)

6. Update Names (/update_names)

Toggle Routes

7. Toggle View Mode (/toggle_view_mode)

8. Toggle Dark Mode (/toggle_dark_mode)

9. Toggle Verbose Mode (/toggle_verbose_mode)

Import Route

10. Import JSON (/import)

Data Import System

Import Process Flow

ChatGPT Export Format

Import Error Handling

What we exclude (for nerds)

Settings Management

Available Settings

Settings Persistence

View Modes

Nice Mode (Default)

Dev Mode

Error Handling

Database Errors

Import Errors

Web Errors

Security Considerations

File Upload Security

Session Security

Database Security

Performance Optimizations

Database Optimizations

Frontend Optimizations

Memory Management

Development Guidelines

Code Style

Testing

Deployment

Future Enhancements

1. `conversations` Table

2. `messages` Table

3. `message_metadata` Table

4. `message_children` Table

5. `settings` Table

Core Application (`app.py`)

Database Initialization Script (`init_db.py`)

CSS Styling (`static/style.css`)

1. Index Page (`/`)

2. Conversation View (`/conversation/<conversation_id>`)

3. Nice Conversation View (`/conversation/<conversation_id>/nice`)

4. Full Conversation View (`/conversation/<conversation_id>/full`)

5. Settings Page (`/settings`)

6. Update Names (`/update_names`)

7. Toggle View Mode (`/toggle_view_mode`)

8. Toggle Dark Mode (`/toggle_dark_mode`)

9. Toggle Verbose Mode (`/toggle_verbose_mode`)

10. Import JSON (`/import`)