- Project Overview
- Architecture
- Database Schema
- Core Application (
app.py) - Database Management
- Templates and Frontend
- Static Assets
- API Endpoints
- Data Import System — including What we exclude (for nerds)
- Settings Management
- View Modes
- Error Handling
- Security Considerations
- Performance Optimizations
- Development Guidelines
ChatGPT Browser is a Flask-based web application designed to import, store, and browse ChatGPT conversation history. The application provides two distinct viewing modes:
- Nice Mode: Clean, focused view showing only the canonical conversation path
- Dev Mode: Full technical view with metadata and all conversation branches
- Import ChatGPT JSON export files
- Dual viewing modes (Nice/Dev)
- Dark/Light theme support
- Markdown rendering
- Conversation tree navigation
- Metadata inspection
- Customizable user/assistant names
- Backend: Flask 3.0.2
- Database: SQLite3
- Template Engine: Jinja2 3.1.3
- Markdown Processing: markdown 3.5.2
- Date Handling: python-dateutil 2.8.2
chatGPT-browser/
├── app.py # Main Flask application
├── init_db.py # Database initialization script
├── schema.sql # Database schema definition
├── requirements.txt # Python dependencies
├── templates/ # Jinja2 HTML templates
│ ├── base.html
│ ├── index.html
│ ├── conversation.html
│ ├── nice_conversation.html
│ └── settings.html
├── static/ # Static assets
│ └── style.css
├── docs/ # Documentation
│ └── CODE_DOCUMENTATION.md
└── diag-tools/ # Diagnostic tools (empty)
Stores conversation metadata and basic information.
CREATE TABLE conversations (
id TEXT PRIMARY KEY, -- ChatGPT conversation ID
create_time TEXT, -- Creation timestamp
update_time TEXT, -- Last update timestamp
title TEXT -- Conversation title
);Stores individual messages within conversations.
CREATE TABLE messages (
id TEXT PRIMARY KEY, -- Message ID
conversation_id TEXT, -- Foreign key to conversations
role TEXT, -- 'user' or 'assistant'
content TEXT, -- JSON-encoded message content
create_time TEXT, -- Message creation timestamp
update_time TEXT, -- Message update timestamp
parent_id TEXT, -- Parent message ID for threading
FOREIGN KEY (conversation_id) REFERENCES conversations(id)
);Stores technical metadata for messages.
CREATE TABLE message_metadata (
message_id TEXT PRIMARY KEY, -- Foreign key to messages
message_type TEXT, -- Type of message
model_slug TEXT, -- AI model used
citations TEXT, -- JSON-encoded citations
content_references TEXT, -- JSON-encoded content references
finish_details TEXT, -- JSON-encoded finish details
is_complete BOOLEAN, -- Whether message is complete
request_id TEXT, -- Request identifier
timestamp_ TEXT, -- Technical timestamp
message_source TEXT, -- Message source
serialization_metadata TEXT, -- JSON-encoded serialization metadata
FOREIGN KEY (message_id) REFERENCES messages(id)
);Manages parent-child relationships between messages for conversation threading.
CREATE TABLE message_children (
parent_id TEXT, -- Parent message ID
child_id TEXT, -- Child message ID
PRIMARY KEY (parent_id, child_id),
FOREIGN KEY (parent_id) REFERENCES messages(id),
FOREIGN KEY (child_id) REFERENCES messages(id)
);Stores application configuration settings.
CREATE TABLE settings (
key TEXT PRIMARY KEY, -- Setting key
value TEXT -- Setting value
);INSERT OR IGNORE INTO settings (key, value) VALUES
('user_name', 'User'),
('assistant_name', 'Assistant'),
('dev_mode', 'false'), -- false = nice mode, true = dev mode
('dark_mode', 'false'),
('verbose_mode', 'false');app = Flask(__name__)
app.secret_key = os.urandom(24) # Required for session management
# Initialize markdown with extensions
md = markdown.Markdown(extensions=['fenced_code', 'tables'])def get_db():
"""Create and return a database connection with Row factory."""
conn = sqlite3.connect('chatgpt.db')
conn.row_factory = sqlite3.Row
return connThe init_db() function:
- Creates all necessary tables if they don't exist
- Sets up foreign key relationships
- Inserts default settings
- Handles database schema migrations
@app.template_filter('fromjson')
def fromjson(value):
"""Convert JSON string to Python object."""
try:
return json.loads(value)
except:
return []
@app.template_filter('tojson')
def tojson(value, indent=None):
"""Convert Python object to JSON string."""
try:
return json.dumps(value, indent=indent)
except:
return str(value)@app.template_filter('datetime')
def format_datetime(timestamp):
"""Format timestamp for display."""
try:
if isinstance(timestamp, str):
timestamp = float(timestamp)
return datetime.fromtimestamp(timestamp).strftime('%Y-%m-%d %H:%M:%S')
except (ValueError, TypeError):
return timestamp@app.template_filter('markdown')
def markdown_filter(text):
"""Convert markdown text to HTML."""
if text is None:
return ""
return Markup(md.convert(text))def get_setting(key, default=None):
"""Retrieve a setting value from the database."""
conn = get_db()
setting = conn.execute('SELECT value FROM settings WHERE key = ?', (key,)).fetchone()
conn.close()
return setting['value'] if setting else default
def set_setting(key, value):
"""Store a setting value in the database."""
conn = get_db()
conn.execute('INSERT OR REPLACE INTO settings (key, value) VALUES (?, ?)', (key, value))
conn.commit()
conn.close()def main():
# Remove existing database if it exists
if os.path.exists('chatgpt.db'):
os.remove('chatgpt.db')
# Initialize fresh database
init_db()
print("Database initialized successfully!")base.html: Base template with common layout and navigationindex.html: Conversation list pageconversation.html: Full conversation view (Dev mode)nice_conversation.html: Clean conversation view (Nice mode)settings.html: Application settings page
- Theme Support: Dark/light mode toggle
- Responsive Design: Mobile-friendly layout
- Markdown Rendering: Code highlighting and formatting
- Dynamic Content: Real-time settings updates
- Navigation: Breadcrumb-style navigation
The stylesheet provides:
- Theme System: Dark and light mode styles
- Responsive Layout: Mobile-first design
- Typography: Readable font choices and spacing
- Interactive Elements: Hover effects and transitions
- Code Highlighting: Syntax highlighting for code blocks
@app.route('/')
def index():
"""Display conversation list with current settings."""- Method: GET
- Purpose: Shows all conversations in chronological order
- Template:
index.html
@app.route('/conversation/<conversation_id>')
def conversation(conversation_id):
"""Display full conversation in dev mode."""- Method: GET
- Purpose: Shows complete conversation with all metadata
- Template:
conversation.html - Redirects: To nice view if dev mode is disabled
@app.route('/conversation/<conversation_id>/nice')
def nice_conversation(conversation_id):
"""Display canonical conversation path."""- Method: GET
- Purpose: Shows only the canonical conversation path
- Template:
nice_conversation.html
@app.route('/conversation/<conversation_id>/full')
def full_conversation(conversation_id):
"""Force dev mode for conversation view."""- Method: GET
- Purpose: Temporarily enables dev mode for viewing
- Behavior: Sets session override and redirects
@app.route('/settings')
def settings():
"""Display settings page."""- Method: GET
- Purpose: Shows application settings
- Template:
settings.html
@app.route('/update_names', methods=['POST'])
def update_names():
"""Update user and assistant display names."""- Method: POST
- Purpose: Updates display names in database
- Redirects: To index page
@app.route('/toggle_view_mode')
def toggle_view_mode():
"""Toggle between nice and dev modes."""- Method: GET
- Purpose: Switches between nice and dev viewing modes
- Response: JSON with new mode status
@app.route('/toggle_dark_mode')
def toggle_dark_mode():
"""Toggle between dark and light themes."""- Method: GET
- Purpose: Switches between dark and light themes
- Redirects: To previous page
@app.route('/toggle_verbose_mode')
def toggle_verbose_mode():
"""Toggle verbose mode for additional details."""- Method: GET
- Purpose: Shows/hides additional technical details
- Response: JSON with new verbose status
@app.route('/import', methods=['POST'])
def import_json():
"""Import ChatGPT conversation data from JSON file."""- Method: POST
- Purpose: Processes ChatGPT export files
- File Handling: Accepts JSON file uploads
- Redirects: To index page after import
- File Validation: Check for uploaded file and valid JSON
- Data Parsing: Parse ChatGPT export format
- Conversation Processing: Extract conversation metadata
- Message Processing: Process individual messages and relationships
- Metadata Extraction: Extract technical metadata
- Database Storage: Store all data in SQLite database
The import system expects ChatGPT's JSON export format:
[
{
"id": "conversation_id",
"create_time": "timestamp",
"update_time": "timestamp",
"title": "Conversation Title",
"mapping": {
"message_id": {
"id": "message_id",
"message": {
"id": "message_id",
"author": {"role": "user|assistant"},
"content": {"parts": ["message content"]},
"create_time": "timestamp",
"update_time": "timestamp",
"metadata": {...}
},
"parent": "parent_message_id",
"children": ["child_message_ids"]
}
}
}
]- File Validation: Checks for empty files and valid JSON
- Conversation Processing: Skips conversations with missing IDs
- Message Processing: Continues processing even if individual messages fail
- Database Transactions: Uses transactions for data consistency
This section is for anyone who cares exactly what data the importer does not store.
1. Other files in the export
Only conversations.json is read. Everything else in the ChatGPT export zip is ignored: chat.html, group_chats.json, message_feedback.json, shared_conversations.json, shopping.json, user.json. So group chats, feedback, shared conversations, and user profile are not imported.
2. Whole conversations
Any conversation with a missing or empty id is skipped (logged as "Skipping conversation: missing ID").
3. Messages
Any mapping entry whose message is missing or an empty dict is skipped. That message is never inserted and its id is not added to the set of "inserted" ids for that conversation.
4. Message content
Only content.parts is stored (as JSON). Other keys on content in the export are not persisted.
5. Metadata
Only a fixed set of metadata fields is stored: message_type, model_slug, citations, content_references, finish_details, is_complete, request_id, timestamp, message_source, serialization_metadata. Any other metadata keys are dropped.
6. Parent–child links (message_children)
We insert a row (parent_id, child_id) only when both the parent and the child message were successfully inserted for that conversation. We skip a link when:
- The child was not inserted (e.g. no
messagekey, or exception during insert) — we don't add that child link. - The parent was not inserted — we skip the entire block of children for that parent, so no links from that parent are added.
Sampling real export data shows:
- Excluded "child" links (parent inserted, child not): In practice 0 in sampled conversations. We are not dropping links because the child message was missing.
- Excluded "parent" blocks: Typically one per conversation. These are mapping entries that have a
childrenarray but nomessagekey (or an empty message). They are structural/synthetic nodes, for example:client-created-root— synthetic root; no content; its single child is the real first message of the thread (which we do import).- Other UUIDs with no
message— branch/structure-only nodes in the export with no displayable content.
So the only excluded links are from these synthetic nodes to real messages. We do not lose any user or assistant content. The app's canonical path walks parent_id on messages (leaf → root), not the children table, so display and threading still work. The script scripts/sample_excluded_children.py can be run against your export to reproduce these counts and sample excluded entries (set MAX_CONV and MAX_SAMPLES if needed).
7. Custom instructions and memories
The app does not import or display ChatGPT’s custom instructions or memories as first-class data. Reason:
- The official ChatGPT export (the zip you get from Settings → Data Controls → Export) does not appear to include a dedicated file for custom instructions or memories. The zip typically contains
conversations.json,chat.html,user.json,group_chats.json,message_feedback.json,shared_conversations.json,shopping.json, etc. None of these are a dedicated “custom instructions” or “memories” export. - We only read
conversations.json. So even if OpenAI added acustom_instructions.jsonormemories.jsonin a future export, the current app would not load it unless we added support. - If custom instructions or memory-like text are embedded inside a conversation (e.g. as a system message or a special message type in
conversations.json), they would be imported as normal messages (we store all roles, includingsystem, andcontent.parts). We don’t treat them as a separate “custom instructions” or “memories” section in the UI.
So: custom instructions and memories are not exported by the system in a way we consume, and we don’t have a dedicated place to show them. Any such content that appears inside a conversation’s messages will still be in the DB and visible in the thread.
Limitation. Custom instructions and memories are important user data and are worth exporting. The app would support importing and displaying them if OpenAI included them in the data export (e.g. a custom_instructions.json or memories.json in the zip). Until then, users who want this data preserved can request that OpenAI add it to the export format (e.g. via in-product feedback or support).
| Setting Key | Default Value | Description |
|---|---|---|
user_name |
"User" | Display name for user messages |
assistant_name |
"Assistant" | Display name for assistant messages |
dev_mode |
"false" | View mode (false=nice, true=dev) |
dark_mode |
"false" | Theme mode (false=light, true=dark) |
verbose_mode |
"false" | Show additional technical details |
- Settings are stored in SQLite database
- Changes persist across application restarts
- Session overrides available for temporary changes
Purpose: Clean, focused conversation viewing
Features:
- Shows only canonical conversation path
- Hides technical metadata
- Clean, distraction-free interface
- Optimized for conversation review
Implementation:
# Find canonical endpoint (message with no children)
canonical_endpoint = conn.execute('''
SELECT m.id, m.role, m.content, m.create_time, m.parent_id
FROM messages m
LEFT JOIN messages child ON m.id = child.parent_id
WHERE m.conversation_id = ? AND child.id IS NULL
ORDER BY m.create_time DESC
LIMIT 1
''', (conversation_id,)).fetchone()Purpose: Full technical conversation analysis
Features:
- Shows all messages and branches
- Displays technical metadata
- Message IDs and timestamps
- Conversation tree structure
- Debugging information
Implementation:
# Get all messages with metadata
messages = conn.execute('''
SELECT m.*,
mm.message_type, mm.model_slug, mm.citations, mm.content_references,
mm.finish_details, mm.is_complete, mm.request_id, mm.timestamp_,
mm.message_source, mm.serialization_metadata
FROM messages m
LEFT JOIN message_metadata mm ON m.id = mm.message_id
WHERE m.conversation_id = ?
ORDER BY m.create_time
''', (conversation_id,)).fetchall()- Connection Management: Proper connection cleanup
- Transaction Rollback: Automatic rollback on errors
- Graceful Degradation: Continue processing on partial failures
- File Validation: Comprehensive file format checking
- Data Validation: Skip invalid records, continue processing
- Error Logging: Console output for debugging
- 404 Handling: Proper "not found" responses
- 400 Handling: Bad request responses for invalid data
- 500 Handling: Internal server error handling
- File Type Validation: Only accepts JSON files
- Content Validation: Validates JSON structure
- Size Limits: Implicit size limits through Flask configuration
- Random Secret Key: Generated using
os.urandom(24) - Session Management: Proper session cleanup
- CSRF Protection: Form-based protection
- Parameterized Queries: Prevents SQL injection
- Input Validation: Validates all user inputs
- Error Information: Limited error details in production
- Indexes: Created on frequently queried columns
- Connection Pooling: Efficient database connection management
- Query Optimization: Optimized SQL queries for large datasets
- CSS Minification: Optimized stylesheet delivery
- Template Caching: Jinja2 template caching
- Static Asset Caching: Browser caching for static files
- Connection Cleanup: Proper database connection handling
- Large File Handling: Streaming file processing
- Memory-Efficient Processing: Processing large datasets in chunks
- PEP 8 Compliance: Follow Python style guidelines
- Docstrings: Comprehensive function documentation
- Type Hints: Consider adding type hints for better IDE support
- Unit Tests: Test individual functions and components
- Integration Tests: Test database operations and API endpoints
- Manual Testing: Test import functionality with real data
- Production Configuration: Disable debug mode
- Database Backup: Regular database backups
- Logging: Implement proper logging for production
- Search Functionality: Full-text search across conversations
- Export Features: Export conversations in various formats
- User Authentication: Multi-user support
- API Endpoints: RESTful API for external integrations
- Advanced Analytics: Conversation analysis and insights
- Database Migrations: Schema versioning and migration system
- Dependency Updates: Regular security and feature updates
- Performance Monitoring: Monitor application performance
- Error Tracking: Implement error tracking and alerting
This documentation covers the complete technical implementation of the ChatGPT Browser application. For user-facing documentation, see the main README.md file.