A .NET 8 console application that exports Slack channel messages to structured JSONL format for data analysis and AI processing.
- Archives messages from multiple Slack channels simultaneously
- Exports structured data in JSONL format (one JSON object per line)
- Handles threaded messages with proper threading context
- Groups messages by month for organized archives
- Includes comprehensive message metadata and user information
- Built-in rate limiting and retry logic with exponential backoff
- Production-ready with dependency injection, logging, and configuration
- Supports flexible date range filtering
- 📊 GitHub Actions Setup - Run in the cloud (recommended)
- 💻 Local Setup - Run on your machine (see below)
- 🤝 Contributing Guide - Development workflow with PRs
- 🔒 Security Policy - Security best practices and reporting
- Go to Slack API
- Create a new app or use existing one
- Go to "OAuth & Permissions"
- Add these scopes under "Bot Token Scopes":
channels:history- Read messages from public channelsusers:read- Read user profile information
- Install the app to your workspace
- Copy the "Bot User OAuth Token" (starts with
xoxb-)
- Open Slack in your browser
- Navigate to the channel you want to archive
- Copy the channel ID from the URL (e.g.,
C1234567890)
Copy appsettings.example.json to appsettings.json and configure:
{
"Slack": {
"Token": "xoxb-your-actual-bot-token-here",
"Channels": [
{
"Id": "C1234567890",
"Name": "general"
},
{
"Id": "C0987654321",
"Name": "random"
}
]
},
"Archive": {
"OutputPath": "./slack-archive"
},
"Logging": {
"LogLevel": {
"Default": "Information",
"Microsoft": "Warning",
"Microsoft.Hosting.Lifetime": "Information"
}
}
}dotnet build# Archive today's messages from all configured channels
dotnet run
# Archive messages from a specific date range
dotnet run "2024-01-01" "2024-01-31"
# Archive messages from a specific date to today
dotnet run "2024-01-01"The application creates monthly JSONL files organized by channel:
slack-archive/
├── general/
│ ├── 2024-01.jsonl
│ ├── 2024-02.jsonl
│ └── ...
└── random/
├── 2024-01.jsonl
└── ...
Each line contains a complete message object with rich metadata:
{"schema_version":"1.0","channel_id":"C1234567890","channel_name":"general","ts":"1705123456.789","ts_iso":"2024-01-13T10:30:56Z","thread_id":"1705123456.789","is_root":true,"message_type":"message","user":"U1234567","user_display_name":"John Doe","actor_user":null,"text":"Good morning team! Ready for the sprint review?","mentions":null,"reply_count":2,"thread_ts":null,"type":"message","subtype":null}
{"schema_version":"1.0","channel_id":"C1234567890","channel_name":"general","ts":"1705123567.123","ts_iso":"2024-01-13T10:32:47Z","thread_id":"1705123456.789","is_root":false,"message_type":"message","user":"U7654321","user_display_name":"Jane Smith","actor_user":null,"text":"Yes! I have the demo ready.","mentions":null,"reply_count":null,"thread_ts":"1705123456.789","type":"message","subtype":null}Each message includes:
- schema_version: Format version for compatibility
- channel_id/channel_name: Channel identification
- ts/ts_iso: Unix timestamp and ISO 8601 formatted time
- thread_id: Thread identifier (root message timestamp)
- is_root: Whether this is a root message or reply
- message_type: Categorized message type (message, member_joined_channel, etc.)
- user/user_display_name: User ID and resolved display name
- text: Message content
- mentions: Array of mentioned user IDs (if any)
- reply_count: Number of replies to this message
- thread_ts: Thread timestamp for replies
The application uses clean architecture with dependency injection:
- Program.cs: Application entry point and DI configuration
- Services/SlackClient.cs: Slack API client with retry logic and user caching
- Services/JsonWriter.cs: JSONL export functionality
- Services/ArchiveOrchestrator.cs: Coordinates parallel channel archiving
- Models/SlackMessage.cs: Message data model with computed properties
- Configuration/: Strongly-typed configuration classes
- Token: Slack Bot User OAuth Token (required)
- Channels: Array of channel configurations
- Id: Slack channel ID (e.g., C1234567890)
- Name: Human-readable name for file organization
- OutputPath: Output directory path (default: "./slack-archive")
- Configurable log levels for different components
- Console logging enabled by default
- Rate Limiting: Respects Slack API rate limits with retry-after headers
- Exponential Backoff: Automatic retry with increasing delays
- User Caching: Reduces API calls by caching user information
- Graceful Degradation: Continues processing if individual messages fail
- Comprehensive Logging: Detailed logging for monitoring and troubleshooting
- Thread Preservation: Maintains complete thread context
- User Resolution: Resolves user IDs to display names
- Message Classification: Categorizes system messages and regular messages
- Mention Extraction: Identifies and extracts user mentions
- Timestamp Normalization: Provides both Unix and ISO timestamps
- Data analysis of Slack communications
- AI/ML training data preparation
- Compliance and archival requirements
- Migration to other platforms
- Analytics and reporting
- Public channels only (private channels need additional permissions)
- File attachments referenced by URL only (not downloaded)
- Emoji reactions not captured
- Large channels process sequentially due to API rate limits