Detailed documentation for each indexer in the Manic Miners Level Indexer system, including configuration, features, and usage examples.
The Manic Miners Level Indexer supports three distinct sources for level collection, each with its own specialized indexer:
| Source | Type | Authentication | Format Versions | Update Frequency |
|---|---|---|---|---|
| Archive.org | Public Archive | None | below-v1, v1 | As uploaded |
| Discord | Community Forum | User Token | below-v1, v1 | Real-time |
| Hognose | GitHub Releases | None | v1, v2 | Per release |
The Internet Archive indexer searches and downloads Manic Miners levels from Archive.org's vast digital library.
- Advanced Search: Customizable search queries with date filtering
- Streaming Metadata: Efficient metadata fetching without loading full pages
- Concurrent Downloads: Parallel file downloads with configurable limits
- Resume Support: Persistent state for interrupted indexing sessions
- Bandwidth Management: Optional bandwidth limiting for downloads
- Checksum Verification: Optional file integrity checking
{
"archive": {
"enabled": true,
"baseUrl": "https://archive.org/advancedsearch.php",
"searchQueries": [
"manic miners level",
"manic miners map",
"manic miners custom"
],
"dateRange": {
"from": "2020-01-01",
"to": "2024-12-31"
},
"maxConcurrentMetadata": 10,
"maxConcurrentDownloads": 5,
"retryAttempts": 3,
"downloadTimeout": 60000,
"bandwidthLimit": 1048576,
"skipExisting": true,
"verifyChecksums": false
}
}- searchQueries: Array of search terms (combined with OR)
- dateRange: Filter results by upload date
- maxConcurrentMetadata: Parallel metadata fetches (default: 10)
- maxConcurrentDownloads: Parallel file downloads (default: 5)
- retryAttempts: Retry count for failed downloads (default: 3) with exponential backoff
- downloadTimeout: Download timeout in milliseconds (default: 60s)
- bandwidthLimit: Bytes per second limit (optional)
- skipExisting: Skip already indexed items (default: true)
- verifyChecksums: Verify file checksums after download
import { InternetArchiveIndexer } from 'manic-miners-level-indexer';
const config = {
baseUrl: 'https://archive.org/advancedsearch.php',
searchQueries: ['manic miners'],
maxConcurrentDownloads: 3
};
const indexer = new InternetArchiveIndexer(config, './output');
// Set progress callback
indexer.setProgressCallback((progress) => {
console.log(`${progress.current}/${progress.total}: ${progress.message}`);
});
// Run indexing
const levels = await indexer.index();
console.log(`Indexed ${levels.length} levels from Archive.org`);levels-archive/
├── catalog_index.json
└── 550e8400-e29b-41d4-a716-446655440000/
├── catalog.json
├── level.dat
└── screenshot.jpg
- Title and description from item metadata
- Author information (uploader)
- Upload date and modification date
- File size and format
- Download count from Archive.org
- Original item identifier
- Collection information
The Discord indexer extracts levels shared in Discord forum channels and threads.
- Forum Thread Support: Indexes both channels and forum threads
- Automated Authentication: Browser automation with token caching
- Thread Discovery: Finds both active and archived threads
- Attachment Processing: Downloads .dat files from messages
- Metadata Extraction: Parses level information from messages
- Pagination Handling: Processes all messages in long threads
{
"discord_community": {
"enabled": true,
"channels": [
"1139908458968252457"
],
"excludedThreads": ["thread_id"],
"retryAttempts": 3,
"downloadTimeout": 60000
},
"discord_archive": {
"enabled": true,
"channels": [
"683985075704299520"
],
"retryAttempts": 3,
"downloadTimeout": 60000
}
}- channels: Array of channel IDs to index
- excludedThreads: Array of thread IDs to skip (optional)
- retryAttempts: Retry count for failed API calls and downloads (default: 3) with exponential backoff
- downloadTimeout: Download timeout in milliseconds (default: 60000ms)
Discord requires user authentication. See Discord Authentication Guide for detailed setup.
Quick setup:
# Method 1: Environment variable
export DISCORD_TOKEN="your_discord_token"
# Method 2: .env file
echo "DISCORD_TOKEN=your_discord_token" > .env
# Method 3: Token file
echo "your_discord_token" > ~/.discord-tokenimport { DiscordUnifiedIndexer } from 'manic-miners-level-indexer';
const channels = [
'https://discord.com/channels/580269696369164299/683985075704299520',
'https://discord.com/channels/580269696369164299/1139908458968252457'
];
const indexer = new DiscordUnifiedIndexer(channels, './output');
// Set authentication options
indexer.setAuthOptions({
token: process.env.DISCORD_TOKEN
});
// Run indexing
const levels = await indexer.index();
console.log(`Indexed ${levels.length} levels from Discord`);- Regular Channels: Standard Discord channels
- Forum Channels: Channels with thread-based organization
- Archived Threads: Inactive threads (still accessible)
- Active Threads: Currently active discussions
levels-discord/
├── catalog_index.json
└── a7c2f8d1-3e5b-4912-8f3a-987654321098/
├── catalog.json
├── AncientCave.dat
└── preview.png
- Level title (from filename or message)
- Author (Discord username)
- Post date and thread information
- Message content as description
- Thread URL as source
- File size and format detection
- Tags extracted from message
The Hognose indexer downloads procedurally generated levels from the Hognose GitHub repository.
- GitHub Releases API: Fetches all repository releases
- In-Memory Processing: Extracts ZIP files without temp directories
- Batch Processing: Handles multiple levels per release
- Version Detection: Identifies format versions from metadata
- Release Notes: Extracts changelog information
{
"hognose": {
"enabled": true,
"githubRepo": "charredUtensil/hognose",
"retryAttempts": 3,
"downloadTimeout": 60000,
"verifyChecksums": true
}
}- githubRepo: GitHub repository path (owner/repo)
- retryAttempts: Retry count for failed API calls and downloads (default: 3) with exponential backoff
- downloadTimeout: Download timeout in milliseconds (default: 60000ms)
- verifyChecksums: Calculate and log SHA-256 checksums for downloaded ZIP files (default: false)
import { HognoseIndexer } from 'manic-miners-level-indexer';
const indexer = new HognoseIndexer('charredUtensil/hognose', './output');
// Set progress callback
indexer.setProgressCallback((progress) => {
console.log(`Processing release: ${progress.message}`);
});
// Run indexing
const levels = await indexer.index();
console.log(`Indexed ${levels.length} levels from Hognose`);- Fetches all releases via GitHub API
- Downloads ZIP files for each release
- Extracts levels in memory
- Processes each .dat file
- Generates metadata from release info
levels-hognose/
├── catalog_index.json
└── release-v0.11.2/
├── hognose-0001/
│ ├── catalog.json
│ └── level.dat
├── hognose-0002/
│ ├── catalog.json
│ └── level.dat
└── ...
- Level title from filename
- Author (typically "Hognose")
- Release date and version
- GitHub release URL
- File size and format version
- Release notes excerpt
- Procedural generation seed (if available)
All indexers support progress callbacks:
indexer.setProgressCallback((progress: IndexerProgress) => {
console.log(`[${progress.phase}] ${progress.source}: ${progress.current}/${progress.total}`);
console.log(progress.message);
});Progress phases:
scraping: Discovering levels to indexdownloading: Downloading level filescataloging: Creating catalog entriesindexing: Processing and validation
All indexers implement robust error handling:
try {
await indexer.index();
} catch (error) {
if (error.code === 'RATE_LIMIT') {
console.log('Rate limited, try again later');
} else if (error.code === 'AUTH_FAILED') {
console.log('Authentication failed');
} else {
console.log('Indexing error:', error.message);
}
}Archive and Discord indexers support resume capability:
// Indexing will resume from last successful item
const indexer = new InternetArchiveIndexer(config, './output');
await indexer.index(); // Resumes if interruptedAll indexers use intelligent format detection:
// Automatic detection based on source and file analysis
formatVersion: 'below-v1' | 'v1' | 'v2' | 'unknown'Detection logic:
- Archive.org: Usually below-v1 format
- Discord: Mixed below-v1 and v1
- Hognose: v1 and v2 formats
- File size/structure analysis for confirmation
{
"archive": {
"maxConcurrentMetadata": 20, // Increase for faster discovery
"maxConcurrentDownloads": 10, // Balance with bandwidth
"bandwidthLimit": 5242880 // 5MB/s limit
}
}{
"discord": {
"channels": [/* limit active channels */],
// Process channels sequentially to avoid rate limits
}
}{
"hognose": {
// Generally fast, no tuning needed
}
}- Storage: Use SSD for better I/O performance
- Network: Ensure stable connection for large downloads
- Memory: 4GB+ RAM recommended for large indexing runs
- Scheduling: Run during off-peak hours for better speeds
Problem: Slow metadata fetching
# Increase concurrent fetches
"maxConcurrentMetadata": 20Problem: Downloads failing
# Increase timeout and retries
"downloadTimeout": 120000,
"retryAttempts": 5Problem: Authentication failures
# Check token validity
npm run test:discord:authProblem: Missing threads
# Discord may have permission restrictions
# Ensure your account can view archived threadsProblem: GitHub API rate limit
# Authenticated requests have higher limits
export GITHUB_TOKEN="your_github_token"// Archive.org advanced search
const config = {
searchQueries: [
'manic miners level AND creator:"Baraklava"',
'manic miners map AND year:[2023 TO 2024]'
]
};// Index specific Discord threads
const channels = [
'https://discord.com/channels/.../thread-id-1',
'https://discord.com/channels/.../thread-id-2'
];// Post-process indexed levels
indexer.on('levelIndexed', (level: Level) => {
// Custom processing
console.log(`Indexed: ${level.metadata.title}`);
});