Auto Metadata Fetch System

🏷️ Auto-Metadata Fetch System

Currently only in :dev version

The Auto-Metadata Fetch System automatically enriches newly ingested books with comprehensive metadata from multiple online sources, ensuring your library has complete and accurate book information.

🌟 Overview

When books are added to your Calibre-Web Automated library, the Auto-Metadata Fetch System automatically searches multiple metadata providers to find and apply detailed information including titles, authors, descriptions, covers, publication dates, and more. This eliminates the need for manual metadata entry and ensures consistent, high-quality book information.

⚙️ How It Works

📚 Book Detection: System identifies newly ingested books with incomplete or poor metadata
🔍 Provider Search: Searches configured metadata providers in priority order
📝 Metadata Application: Applies found metadata based on administrator-configured rules
⭐ Quality Enhancement: Improves book discoverability and organization

🔧 Administrator Configuration

🔄 Enabling Auto-Metadata Fetch

⚠️ Important: Metadata fetching is controlled entirely by server administrators. Users cannot enable or disable this feature individually.

Administrators can enable the system in CWA Settings:

Navigate to CWA Settings
Check "Enable Auto-Metadata Fetch"
Configure Smart Metadata Application (optional)
Set up Metadata Provider Hierarchy
Save settings

🧠 Smart Metadata Application

The system offers two metadata application modes with granular field control:

🎯 Direct Replacement (Default)

⚡ Behavior: Takes metadata from the preferred provider exactly as provided
💭 Philosophy: "Just take the metadata that comes from the preferred metadata provider as is"
🎯 Use Case: When you trust your primary provider and want consistent results
📋 Result: Complete replacement of existing metadata with provider data for selected fields

🤖 Smart Application (Optional)

🧠 Behavior: Applies intelligent criteria when replacing existing metadata
💭 Philosophy: Only improve metadata when the new data is demonstrably better
🎯 Use Case: When you want to preserve good existing metadata while enhancing poor data
📏 Criteria: Applied only to selected fields:
- 📖 Titles: Only replace if new title is longer/more descriptive
- 📝 Descriptions: Only replace if new description is longer/more detailed
- 🏢 Publishers: Only replace if current publisher field is empty
- 🖼️ Covers: Only replace if new cover has higher resolution
- ✍️ Authors: Always update (typically improves consistency)
- 🏷️ Tags/Series: Always add (enhances discoverability)

✅ Selective Field Updates

New Feature: Administrators can now choose exactly which metadata fields should be updated during automatic fetching.

📝 Available Field Controls

Each metadata field can be individually enabled or disabled:

📖 Title - Book title and subtitle
✍️ Authors - Author names and contributors
📝 Description - Plot summary and book description
🏢 Publisher - Publishing house and imprint
🏷️ Tags/Genres - Subject tags and genre classifications
📚 Series - Series name and position/index
⭐ Rating - Star ratings and reviews
📅 Publication Date - Original publication date
🔢 Identifiers - ISBN, ASIN, and other book IDs
🖼️ Cover Image - Book cover artwork

🎯 Default Behavior

All fields are enabled by default - Maintains existing functionality
Unchecked fields are never modified - Preserves your existing data
Works with both Smart and Direct modes - Applies to your chosen application method

💡 Use Cases

🏛️ Academic Libraries:

✅ Authors, Publishers, Identifiers, Publication Date
❌ Title, Description, Tags, Rating, Cover

Preserve manual cataloging while updating bibliographic data

📚 Fiction Collections:

✅ Description, Tags, Series, Cover, Rating
❌ Title, Authors, Publisher, Publication Date

Enhance discoverability while keeping core attribution intact

🦸 Comic Collections:

✅ Series, Cover, Tags, Rating
❌ Title, Authors, Description, Publisher

Update series information and artwork while preserving original titles

🔒 Curated Libraries:

✅ Identifiers, Publication Date
❌ All other fields

Only add missing bibliographic identifiers

📋 Metadata Provider Hierarchy

Configure the order in which providers are searched:

🖱️ Drag and Drop: Reorder providers by dragging them in the CWA Settings interface
🏆 Priority System: System tries providers from top to bottom until successful
🥇 First Success Wins: Once a provider returns usable data, the search stops

🌐 Available Providers

📚 Google Books

💪 Strengths: Comprehensive database, good for popular books, excellent cover images
🎯 Best For: Fiction, popular non-fiction, recently published books
🌍 Coverage: International, multiple languages

📖 Internet Archive (archive.org)

💪 Strengths: Extensive catalog, good for older/rare books, academic works
🎯 Best For: Classic literature, academic texts, out-of-print books
📚 Coverage: Historical and academic works

🇩🇪 Deutsche Nationalbibliothek (DNB)

💪 Strengths: Authoritative German library catalog, excellent for German-language books
🎯 Best For: German books, academic works, official publication data
🌍 Coverage: German-language publications

🦸 ComicVine

💪 Strengths: Specialized comic book database, detailed series information
🎯 Best For: Comic books, graphic novels, manga
📖 Coverage: Comics and graphic literature

🇨🇳 Douban

💪 Strengths: Chinese book database, good for Asian literature
🎯 Best For: Chinese books, Asian literature, translated works
🌏 Coverage: Chinese and East Asian publications

🏆 Recommended Provider Orders

🏠 General Purpose Libraries:

📚 Google Books (broad coverage)
📖 Internet Archive (older/rare books)
🇩🇪 DNB (German books)
🦸 ComicVine (comics)
🇨🇳 Douban (Asian literature)

🎓 Academic Libraries:

🇩🇪 DNB (authoritative data)
📖 Internet Archive (academic works)
📚 Google Books (recent publications)

🦸 Comic/Graphic Novel Collections:

🦸 ComicVine (specialized)
📚 Google Books (mainstream comics)
📖 Internet Archive (older comics)

👤 User Experience

👀 What Users See

Users benefit from auto-metadata fetch without any configuration:

✅ 📖 Complete Book Information: Books automatically have proper titles, authors, descriptions
✅ 📂 Better Organization: Consistent metadata improves browsing and searching
✅ 🔍 Enhanced Discovery: Tags and series information help find related books
✅ 🎨 Professional Appearance: High-quality covers and complete details

🚀 When Metadata is Fetched

The system fetches metadata for:

✅ Newly uploaded books with minimal metadata
✅ Books imported from external sources
✅ Books with obviously incorrect or incomplete information
❌ Books that already have complete, high-quality metadata (in Smart mode)

🎯 System Behavior

🔍 Search Process

🔧 Build Search Query: Combines existing title and author information
🌐 Provider Search: Queries providers in configured order
⭐ Result Evaluation: Analyzes returned metadata for quality and relevance
🥇 First Success: Stops searching once a provider returns usable data
📝 Metadata Application: Applies metadata according to configured mode

📊 Quality Criteria

The system evaluates metadata quality based on:

✅ Completeness: Number of filled fields
🎯 Accuracy: Relevance to search terms
📏 Detail Level: Depth of description and information
🖼️ Image Quality: Cover image resolution and clarity

🔗 Processing Integration

Metadata fetching integrates with other CWA systems:

📧 Before Auto-Send: Metadata is fetched before books are sent to eReaders
🔄 After Auto-Convert: Metadata is applied after format conversion
📥 During Ingest: Runs as part of the book ingestion pipeline

⚙️ Technical Details

🔍 Search Algorithm

For each newly ingested book:
  If metadata_fetch_enabled:
    For each provider in hierarchy:
      🔍 Search provider with book title + author
      If results found:
        📝 Apply metadata based on application mode
        📊 Log success and stop searching
      Else:
        ⏭️ Try next provider
    If no providers returned data:
      ⚠️ Log failure, book keeps original metadata

📋 Metadata Fields

The system can fetch and apply:

📚 Core Information:

📖 Title and subtitle
✍️ Author(s) and contributors
📅 Publication date
🏢 Publisher and imprint
🔢 ISBN and other identifiers

📝 Descriptive Data:

📖 Plot summary/description
🏷️ Tags and genres
📚 Series information and position
🌍 Language and edition details

🎨 Visual Elements:

🖼️ Cover images (high resolution preferred)
🏢 Publisher logos and branding

📊 Cataloging Data:

📚 Library classifications
🏷️ Subject headings
📖 Academic citations

💾 Database Integration

💾 Metadata Storage: Integrated with Calibre's metadata database
🔍 Search Indexing: New metadata immediately improves search capabilities
📊 Version Tracking: Changes are logged for audit purposes

🔧 Troubleshooting

📚 Metadata Not Being Fetched

🔧 Check Administrator Settings:

Verify "Enable Auto-Metadata Fetch" is checked in CWA Settings
Confirm at least one provider is configured in the hierarchy
Check field selections - Ensure desired fields are enabled
Check system logs for provider connectivity issues

📝 Field-Specific Issues:

No updates happening: Check if all relevant fields are disabled
Partial updates only: Verify which fields are enabled in settings
Core fields not updating: Ensure Title, Authors, Description are enabled
Missing cover images: Check if Cover Image field is enabled

🌐 Provider-Specific Issues:

Network Connectivity: Ensure server can reach external metadata sources
⏱️ API Limits: Some providers have rate limits or access restrictions
🔍 Search Terms: Very obscure books may not be found by any provider

⭐ Poor Quality Results

🔍 Search Term Issues:

Books with very short or generic titles may return incorrect matches
Non-English books may not be found by English-language providers
Academic or technical books may need specialized providers

📋 Provider Selection:

Reorder providers to prioritize sources better suited to your collection
Consider disabling providers that consistently return poor results
Add provider-specific delays if rate limiting occurs

🧠 Smart Mode Behavior

🤔 Understanding Smart Criteria:

Smart mode is conservative - it only replaces data when confident the new data is better
Some fields (like authors) are always updated for consistency
Covers are only replaced if demonstrably higher quality

❓ When Smart Mode Doesn't Update:

Existing metadata may already be good quality
New metadata may not meet the improvement criteria
This is normal behavior - the system is preserving your existing data

💡 Best Practices

🔧 For Administrators

📋 Provider Configuration:

Order providers based on your collection's characteristics
Test with representative books before enabling system-wide
Monitor logs to identify provider success rates

🎯 Application Mode Selection:

Use 🎯 Direct Replacement for new libraries or when starting fresh
Use 🧠 Smart Application for established libraries with existing good metadata
Consider your collection's metadata quality when choosing

✅ Field Selection Strategy:

Start with all fields enabled - Test with a small subset of books first
Disable fields you've manually curated - Preserve your hard work
Enable fields that are often missing - Like descriptions, tags, series information
Consider your workflow - Match field selection to your cataloging practices

⚡ Performance Optimization:

Limit the number of active providers to reduce processing time
Configure appropriate delays between provider requests
Monitor system resources during peak ingestion periods
Fewer enabled fields = faster processing

📚 Collection-Specific Recommendations

📖 Fiction Libraries:

Providers: Prioritize 📚 Google Books and 📖 Internet Archive
Mode: Enable 🧠 Smart Application to preserve manual curation
Fields: Focus on ✅ Description, Tags, Series, Cover, Rating
Skip: ❌ Title, Authors if you have good existing data

🎓 Academic Collections:

Providers: Lead with 🇩🇪 DNB and 📖 Internet Archive
Mode: Use 🎯 Direct Replacement for consistency
Fields: Prioritize ✅ Authors, Publisher, Identifiers, Publication Date
Skip: ❌ Descriptions, Tags that may not match academic standards

🦸 Comic Collections:

Providers: Put 🦸 ComicVine first
Mode: Use 🧠 Smart Application to preserve series organization
Fields: Focus on ✅ Series, Cover, Tags, Rating
Skip: ❌ Titles, Authors for complex numbering systems

🌍 Multilingual Libraries:

Providers: Include language-appropriate providers (🇩🇪 DNB for German, 🇨🇳 Douban for Chinese)
Mode: Test both modes with each language
Fields: Enable ✅ All fields but test quality per language
Strategy: Consider separate configurations per language section

🔒 Privacy and Legal Considerations

🌐 Data Sources

All metadata providers are public services
No personal information is transmitted to providers
Only book title and author information is used for searches

📚 Intellectual Property

Metadata fetching uses publicly available bibliographic data
Cover images are linked, not stored, respecting copyright
All provider terms of service are respected

⏱️ Rate Limiting

System respects provider API limits and usage policies
Automatic delays prevent overwhelming external services
Failed requests are logged but not indefinitely retried

🚀 Advanced Configuration

🔧 Custom Provider Integration

System architecture supports adding new metadata providers
Contact your administrator for custom provider development
API documentation available for advanced integrations

📊 Bulk Metadata Updates

System focuses on newly ingested books
Existing books can be updated through Calibre-Web's standard metadata editing
Consider batch operations for large collection updates

🏷️ The Auto-Metadata Fetch System ensures your library maintains high-quality, complete metadata automatically. 🛠️ For technical support or advanced configuration options, please contact your system administrator.

Uh oh!

Auto Metadata Fetch System

🏷️ Auto-Metadata Fetch System

🌟 Overview

⚙️ How It Works

🔧 Administrator Configuration

🔄 Enabling Auto-Metadata Fetch

🧠 Smart Metadata Application

🎯 Direct Replacement (Default)

🤖 Smart Application (Optional)

✅ Selective Field Updates

📝 Available Field Controls

🎯 Default Behavior

💡 Use Cases

📋 Metadata Provider Hierarchy

🌐 Available Providers

🏆 Recommended Provider Orders

👤 User Experience

👀 What Users See

🚀 When Metadata is Fetched

🎯 System Behavior

🔍 Search Process

📊 Quality Criteria

🔗 Processing Integration

⚙️ Technical Details

🔍 Search Algorithm

📋 Metadata Fields

💾 Database Integration

🔧 Troubleshooting

📚 Metadata Not Being Fetched

⭐ Poor Quality Results

🧠 Smart Mode Behavior

💡 Best Practices

🔧 For Administrators

📚 Collection-Specific Recommendations

🔒 Privacy and Legal Considerations

🌐 Data Sources

📚 Intellectual Property

⏱️ Rate Limiting

🚀 Advanced Configuration

🔧 Custom Provider Integration

📊 Bulk Metadata Updates

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Clone this wiki locally