Skip to content

Conversation

prateeksinghalgit
Copy link

@prateeksinghalgit prateeksinghalgit commented Oct 9, 2025

Implements Azure Blob Storage backup repository as discussed in SOLR-17949.

Description

This PR adds a new backup repository implementation for Azure Blob Storage, enabling Solr collections to be backed up to and restored from Microsoft Azure.

Key Features:

  • Full backup/restore functionality to Azure Blob Storage
  • Support for 4 authentication methods (Connection String, Account Key, SAS Token, Azure Identity)
  • Incremental backup support with versioning
  • Data integrity verification (checksum validation)
  • Compatible with Azurite emulator for local testing
  • Comprehensive documentation and 76 passing unit tests

Solution

The implementation follows Solr's BackupRepository interface pattern, similar to existing S3 and GCS repository modules:

  • BlobBackupRepository: Main class implementing Solr's BackupRepository interface
  • BlobStorageClient: Wrapper for Azure SDK, providing file operations
  • BlobIndexInput: Custom Lucene IndexInput for reading from Azure blobs
  • BlobOutputStream: Custom output stream for writing to Azure blobs
  • Authentication: Supports 4 methods via flexible configuration in solr.xml

All streaming operations are compatible with Solr's ResumableInputStream for fault-tolerant transfers.

Implementation stats:

  • 8 implementation files (1,606 LOC)
  • 8 test files (2,180 LOC)
  • All dependencies Apache 2.0 licensed

Tests

Unit Tests: 76/76 passing (100%)

./gradlew :solr:modules:blob-repository:test
# Result: BUILD SUCCESSFUL - 76 test(s)

Test Coverage:

  • Basic read/write operations
  • Large file handling (1GB+)
  • Binary data integrity
  • Concurrent operations
  • Stream lifecycle (close/resume behavior)
  • Incremental backups
  • All 4 authentication methods
  • Integration with Azurite (local emulator)
  • Integration with real Azure Blob Storage

Testing Instructions:
Can be tested locally with Azurite emulator (no Azure account needed) or with real Azure Blob Storage. See solr/modules/blob-repository/README.md for detailed setup instructions.

Checklist

Please review the following and check all that apply:

  • I have reviewed the guidelines for How to Contribute and my code conforms to the standards described there to the best of my ability.
  • I have created a Jira issue and added the issue ID to my pull request title.
  • I have given Solr maintainers access to contribute to my PR branch. (optional but recommended, not available for branches on forks living under an organisation)
  • I have developed this patch against the main branch.
  • I have run ./gradlew :solr:modules:blob-repository:check (module-specific check passed).
  • I have added tests for my changes.
  • I have added documentation for the Reference Guide

This commit adds support for backing up and restoring Solr collections
to Azure Blob Storage with multiple authentication options.

Features:
- Full backup/restore functionality to Azure Blob Storage
- Support for 4 authentication methods:
  * Connection String (for development)
  * Account Name + Key (for simple production)
  * SAS Token (recommended for production)
  * Azure Identity (Managed Identity, Service Principal, Azure CLI)
- Incremental backup support with versioning
- Data integrity verification (checksum validation)
- Compatible with Azurite emulator for local testing
- Comprehensive documentation and 76 passing unit tests

Implementation:
- 8 implementation files (1,606 LOC)
- 8 test files (2,180 LOC)
- All dependencies Apache 2.0 licensed
- Follows Solr's backup repository patterns
@github-actions github-actions bot added documentation Improvements or additions to documentation dependencies Dependency upgrades tool:build tests labels Oct 9, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

dependencies Dependency upgrades documentation Improvements or additions to documentation tests tool:build

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant