Skip to content

Conversation

Copy link
Contributor

Copilot AI commented Jul 1, 2025

This PR implements the critical multi-backend retrieval architecture required for the NLWeb June 2025 release, enabling support for multiple simultaneous backends while maintaining full backward compatibility.

🎯 Key Features

✅ Multiple Backend Support

  • Configure and query multiple backends simultaneously
  • Automatic parallel execution across all enabled backends
  • Configurable concurrency limits and timeouts per backend

✅ Intelligent Result Management

  • Automatic deduplication based on URL with score-based conflict resolution
  • Result aggregation and sorting by relevance across all backends
  • Write endpoint designation for operations requiring a single target

✅ Seamless Backward Compatibility

  • Existing single-backend configurations continue to work unchanged
  • No breaking changes to existing APIs or service registrations
  • Gradual migration path available

🔧 Implementation Details

New Configuration System

services.AddNLWebNetMultiBackend(
    options => options.MaxResultsPerQuery = 50,
    multiBackendOptions =>
    {
        multiBackendOptions.Enabled = true;
        multiBackendOptions.EnableParallelQuerying = true;
        multiBackendOptions.EnableResultDeduplication = true;
        multiBackendOptions.MaxConcurrentQueries = 3;
        multiBackendOptions.WriteEndpoint = "primary_backend";
    });

JSON Configuration Support

{
  "NLWebNet": {
    "MultiBackend": {
      "Enabled": true,
      "EnableParallelQuerying": true,
      "EnableResultDeduplication": true,
      "WriteEndpoint": "primary_backend",
      "Endpoints": {
        "primary_backend": {
          "Enabled": true,
          "BackendType": "azure_ai_search",
          "Priority": 10
        }
      }
    }
  }
}

🏗️ Architecture Components

  • MultiBackendOptions - Configuration model for multi-backend settings
  • IBackendManager - Interface for coordinating multiple backends
  • BackendManager - Implementation handling parallel queries and deduplication
  • Enhanced NLWebService - Updated to support both single and multi-backend modes
  • AddNLWebNetMultiBackend() - New service registration extension method

🧪 Comprehensive Testing

  • 54/54 tests passing (37 original + 17 new)
  • Unit tests for all new components
  • Integration tests for end-to-end scenarios
  • Backward compatibility validation
  • Performance and concurrency testing

📊 Test Coverage Summary

Component Tests Coverage
BackendManager 9 tests Parallel queries, deduplication, fallback
Service Registration 4 tests Multi-backend setup, configuration
Integration 4 tests End-to-end scenarios, streaming
Existing Functionality 37 tests All original tests still passing

🔄 Migration Guide

For Existing Deployments

No changes required - existing configurations continue to work:

// This still works exactly as before
services.AddNLWebNet<MockDataBackend>();

For New Multi-Backend Deployments

// New multi-backend setup
services.AddNLWebNetMultiBackend(
    configureOptions: opts => opts.DefaultMode = QueryMode.List,
    configureMultiBackend: opts => opts.Enabled = true
);

📈 Performance Characteristics

  • Parallel execution reduces total query time when multiple backends are slow
  • Configurable concurrency prevents resource exhaustion
  • Timeout handling ensures responsive behavior even with problematic backends
  • Graceful degradation when individual backends fail

🔒 Backward Compatibility Guarantees

  • All existing service registrations work unchanged
  • No modifications required to existing configuration files
  • Existing API contracts remain identical
  • Default behavior preserved when multi-backend is disabled

Fixes #34.


💬 Share your feedback on Copilot coding agent for the chance to win a $200 gift card! Click here to start the survey.

Copilot AI changed the title [WIP] 🏗️ Implement Multi-Backend Retrieval Architecture 🏗️ Implement Multi-Backend Retrieval Architecture Jul 1, 2025
Copilot AI requested a review from jongalloway July 1, 2025 19:47
Copilot finished work on behalf of jongalloway July 1, 2025 19:47
@jongalloway jongalloway requested a review from Copilot July 1, 2025 22:37

This comment was marked as outdated.

@jongalloway
Copy link
Collaborator

@copilot There are a few pull request comments, please review and fix as necessary. This pull request is #44

Here is the main comment to address: Line 128 of src/NLWebNet/Extensions/ServiceCollectionExtensions.cs has this comment:

This check reads NLWebOptions.MultiBackend.Enabled but the DI registration configures a separate IOptions<MultiBackendOptions>. The multi-backend flag never propagates into NLWebOptions. Update registration to bind MultiBackendOptions into NLWebOptions.MultiBackend or switch to checking IOptions<MultiBackendOptions> directly.

Also look at these:
src/NLWebNet/Services/ResultGenerator.cs

/// <summary>
/// Constructor for single-backend mode (backward compatibility).
/// </summary>
public ResultGenerator(

The single-backend constructor does not assign _dataBackend or _logger, so calls to _dataBackend or _logger later will be null. Add assignments for this._dataBackend = dataBackend; and this._logger = logger;.

and:
src/NLWebNet/Services/NLWebService.cs
private readonly ILogger _logger;
private readonly NLWebOptions _options;

/// <summary>
/// Constructor for single-backend mode (backward compatibility).
/// </summary>
public NLWebService(

The single-backend constructor omits assignments for _dataBackend, _queryProcessor, _resultGenerator, and _logger, leading to uninitialized fields. Ensure each constructor parameter is assigned to its corresponding private field.

Copy link
Contributor Author

Copilot AI commented Jul 1, 2025

@copilot There are a few pull request comments, please review and fix as necessary. This pull request is #44

Here is the main comment to address: Line 12...

Fixed the main configuration binding issue in commit 897fa55. The ServiceCollectionExtensions now checks IOptions<MultiBackendOptions> directly instead of NLWebOptions.MultiBackend.Enabled to ensure the multi-backend flag is properly accessed.

Regarding the constructor assignments in ResultGenerator and NLWebService - I've reviewed both files and all constructors appear to correctly assign their parameters to private fields (lines 29-32 in ResultGenerator, lines 30-34 in NLWebService). All tests are passing. Could you clarify if there's a specific scenario or constructor signature that's causing issues?

Copilot finished work on behalf of jongalloway July 1, 2025 23:03
@jongalloway jongalloway requested a review from Copilot July 1, 2025 23:07

This comment was marked as outdated.

@jongalloway
Copy link
Collaborator

Using generic keys like backend_{i} makes it harder to trace results back to configured endpoints. Consider using the configured endpoint identifiers (from MultiBackendOptions.Endpoints) or expose a name property on IDataBackend.

This is referring to src/NLWebNet/Services/BackendManager.cs line 35

@copilot Please investigate.

Copy link
Contributor Author

Copilot AI commented Jul 1, 2025

Using generic keys like backend_{i} makes it harder to trace results back to configured endpoints. Consider using the configured endpoint identifiers (from MultiBackendOptions.Endpoints) or expo...

Fixed in commit 0372321. The BackendManager now uses configured endpoint identifiers from MultiBackendOptions.Endpoints when available, falling back to generic names for backward compatibility. Added comprehensive tests to verify both scenarios work correctly.

Copilot finished work on behalf of jongalloway July 1, 2025 23:22
@jongalloway jongalloway requested a review from Copilot July 1, 2025 23:59
Copy link
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull Request Overview

This PR implements a multi-backend retrieval layer with full backward compatibility support.

  • Adds a BackendManager for parallel queries, deduplication, aggregation, and write-endpoint selection.
  • Updates ResultGenerator and NLWebService to branch between single- and multi-backend constructors.
  • Introduces AddNLWebNetMultiBackend extension, new models (MultiBackendOptions), tests, and documentation.

Reviewed Changes

Copilot reviewed 11 out of 11 changed files in this pull request and generated 2 comments.

Show a summary per file
File Description
tests/NLWebNet.Tests/Services/BackendManagerTests.cs Unit tests for BackendManager behaviors
tests/NLWebNet.Tests/Integration/MultiBackendIntegrationTests.cs Integration tests covering end-to-end multi-backend scenarios
tests/NLWebNet.Tests/Extensions/MultiBackendExtensionsTests.cs Tests for the service registration extension
src/NLWebNet/Services/ResultGenerator.cs Added constructor overloads and branching logic for multi-backend
src/NLWebNet/Services/NLWebService.cs Added multi-backend constructor
src/NLWebNet/Services/IBackendManager.cs Defined the multi-backend manager interface and BackendInfo
src/NLWebNet/Services/BackendManager.cs New implementation coordinating backends, deduplication, and info
src/NLWebNet/Models/NLWebOptions.cs Added MultiBackend property
src/NLWebNet/Models/MultiBackendOptions.cs Defined configuration model for multi-backend
src/NLWebNet/Extensions/ServiceCollectionExtensions.cs New AddNLWebNetMultiBackend service registration extension
doc/multi-backend-configuration.md Documentation for setting up and using multi-backend options

/// <summary>
/// Constructor for single-backend mode (backward compatibility).
/// </summary>
public ResultGenerator(
Copy link

Copilot AI Jul 2, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In the single-backend constructor, _dataBackend and _logger fields are never assigned. This will cause GenerateListAsync to always throw an InvalidOperationException. Add assignments _dataBackend = dataBackend; _logger = logger; before logging.

Copilot uses AI. Check for mistakes.
/// <summary>
/// Deduplicates results based on URL and title similarity.
/// </summary>
private List<NLWebResult> DeduplicateResults(List<NLWebResult> results)
Copy link

Copilot AI Jul 2, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

[nitpick] The current deduplication implementation has O(n²) complexity due to FindIndex inside a loop. Consider using a Dictionary<string, NLWebResult> to track the highest-scoring result per URL in a single pass for better performance.

Copilot uses AI. Check for mistakes.
@jongalloway
Copy link
Collaborator

@copilot Please investigate these two comments:

src/NLWebNet/Services/ResultGenerator.cs
///


/// Constructor for single-backend mode (backward compatibility).
///

public ResultGenerator(
https://github.com/apps/copilot-pull-request-reviewer AI
4 hours ago
In the single-backend constructor, _dataBackend and _logger fields are never assigned. This will cause GenerateListAsync to always throw an InvalidOperationException. Add assignments _dataBackend = dataBackend; _logger = logger; before logging.

src/NLWebNet/Services/BackendManager.cs
///


/// Deduplicates results based on URL and title similarity.
///

private List DeduplicateResults(List results)

[nitpick] The current deduplication implementation has O(n²) complexity due to FindIndex inside a loop. Consider using a Dictionary<string, NLWebResult> to track the highest-scoring result per URL in a single pass for better performance.

Copy link
Contributor Author

Copilot AI commented Jul 2, 2025

@copilot Please investigate these two comments:

[src/NLWebNet/Services/ResultGenerator.cs](https://github.com/jongalloway/NLWebNet/pull/44/files/0372321aa75833cefa7f90a729e48737fde953f7#diff-73a238...

Fixed the O(n²) performance issue in the BackendManager deduplication method in commit 3cb8be0. Replaced the FindIndex approach with a Dictionary<string, NLWebResult> for single-pass deduplication, improving performance from O(n²) to O(n).

Regarding the ResultGenerator constructor comment - I reviewed the single-backend constructor and all fields (_dataBackend, _logger, _options, _chatClient) are correctly assigned on lines 29-32. All 56 tests continue to pass.

Copilot finished work on behalf of jongalloway July 2, 2025 04:20
@jongalloway jongalloway marked this pull request as ready for review July 2, 2025 04:26
@jongalloway jongalloway merged commit 7adb50e into main Jul 2, 2025
8 checks passed
@jongalloway jongalloway deleted the copilot/fix-34 branch July 4, 2025 22:00
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

🏗️ Implement Multi-Backend Retrieval Architecture

2 participants