|
1 | 1 | # GraphRAG for .NET |
2 | 2 |
|
3 | | -This repository hosts the in-progress C#/.NET 9 migration of Microsoft's GraphRAG project. The original |
4 | | -Python implementation is still available as a git submodule under `submodules/graphrag-python` for |
5 | | -reference while the port matures. |
| 3 | +GraphRAG for .NET is a ground-up port of Microsoft’s original GraphRAG Python reference implementation to the modern .NET 9 stack. |
| 4 | +Our goal is API parity with the Python pipeline while embracing native .NET idioms (dependency injection, logging abstractions, async I/O, etc.). |
| 5 | +The upstream Python code remains available in `submodules/graphrag-python` for side-by-side reference during the migration. |
6 | 6 |
|
7 | | -## Repository layout |
| 7 | +--- |
8 | 8 |
|
9 | | -- `GraphRag.slnx` – experimental solution definition referencing every project in the workspace. |
10 | | -- `Directory.Build.props` / `Directory.Packages.props` – centralised build settings and pinned NuGet package versions. |
11 | | -- `src/GraphRag.Abstractions` – shared interfaces for pipelines, storage, vector indexes, and graph databases. |
12 | | -- `src/GraphRag.Core` – the pipeline builder, service registration helpers, and DI primitives. |
13 | | -- `src/GraphRag.Storage.*` – concrete adapters for Neo4j, Azure Cosmos DB, and PostgreSQL-backed graph storage. |
14 | | -- `tests/GraphRag.Tests.Integration` – Aspire-powered integration tests that exercise the real datastores via xUnit. |
15 | | -- `.github/workflows/dotnet-integration.yml` – GitHub Actions workflow that runs the integration tests on both Linux and Windows agents. |
| 9 | +## Repository Structure |
| 10 | + |
| 11 | +``` |
| 12 | +graphrag/ |
| 13 | +├── GraphRag.slnx # Single solution covering every project |
| 14 | +├── Directory.Build.props / Directory.Packages.props |
| 15 | +├── src/ |
| 16 | +│ ├── ManagedCode.GraphRag # Core pipeline orchestration & abstractions |
| 17 | +│ ├── ManagedCode.GraphRag.CosmosDb # Azure Cosmos DB graph adapter |
| 18 | +│ ├── ManagedCode.GraphRag.Neo4j # Neo4j adapter & bolt client integration |
| 19 | +│ └── ManagedCode.GraphRag.Postgres # Apache AGE/PostgreSQL graph store adapter |
| 20 | +├── tests/ |
| 21 | +│ └── ManagedCode.GraphRag.Tests |
| 22 | +│ ├── Integration/ # Live container-backed scenarios (Testcontainers) |
| 23 | +│ └── … unit-level suites |
| 24 | +└── submodules/ |
| 25 | + └── graphrag-python # Original Python implementation (read-only reference) |
| 26 | +``` |
| 27 | + |
| 28 | +### Key Components |
| 29 | + |
| 30 | +- **ManagedCode.GraphRag** |
| 31 | + Hosts the pipelines, workflow execution model, and shared contracts such as `IGraphStore`, `IPipelineCache`, etc. |
| 32 | + |
| 33 | +- **ManagedCode.GraphRag.Neo4j / .Postgres / .CosmosDb** |
| 34 | + Concrete graph-store adapters that satisfy the core abstractions. Each hides the backend-specific SDK plumbing and exposes `.AddXGraphStore(...)` DI helpers. |
| 35 | + |
| 36 | +- **ManagedCode.GraphRag.Tests** |
| 37 | + Our only test project. |
| 38 | + Unit tests ensure helper APIs behave deterministically. |
| 39 | + The `Integration/` folder spins up real infrastructure (Neo4j, Apache AGE/PostgreSQL, optional Cosmos) via Testcontainers—no fakes or mocks. |
| 40 | + |
| 41 | +--- |
16 | 42 |
|
17 | 43 | ## Prerequisites |
18 | 44 |
|
19 | | -- .NET SDK 9.0 preview or newer (the workflow and development container install it via `dotnet-install`). |
20 | | -- Docker Desktop or another container runtime so Aspire can launch Neo4j and PostgreSQL for the tests. |
21 | | -- (Optional) Azure Cosmos DB Emulator running locally with the `COSMOS_EMULATOR_CONNECTION_STRING` environment |
22 | | - variable populated in order to run the Cosmos integration test. |
| 45 | +| Requirement | Notes | |
| 46 | +|-------------|-------| |
| 47 | +| [.NET SDK 9.0](https://dotnet.microsoft.com/en-us/download/dotnet/9.0) | The solution targets `net9.0`; install previews where necessary. | |
| 48 | +| Docker Desktop / compatible container runtime | Required for Testcontainers-backed integration tests (Neo4j & Apache AGE/PostgreSQL). | |
| 49 | +| (Optional) Azure Cosmos DB Emulator | Set `COSMOS_EMULATOR_CONNECTION_STRING` to enable Cosmos tests; they are skipped when the env var is absent. | |
23 | 50 |
|
24 | | -## Running the tests locally |
| 51 | +--- |
25 | 52 |
|
26 | | -```bash |
27 | | -dotnet test tests/GraphRag.Tests.Integration/GraphRag.Tests.Integration.csproj --logger "console;verbosity=normal" |
28 | | -``` |
| 53 | +## Getting Started |
| 54 | + |
| 55 | +1. **Clone the repository** |
| 56 | + ```bash |
| 57 | + git clone https://github.com/<your-org>/graphrag.git |
| 58 | + cd graphrag |
| 59 | + git submodule update --init --recursive |
| 60 | + ``` |
| 61 | + |
| 62 | +2. **Restore & build** |
| 63 | + ```bash |
| 64 | + dotnet build GraphRag.slnx |
| 65 | + ``` |
| 66 | + > Repository rule: always build the solution before running tests. |
| 67 | +
|
| 68 | +3. **Run the full test suite** |
| 69 | + ```bash |
| 70 | + dotnet test GraphRag.slnx --logger "console;verbosity=minimal" |
| 71 | + ``` |
| 72 | + This command will: |
| 73 | + - Restore packages |
| 74 | + - Launch Neo4j and Apache AGE/PostgreSQL containers via Testcontainers |
| 75 | + - Execute unit + integration tests from `ManagedCode.GraphRag.Tests` |
| 76 | + - Tear down containers automatically when finished |
| 77 | + |
| 78 | +4. **Limit to a specific integration area (optional)** |
| 79 | + ```bash |
| 80 | + dotnet test tests/ManagedCode.GraphRag.Tests/ManagedCode.GraphRag.Tests.csproj \ |
| 81 | + --filter "FullyQualifiedName~PostgresGraphStoreIntegrationTests" \ |
| 82 | + --logger "console;verbosity=normal" |
| 83 | + ``` |
| 84 | + |
| 85 | +--- |
| 86 | + |
| 87 | +## Integration Testing Strategy |
| 88 | + |
| 89 | +- **No fakes.** We removed the legacy fake Postgres store. Every graph operation in tests uses real services orchestrated by Testcontainers. |
| 90 | +- **Security coverage.** `Integration/PostgresGraphStoreIntegrationTests.cs` includes payloads that mimic SQL/Cypher injection attempts to ensure values remain literals and labels/types are strictly validated. |
| 91 | +- **Cross-backend validation.** `Integration/GraphStoreIntegrationTests.cs` exercises Postgres, Neo4j, and Cosmos (when available) through the shared `IGraphStore` abstraction. |
| 92 | +- **Workflow smoke tests.** Pipelines (e.g., `IndexingPipelineRunnerTests`) and finalization steps run end-to-end with the fixture-provisioned infrastructure. |
| 93 | + |
| 94 | +--- |
| 95 | + |
| 96 | +## Local Cosmos Testing |
| 97 | + |
| 98 | +1. Install and start the [Azure Cosmos DB Emulator](https://learn.microsoft.com/azure/cosmos-db/local-emulator). |
| 99 | +2. Export the connection string: |
| 100 | + ```bash |
| 101 | + export COSMOS_EMULATOR_CONNECTION_STRING="AccountEndpoint=https://localhost:8081/;AccountKey=…;" |
| 102 | + ``` |
| 103 | +3. Rerun `dotnet test`; Cosmos scenarios will seed databases & verify relationships without additional setup. |
| 104 | + |
| 105 | +--- |
| 106 | + |
| 107 | +## Development Tips |
| 108 | + |
| 109 | +- **Solution layout.** Use `GraphRag.slnx` in Visual Studio/VS Code/Rider for a complete workspace view. |
| 110 | +- **Formatting / analyzers.** Run `dotnet format GraphRag.slnx` before committing to satisfy the repo analyzers. |
| 111 | +- **Coding conventions.** |
| 112 | + - `nullable` and implicit usings are enabled; keep annotations accurate. |
| 113 | + - Async methods should follow the `Async` suffix convention. |
| 114 | + - Prefer DI helpers in `ManagedCode.GraphRag` when wiring new services. |
| 115 | +- **Graph adapters.** Implement additional backends by conforming to `IGraphStore` and registering via `IServiceCollection`. |
| 116 | + |
| 117 | +--- |
| 118 | + |
| 119 | +## Contributing |
| 120 | + |
| 121 | +1. Fork and clone the repo. |
| 122 | +2. Create a feature branch from `main`. |
| 123 | +3. Follow the repository rules (build before testing; integration tests must use real containers). |
| 124 | +4. Submit a PR referencing any related issues. Include `dotnet test GraphRag.slnx` output in the PR body. |
| 125 | + |
| 126 | +See `CONTRIBUTING.md` for coding standards and PR expectations. |
| 127 | + |
| 128 | +--- |
| 129 | + |
| 130 | +## License & Credits |
| 131 | + |
| 132 | +- Licensed under the [MIT License](LICENSE). |
| 133 | +- Original Python implementation © Microsoft; see the `graphrag-python` submodule for upstream documentation and examples. |
| 134 | + |
| 135 | +--- |
29 | 136 |
|
30 | | -When the Cosmos emulator is available the test suite will automatically seed it and assert the stored relationship count. |
| 137 | +Have questions or found a bug? Open an issue or start a discussion—we’re actively evolving the .NET port and welcome feedback. 🚀 |
0 commit comments