microsoft
diff --git a/‎docs/config/yaml.md‎
Lines changed: 0 additions & 23 deletions b/‎docs/config/yaml.md‎
Lines changed: 0 additions & 23 deletions
diff --git a/‎docs/index/architecture.md‎
Lines changed: 0 additions & 1 deletion b/‎docs/index/architecture.md‎
Lines changed: 0 additions & 1 deletion
diff --git a/‎docs/index/default_dataflow.md‎
Lines changed: 2 additions & 22 deletions b/‎docs/index/default_dataflow.md‎
Lines changed: 2 additions & 22 deletions
diff --git a/‎docs/index/methods.md‎
Lines changed: 1 addition & 1 deletion b/‎docs/index/methods.md‎
Lines changed: 1 addition & 1 deletion
diff --git a/‎docs/index/outputs.md‎
Lines changed: 0 additions & 2 deletions b/‎docs/index/outputs.md‎
Lines changed: 0 additions & 2 deletions
diff --git a/‎docs/visualization_guide.md‎
Lines changed: 0 additions & 7 deletions b/‎docs/visualization_guide.md‎
Lines changed: 0 additions & 7 deletions
diff --git a/‎graphrag/config/defaults.py‎
Lines changed: 0 additions & 23 deletions b/‎graphrag/config/defaults.py‎
Lines changed: 0 additions & 23 deletions
diff --git a/‎graphrag/config/init_content.py‎
Lines changed: 0 additions & 6 deletions b/‎graphrag/config/init_content.py‎
Lines changed: 0 additions & 6 deletions
diff --git a/‎graphrag/config/models/embed_graph_config.py‎
Lines changed: 0 additions & 45 deletions b/‎graphrag/config/models/embed_graph_config.py‎
Lines changed: 0 additions & 45 deletions
diff --git a/‎graphrag/config/models/graph_rag_config.py‎
Lines changed: 0 additions & 13 deletions b/‎graphrag/config/models/graph_rag_config.py‎
Lines changed: 0 additions & 13 deletions
@@ -287,29 +287,6 @@ These are the settings used for Leiden hierarchical clustering of the graph to c
 - `max_length` **int** - The maximum number of output tokens per report.
 - `max_input_length` **int** - The maximum number of input tokens to use when generating reports.
 
-### embed_graph
-
-We use node2vec to embed the graph. This is primarily used for visualization, so it is not turned on by default.
-
-#### Fields
-
-- `enabled` **bool** - Whether to enable graph embeddings.
-- `dimensions` **int** - Number of vector dimensions to produce.
-- `num_walks` **int** - The node2vec number of walks.
-- `walk_length` **int** - The node2vec walk length.
-- `window_size` **int** - The node2vec window size.
-- `iterations` **int** - The node2vec number of iterations.
-- `random_seed` **int** - The node2vec random seed.
-- `strategy` **dict** - Fully override the embed graph strategy.
-
-### umap
-
-Indicates whether we should run UMAP dimensionality reduction. This is used to provide an x/y coordinate to each graph node, suitable for visualization. If this is not enabled, nodes will receive a 0/0 x/y coordinate. If this is enabled, you *must* enable graph embedding as well.
-
-#### Fields
-
-- `enabled` **bool** - Whether to enable UMAP layouts.
-
 ### snapshots
 
 #### Fields
 
@@ -23,7 +23,6 @@ stateDiagram-v2
     Chunk --> EmbedDocuments
     ExtractGraph --> GenerateReports
     ExtractGraph --> EmbedEntities
-    ExtractGraph --> EmbedGraph
 ```
 
 ### LLM Caching
 
@@ -46,8 +46,7 @@ flowchart TB
     end
     subgraph phase6[Phase 6: Network Visualization]
     graph_outputs --> graph_embed[Graph Embedding]
-    graph_embed --> umap_entities[Umap Entities]
-    umap_entities --> combine_nodes[Final Entities]
+    graph_embed --> combine_nodes[Final Entities]
     end
     subgraph phase7[Phase 7: Text Embeddings]
     textUnits --> text_embed[Text Embedding]
@@ -176,27 +175,8 @@ In this step, we link each document to the text-units that were created in the f
 
 At this point, we can export the **Documents** table into the knowledge Model.
 
-## Phase 6: Network Visualization (optional)
 
-In this phase of the workflow, we perform some steps to support network visualization of our high-dimensional vector spaces within our existing graphs. At this point there are two logical graphs at play: the _Entity-Relationship_ graph and the _Document_ graph.
-
-```mermaid
----
-title: Network Visualization Workflows
----
-flowchart LR
-    ag[Graph Table] --> ge[Node2Vec Graph Embedding] --> ne[Umap Entities] --> ng[Entities Table]
-```
-
-### Graph Embedding
-
-In this step, we generate a vector representation of our graph using the Node2Vec algorithm. This will allow us to understand the implicit structure of our graph and provide an additional vector-space in which to search for related concepts during our query phase.
-
-### Dimensionality Reduction
-
-For each of the logical graphs, we perform a UMAP dimensionality reduction to generate a 2D representation of the graph. This will allow us to visualize the graph in a 2D space and understand the relationships between the nodes in the graph. The UMAP embeddings are reduced to two dimensions as x/y coordinates.
-
-## Phase 7: Text Embedding
+## Phase 6: Text Embedding
 
 For all artifacts that require downstream vector search, we generate text embeddings as a final step. These embeddings are written directly to a configured vector store. By default we embed entity descriptions, text unit text, and community report text.
 
 
@@ -41,4 +41,4 @@ You can install it manually by running `python -m spacy download <model_name>`,
 
 ## Choosing a Method
 
-Standard GraphRAG provides a rich description of real-world entities and relationships, but is more expensive that FastGraphRAG. We estimate graph extraction to constitute roughly 75% of indexing cost. FastGraphRAG is therefore much cheaper, but the tradeoff is that the extracted graph is less directly relevant for use outside of GraphRAG, and the graph tends to be quite a bit noisier. If high fidelity entities and graph exploration are important to your use case, we recommend staying with traditional GraphRAG. If your use case is primarily aimed at summary questions using global search, FastGraphRAG provides high quality summarization at much less LLM cost.
+Standard GraphRAG provides a rich description of real-world entities and relationships, but is more expensive than FastGraphRAG. We estimate graph extraction to constitute roughly 75% of indexing cost. FastGraphRAG is therefore much cheaper, but the tradeoff is that the extracted graph is less directly relevant for use outside of GraphRAG, and the graph tends to be quite a bit noisier. If high fidelity entities and graph exploration are important to your use case, we recommend staying with traditional GraphRAG. If your use case is primarily aimed at summary questions using global search, FastGraphRAG provides high quality summarization at much less LLM cost.
@@ -82,8 +82,6 @@ List of all entities found in the data by the LM.
 | text_unit_ids | str[] | List of the text units containing the entity. |
 | frequency     | int   | Count of text units the entity was found within. |
 | degree        | int   | Node degree (connectedness) in the graph. |
-| x             | float | X position of the node for visual layouts. If graph embeddings and UMAP are not turned on, this will be 0. |
-| y             | float | Y position of the node for visual layouts. If graph embeddings and UMAP are not turned on, this will be 0. |
 
 ## relationships
 List of all entity-to-entity relationships found in the data by the LM. This is also the _edge list_ for the graph.
 
@@ -8,13 +8,6 @@ Before building an index, please review your `settings.yaml` configuration file
 snapshots:
   graphml: true
 ```
-(Optional) To support other visualization tools and exploration, additional parameters can be enabled that provide access to vector embeddings.
-```yaml
-embed_graph:
-  enabled: true # will generate node2vec embeddings for nodes
-umap:
-  enabled: true # will generate UMAP embeddings for nodes, giving the entities table an x/y position to plot
-```
 After running the indexing pipeline over your data, there will be an output folder (defined by the `storage.base_dir` setting).
 
 - **Output Folder**: Contains artifacts from the LLM’s indexing pass.
 
@@ -125,20 +125,6 @@ class DriftSearchDefaults:
     embedding_model_id: str = DEFAULT_EMBEDDING_MODEL_ID
 
 
-@dataclass
-class EmbedGraphDefaults:
-    """Default values for embedding graph."""
-
-    enabled: bool = False
-    dimensions: int = 1536
-    num_walks: int = 10
-    walk_length: int = 40
-    window_size: int = 2
-    iterations: int = 3
-    random_seed: int = 597832
-    use_lcc: bool = True
-
-
 @dataclass
 class EmbedTextDefaults:
     """Default values for embedding text."""
@@ -367,13 +353,6 @@ class SummarizeDescriptionsDefaults:
     model_id: str = DEFAULT_CHAT_MODEL_ID
 
 
-@dataclass
-class UmapDefaults:
-    """Default values for UMAP."""
-
-    enabled: bool = False
-
-
 @dataclass
 class UpdateIndexOutputDefaults(StorageDefaults):
     """Default values for update index output."""
@@ -410,7 +389,6 @@ class GraphRagConfigDefaults:
     )
     cache: CacheDefaults = field(default_factory=CacheDefaults)
     input: InputDefaults = field(default_factory=InputDefaults)
-    embed_graph: EmbedGraphDefaults = field(default_factory=EmbedGraphDefaults)
     embed_text: EmbedTextDefaults = field(default_factory=EmbedTextDefaults)
     chunks: ChunksDefaults = field(default_factory=ChunksDefaults)
     snapshots: SnapshotsDefaults = field(default_factory=SnapshotsDefaults)
@@ -427,7 +405,6 @@ class GraphRagConfigDefaults:
     extract_claims: ExtractClaimsDefaults = field(default_factory=ExtractClaimsDefaults)
     prune_graph: PruneGraphDefaults = field(default_factory=PruneGraphDefaults)
     cluster_graph: ClusterGraphDefaults = field(default_factory=ClusterGraphDefaults)
-    umap: UmapDefaults = field(default_factory=UmapDefaults)
     local_search: LocalSearchDefaults = field(default_factory=LocalSearchDefaults)
     global_search: GlobalSearchDefaults = field(default_factory=GlobalSearchDefaults)
     drift_search: DriftSearchDefaults = field(default_factory=DriftSearchDefaults)
 
@@ -130,12 +130,6 @@
   max_length: {graphrag_config_defaults.community_reports.max_length}
   max_input_length: {graphrag_config_defaults.community_reports.max_input_length}
 
-embed_graph:
-  enabled: false # if true, will generate node2vec embeddings for nodes
-
-umap:
-  enabled: false # if true, will generate UMAP embeddings for nodes (embed_graph must also be enabled)
-
 snapshots:
   graphml: false
   embeddings: false
 
@@ -19,7 +19,6 @@
 from graphrag.config.models.cluster_graph_config import ClusterGraphConfig
 from graphrag.config.models.community_reports_config import CommunityReportsConfig
 from graphrag.config.models.drift_search_config import DRIFTSearchConfig
-from graphrag.config.models.embed_graph_config import EmbedGraphConfig
 from graphrag.config.models.extract_claims_config import ClaimExtractionConfig
 from graphrag.config.models.extract_graph_config import ExtractGraphConfig
 from graphrag.config.models.extract_graph_nlp_config import ExtractGraphNLPConfig
@@ -35,7 +34,6 @@
     SummarizeDescriptionsConfig,
 )
 from graphrag.config.models.text_embedding_config import TextEmbeddingConfig
-from graphrag.config.models.umap_config import UmapConfig
 from graphrag.config.models.vector_store_config import VectorStoreConfig
 
 
@@ -254,17 +252,6 @@ def _validate_reporting_base_dir(self) -> None:
     )
     """The community reports configuration to use."""
 
-    embed_graph: EmbedGraphConfig = Field(
-        description="Graph embedding configuration.",
-        default=EmbedGraphConfig(),
-    )
-    """Graph Embedding configuration."""
-
-    umap: UmapConfig = Field(
-        description="The UMAP configuration to use.", default=UmapConfig()
-    )
-    """The UMAP configuration to use."""
-
     snapshots: SnapshotsConfig = Field(
         description="The snapshots configuration to use.",
         default=SnapshotsConfig(),
Original file line number	Diff line number	Diff line change
@@ -41,4 +41,4 @@ You can install it manually by running `python -m spacy download <model_name>`,
`41`	`41`
`42`	`42`	`## Choosing a Method`
`43`	`43`
`44`		-Standard GraphRAG provides a rich description of real-world entities and relationships, but is more expensive that FastGraphRAG. We estimate graph extraction to constitute roughly 75% of indexing cost. FastGraphRAG is therefore much cheaper, but the tradeoff is that the extracted graph is less directly relevant for use outside of GraphRAG, and the graph tends to be quite a bit noisier. If high fidelity entities and graph exploration are important to your use case, we recommend staying with traditional GraphRAG. If your use case is primarily aimed at summary questions using global search, FastGraphRAG provides high quality summarization at much less LLM cost.
	`44`	+Standard GraphRAG provides a rich description of real-world entities and relationships, but is more expensive than FastGraphRAG. We estimate graph extraction to constitute roughly 75% of indexing cost. FastGraphRAG is therefore much cheaper, but the tradeoff is that the extracted graph is less directly relevant for use outside of GraphRAG, and the graph tends to be quite a bit noisier. If high fidelity entities and graph exploration are important to your use case, we recommend staying with traditional GraphRAG. If your use case is primarily aimed at summary questions using global search, FastGraphRAG provides high quality summarization at much less LLM cost.