neo4j
diff --git a/‎CHANGELOG.md‎
Lines changed: 3 additions & 1 deletion b/‎CHANGELOG.md‎
Lines changed: 3 additions & 1 deletion
diff --git a/‎docs/source/types.rst‎
Lines changed: 5 additions & 0 deletions b/‎docs/source/types.rst‎
Lines changed: 5 additions & 0 deletions
diff --git a/‎docs/source/user_guide_kg_builder.rst‎
Lines changed: 20 additions & 0 deletions b/‎docs/source/user_guide_kg_builder.rst‎
Lines changed: 20 additions & 0 deletions
diff --git a/‎docs/source/user_guide_rag.rst‎
Lines changed: 81 additions & 0 deletions b/‎docs/source/user_guide_rag.rst‎
Lines changed: 81 additions & 0 deletions
diff --git a/‎examples/customize/build_graph/components/extractors/llm_entity_relation_extractor_with_structured_output.py‎
Lines changed: 74 additions & 0 deletions b/‎examples/customize/build_graph/components/extractors/llm_entity_relation_extractor_with_structured_output.py‎
Lines changed: 74 additions & 0 deletions
diff --git a/‎examples/customize/llms/openai_llm_structured_output.py‎
Lines changed: 137 additions & 0 deletions b/‎examples/customize/llms/openai_llm_structured_output.py‎
Lines changed: 137 additions & 0 deletions
@@ -8,12 +8,14 @@
 
 - Support for Python 3.14
 - Support for version 6.0.0 of the Neo4j Python driver
+- Support for structured output in `OpenAILLM` and `VertexAILLM` via `response_format` parameter. Accepts Pydantic models (requires `ConfigDict(extra="forbid")`) or JSON schemas.
+- Added `use_structured_output` parameter to `LLMEntityRelationExtractor` for improved entity extraction reliability with OpenAI/VertexAI LLMs.
 
 ### Changed
 
 - Switched project/dependency management from Poetry to uv.
 - Dropped support for Python 3.9 (EOL)
-
+- Made `Neo4jNode`, `Neo4jRelationship`, and `Neo4jGraph` stricter: properties field now uses typed `PropertyValue` (Neo4j primitives, temporal values, lists, `GeoPoint`) and fixed mutable defaults with `Field(default_factory=...)`.
 
 ## 1.11.0
 
 
@@ -70,6 +70,11 @@ Neo4jGraph
 
 .. autoclass:: neo4j_graphrag.experimental.components.types.Neo4jGraph
 
+GeoPoint
+========
+
+.. autoclass:: neo4j_graphrag.experimental.components.types.GeoPoint
+
 KGWriterModel
 =============
 
 
@@ -932,6 +932,26 @@ It can be used in this way:
 The LLM to use can be customized, the only constraint is that it obeys the :ref:`LLMInterface <llminterface>`.
 
 
+Using Structured Output
+-----------------------
+
+For improved reliability and type safety with :ref:`OpenAILLM <openaillm>` or :ref:`VertexAILLM <vertexaillm>`, enable structured output mode. When `use_structured_output=True`, the extractor uses the LLMInterfaceV2, passing the `Neo4jGraph` Pydantic model as `response_format` to `invoke()`. This ensures the LLM response conforms to the expected graph structure with automatic type validation, reducing the need for JSON repair and error handling.
+
+.. code:: python
+
+    from neo4j_graphrag.experimental.components.entity_relation_extractor import (
+        LLMEntityRelationExtractor,
+    )
+    from neo4j_graphrag.llm import OpenAILLM
+
+    llm = OpenAILLM(model_name="gpt-4o-mini", model_params={"temperature": 0})
+    extractor = LLMEntityRelationExtractor(llm=llm, use_structured_output=True)
+
+.. note::
+
+    Using `use_structured_output=True` with other LLM providers will raise a `ValueError`. Do not pass `response_format` in constructor parameters (`model_params` or `generation_config`); the extractor automatically sets it when calling `invoke()`.
+
+
 Error Behaviour
 ---------------
 
 
@@ -295,6 +295,87 @@ Here's an example using the Python Ollama client:
 See :ref:`llminterface`.
 
 
+Structured Output with LLMs
+============================
+
+Structured output enables LLMs to return responses conforming to a predefined schema (Pydantic model or JSON schema), ensuring type-safe and consistent data structures. This is useful for extracting entities, relationships, or any structured data with automatic validation.
+
+**V2 Interface (Recommended)**: For :ref:`OpenAILLM <openaillm>` and :ref:`VertexAILLM <vertexaillm>`, pass `response_format` as a parameter to the `invoke()` method when using the V2 interface (list of `LLMMessage`). The `response_format` accepts either a Pydantic model class or a JSON schema dictionary. Other LLM providers will raise `NotImplementedError` if `response_format` is used.
+
+**V1 Interface (Legacy)**: With the V1 interface (string input), standard JSON mode is supported for both OpenAI and VertexAI via constructor parameters only (`model_params` for OpenAI, `generation_config` for VertexAI). The `response_format` parameter in `invoke()` is not permitted with V1.
+
+.. code:: python
+
+    from pydantic import BaseModel, ConfigDict
+    from neo4j_graphrag.llm import OpenAILLM
+    from neo4j_graphrag.types import LLMMessage
+
+    class Person(BaseModel):
+        model_config = ConfigDict(extra="forbid")  # Required for OpenAI structured output
+        
+        name: str
+        age: int
+        occupation: str
+
+    llm = OpenAILLM(model_name="gpt-4o-mini")
+
+    # V2: Pass response_format to invoke()
+    messages = [LLMMessage(role="user", content="Extract: John is a 30 year old engineer.")]
+    response = llm.invoke(messages, response_format=Person, temperature=0)
+    person = Person.model_validate_json(response.content)  # {"name": "John", "age": 30, ...}
+
+    # V1: Use constructor parameters for standard JSON mode
+    llm_v1 = OpenAILLM(
+        model_name="gpt-4o-mini",
+        model_params={"response_format": {"type": "json_object"}, "temperature": 0}
+    )
+    response_v1 = llm_v1.invoke("Extract person in JSON format: John is 30 years old.")
+
+OpenAI Structured Output
+-------------------------
+
+OpenAI supports Pydantic models, JSON schemas, and JSON object mode. **Important**: Pydantic models must include `ConfigDict(extra="forbid")` to generate schemas with `additionalProperties: false`, which is required by OpenAI's strict mode. 
+
+.. code:: python
+
+    from neo4j_graphrag.llm import OpenAILLM
+    from neo4j_graphrag.types import LLMMessage
+
+    llm = OpenAILLM(model_name="gpt-4o-mini")
+
+    # JSON schema
+    json_schema = {
+        "type": "object",
+        "properties": {"name": {"type": "string"}, "age": {"type": "integer"}},
+        "required": ["name", "age"]
+    }
+    messages = [LLMMessage(role="user", content="Extract: John is 30.")]
+    response = llm.invoke(messages, response_format=json_schema, temperature=0)
+
+    # JSON object mode (no schema enforcement)
+    response = llm.invoke(messages, response_format={"type": "json_object"}, temperature=0.5)
+
+VertexAI Structured Output
+---------------------------
+
+VertexAI uses `GenerationConfig` with `response_mime_type` and `response_schema` internally when `response_format` is passed to `invoke()`. Both Pydantic models and JSON schemas are supported. **Important**: Additional `GenerationConfig` parameters (e.g., `temperature`, `max_output_tokens`) can be passed as kwargs to `invoke()`.
+.. code:: python
+
+    from pydantic import BaseModel, ConfigDict
+    from neo4j_graphrag.llm import VertexAILLM
+    from neo4j_graphrag.types import LLMMessage
+
+    class Person(BaseModel):
+        model_config = ConfigDict(extra="forbid")
+        
+        name: str
+        age: int
+
+    llm = VertexAILLM(model_name="gemini-1.5-pro")
+    messages = [LLMMessage(role="user", content="Extract: John is 30.")]
+    response = llm.invoke(messages, response_format=Person, temperature=0)
+    person = Person.model_validate_json(response.content)
+
 Rate Limit Handling
 ===================
 
 
@@ -0,0 +1,74 @@
+"""
+Simple example demonstrating structured output with LLMEntityRelationExtractor.
+
+This example shows how to use structured output for more reliable entity and
+relationship extraction with automatic schema validation.
+
+The Neo4jGraph schema is now compatible with both OpenAI and VertexAI structured
+output APIs, with strict schema validation (additionalProperties: false) and
+proper required field definitions.
+
+Prerequisites:
+- Google Cloud credentials configured for VertexAI
+- Or OpenAI API key set in OPENAI_API_KEY environment variable
+"""
+
+import asyncio
+from dotenv import load_dotenv
+
+from neo4j_graphrag.experimental.components.entity_relation_extractor import (
+    LLMEntityRelationExtractor,
+)
+from neo4j_graphrag.experimental.components.types import (
+    Neo4jGraph,
+    TextChunk,
+    TextChunks,
+)
+from neo4j_graphrag.llm import VertexAILLM
+
+
+async def main() -> Neo4jGraph:
+    """
+    Demonstrates entity and relation extraction with structured output.
+
+    With use_structured_output=True:
+    - Uses LLMInterfaceV2 (list of messages)
+    - Passes Neo4jGraph Pydantic model as response_format to invoke()
+    - Ensures response conforms to expected graph structure
+    - Provides automatic type validation
+    - Reduces need for JSON repair and error handling
+    """
+    load_dotenv()
+    # Initialize LLM - no response_format in constructor!
+    llm = VertexAILLM(model_name="gemini-2.5-flash")
+
+    # llm = OpenAILLM(
+    #     model_name="gpt-4o-mini",
+    #     model_params={"temperature": 0}
+    # )
+
+    # Enable structured output for reliable extraction
+    extractor = LLMEntityRelationExtractor(
+        llm=llm,
+        use_structured_output=True,  # This is the key parameter!
+    )
+
+    # Sample text about a person and organization
+    sample_text = """
+    Albert Einstein was a theoretical physicist who developed the theory of relativity.
+    He worked at the Institute for Advanced Study in Princeton from 1933 until his death in 1955.
+    """
+
+    # Extract entities and relationships
+    graph = await extractor.run(
+        chunks=TextChunks(chunks=[TextChunk(text=sample_text, index=0)])
+    )
+
+    return graph
+
+
+if __name__ == "__main__":
+    # Run extraction
+    graph = asyncio.run(main())
+
+    print(graph)
@@ -0,0 +1,137 @@
+#  Copyright (c) "Neo4j"
+#  Neo4j Sweden AB [https://neo4j.com]
+#  #
+#  Licensed under the Apache License, Version 2.0 (the "License");
+#  you may not use this file except in compliance with the License.
+#  You may obtain a copy of the License at
+#  #
+#      https://www.apache.org/licenses/LICENSE-2.0
+#  #
+#  Unless required by applicable law or agreed to in writing, software
+#  distributed under the License is distributed on an "AS IS" BASIS,
+#  WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+#  See the License for the specific language governing permissions and
+#  limitations under the License.
+"""
+Simple example comparing OpenAI LLM V1 (legacy) vs V2 (structured output).
+
+This demonstrates how V2's structured output provides type-safe, validated responses
+compared to V1's prompt-based JSON extraction.
+
+Prerequisites:
+- OpenAI API key set in OPENAI_API_KEY environment variable
+"""
+
+from dotenv import load_dotenv
+from pydantic import BaseModel, ConfigDict
+from neo4j_graphrag.llm import OpenAILLM
+from neo4j_graphrag.types import LLMMessage
+
+load_dotenv()
+
+
+# Define a Pydantic model for structured output
+class Movie(BaseModel):
+    model_config = ConfigDict(
+        extra="forbid"
+    )  # This is important to prevent extra properties from being added to the response
+
+    title: str
+    year: int
+    director: str
+    genre: str
+
+
+# =============================================================================
+# V1 (Legacy): Manual JSON mode with prompt engineering
+# =============================================================================
+print("=" * 60)
+print("V1 Legacy: Manual JSON extraction with prompt engineering")
+print("=" * 60)
+
+# V1: Use model_params with response_format for JSON object mode
+llm_v1 = OpenAILLM(
+    model_name="gpt-4o-mini",
+    model_params={"response_format": {"type": "json_object"}, "temperature": 0},
+)
+
+# V1 requires string input and explicit JSON instructions in the prompt
+v1_prompt = """Extract movie information and respond in JSON format.
+Include: title, year, director, genre.
+
+Text: Inception was directed by Christopher Nolan in 2010. It's a science fiction thriller."""
+
+response_v1 = llm_v1.invoke(v1_prompt)
+print(f"Response: {response_v1.content}")
+
+
+# =============================================================================
+# V2 (New): Structured output with Pydantic model
+# =============================================================================
+print("\n" + "=" * 60)
+print("V2: Structured output with Pydantic model")
+print("=" * 60)
+
+# V2: Use clean LLM without constructor params
+llm_v2 = OpenAILLM(model_name="gpt-4o-mini")
+
+# V2 uses list of LLMMessage for input
+messages = [
+    LLMMessage(
+        role="user",
+        content="Inception was directed by Christopher Nolan in 2010. It's a science fiction thriller.",
+    )
+]
+
+# Pass response_format and temperature directly to invoke()
+response_v2 = llm_v2.invoke(messages, response_format=Movie, temperature=0)
+
+# Parse and validate in one step
+movie = Movie.model_validate_json(response_v2.content)
+print(f"Response: {response_v2.content}")
+
+
+# =============================================================================
+# V2: Using JSON Schema instead of Pydantic
+# =============================================================================
+print("\n" + "=" * 60)
+print("V2 Alternative: Structured output with JSON Schema")
+print("=" * 60)
+
+# V2: Use clean LLM without constructor params
+llm_v2 = OpenAILLM(model_name="gpt-4o-mini")
+
+# V2 uses list of LLMMessage for input
+messages = [
+    LLMMessage(
+        role="user",
+        content="Inception was directed by Christopher Nolan in 2010. It's a science fiction thriller.",
+    )
+]
+
+# Define a JSON schema (equivalent to the Movie Pydantic model)
+# Note: OpenAI requires JSON schemas to be wrapped in this specific format
+movie_schema = {
+    "type": "json_schema",
+    "json_schema": {
+        "name": "movie_info",
+        "schema": {
+            "type": "object",
+            "properties": {
+                "title": {"type": "string"},
+                "year": {"type": "integer"},
+                "director": {"type": "string"},
+                "genre": {"type": "string"},
+            },
+            "required": ["title", "year", "director", "genre"],
+            "additionalProperties": False,
+        },
+    },
+}
+
+# Pass JSON schema as response_format
+response_v2_schema = llm_v2.invoke(
+    messages, response_format=movie_schema, temperature=0
+)
+
+print(f"Response: {response_v2_schema.content}")