A system for transforming unstructured text into structured knowledge through automated graph creation, expansion, and analysis.
The Knowledge Graph Synthesis System processes text input to extract entities and relationships, builds a knowledge graph, expands it through recursive reasoning, analyzes its structure, creates abstractions, and generates theories and insights. The system supports both Russian and English languages and works with multiple Large Language Model providers.
- Text Processing: Hierarchical segmentation with contextual summarization
- Knowledge Extraction: Entity and relationship extraction with coreference resolution
- Graph Management: Creation, storage, and visualization of knowledge graphs
- Recursive Reasoning: Autonomous expansion of knowledge graphs through questioning and reasoning
- Graph Analysis: Calculation of structural metrics, community detection, and pattern identification
- Meta-Graph Creation: Abstraction of concepts into higher-level representations
- Theory Formation: Generation of theories and hypotheses with testing
- Results Generation: Production of documents, visualizations, and insights
- Claude (Anthropic)
- GPT (OpenAI)
- Gemini (Google)
- DeepSeek
- Ollama (local models)
- Python 3.9 or higher
- Git
-
Clone the repository:
git clone https://github.com/yourusername/knowledge-graph-synthesis.git cd knowledge-graph-synthesis -
Create a virtual environment:
python -m venv venv source venv/bin/activate # On Windows: venv\Scripts\activate
-
Install dependencies:
pip install -r requirements.txt
-
Create a
.envfile based on the example:cp .env.example .env # Edit .env with your API keys and configuration
Process a text file and generate insights:
python src/main.py process --input text.txt --output results/ --language en --provider claudeStart the interactive Streamlit interface:
python -m streamlit run src/frontend/app.pyConfigure LLM providers in your .env file:
CLAUDE_API_KEY=your_api_key
GPT_API_KEY=your_api_key
GEMINI_API_KEY=your_api_key
DEEPSEEK_API_KEY=your_api_key
# For Ollama, no API key is needed
src/
├── config/ # Configuration management
├── text_processing/ # Text processing module
├── knowledge_extraction/ # Knowledge extraction module
├── graph_management/ # Graph management module
├── reasoning/ # Reasoning module
├── analysis/ # Graph analysis module
├── meta_graph/ # Meta-graph module
├── theory_formation/ # Theory formation module
├── results/ # Results generation module
├── llm/ # LLM provider integration
├── storage/ # Storage services
├── utils/ # Utility services
├── frontend/ # Frontend integration
└── main.py # Application entry point
pytest- Fork the repository
- Create a feature branch:
git checkout -b feature/your-feature - Commit your changes:
git commit -am 'Add some feature' - Push to the branch:
git push origin feature/your-feature - Submit a pull request
Detailed documentation is available in the docs/ directory:
- Product Requirements Document
- Application Flow
- Technology Stack
- Frontend Guidelines
- Backend Structure
- Implementation Plan
This project is licensed under the MIT License - see the LICENSE file for details.
- This project was inspired by research in knowledge graph construction and reasoning
- Thanks to the developers of all the libraries and tools used in this project