|
| 1 | +# csv-to-knowledge-graph |
| 2 | + |
| 3 | +<p align=center> |
| 4 | + <img width="80%" src="docs/images/banner-dark.png#gh-dark-mode-only" alt="csv-to-knowledge-graph"/> |
| 5 | + <img width="80%" src="docs/images/banner-white.png#gh-light-mode-only" alt="csv-to-knowledge-graph"/> |
| 6 | +</p> |
| 7 | + |
| 8 | +<div align=center> |
| 9 | + <h3>Create Dgraph backed knowledge graphs from CSV files. </h3> |
| 10 | + |
| 11 | + <p>Built with <a href="https://hypermode.com/">Hypermode</a> and powered by AI.</p> |
| 12 | + |
| 13 | + <p> 👉 |
| 14 | + <a href="https://csv-to-knowledge-graph-frontend.vercel.app">Import my CSV now!</a> |
| 15 | + </p> |
| 16 | +</div> |
| 17 | + |
| 18 | +## Table of Contents |
| 19 | + |
| 20 | +- [csv-to-knowledge-graph](#csv-to-knowledge-graph) |
| 21 | + - [Table of Contents](#table-of-contents) |
| 22 | + - [Features](#features) |
| 23 | + - [Introduction](#introduction) |
| 24 | + - [What problem does this solve?](#what-problem-does-this-solve) |
| 25 | + - [How does it work?](#how-does-it-work) |
| 26 | + - [Under the Hood](#under-the-hood) |
| 27 | + - [AI-Powered Analysis](#ai-powered-analysis) |
| 28 | + - [RDF Generation](#rdf-generation) |
| 29 | + - [Query Generation](#query-generation) |
| 30 | + - [CSV-to-RDF Library](#csv-to-rdf-library) |
| 31 | + - [RDF-to-Dgraph Library](#rdf-to-dgraph-library) |
| 32 | + - [Usage](#usage) |
| 33 | + - [Powered by Hypermode](#powered-by-hypermode) |
| 34 | + |
| 35 | +> Looking to contribute? Check out the [Contributing Guide](docs/CONTRIBUTING.md) for more information on how to get started with development. |
| 36 | +
|
| 37 | +## Features |
| 38 | + |
| 39 | +- 🚀 **Browser-Based CSV Processing** - Upload CSV files directly in your browser. |
| 40 | + |
| 41 | +- 🧠 **AI-Powered Graph Generation** - Auto-detect entities and relationships from CSV columns. |
| 42 | + |
| 43 | +- 🔍 **Interactive Graph Visualization** - Zoom, pan, and reposition nodes in your knowledge graph. |
| 44 | + |
| 45 | +- 🔄 **RDF Template Generation** - Create RDF templates from your graph structure. |
| 46 | + |
| 47 | +- 📝 **RDF Data Conversion** - Transform CSV to RDF with real-time progress tracking. |
| 48 | + |
| 49 | +- 🔌 **Dgraph Integration** - Connect, test, and import data to your Dgraph instance. |
| 50 | + |
| 51 | +- 💡 **DQL Query Generation** - Get auto-generated queries specific to your schema. |
| 52 | + |
| 53 | +- 🔗 **Ratel Support** - Open queries in Dgraph's Ratel UI with one click. |
| 54 | + |
| 55 | +- 🧩 **Modular Architecture** - Separate packages for CSV-to-RDF, RDF-to-Dgraph, and graph handling. |
| 56 | + |
| 57 | +## Introduction |
| 58 | + |
| 59 | +### What problem does this solve? |
| 60 | + |
| 61 | +Getting data into Dgraph is unnecessarily complicated. Currently, you need to: |
| 62 | + |
| 63 | +- Manually create schema files that correctly model your data |
| 64 | +- Learn RDF (Resource Description Framework) format and its quirks |
| 65 | +- Run complicated command-line tools with cryptic options |
| 66 | +- Write custom scripts to transform your data that often break |
| 67 | + |
| 68 | +This creates a steep learning curve that prevents many organizations from using graph databases effectively. CSV to Knowledge Graph removes these barriers by providing a simple, visual way to transform regular CSV files into graph data. |
| 69 | + |
| 70 | +### How does it work? |
| 71 | + |
| 72 | +1. **Upload your CSV**: Simply drag and drop your CSV file into the browser |
| 73 | +2. **AI analyzes your data**: Our AI examines your column names to automatically identify entities and relationships |
| 74 | +3. **Visual graph preview**: See and interact with the proposed knowledge graph structure |
| 75 | +4. **Generate RDF**: Convert your CSV data to the RDF format Dgraph requires |
| 76 | +5. **One-click import**: Connect to your Dgraph instance and import with a single click |
| 77 | + |
| 78 | +Here's a simple example of how a CSV file: |
| 79 | + |
| 80 | +```sh |
| 81 | +Order_ID,Customer_Name,Product_Name,Quantity,Price |
| 82 | +ORD-001,John Smith,Wireless Earbuds,1,79.99 |
| 83 | +ORD-002,Sarah Johnson,Smart Watch,1,249.99 |
| 84 | +``` |
| 85 | + |
| 86 | +Gets transformed into RDF triples: |
| 87 | + |
| 88 | +```sh |
| 89 | +<_:Customer_John_Smith> <dgraph.type> "Customer" . |
| 90 | +<_:Customer_John_Smith> <Customer.name> "John Smith" . |
| 91 | + |
| 92 | +<_:Order_ORD-001> <dgraph.type> "Order" . |
| 93 | +<_:Order_ORD-001> <Order.id> "ORD-001" . |
| 94 | +<_:Order_ORD-001> <PLACED_BY> <_:Customer_John_Smith> . |
| 95 | + |
| 96 | +<_:Product_Wireless_Earbuds> <dgraph.type> "Product" . |
| 97 | +<_:Product_Wireless_Earbuds> <Product.name> "Wireless Earbuds" . |
| 98 | +<_:Product_Wireless_Earbuds> <Product.price> "79.99" . |
| 99 | +<_:Product_Wireless_Earbuds> <Product.quantity> "1" . |
| 100 | + |
| 101 | +<_:Product_Wireless_Earbuds> <BELONGS_TO_ORDER> <_:Order_ORD-001> . |
| 102 | +``` |
| 103 | + |
| 104 | +<p align=center style="margin-top: 20px; margin-bottom: 20px;"> |
| 105 | + <img width="80%" src="docs/images/root-graph.png" alt="csv-to-knowledge-graph"/> |
| 106 | +</p> |
| 107 | + |
| 108 | +## Under the Hood |
| 109 | + |
| 110 | +### AI-Powered Analysis |
| 111 | + |
| 112 | +Our AI analyzes your CSV column names to intelligently identify entities, attributes, and potential relationships. It recognizes patterns like `Customer_Name` or `Order_ID` to generate a coherent graph structure that represents the real-world relationships in your data. |
| 113 | + |
| 114 | +This analysis happens entirely in your browser. The AI examines column naming patterns, value types, and contextual relationships to build a comprehensive graph model that serves as the foundation for RDF generation. |
| 115 | + |
| 116 | +### RDF Generation |
| 117 | + |
| 118 | +The RDF generation pipeline automatically creates a template that maps CSV data to a valid RDF format compatible with Dgraph. This includes: |
| 119 | + |
| 120 | +- Generating appropriate entity types based on column groupings |
| 121 | +- Creating unambiguous relationship predicates with proper direction |
| 122 | +- Defining attribute mappings between CSV columns and RDF properties |
| 123 | +- Setting correct data types for each attribute |
| 124 | + |
| 125 | +The template is then applied to your CSV data, converting each row into a set of RDF triples that preserve the semantic structure of your information. |
| 126 | + |
| 127 | +### Query Generation |
| 128 | + |
| 129 | +Once your data is imported, the application automatically generates useful Dgraph Query Language (DQL) queries customized to your specific data model. These queries are designed to: |
| 130 | + |
| 131 | +- Showcase common query patterns for your specific entity types |
| 132 | +- Demonstrate traversal across relationships in your knowledge graph |
| 133 | +- Include appropriate filters and aggregations based on your data structure |
| 134 | + |
| 135 | +Each query can be immediately executed or opened in Dgraph's Ratel UI for further exploration and modification. |
| 136 | + |
| 137 | +### CSV-to-RDF Library |
| 138 | + |
| 139 | +The `csv-to-rdf` library processes CSV files entirely in the browser, using a template-based approach to transform tabular data into RDF triples. It features: |
| 140 | + |
| 141 | +- Memory-efficient chunking to handle large CSV files without browser crashes |
| 142 | +- Real-time progress tracking ideal for responsive UI feedback |
| 143 | +- Template-based transformation that replaces column placeholders with actual values |
| 144 | +- Support for complex entity relationships and data type conversions |
| 145 | +- Streaming processing with minimal memory footprint |
| 146 | + |
| 147 | +### RDF-to-Dgraph Library |
| 148 | + |
| 149 | +The `rdf-to-dgraph` library enables direct browser-to-Dgraph communication without requiring a backend server: |
| 150 | + |
| 151 | +- Uses standard browser fetch APIs to connect directly to Dgraph's HTTP endpoints |
| 152 | +- Handles authentication, schema setup, and data import through a browser-compatible interface |
| 153 | +- Provides detailed import statistics and progress tracking |
| 154 | +- Automatically sets up appropriate relationship directives in the schema |
| 155 | +- Works around browser limitations by breaking large mutations into manageable chunks |
| 156 | +- Fetches current schema information and node type counts for query generation |
| 157 | + |
| 158 | +## Usage |
| 159 | + |
| 160 | +👉 [Import my CSV now!](https://csv-to-knowledge-graph-frontend.vercel.app) |
| 161 | + |
| 162 | +## Powered by Hypermode |
| 163 | + |
| 164 | +Built with ❤️ by [Hypermode](https://hypermode.com/). |
0 commit comments