This repository holds the infrastructure as code scripts for the deployment of the Geoconnex knowledge graph. For general info on Geoconnex and a more detailed description of the tech stack, see docs.geoconnex.us
The full Geoconnex tech stack is composed of
- gleaner, a golang cli program which harvests JSON-LD documents from remote sitemaps and puts them in S3
- nabu, a golang cli program which synchronizes an S3 bucket with a graph database
- scheduler, a full stack data platform which uses Python and dagster to run gleaner and nabu on a schedule for each source in the Geoconnex sitemap.
- An S3 bucket for storing jsonld data
- A graph database like graphdb for sparql queries
