MonkDB is a unified database platform which supports the below workloads.
- Timeseries workloads.
- Vector Workloads.
- Document (JSON) workloads.
- Full Text Search workloads.
- Geospatial workloads.
- Blob (object) workloads.
Users can query using psql
/postgresql
SQL statements or our query HTTP API.
- Ensure, you have spun up an instance in cloud with specs of 16GB RAM and alteast 100GB of SSD. In our test environment, we always spin up
c5.2xlarge
of AWS EC2 instance family. We recommend you to use its equivalent in your preferred environment. - Please ensure docker engine and psql are installed & active in the spun up instance. For more information, please refer Chapter-2 in the documentation section.
- If you prefer localhost, please ensure docker engine is installed and active. (for localhost, and solely for testing)
- Also check if psql is installed in localhost. If not, please install it.
- We have added docker and psql checks in the automation script.
- Once the docker engine is active, spin up MonkDB's docker container from its image (post
docker pull
). The instructions are mentioned in Chapter-3. Also implement other instructions of chapter-3. - Replace
xx.xx.xx.xxx
in config.ini with the spun-up instance ip address. Please ensure the instance is accessible from your envionment.- In your security groups or equivalents, please whitelist ports
4200
,5432
,HTTP
, andHTTPS
ports mapped against your source ip address. You must be able to successfully connect & converse with the spun-up instance over these ports. If you want to login to the spun-up instance, also whitelistSSH
port in ingress connections. - If it is local dev environment, please mention
127.0.0.1
orlocalhost
. However, ensure the above note is implemented.
- In your security groups or equivalents, please whitelist ports
- Run async timeseries simulation seperately/in standalone mode as it is based on async live streams. You may interrupt the execution using
KeyboardInterruption
. It is not invoked from automation scripts. Runpython3 documentation/timeseries/timeseries_async_data.py
command from the root workspace. - Vector simulation might take a delay of 30s for the first run and 10-12 seconds from the second run onwards owing to the usage of sentence transformers (ST) from huggingface. ST must be loaded everytime during data embed calls. The operation would be swift if you are using Cohere, OpenAI, etc for embedding.
Language | Status | Badge | Link (if available) |
---|---|---|---|
Python | Released | PyPI | |
JS/TS | Released | NPM | |
Rust | TODO | N/A | |
Java | TODO | N/A | |
Golang | TODO | N/A |
- TS/JS examples are in this repo. It shows how to use our official SDK to work with MonkDB.
- Users can leverage other PostgreSQL or ORM libraries of their respective stacks as well.
To follow the instructions of MonkDB, please traverse through the below directories.
documentation
- It has instructions on how to work with multi-model data in MonkDB. It also has simulation scripts with synthetic data. We shall be segregating this by language once other SDKs release.monkdb-sql
- It has usage instructions on how to use MonkDB's SQL commands and create SQL statements.advanced_concepts
- It has usage instructions on dealing with advanced concepts using MonkDB. It is WIP.
- If you are working in MacOS or Linux environments, please run this shell script.
- However, if you are working in MS Windows environment, please run this bat script.
- If you have powershell environment, please use this ps1 script to execute the simulations.
Use the chmod
command to grant execute permissions. cd
into the directory where this shell script is present.
$ chmod +x get_started.sh
Execute the script directly using the below command.
$ ./get_started.sh
Verify that the file has executable permissions using ls -l. Please note that this is optional.
$ ls -l get_started.sh
Batch files are executed natively by the Windows Command Prompt.
Open Command Prompt (cmd.exe
) and navigate to the directory containing the script.
cd path\to\get_started.bat
get_started.bat
To run directly, double-click the .bat
file in File Explorer.
Ensure you have sufficient permissions to execute scripts in the directory.
Open PowerShell as Administrator. Run the below command to bypass restrictions for the current session only.
powershell -ExecutionPolicy Bypass -File get_started.ps1
However, to permanently Change Execution Policy Open PowerShell as Administrator and Check the current execution policy.
Get-ExecutionPolicy
Set-ExecutionPolicy RemoteSigned
Confirm by typing A
(Yes to All) when prompted.
Execution Policy Options:
- Restricted: No scripts are allowed (default setting).
- AllSigned: Only signed scripts are allowed.
- RemoteSigned: Local scripts run without restriction, but remote scripts must be signed.
- Unrestricted: All scripts are allowed but with a warning for remote scripts.
Verify the new policy
Get-ExecutionPolicy
Finally, run our powershell script.
.\get_started.ps1
MonkDB like every software has certain limitations. They have been listed in the limitations document.