Skip to content

Commit 172dcef

Browse files
authored
Merge pull request #108 from dice-group/develop
Code for WCOJ update paper (#107)
2 parents f5f87f8 + b5bc7ec commit 172dcef

27 files changed

+661
-216
lines changed

CMakeLists.txt

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -1,5 +1,5 @@
11
cmake_minimum_required(VERSION 3.24)
2-
project(tentris VERSION 1.4.0
2+
project(tentris VERSION 1.5.0
33
DESCRIPTION "Tentris - A tensor-based Triplestore.")
44

55
include(cmake/boilerplate_init.cmake)

README.MD

Lines changed: 48 additions & 49 deletions
Original file line numberDiff line numberDiff line change
@@ -1,86 +1,85 @@
1-
# Tᴇɴᴛʀɪs: A Tensor-based Triple Store
1+
# Tentris: A Tensor-based Triple Store
22

3-
<p><img src = "https://tentris.dice-research.org/iswc2020/assets/img/Tentris_logo.svg" alt = "Tᴇɴᴛʀɪs Logo" width = "30%" align = "center"></p>
3+
Tentris is a tensor-based triplestore that natively supports worst-case optimal joins.
44

5-
Tᴇɴᴛʀɪs is a tensor-based RDF triple store with SPARQL support. It is introduced and described in:
6-
> [Alexander Bigerl, Felix Conrads, Charlotte Behning, Mohamed Ahmed Sherif, Muhammad Saleem and Axel-Cyrille Ngonga Ngomo (2020)
7-
**Tentris – A Tensor-Based Triple Store.
8-
** In: The Semantic Web – ISWC 2020](https://tentris.dice-research.org/iswc2020/)
5+
This is the research version of Tentris. A commercial version is available at https://github.com/tentris/tentris.
96

10-
and
7+
## Features
118

12-
> [Alexander Bigerl, Lixi Conrads, Charlotte Behning, Muhammad Saleem and Axel-Cyrille Ngonga Ngomo (2022) Hashing the Hypertrie: Space- and Time-Efficient Indexing for SPARQL in Tensors. In: The Semantic Web – ISWC 2022 Hashing the Hypertrie: Space- and Time-Efficient Indexing for SPARQL in Tensors](https://tentris.dice-research.org/iswc2022/)
9+
- SPARQL endpoint supporting `SELECT`, `SELECT DISTINCT` and `ASK` queries at `/sparql` and `/stream`. The `WHERE` block
10+
may contain basic graph patterns and OPTIONAL clauses.
11+
Queries must be well-designed [Kaminski et al. (2018)](https://doi.org/10.1007/s00224-017-9802-9)
12+
- Supports `INSERT DATA`/`DELETE DATA` updates that are synchronized with a reader-writer lock at `/update`.
13+
- Supports streaming results through the `/stream` endpoint for large results.
14+
- Tentris is a persistent triplestore that uses [metall](https://github.com/LLNL/metall) to manage its disk-based index.
15+
Note that previous versions were in-memory.
1316

14-
## Get It
17+
## Download
1518

16-
* download [static prebuilt binaries](https://github.com/dice-group/tentris/releases)
17-
and [try them out](#running-tentris)
18-
* [build it with docker](#docker)
19+
We provide pre-built binaries in the [releases](https://github.com/dice-group/tentris-research-project/releases) that run on Linux machines with fairly recent (after 2015) x86_64 CPUs.
1920

20-
## Running Tᴇɴᴛʀɪs
21+
## Running
2122

22-
<details><summary> </summary>
23+
Initialize Tentris' index with an empty graph with:
2324

24-
#### Bulk-load Data
25+
```shell
26+
./tentris_loader -f <(echo "")
27+
```
2528

26-
Provide an NTRIPLE or TURTLE file to build the an index. By default, the index is stored in the current directory. The
27-
path can be changed with the option `--storage`.
29+
Or initialize it by loading a turtle or ttl file:
2830

2931
```shell
3032
tentris_loader --file my_nt_file.nt
3133
```
3234

33-
#### Start HTTP endpoint
34-
35-
To start Tᴇɴᴛʀɪs as a HTTP endpoint on port 9080 run now:
35+
Start the Tentris endpoint at [127.0.0.1:9080/sparql](http://127.0.0.1:9080/sparql):
3636

3737
```
3838
tentris_server -p 9080
3939
```
4040

41-
#### Query
42-
43-
The SPARQL endpoint may now be queried locally at: `127.0.0.1:9080/sparql?query=*your query*`. You can execute queries
44-
with the following curl command:
41+
Run a query with :
4542

4643
```shell
4744
curl -G \
4845
--data-urlencode 'query=SELECT * WHERE { ?s ?p ?o . }' \
4946
'127.0.0.1:9080/sparql'
5047
```
5148

52-
If you want to type the query in your browser, the query string must be URL encoded. You can use any online URL encoder
53-
like <https://meyerweb.com/eric/tools/dencoder>.
54-
55-
The following endpoints are available:
56-
Available endpoints:
49+
Or update your data with:
5750

58-
- HTTP GET `/sparql?query=` for normal queries
59-
- HTTP GET `/stream?query=` for queries with huge results
60-
- HTTP GET `/count?query=` as a workaround for count (consumes a select query)
61-
62-
</details>
51+
```shell
6352

64-
## Docker
53+
curl -X POST \
54+
-H 'Content-Type: application/sparql-update' \
55+
--data 'INSERT DATA { <http://example.org/subject> <http://example.org/predicate> "object" . }' \
56+
'127.0.0.1:9080/sparql'
57+
```
6558

66-
<details><summary> </summary>
59+
The following endpoints are available:
6760

68-
Use the [Dockerfile](./Dockerfile) to build tentris.
61+
- HTTP GET `/sparql` for normal SPARQL queries
62+
- HTTP GET `/stream` for SPARQL queries with huge results
63+
- HTTP POST `/update` for `INSERT DATA` and `DELETE DATA` updates
6964

70-
* A docker image is available on [docker hub](https://hub.docker.com/r/dicegroup/tentris_server). Get it with
71-
```shell script
72-
docker build -f Dockerfile .
73-
docker pull dicegroup/tentris_server
74-
```
65+
## Build It
7566

76-
</details>
67+
Tentris is known to build on Ubuntu 22.04 and newer.
68+
Building was tested with Clang 17 & 19. As C++ standard template library, only `libstdc++11` (v13) was tested. For
69+
details
70+
refer to the [Dockerfile](./Dockerfile) or GitHub actions.
7771

78-
## Build It Yourself
72+
# Research
7973

80-
<details><summary> </summary>
74+
- Alexander Bigerl, Liss Heidrich, Nikolaos Karalis and Axel-Cyrille Ngonga Ngomo (2025) **Efficient Updates for
75+
Worst-Case Optimal Join Triple Stores.** In: The Semantic Web – ISWC
76+
2025 | [website](https://github.com/dice-group/tentris-wcoj-with-updates/)
8177

82-
Tᴇɴᴛʀɪs is known to build on Ubuntu 22.04 and newer.
83-
Building was tested with Clang 17 & 19. As standard library, only libstdc++11 (v13) was tested. For details
84-
refer to the [Dockerfile](./Dockerfile) or github actions.
78+
- Alexander Bigerl, Lixi Conrads, Charlotte Behning, Muhammad Saleem and Axel-Cyrille Ngonga Ngomo (2022) **Hashing the
79+
Hypertrie: Space- and Time-Efficient Indexing for SPARQL in Tensors.** In: The Semantic Web – ISWC
80+
2022 | [website](https://tentris.dice-research.org/iswc2022/)
8581

86-
</details>
82+
- Alexander Bigerl, Felix Conrads, Charlotte Behning, Mohamed Ahmed Sherif, Muhammad Saleem and Axel-Cyrille Ngonga
83+
Ngomo (2020)
84+
**Tentris – A Tensor-Based Triple Store.**
85+
In: The Semantic Web – ISWC 2020 | [website](https://tentris.dice-research.org/iswc2020/)

conanfile.py

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -24,7 +24,7 @@ class Recipe(ConanFile):
2424
generators = "CMakeDeps", "CMakeToolchain"
2525

2626
def requirements(self):
27-
self.requires("hypertrie/0.9.6", transitive_headers=True)
27+
self.requires("hypertrie/0.10.0", transitive_headers=True)
2828
self.requires("rdf4cpp/0.0.27.1", transitive_headers=True)
2929
self.requires("sparql-parser-base/0.3.6")
3030
self.requires("unordered_dense/4.4.0", transitive_headers=True, force=True)

libs/endpoint/CMakeLists.txt

Lines changed: 5 additions & 5 deletions
Original file line numberDiff line numberDiff line change
@@ -13,17 +13,17 @@ add_library(${lib}
1313
src/dice/endpoint/SparqlEndpoint.cpp
1414
src/dice/endpoint/CountEndpoint.cpp
1515
src/dice/endpoint/SparqlStreamingEndpoint.cpp
16+
src/dice/endpoint/SparqlUpdateEndpoint.cpp
1617
src/dice/endpoint/SparqlQueryCache.cpp
1718
src/dice/endpoint/Endpoint.cpp
1819
)
1920
add_library(${PROJECT_NAME}::${lib_suffix} ALIAS ${lib})
2021

21-
target_include_directories(${lib}
22-
PUBLIC
22+
target_include_directories(${lib} PUBLIC
2323
$<BUILD_INTERFACE:${CMAKE_CURRENT_SOURCE_DIR}/src>
2424
PRIVATE
2525
${CMAKE_CURRENT_SOURCE_DIR}/private-include
26-
)
26+
)
2727

2828
target_link_libraries(${lib} PUBLIC
2929
${PROJECT_NAME}::triple-store
@@ -34,7 +34,7 @@ target_link_libraries(${lib} PUBLIC
3434
spdlog::spdlog
3535
cppitertools::cppitertools
3636
rapidjson
37-
)
37+
)
3838

3939
include(${CMAKE_SOURCE_DIR}/cmake/install_components.cmake)
40-
install_component(PUBLIC ${lib_suffix} src)
40+
install_component(PUBLIC ${lib_suffix} src)
Lines changed: 46 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,46 @@
1+
#ifndef TENTRIS_PARSESPARQLUPDATEPARAM_HPP
2+
#define TENTRIS_PARSESPARQLUPDATEPARAM_HPP
3+
4+
#include <spdlog/spdlog.h>
5+
6+
#include <restinio/helpers/http_field_parsers/content-type.hpp>
7+
#include <restinio/request_handler.hpp>
8+
#include <restinio/uri_helpers.hpp>
9+
10+
#include <dice/sparql2tensor/UPDATEQuery.hpp>
11+
12+
13+
namespace dice::endpoint {
14+
15+
struct update_error : std::runtime_error {
16+
explicit update_error(const std::string &message)
17+
: std::runtime_error(message) {}
18+
};
19+
20+
inline sparql2tensor::UPDATEDATAQueryData parse_sparql_update_param(restinio::request_handle_t &req) {
21+
using namespace dice::sparql2tensor;
22+
using namespace restinio;
23+
auto content_type = req->header().opt_value_of(http_field::content_type);
24+
auto content_type_value = http_field_parsers::content_type_value_t::try_parse(*content_type);
25+
if (not content_type_value.has_value() or
26+
content_type_value.value().media_type.type != "application" or
27+
content_type_value.value().media_type.subtype != "sparql-update") {
28+
throw update_error("Expected content-type: application/sparql-update");
29+
}
30+
std::string sparql_update_str{req->body()};
31+
try {
32+
auto update_query = UPDATEDATAQueryData::parse(sparql_update_str);
33+
return update_query;
34+
} catch (std::exception &ex) {
35+
static constexpr auto message = "Value of parameter 'update' is not parsable: ";
36+
throw update_error{std::string{message} + ex.what()};
37+
} catch (...) {
38+
static constexpr auto message = "Unknown error";
39+
throw update_error{message};
40+
}
41+
}
42+
43+
}// namespace dice::endpoint
44+
45+
46+
#endif//TENTRIS_PARSESPARQLUPDATEPARAM_HPP

libs/endpoint/private-include/dice/endpoint/SparqlStreamingEndpoint.hpp

Lines changed: 7 additions & 7 deletions
Original file line numberDiff line numberDiff line change
@@ -5,14 +5,14 @@
55

66
namespace dice::endpoint {
77

8-
class SPARQLStreamingEndpoint final : public Endpoint {
8+
class SPARQLStreamingEndpoint final : public Endpoint {
99

10-
public:
11-
SPARQLStreamingEndpoint(tf::Executor &executor, triple_store::TripleStore &triplestore, SparqlQueryCache &sparql_query_cache, EndpointCfg const &endpoint_cfg);
10+
public:
11+
SPARQLStreamingEndpoint(tf::Executor &executor, triple_store::TripleStore &triplestore, SparqlQueryCache &sparql_query_cache, EndpointCfg const &endpoint_cfg);
1212

13-
protected:
14-
void handle_query(restinio::request_handle_t req, std::chrono::steady_clock::time_point timeout) override;
15-
};
13+
protected:
14+
void handle_query(restinio::request_handle_t req, std::chrono::steady_clock::time_point timeout) override;
15+
};
1616
}// namespace dice::endpoint
1717

18-
#endif//TENTRIS_SPARQLSTREAMINGENDPOINT_HPP
18+
#endif//TENTRIS_SPARQLSTREAMINGENDPOINT_HPP
Lines changed: 19 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,19 @@
1+
#ifndef TENTRIS_SPARQLUPDATEENDPOINT_HPP
2+
#define TENTRIS_SPARQLUPDATEENDPOINT_HPP
3+
4+
#include <dice/endpoint/Endpoint.hpp>
5+
#include <rapidjson/stringbuffer.h>
6+
7+
namespace dice::endpoint {
8+
9+
class SPARQLUpdateEndpoint final : public Endpoint {
10+
public:
11+
SPARQLUpdateEndpoint(tf::Executor &executor, triple_store::TripleStore &triplestore, SparqlQueryCache &sparql_query_cache, EndpointCfg const &endpoint_cfg);
12+
13+
protected:
14+
void handle_query(restinio::request_handle_t req, std::chrono::steady_clock::time_point timeout) override;
15+
};
16+
17+
}// namespace dice::endpoint
18+
19+
#endif//TENTRIS_SPARQLUPDATEENDPOINT_HPP

libs/endpoint/src/dice/endpoint/CountEndpoint.cpp

Lines changed: 17 additions & 17 deletions
Original file line numberDiff line numberDiff line change
@@ -5,25 +5,25 @@
55

66
namespace dice::endpoint {
77

8-
CountEndpoint::CountEndpoint(tf::Executor &executor,
9-
triple_store::TripleStore &triplestore,
10-
SparqlQueryCache &sparql_query_cache,
11-
EndpointCfg const &endpoint_cfg)
12-
: Endpoint(executor, triplestore, sparql_query_cache, endpoint_cfg) {}
8+
CountEndpoint::CountEndpoint(tf::Executor &executor,
9+
triple_store::TripleStore &triplestore,
10+
SparqlQueryCache &sparql_query_cache,
11+
EndpointCfg const &endpoint_cfg)
12+
: Endpoint(executor, triplestore, sparql_query_cache, endpoint_cfg) {}
1313

14-
void CountEndpoint::handle_query(restinio::request_handle_t req, std::chrono::steady_clock::time_point timeout) {
15-
using namespace dice::sparql2tensor;
16-
using namespace restinio;
14+
void CountEndpoint::handle_query(restinio::request_handle_t req, std::chrono::steady_clock::time_point timeout) {
15+
using namespace dice::sparql2tensor;
16+
using namespace restinio;
1717

18-
auto sparql_query = parse_sparql_query_param(req, this->sparql_query_cache_);
19-
if (not sparql_query)
20-
return;
18+
auto sparql_query = parse_sparql_query_param(req, this->sparql_query_cache_);
19+
if (not sparql_query)
20+
return;
2121

22-
auto const count = this->triplestore_.count(*sparql_query, timeout);
22+
auto const count = this->triplestore_.count(*sparql_query, timeout);
2323

24-
req->create_response(status_ok())
25-
.set_body(fmt::format("{}", count))
26-
.done();
27-
spdlog::info("HTTP response {}: counted {} results", status_ok(), count);
28-
}
24+
req->create_response(status_ok())
25+
.set_body(fmt::format("{}", count))
26+
.done();
27+
spdlog::info("HTTP response {}: counted {} results", status_ok(), count);
28+
}
2929
}// namespace dice::endpoint

libs/endpoint/src/dice/endpoint/HTTPServer.cpp

Lines changed: 5 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -3,6 +3,7 @@
33
#include <dice/endpoint/CountEndpoint.hpp>
44
#include <dice/endpoint/SparqlEndpoint.hpp>
55
#include <dice/endpoint/SparqlStreamingEndpoint.hpp>
6+
#include <dice/endpoint/SparqlUpdateEndpoint.hpp>
67

78
#include <csignal>
89
#include <cstring>
@@ -58,6 +59,10 @@ namespace dice::endpoint {
5859
SPARQLEndpoint{executor_, triplestore_, sparql_query_cache_, cfg_});
5960
spdlog::info(" GET /sparql?query= for normal queries");
6061

62+
router_->http_post(R"(/sparql)",
63+
SPARQLUpdateEndpoint{executor_, triplestore_, sparql_query_cache_, cfg_});
64+
spdlog::info(" POST /sparql for update queries");
65+
6166
router_->http_get(R"(/stream)",
6267
SPARQLStreamingEndpoint{executor_, triplestore_, sparql_query_cache_, cfg_});
6368
spdlog::info(" GET /stream?query= for queries with huge results");

0 commit comments

Comments
 (0)