Skip to content

Document API

Cedrick Lunven edited this page Jan 7, 2022 · 33 revisions

Stargate and Astra bring great innovation by allowing Apache Cassandra™ to store JSON documents like a document-oriented noSQL database. The same data model is in use for each document collection leveraging a document shredding stratefy.

ApiDocumentClient Initialization

Main client object initializations (AstraClient and StargateClient) have been detailed on the Home page. Moving forward the sample code will reuse those classes but do not initialize them.

ApiDocumentClient is the core class when it comes to work with documents.

// Option1. Retrieved from astraClient
ApiDocumentClient apiDocClient1 = astraClient.apiStargateDocument();
ApiDocumentClient apiDocClient2 = astraClient.getStargateClient().apiDocument()

// Option 2. Retrieved from StargateClient
ApiDocumentClient astraClient3 = stargateClient.apiDocument();

// Option 3. Built from the endpoint and credentials
ApiDocumentClient astraClient4    = new ApiDocumentClient("http://api_endpoint", "apiToken");
ApiDocumentClient astraClient5 = new ApiDocumentClient("http://api_endpoint", 
  new TokenProviderDefault("username", "password", "http://auth_endpoint");

For the rest of the document apiDocClient will refer to ApiDocumentClient but the initialization code will not be duplicated.

Working with Namespaces

Namespace if the term used to talk about keyspaces when dealing with the document API.

DocumentApiIntegrationTest is the unit test class for this API where you can find more sample usage of the SDK.

✅. List Namespaces Names

Stream<String> namespaces = apiDocClient.namespaceNames();

✅. List Namespaces objects

Reference Api documentation

Stream<Namespace> namespaces = apiDocClient.namespaces();

The Namespace class provides the replication factor and or the datacenter list for a namespace.

public class Keyspace {
    protected String name;
    protected Integer replicas;
    protected List<DataCenter> datacenters;

✅. Find Namespace by its name

Reference Api documentation

The parameter ns1 is here the unique identifier for the namespace

Optional<Namespace> ns1 = apiDocClient.namespace("ns1").find();

✅. Test if Namespace exists

apiDocClient.namespace("ns1").exist();

✅. Create Namespace

🚨 As of Today, in Astra, Namespaces and Keyspaces creations are only available at the DevOps API level or through the user interface.

// Create a namespace with a single DC dc-1
DataCenter dc1 = new DataCenter("dc-1", 1);
apiDocClient.namespace("ns1").create(dc1);

// Create a namespace providing only the replication factor
apiDocClient.namespace("ns1").createSimple(3);

✅. Delete a namespace

🚨 As of today the namespace / keyspace creations are not available in ASTRA

apiDocClient.namespace("ns1").delete();

ℹ️ Tips

You can simplify the code by assigning apiDocClient.namespace("ns1") to a NamespaceClient variable as shown below:

NamespaceClient ns1Client = astraClient.apiStargateDocument().namespace("ns1");
        
// Create if not exist
if (!ns1Client.exist()) {
  ns1Client.createSimple(3);
}
        
// Show datacenters where it lives
ns1Client.find().get().getDatacenters()
         .stream().map(DataCenter::getName)
         .forEach(System.out::println); 
        
// Delete 
ns1Client.delete();

Working with Collections

The related Api Documentation is available here

✅. Lists available Collection in namespace

// We can create a local variable to shorten the code.
NamespaceClient ns1Client = apiDocClient.namespace("ns1");
Stream<String> colNames   = ns1Client.collectionNames();

✅. Check if collection exists

CollectionClient col1Client = apiDocClient.namespace("ns1").collection("col1");
boolean colExist = col1Client.exist();

✅. Retrieve a collection from its name

Optional<CollectionDefinition> = apiDocClient.namespace("ns1").collection("col1").find();

✅. Create an empty collection

apiDocClient.namespace("ns1").collection("col1").create();

✅. Delete a collection

apiDocClient.namespace("ns1").collection("col1").delete();

ℹ️ Tips

You can simplify the code by assigning apiDocClient.namespace("ns1").collection("col1") to a variable CollectionClient

CollectionClient colClient = apiDocClient.namespace("ns1").collection("col1");
colClient.exist();
//...

In the following we consider than CollectionClient, colClient, has been initialized.

Working with Documents

📘. About Document

In Stargate Document API, documents are retrieved with a Json payload and unique identifier (UUID).

{
  "data": {
    "9e14db1c-0a05-47d2-9f27-df881f7f37ab": { "p1": "v1", "p2": "v2"},
    "9e14db1c-0a05-47d2-9f27-df881f7f37ac": { "p1": "v11", "p2": "v21"},
    "9e14db1c-0a05-47d2-9f27-df881f7f37ad": { "p1": "v12", "p2": "v22"}
  }
}

This SDK provides the class Document as a wrapper to give you both documentId (unique identifier) and document (payload).

public class Document<T> {
  private String documentId;
  private T document;
  // Constructor, Getters, Setters
}

📘. Paging

Due the verbose nature of the document API the number of items one could retrieve is 20 at maximum. As such, every request is paged. If the number of records is greater than the page size you got a field called pagingState in the response.

{
  "pagingState": "jhfekwfkwejefejwhkjewhehwrjhewjkrhewjrhewklrhewklrhewj"
  "data": {
    "9e14db1c-0a05-47d2-9f27-df881f7f37ab": { "p1": "v1", "p2": "v2"},
    "9e14db1c-0a05-47d2-9f27-df881f7f37ac": { "p1": "v11", "p2": "v21"},
    "9e14db1c-0a05-47d2-9f27-df881f7f37ad": { "p1": "v12", "p2": "v22"}
  }
}

You need to provide the value pagingState in order to request the next page with the same query

Page<Document<String>> page1 = cp.findPage(query);

query.setPageState(page1.getPageState().get());
Page<Document<String>> page2 = cp.findPage(query);

🚨 You will see in the following chapters findAll methods but under the hood its does the same things as above : fetching one page after another until the dataset is exhausted. This meant it could be slow - use it with cautious.

📘. Object Mapping

Documents Payload can be deserialized as beans or let unchanged as Json. To build the expected beans you can either leverage on Jackson or implement your customer DocumentMapper. We will illustrate this in the sample codes.

// Retrieve data as JSON
Page<Document<String>> pageOfJsonRecords = cp.findPage();

// Retrieve data with default Jackson Mapper
Page<Document<Person>> pageOfPersonRecords1 = cp.findPage(Person.class);

// Retrieve data with a custom Mapper
Page<Document<Person>> pageOfPersonRecords2 = cp.findPage(new DocumentMapper<Person>() {
  public Person map(String record) {
     return new Person();
  }
});

✅. Search Documents in a collection with Paging

The document Api allows to search on any fields in the document. A where clause is expected. In the rest API the parameter looks like: {"age": {"$gte":30},"lastname": {"$eq":"PersonAstra2"}} in the SDK dedicated queries and builders are provided for full scan or paged searches Query and PageableQuery respectively.

  • Build your query with PageableQuery
PageableQuery query = PageableQuery.builder()
  .selectAll()
  // alternatively select("field1", "field2", ...)
  .where("firstName").isEqualsTo("John")
  .and("lastName").isEqualsTo("Connor")
  .pageSize(3)
  //.pageState() used to get next pages
  .build();
  • Retrieve your Page<T> with findPage(PageableQuery query), if you do provide any marshaller you get a Json String.
Page<Document<String>> page1 = cp.findPage(query);

// Use pagingState in page1 to retrieve page2
if (page1.getPageState().isPresent()) {
  query.setPageState(page1.getPageState().get());
  Page<Document<String>> page2 = cp.findPage(query);
}
  • Retrieve your Page<T> with findPage(PageableQuery query, Class<T> class) using default Jackson Mapper
Page<Document<Person>> page1 = cp.findPage(query, Person.class);

// Use pagingState in page1 to retrieve page2
if (page1.getPageState().isPresent()) {
  query.setPageState(page1.getPageState().get());
  Page<Document<Person>> page2 = cp.findPage(query, Person.class);
}
  • Retrieve your Page<T> with findPage(PageableQuery query, DocumentMapper<T>) using your custom mapping
public static class PersonMapper implements DocumentMapper<Person> {
  @Override
  public Person map(String record) {
    Person p = new Person();
    // custom logic
    return p;
  }    
}

Page<Document<Person>> page1 = cp.findPage(query, new PersonMapper());

✅. Search Documents in a collection without Paging

  • Build your query with Query
Query query = Query.builder()
  .select("field1", "field2", ...)
  .selectAll() // to get all fields
  .where("firstName").isEqualsTo("John")
  .and("lastName").isEqualsTo("Connor")
  .build();
  • Retrieve Stream<T> with findAll(Query query), if you do not provide any marshaller you get a Json String.
Stream<Document<String>> result = cp.findAll(query);
  • Retrieve all documents of a collection is possible, it is the default query
// Get all documents
Stream<Document<String>> allDocs1 = cp.findAll();

// Equivalent to 
Stream<Document<String>> allDocs2 = cp.findAll(Query.builder().build());
  • Retrieve your Stream<T> with findAll(Query query, Class<T> class) using default Jackson Mapper
Stream<Document<Person>> res1 = cp.findAll(query, Person.class);
  • Retrieve your Page<T> with findAll(PageableQuery query, DocumentMapper<T>) using your custom mapping
public static class PersonMapper implements DocumentMapper<Person> {
  @Override
  public Person map(String record) {
    Person p = new Person();
    // custom logic
    return p;
  }    
}

Stream<Document<Person>> page1 = cp.findAll(query, new PersonMapper());

Here the class definitions for beans used in the samples.

✅. Get a document by id

// doc1 is the document Id in the collection
boolean docExist = colPersonClient.document("doc1").exist();

// Find returns an optional
Optional<Person> p = colPersonClient.document("doc1").find(Person.class);

✅. Create a new document with no id

// Define an object
Person john = new Person("John", "Doe", 20, new Address("Paris", 75000));

// As no id has been provided, the API will create a UUID and returned it to you 
String docId = colPersonClient.createNewDocument(john);

✅. Upsert a document enforcing the id

// Define an object
Person john2 = new Person("John", "Doe", 20, new Address("Paris", 75000));

// Now the id is provided (myId) and we can upsert
String docId = colPersonClient.document("myId").upsert(john2, Person.class);

✅. Delete a Document from its id

colPersonClient.document("myId").delete();

✅. Count Documents

🚨 This operation can be slow. Every query to the API os paged. The method will fetch pages (limited the payloads size as much as possible) as long as they are and finally count the results.

int docNum = colPersonClient.count();

✅. Find part of a documents

The document API allows to work with nested structure in document. You are asked to provide the path in the URL

http://{doc-api-endpoint}/namespaces/{namespace-id}/collections/{collection-id}/{document-id}/{document-path}

Given a Json DOCUMENT with UUID e8c5021b-2c91-4015-aec6-14a16e449818 :

{ 
  "age": 25,
  "firstname": "PersonAstra5",
  "lastname": "PersonAstra1",
  "address": {
    "city": "Paris",
    "zipCode": 75000
   },
}

You can retrieve the zipCode with: http://{doc-api-endpoint}/namespaces/ns1/collections/person/e8c5021b-2c91-4015-aec6-14a16e449818/address/zipCode

The SDK provide some utility methods to work with this as well:

// Retrieve an object and marshall
Optional<Address> address = colPersonClient
   .document("e8c5021b-2c91-4015-aec6-14a16e449818")
   .findSubDocument("address", Address.class);
        
// Retrieve a scalar deeper in the tree
Optional<Integer> zipcode = colPersonClient
  .document("e8c5021b-2c91-4015-aec6-14a16e449818")
  .findSubDocument("address/zipCode", Integer.class);

✅. Update a sub document

// Update an existing attribute of the JSON
colPersonClient.document("e8c5021b-2c91-4015-aec6-14a16e449818")
               .updateSubDocument("address", new Address("city2", 8000));

// Create a new attribute in the document
colPersonClient.document("e8c5021b-2c91-4015-aec6-14a16e449818")
               .updateSubDocument("secondAddress", new Address("city2", 8000));

✅. Delete part of a documents

colPersonClient.document("e8c5021b-2c91-4015-aec6-14a16e449818")
               .deleteSubDocument("secondAddress");

Document Repository

📘. StargateDocumentRepository overview

If you have work with Spring Data or Active Record before you might already know what the repository are. Those are classes that provides you CRUD (create, read, update, delete) operations without you having to code anything.

Here this is not different, if you provide an object for a collection this is what is available for you

public interface StargateDocumentRepository <DOC> {
   
   // Create
   String insert(DOC p);
   void insert(String docId, DOC doc);
   
   // Read unitary
   boolean exists(String docId);
   Optional<DOC> find(String docId);

   // Read records
   int count();
   DocumentResultPage<DOC> findPage();
   DocumentResultPage<DOC> findPage(SearchDocumentQuery query) ;
   Stream<ApiDocument<DOC>> findAll();
   Stream<ApiDocument<DOC>> findAll(SearchDocumentQuery query);

  // Update
  void save(String docId, DOC doc);

  // Delete
  void delete(String docId);
}

✅. Initialization of repository

// Initialization (from namespaceClients)
NamespaceClient ns1Client = astraClient.apiStargateDocument().namespace("ns1");
StargateDocumentRepository<Person> personRepository1 = 
  new StargateDocumentRepository<Person>(ns1Client, Person.class);

Points to note:

  • No collection name is provided here. By default the SDK will use the class name in lower case (here person)
  • If you want to override the collection name you can annotate your bean Person with @Collection("my_collection_name")
// Initialization from CollectionClient, no ambiguity on collection name
CollectionClient colPersonClient = astraClient.apiStargateDocument()
 .namespace("ns1").collection("person");
StargateDocumentRepository<Person> personRepository2 = 
  new StargateDocumentRepository<Person>(colPersonClient, Person.class);

✅. CRUD

We assume that the repository has been initialized as describe above and name personRepo.

if (!personRepo.exists("Cedrick")) {
  personRepo.save("Cedrick", new Person("Cedrick", "Lunven", new Address()));
}

// Yeah
personRepository.findAll()                     // Stream<ApiDocument<Person>>      
                .map(ApiDocument::getDocument) // Stream<Person>      
                .map(PersonRepo::getFirstname) // Stream<String>
                .forEach(System.out::println);
Clone this wiki locally