Data Model

ER Diagram of DATS SQL Database

Table Overview

Source Document

Tracks individual files and their various processing states such as text extraction or thumbnail generation.
Contains the raw file content, HTML structure, and the technical token/sentence arrays required for precise annotation.
Monitors the background progress of automated tasks for each document.

Annotations

Acts as the bridge between users and the documents they are working on.
Serves as the central "Unit of Work," linking a specific user to a document within a project to allow independent multi-user annotation.
Stores the exact coordinates and content for highlights, bounding boxes, and sentence labels.

Project & User Management

Acts as the top-level container for all data; everything from documents to codes is scoped to a specific project.
Stores essential profile information such as names, emails, and secure hashed credentials.
Manages the access levels and permissions connecting users to their respective workspaces.

Code & Tag

Defines the labels available for annotation, including their hierarchy and visual properties like color.
Provides a lightweight way to organize documents (e.g., "To Review", "Phase 1") outside of the formal codebook.

Classifiers

Manages the entire AI-assisted labeling lifecycle from training to prediction.
Stores the definition and training state of specific machine learning models.
Tracks model performance metrics such as accuracy, F1 score, and precision.

Metadata

Provides a robust system for extending document information without altering the database schema.
Defines the "schema" of custom fields (keys and types) available within a specific project.
Stores the actual values—ranging from integers and strings to dates—associated with each document.

Memo

Facilitates qualitative research and detailed documentation through user-created notes.
Stores insights and observations that can be starred for importance.
Acts as a universal link, enabling a single note to be attached to any database object, such as a document or a code.

Analyses

Records settings and configurations for tracking data trends over time.
Stores specific parameters for analyzing how thematic concepts evolve through the document corpus.

Cluster

Enables automated organization of large datasets through machine learning.
Defines the parameters and models used for embedding and clustering documents.
Represents the calculated groups, including metadata like top descriptive words and spatial coordinates.
Maps individual documents to these clusters and records the strength of their relationship.

Summary

In summary, Project and User tables control access; Source Document and Metadata organize the input data; Annotation and Memo capture the research work; while Analyses, Cluster, and Classifier provide advanced tools for automated discovery and AI integration.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Data Model

Data Model

ER Diagram of DATS SQL Database

Table Overview

Source Document

Annotations

Project & User Management

Code & Tag

Classifiers

Metadata

Memo

Analyses

Cluster

Summary

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Clone this wiki locally