You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardExpand all lines: book/00-introduction/05-executive-summary.md
+1-1Lines changed: 1 addition & 1 deletion
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -24,7 +24,7 @@ Unlike Entity-Relationship modeling that requires translation to SQL, DataJoint
24
24
Foreign keys in DataJoint do more than enforce referential integrity—they encode computational dependencies. A computed result that references raw data will be automatically deleted if that raw data is removed, preventing stale or orphaned results. This maintains *computational validity*, not just *referential integrity*.
25
25
26
26
**Declarative Computation**
27
-
Computations are defined declaratively through `make()` methods attached to table definitions. The `populate()` operation identifies all missing results and executes computations in dependency order. Parallelization, error handling, and job distribution are handled automatically.
27
+
Computations are defined declaratively through make() methods attached to table definitions. The populate() operation identifies all missing results and executes computations in dependency order. Parallelization, error handling, and job distribution are handled automatically.
28
28
29
29
**Immutability by Design**
30
30
Computed results are immutable. Correcting upstream data requires deleting dependent results and recomputing—ensuring the database always represents a consistent computational state. This naturally provides complete provenance: every result can be traced to its source data and the exact code that produced it.
Copy file name to clipboardExpand all lines: book/20-concepts/00-databases.md
+4-4Lines changed: 4 additions & 4 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -25,7 +25,7 @@ Databases are crucial for the smooth and organized operation of various entities
25
25
## Database Management Systems (DBMS)
26
26
27
27
```{card} Database Management System
28
-
A Database Management System is a software system that serves as the computational engine powering a database.
28
+
A Database Management System (DBMS) is a software system that serves as the computational engine powering a database.
29
29
It defines and enforces the structure of the data, ensuring that the organization's rules are consistently applied.
30
30
A DBMS manages data storage and efficiently executes data updates and queries while safeguarding the data's structure and integrity, particularly in environments with multiple concurrent users.
31
31
@@ -50,7 +50,7 @@ One of the most critical features distinguishing databases from simple file stor
50
50
51
51
### Authentication and Authorization
52
52
53
-
Before you can work with a database, you must **authenticate**—prove your identity with a username and password. Once authenticated, the database enforces **authorization** rules that determine what you can do:
53
+
Before you can work with a database, you must **authentication**—prove your identity with a username and password. Once authenticated, the database enforces **authorization** rules that determine what you can do:
54
54
55
55
-**Read**: View specific tables or columns
56
56
-**Write**: Add new data to certain tables
@@ -80,10 +80,10 @@ Modern databases typically separate data management from data use through distin
80
80
81
81
### Common Architectures
82
82
83
-
**Server-Client Architecture** (most common): A database server program manages all data operations, while client programs (your scripts, applications, notebooks) connect to request data or submit changes. The server enforces all rules and access permissions consistently for every client. This is like a library where the librarian (server) manages the books and enforces checkout policies, while patrons (clients) request materials.
83
+
**Server-client architecture** (most common): A database server program manages all data operations, while client programs (your scripts, applications, notebooks) connect to request data or submit changes. The server enforces all rules and access permissions consistently for every client. This is like a library where the librarian (server) manages the books and enforces checkout policies, while patrons (clients) request materials.
84
84
The two most popular open-source relational database systems: MySQL and PostgreSQL implement a server-client architecture.
85
85
86
-
**Embedded Databases**: The database engine runs within your application itself—no separate server. This works for single-user applications like mobile apps or desktop software, but doesn't support multiple users accessing shared data simultaneously.
86
+
**Embedded databases**: The database engine runs within your application itself—no separate server. This works for single-user applications like mobile apps or desktop software, but doesn't support multiple users accessing shared data simultaneously.
87
87
SQLite is a common embedded database @10.14778/3554821.3554842.
88
88
89
89
**Distributed Databases**: Data and processing are spread across multiple servers working together. This provides high availability and can handle massive scale, but adds significant complexity. Systems like Google Spanner, Amazon DynamoDB, and CockroachDB use this approach.
Copy file name to clipboardExpand all lines: book/20-concepts/01-models.md
+1-1Lines changed: 1 addition & 1 deletion
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -244,7 +244,7 @@ Most importantly, spreadsheets provide no referential integrity. If cell B2 cont
244
244
245
245
The **relational data model**, introduced by Edgar F. Codd in 1970, revolutionized data management by organizing data into tables (relations) with well-defined relationships. This model emphasizes data integrity, consistency, and powerful query capabilities through a formal mathematical foundation.
246
246
247
-
The relational model organizes all data into tables representing mathematical relations, where each table consists of rows (representing mathematical *tuples*) and columns (often called *attributes*). Key principles include data type constraints, uniqueness enforcement through primary keys, referential integrity through foreign keys, and declarative queries. The next chapter explores these principles in depth.
247
+
The relational model organizes all data into tables representing mathematical relations, where each table consists of rows (representing mathematical *tuples*) and columns (often called *attributes*). Key principles include data type constraints, uniqueness enforcement through primary keys, referential integrity through foreign keys, and declarative query. The next chapter explores these principles in depth.
248
248
249
249
The most common way to interact with relational databases is through the Structured Query Language (SQL), a language specifically designed to define, manipulate, and query data within relational databases.
Copy file name to clipboardExpand all lines: book/20-concepts/03-relational-practice.ipynb
+1-9Lines changed: 1 addition & 9 deletions
Original file line number
Diff line number
Diff line change
@@ -1568,15 +1568,7 @@
1568
1568
{
1569
1569
"cell_type": "markdown",
1570
1570
"metadata": {},
1571
-
"source": [
1572
-
"## The Path Forward: Databases as Workflows\n",
1573
-
"\n",
1574
-
"**DataJoint extends relational theory by viewing the schema as a workflow specification.** It preserves all the benefits of relational databases—mathematical rigor, declarative queries, data integrity—while adding workflow semantics that make the database **workflow-aware**.\n",
1575
-
"\n",
1576
-
"**Key Insight**: The database schema structure can be identical whether using SQL or DataJoint, although DataJoint imposes some conventions. What's different is the **conceptual view**: SQL sees static entities and relationships; DataJoint sees an executable workflow, where some steps are manual and others are automatic. This workflow view enables automatic execution, provenance tracking, and computational validity—features essential for scientific computing.\n",
1577
-
"\n",
1578
-
"The next chapter introduces DataJoint's Relational Workflow Model in detail, showing how Computed tables turn your schema into an executable pipeline specification.\n"
1579
-
]
1571
+
"source": "## The Path Forward: Databases as Workflows\n\n**DataJoint extends relational theory by viewing the schema as a workflow specification.** It preserves all the benefits of relational databases—mathematical rigor, declarative queries, data integrity—while adding workflow semantics that make the database **workflow-aware**.\n\n**Key Insight**: The database schema structure can be identical whether using SQL or DataJoint, although DataJoint imposes some conventions. What's different is the **conceptual view**: SQL sees static entities and relationships; DataJoint sees an executable workflow, where some steps are manual and others are automatic. This workflow view enables automatic execution, provenance tracking, and computational validity—features essential for scientific computing.\n\nThe next chapter explores **Data Integrity**—the fundamental constraints that databases enforce to ensure data remains accurate, consistent, and reliable. Understanding these integrity concepts provides the foundation for DataJoint's Relational Workflow Model, which extends integrity guarantees to include workflow validity and computational consistency."
0 commit comments