Skip to content

Commit a5e0bc7

Browse files
Merge pull request #8 from dimitri-yatsenko/main
remove index
2 parents 9888c14 + cc83926 commit a5e0bc7

33 files changed

+2260
-5183
lines changed

.gitignore

Lines changed: 23 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,23 @@
1+
# Build output
2+
_build/
3+
4+
# Node.js dependencies
5+
node_modules/
6+
7+
# Python
8+
__pycache__/
9+
*.py[cod]
10+
.ipynb_checkpoints/
11+
12+
# Environment
13+
.env
14+
.venv/
15+
venv/
16+
17+
# IDE
18+
.vscode/
19+
.idea/
20+
21+
# OS files
22+
.DS_Store
23+
Thumbs.db

SIMPLIFICATION_RECOMMENDATIONS.md

Lines changed: 0 additions & 184 deletions
This file was deleted.

book/00-introduction/05-executive-summary.md

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -24,7 +24,7 @@ Unlike Entity-Relationship modeling that requires translation to SQL, DataJoint
2424
Foreign keys in DataJoint do more than enforce referential integrity—they encode computational dependencies. A computed result that references raw data will be automatically deleted if that raw data is removed, preventing stale or orphaned results. This maintains *computational validity*, not just *referential integrity*.
2525

2626
**Declarative Computation**
27-
Computations are defined declaratively through `make()` methods attached to table definitions. The `populate()` operation identifies all missing results and executes computations in dependency order. Parallelization, error handling, and job distribution are handled automatically.
27+
Computations are defined declaratively through make() methods attached to table definitions. The populate() operation identifies all missing results and executes computations in dependency order. Parallelization, error handling, and job distribution are handled automatically.
2828

2929
**Immutability by Design**
3030
Computed results are immutable. Correcting upstream data requires deleting dependent results and recomputing—ensuring the database always represents a consistent computational state. This naturally provides complete provenance: every result can be traced to its source data and the exact code that produced it.

book/20-concepts/00-databases.md

Lines changed: 4 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -25,7 +25,7 @@ Databases are crucial for the smooth and organized operation of various entities
2525
## Database Management Systems (DBMS)
2626

2727
```{card} Database Management System
28-
A Database Management System is a software system that serves as the computational engine powering a database.
28+
A Database Management System (DBMS) is a software system that serves as the computational engine powering a database.
2929
It defines and enforces the structure of the data, ensuring that the organization's rules are consistently applied.
3030
A DBMS manages data storage and efficiently executes data updates and queries while safeguarding the data's structure and integrity, particularly in environments with multiple concurrent users.
3131
@@ -50,7 +50,7 @@ One of the most critical features distinguishing databases from simple file stor
5050

5151
### Authentication and Authorization
5252

53-
Before you can work with a database, you must **authenticate**—prove your identity with a username and password. Once authenticated, the database enforces **authorization** rules that determine what you can do:
53+
Before you can work with a database, you must **authentication**—prove your identity with a username and password. Once authenticated, the database enforces **authorization** rules that determine what you can do:
5454

5555
- **Read**: View specific tables or columns
5656
- **Write**: Add new data to certain tables
@@ -80,10 +80,10 @@ Modern databases typically separate data management from data use through distin
8080

8181
### Common Architectures
8282

83-
**Server-Client Architecture** (most common): A database server program manages all data operations, while client programs (your scripts, applications, notebooks) connect to request data or submit changes. The server enforces all rules and access permissions consistently for every client. This is like a library where the librarian (server) manages the books and enforces checkout policies, while patrons (clients) request materials.
83+
**Server-client architecture** (most common): A database server program manages all data operations, while client programs (your scripts, applications, notebooks) connect to request data or submit changes. The server enforces all rules and access permissions consistently for every client. This is like a library where the librarian (server) manages the books and enforces checkout policies, while patrons (clients) request materials.
8484
The two most popular open-source relational database systems: MySQL and PostgreSQL implement a server-client architecture.
8585

86-
**Embedded Databases**: The database engine runs within your application itself—no separate server. This works for single-user applications like mobile apps or desktop software, but doesn't support multiple users accessing shared data simultaneously.
86+
**Embedded databases**: The database engine runs within your application itself—no separate server. This works for single-user applications like mobile apps or desktop software, but doesn't support multiple users accessing shared data simultaneously.
8787
SQLite is a common embedded database @10.14778/3554821.3554842.
8888

8989
**Distributed Databases**: Data and processing are spread across multiple servers working together. This provides high availability and can handle massive scale, but adds significant complexity. Systems like Google Spanner, Amazon DynamoDB, and CockroachDB use this approach.

book/20-concepts/01-models.md

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -244,7 +244,7 @@ Most importantly, spreadsheets provide no referential integrity. If cell B2 cont
244244

245245
The **relational data model**, introduced by Edgar F. Codd in 1970, revolutionized data management by organizing data into tables (relations) with well-defined relationships. This model emphasizes data integrity, consistency, and powerful query capabilities through a formal mathematical foundation.
246246

247-
The relational model organizes all data into tables representing mathematical relations, where each table consists of rows (representing mathematical *tuples*) and columns (often called *attributes*). Key principles include data type constraints, uniqueness enforcement through primary keys, referential integrity through foreign keys, and declarative queries. The next chapter explores these principles in depth.
247+
The relational model organizes all data into tables representing mathematical relations, where each table consists of rows (representing mathematical *tuples*) and columns (often called *attributes*). Key principles include data type constraints, uniqueness enforcement through primary keys, referential integrity through foreign keys, and declarative query. The next chapter explores these principles in depth.
248248

249249
The most common way to interact with relational databases is through the Structured Query Language (SQL), a language specifically designed to define, manipulate, and query data within relational databases.
250250

book/20-concepts/03-relational-practice.ipynb

Lines changed: 1 addition & 9 deletions
Original file line numberDiff line numberDiff line change
@@ -1568,15 +1568,7 @@
15681568
{
15691569
"cell_type": "markdown",
15701570
"metadata": {},
1571-
"source": [
1572-
"## The Path Forward: Databases as Workflows\n",
1573-
"\n",
1574-
"**DataJoint extends relational theory by viewing the schema as a workflow specification.** It preserves all the benefits of relational databases—mathematical rigor, declarative queries, data integrity—while adding workflow semantics that make the database **workflow-aware**.\n",
1575-
"\n",
1576-
"**Key Insight**: The database schema structure can be identical whether using SQL or DataJoint, although DataJoint imposes some conventions. What's different is the **conceptual view**: SQL sees static entities and relationships; DataJoint sees an executable workflow, where some steps are manual and others are automatic. This workflow view enables automatic execution, provenance tracking, and computational validity—features essential for scientific computing.\n",
1577-
"\n",
1578-
"The next chapter introduces DataJoint's Relational Workflow Model in detail, showing how Computed tables turn your schema into an executable pipeline specification.\n"
1579-
]
1571+
"source": "## The Path Forward: Databases as Workflows\n\n**DataJoint extends relational theory by viewing the schema as a workflow specification.** It preserves all the benefits of relational databases—mathematical rigor, declarative queries, data integrity—while adding workflow semantics that make the database **workflow-aware**.\n\n**Key Insight**: The database schema structure can be identical whether using SQL or DataJoint, although DataJoint imposes some conventions. What's different is the **conceptual view**: SQL sees static entities and relationships; DataJoint sees an executable workflow, where some steps are manual and others are automatic. This workflow view enables automatic execution, provenance tracking, and computational validity—features essential for scientific computing.\n\nThe next chapter explores **Data Integrity**—the fundamental constraints that databases enforce to ensure data remains accurate, consistent, and reliable. Understanding these integrity concepts provides the foundation for DataJoint's Relational Workflow Model, which extends integrity guarantees to include workflow validity and computational consistency."
15801572
},
15811573
{
15821574
"cell_type": "code",

0 commit comments

Comments
 (0)