Skip to content

Commit 826763f

Browse files
committed
adds notes for ch1
1 parent 289ffed commit 826763f

File tree

3 files changed

+99
-0
lines changed

3 files changed

+99
-0
lines changed

DDIA/data-system.png

180 KB
Loading

DDIA/part-1.md

Lines changed: 99 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,99 @@
1+
# Preface
2+
3+
## Preface
4+
Driving forces in database developments and distributed systems:
5+
- Companies are handling massive amounts of traffic and data, so they must engineer tools to efficiently handle this
6+
- To respond quickly to market insights, business must be able to test and iterate on hypotheses cheaply; thus, development cycles are short and data must be flexible to meet this dynamic environment
7+
- (F)OSS preference and dependability is growing
8+
- CPU clock speeds are barely increasing, but multi-core processors are standard and networks are getting faster
9+
- cloud infrastructure allows for distributed system across many machines and geographic regions
10+
- extended downtime is increasnigly unacceptable
11+
12+
Data-Intensive applications: apps where data is the primary challenge, e.g. data quantity, complexity, and changes
13+
14+
Compute-Intensive applications: apps where CPU cycles are the primary challenge
15+
16+
Goals:
17+
- Understand successful data systems by examining their algorithms, their princicples, and trade-offs
18+
- Be able to architect applications with appropriate technology by combining the most appropriate tools after examining their trade-offs
19+
20+
# Part 1: Foundations of Data Systems
21+
- Chapter 1: Examines reliability, scalability, and maintainability and methods to achieve these goals
22+
- Chapter 2: Compares data models and query langauges and discusses situations where each is most appropriate
23+
- Chapter 3: Examines optimization of different work loads by reviewing the internals of storage engines and looks at how databases lay out data on disk
24+
- Chapter 4: Compares serialization formats and examines how they fare in scaling environments
25+
26+
## 1. Reliable, Scalable, and Maintainable Applications
27+
![Alt text](r-s-m.png)
28+
Common Application Needs:
29+
- databases: store data for finding/reading later
30+
- caches: remember the result of an operation, typically expensive, to be able to quickly return the data without having to retrieve from the database again
31+
- search indexes: allows searching by keywords or other filters
32+
- stream processing: deals with data in motion, sending data to another process, handled asynchronously
33+
- batch processing: collects and processes data in batches
34+
35+
### Thinking About Data Systems
36+
Data Systems is an umbrella term used for optimized tools for data storage and processing.
37+
Different tools, each performing a single task efficiently, are stitched together using application code because of the growing requirements from modern applications.
38+
The example below shows "an application-managed caching layer (using Memcached or similar), or a full-text search server (such as Elasticsearch or Solr) separate from your main database, it is normally the application code’s responsibility to keep those caches and indexes in sync with the main database."
39+
![Alt text](data-system.png)
40+
The above data system stitches together different tools using application code.
41+
42+
**Questions when designing a data system:**
43+
- How do you ensure that the data remains correct and complete, even when things go wrong internally?
44+
- How do you provide consistently good performance to clients, even when parts of your system are degraded?
45+
- How do you scale to handle an increase in load?
46+
- What does a good API for the service look like?
47+
48+
**Concerns in Most Software Systems**
49+
- **Reliability:**
50+
The system should continue to work correctly (performing the correct function at the desired level of performance) even in the face of adversity (hardware or software faults, and even human error).
51+
- **Scalability:**
52+
As the system grows (in data volume, traffic volume, or complexity), there should be reasonable ways of dealing with that growth.
53+
- **Maintainability:**
54+
Over time, many different people will work on the system (engineering and operations, both maintaining current behavior and adapting the system to new use cases), and they should all be able to work on it productively
55+
### Reliability
56+
**Software Reliability Principles**
57+
- application performs as the users expects
58+
- it can tolerate unexpected user user or human errors
59+
- it performs under expected load and data volume
60+
- prevents abuse or unauthorized use of the application
61+
62+
Systems should be resilient or fault tolerant, i.e. they should continue to function even when faced with errors as they should be anticipating faults
63+
64+
fault vs. failure
65+
- fault: one component of the system deviating from spec
66+
- failure: system stops providing its service or function to users
67+
68+
69+
#### Hardware Faults
70+
Typical Faults: Hard disks crash, RAM becomes faulty, the power grid has a blackout, someone unplugs the wrong network cable
71+
72+
Redundancy is typically the first line of defense - Due to parallelism in cloud instances, risk is diversified and can be loss-tolerant
73+
#### Software Errors
74+
**Systematic Errors:** Cause the most faults and are difficult to anticipate. e.g. "A software bug that causes every instance of an application server to crash when given a particular bad input. For example, consider the leap second on June 30, 2012, that caused many applications to hang simultaneously due to a bug in the Linux kernel"
75+
**Mitigations:** "carefully thinking about assumptions and interactions in the system; thorough testing; process isolation; allowing processes to crash and restart; measuring, monitoring, and analyzing system behavior in production."
76+
77+
#### Human Errors
78+
Tips for reliable systems tolerant to human errors, as quoted in DDIA:
79+
- well-designed abstractions, APIs, and admin interfaces make it easy to do “the right thing” and discourage “the wrong thing.”
80+
- provide fully featured non-production sandbox environments where people can explore and experiment safely, using real data, without affecting real users
81+
- Test thoroughly at all levels, from unit tests to whole-system integration tests and manual tests
82+
- make it fast to roll back configuration changes, roll out new code gradually, and provide tools to recompute data
83+
-
84+
#### How Important Is Reliability?
85+
### Scalability
86+
#### Describing Load
87+
#### Describing Performance
88+
#### Approaches for Coping with Load
89+
### Maintainability
90+
#### Operability: Making Life Easy for Operations
91+
#### Simplicity: Managing Complexity
92+
#### Evolvability: Making Change Easy
93+
### Summary
94+
95+
## 2. Data Models and Query Languages
96+
97+
98+
99+

DDIA/r-s-m.png

70.8 KB
Loading

0 commit comments

Comments
 (0)