Skip to content

Commit 57fbc0b

Browse files
committed
module 2 content expanded
1 parent ec29f4f commit 57fbc0b

File tree

26 files changed

+787
-321
lines changed

26 files changed

+787
-321
lines changed

module2-databases/README.md

Lines changed: 40 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,40 @@
1+
# Module 2: Databases
2+
3+
Last revised: 31/10/2021
4+
5+
## Summary
6+
Web applications that persist data between visits inevitably use a database. Students
7+
familiarize themselves with the relational and non-relational databases used in today’s
8+
ecosystem and their query languages: MySQL, PostgreSQL, MongoDB. Students
9+
explore the advantages and disadvantages of each technology, understanding the
10+
appropriate use cases for each one.
11+
12+
## Outline
13+
14+
- 1 [Introduction to databases [R]](../module2-databases/r1-introduction-to-databases/README.md)
15+
16+
- 1.1 [Introduction to relational databases [R]](../module2-databases/r1.1-introduction-to-relational-databases/README.md)
17+
18+
- 1.2 [Relational database structure [R]](../module2-databases/r1.2-relational-database-structure/README.md)
19+
20+
- 1.3 [Querying in SQL [R]](../module2-databases/r1.3-querying-in-sql/README.md)
21+
22+
- 1.4 [SQL queries practice [L]](../module2-databases/r1.4-sql-queries-practice/README.md)
23+
24+
- 2 [Introduction to NoSQL databases [R]](../module2-databases/r2-introduction-to-nosql-databases/README.md)
25+
26+
- 2.1 [Introduction to MongoDB [R]](../module2-databases/r2.1-introduction-to-mongodb/README.md)
27+
28+
- 2.2 [Querying in MongoDB [R]](../module2-databases/r2.2-querying-in-mongodb/README.md)
29+
30+
- 2.3 [MongoDB queries practice [L]](../module2-databases/r2.3-mongodb-queries-practice/README.md)
31+
32+
- 3 [ORM and ODM [R]](../module2-databases/r3-orm-and-odm/README.md)
33+
34+
- 3.1 [Sequelize practice [L]](../module2-databases/r3.1-sequelize-practice/README.md)
35+
36+
- 4 [Other popular databases [R]](../module2-databases/r4-other-popular-databases/README.md)
37+
38+
- 4.1 [Elasticsearch practice [R]](../module2-databases/r4.1-elasticsearch-practice/README.md)
39+
40+
- 5 [Summary [R]](../module2-databases/r5-summary/README.md)

module2-databases/assets/document-example.svg

Lines changed: 1 addition & 0 deletions
Loading

module2-databases/assets/joins.jpeg

188 KB
Loading
65.2 KB
Loading
37.2 KB
Loading
34.4 KB
Loading
35.2 KB
Loading
Lines changed: 31 additions & 16 deletions
Original file line numberDiff line numberDiff line change
@@ -1,18 +1,33 @@
1-
## Introduction to Databases
2-
3-
In this module, you'll learn about **databases**. Database is a general term
4-
that refers to the data stored in a structured format. We use databases when we
5-
want to persist data for future use. For example, in a web application, when a
6-
user writes something in a text box, this data will be lost when the page is
7-
refreshed -- unless this data is persisted in a database.
8-
9-
Databases are a ubiquitous and foundational part of backend technology. One
10-
would use databases instead of local storage or cookies, for example, when the
11-
information being persisted needs to be available to any possible client. For
12-
example, information stored in [local
13-
storage](https://developer.mozilla.org/en-US/docs/Web/API/Window/localStorage) is not automatically synchronized
14-
across browsers. For example, if you were to store information about user
1+
# Introduction to Databases
2+
Welcome to the next module, where you'll learn about **databases**. The objectives of this lesson are:
3+
1. Getting familiar with the basic concepts of databases
4+
2. Understanding the relevance of database knowledge for backend developers
5+
6+
## What is a database?
7+
A database, in the most general sense, is an organized collection of data. It is a general term that refers to data stored in a structured format in a system where it can be easily accessed, manipulated and updated.
8+
9+
### Why do we need a database on the backend?
10+
In the previous module, we initially read that one of the main components of the backend architecture is the database. Data is the core of any website, application or API. Users come to a website or application looking for some kind of data - list of restaurants on a food ordering app, a vast number of products on an E-commerce site or interesting courses on an online learning platform. As the users interact with an application, they generate useful data such as ratings given to a restaurant, products added to their cart or course completion progress.
11+
12+
Data that doesn't disappear when an application stops running is referred to as being "persistent". Databases help us to persist data for future use or continuous use. Our APIs are incomplete without data persistence. We have already seen this while working on the assignments of module 1. We were storing data in arrays and objects, but this data is lost from memory once the app is stopped or web page is refreshed. That is why, databases are an essential component of any backend application.
13+
14+
Today data is becoming or probably has already become the most valuable commodity in our world, surpassing fossil fuels like oil. Should we be worried? Probably a discussion for another day, but first let's understand working with databases.
15+
16+
### Can data be stored only on the backend?
17+
Of course not! On the frontend, data can be stored using [cookies](https://developer.mozilla.org/en-US/docs/Web/HTTP/Cookies) or [web storage](https://developer.mozilla.org/en-US/docs/Web/API/Web_Storage_API). However, these methods allow for storing small amounts of data which cannot be persisted as the data lifetime depends on browser session time, browser close or a fixed expiry time. Also such data stored on one client cannot be accessed by another. For example, if you were to store information about user
1518
settings in local storage, then those settings would no longer be available on a
16-
different browser -- but if you store them in a database, the user can access
17-
them from any client.
19+
different browser or device -- but if you store them in a database on the backend, the user can access them from any client.
20+
21+
For data that needs to be persisted long-term, the client-side always relies on the backend to have a database. APIs allow clients to send and receive this data from the database through the backend server.
22+
23+
### How much knowledge of databases is relevant for a backend developer?
24+
In tech companies, there are different roles and specializations and many of them are focussed on data and databases. Depending on the size, team structure and requirements of a company, there might be a Database Administrator - who is responsible for day-to-day operations on the database such as creating, updating and cleaning data records, ensuring data is available to users readily and securely. There could even be a Data Analyst - who is responsible for analyzing the data collected by the organisation and drive strategic decisions from the same. There might be a Data Scientist - who writes code to perform complex analysis on large datasets with the knowledge of statistics, probability, advanced mathematics and machine learning. Some companies might even have a Database Developer - if the product they are building and maintaining is a database itself. Other roles include Database Architect, Data Modeler and Database Tester.
25+
26+
However, databases being a ubiquitous and foundational technology of the backend imply that backend developers must have an overall good knowledge of databases. This means that you may not specialize as a data analyst or data scientist, but you must be able to work with databases - which includes data modelling, data querying and enabling the APIs to connect and communicate with the database.
27+
28+
So now that you have been introduced to databases, let's move on to the next lesson where we will learn about the most widely used type of database - relational database.
1829

30+
---
31+
## References
32+
- https://www.techopedia.com/6/28832/enterprise/databases/introduction-to-databases
33+
- https://learnsql.com/blog/types-of-database-jobs/
Lines changed: 70 additions & 37 deletions
Original file line numberDiff line numberDiff line change
@@ -1,43 +1,31 @@
1-
# Introduction to relational databases
2-
3-
One of the most common types of databases is the **relational database**. A
4-
relational database stores data in tables, and these tables may have *relations*
5-
to one another. For example, you may have a table `Customer`; each row
6-
represents a customer in your application. You may have another table
7-
`Order`, a customer can order many things, and each order will appear in this
8-
table. We will later look in more detail at how the two tables are linked.
9-
10-
Relational databases are frequently referred to as SQL databases. SQL
11-
(Structured Query Language) is a
12-
[declarative programming
13-
language](https://365datascience.com/tutorials/sql-tutorials/sql-declarative-language/)
14-
that is commonly used to query for data. Not all relational databases
15-
necessarily use SQL, but it is by far the most common, so in practice, SQL
16-
database and relational database are synomyous terms.
17-
18-
The most frequently used relational databases are PostgreSQL and MySQL. There
19-
are many offerings for SQL databases, and each may have slightly different
20-
syntax for SQL. However, if you understand the general concept of relational
21-
databases, you will be able to easily use different databases.
1+
# Introduction to Relational Databases
2+
The objectives of this lesson are:
3+
1. Understanding the relational model
4+
2. Understanding the advandatages of relational databases
5+
3. Getting familiar with popular relational databases
6+
7+
## The Relational Model
8+
A database management system (DBMS) is a software package designed to define, manipulate, retrieve and manage data in a database. A DBMS generally manipulates the data itself, the data format, field names, record structure and file structure. It also defines rules to validate and manipulate this data.
9+
10+
Edgar F. Codd was the pioneer of the relational model for databases, who came up with what is today known as Codd's twelve rules for a database management system to be considered relational, as in a relational database management system or RDBMS. The relational model was a radical departure from the reigning hierarchical model in that it focused on the ability to search a database by content rather than by following a linked navigation system. This offered the significant advantage of allowing databases to grow and store more and more data, all without having to change or rewrite the applications that accessed that data. Even today the relational model is still used for the overwhelming majority of commercial database offerings.
2211

23-
### Understanding relational database structure
24-
As mentioned previously, data in relational databases are usually organized in a
25-
tabular format, similar to spreadsheets.
12+
If you're interested in some core CS, you can read about Codd's twelve rules [here](https://en.wikipedia.org/wiki/Codd%27s_12_rules). We'll move ahead to look at relational databases in practice.
2613

27-
* A database has many tables
28-
* A table has many rows (also known as records)
29-
* A row has many columns (also known as fields).
14+
### How do relational databases work?
15+
A relational database is essentially a group of tables. Each table is made up of rows (also known as records) and columns (also known as fields), where a row represents a data record and a column represents a data attribute or property. The tables can have relationships between them that are defined as using a certain column in one table that references a column in another table. Every row in a table must have a primary key which is a unique value that is used to reference the specific row. If a table is related to another table, it will have a foreign key which is used to reference the related record on the related table.
3016

31-
For example, the previously mentioned `Customer` table may look something like
32-
the following. Note that this is not any particular syntax but simply a
33-
visualization of the data.
17+
For example, you may have a table `Customer` in which each row represents a customer in your application. You may have another table `Order`, a customer can place many orders, and each order will appear in this table.
18+
19+
The `Customer` table may look something like the following. Note that this is not any particular syntax but simply a visualization of the data in tabular format.
3420

3521
**id**|**first\_name**|**last\_name**|**registered\_at**
3622
:-----:|:-----:|:-----:|:-----:
3723
1|Joe|Smith|2012-01-02
3824
2|Jane|Doe|2012-01-03
3925
3|Susan|Stone|2012-01-05
4026

27+
The column `id` here is the primary key for the table with a unique value for each row. Each row has the properties `first_name`, `last_name` and the date the customer `registered_at`.
28+
4129
The `Order` table may look something like so:
4230

4331
**id**|**product**|**delivered**|**customer\_id**
@@ -47,13 +35,58 @@ The `Order` table may look something like so:
4735
3|Cookies|TRUE|1
4836
4|Rice|TRUE|2
4937

50-
On the `Order` table, note that there is a field called `customer_id`. This is
51-
called a **foreign key** (or sometimes join key). This column creates a
52-
**relation** between the `Customer` and `Order` tables: it tells us to which
53-
customer the order belongs. This is the core of relational databases: expressing
54-
relations between entities.
38+
The `Order` table also has its own primary key `id` and the fields `product` and `delivered`. Note that there is a field called `customer_id`, which is
39+
a **foreign key**. This column creates a **relation** between the `Customer` and `Order` tables: it tells us to which customer the order belongs. This is the core of relational databases: expressing relations between entities.
40+
41+
This table says that Joe has ordered a Keyboard, Mouse, and Cookies. He has
42+
three orders, because there are three rows with `customer_id = 1`. Jane has ordered one item: Rice (`customer_id = 2`). Of all orders, Joe's Mouse is yet to be delivered.
43+
44+
### SQL
45+
Structured Query Language (SQL) is the industry standard language used for the management and manipulation of data in relational databases. SQL can be used to query, insert, update and modify data. All major relational databases support SQL, and that's why relational databases are frequently referred to as SQL databases. SQL is often pronouced as "sequel".
46+
47+
SQL is a [declarative programming language](https://365datascience.com/tutorials/sql-tutorials/sql-declarative-language/) that is commonly used to query for data. Most commercial RDBMS platforms have their own customized SQL implementations, but these tend to be fully compatible with the standard SQL, so in practice, SQL database and relational database are synomyous terms.
48+
49+
## Advantages of RDBMS
50+
A Relational Database system has multiple other advantages over any other type of database, such as:
51+
1. **Simple Model** : It does not require any complex structuring or querying processes.
52+
2. **Data Accuracy** : Multiple tables can be related to one another, leaving no chance for duplication of data.
53+
3. **Data Integrity** : The structured schema constraints and relational reliability amongst the tables in the database helps in avoiding the records from being imperfect, isolated or unrelated, which in turn supports ease of use, precision and stability of data.
54+
4. **Normalization** : Normalization is the process of minimizing redundancy from a relation or set of relations and can be easily acheived in relational databases. This term often comes up when working with SQL databases, so you can read more about it [here](https://www.geeksforgeeks.org/normal-forms-in-dbms/).
55+
5. **High Security** : RDBMS support controlled access for different users, and as the data is divided between tables, it is possible to tag a few tables as confidential and others not.
56+
57+
### When to use a relational database
58+
Relational databases are typically the most mature databases: they have withstood the test of time and continue to be an industry standard tool for the reliable storage of important data. So they are pretty much to go-to choice for databases.
59+
60+
However, it's possible that your data doesn't conform nicely to a relational schema or your schema is changing so frequently that the rigid structure of a relational database is slowing down your development. In this case, you can consider using a non-relational database instead.
61+
62+
## Popular Relational Databases
63+
The most frequently used relational databases are MySQL and PostgreSQL. There
64+
are many other offerings for SQL databases, and each may have slightly different
65+
syntax for SQL. However, if you understand the general concept of relational
66+
databases, you will be able to easily use different databases.
67+
68+
### MySQL
69+
MySQL is the most popular open source SQL database. It is typically used for web application development, and often accessed using PHP in what is called the [LAMP stack](https://en.wikipedia.org/wiki/LAMP_(software_bundle)).
70+
71+
The main advantages of MySQL are that it is easy to use, inexpensive, reliable (has been around since 1995), and has a large community of developers who can help answer questions. MySQL is durable, resilient, and persistent. You can trust MySQL to store your data and never, ever lose it.
72+
73+
Some of the disadvantages are that it has been known to suffer from poor performance when scaling, open source development has lagged since Oracle has taken control of MySQL, and it does not include some advanced features that developers may be used to.
74+
75+
### PostgreSQL
76+
PostgreSQL is an open source SQL database that is not controlled by any corporation. It is typically used for web application development with different server-side languages.
77+
78+
PostgreSQL shares many of the same advantages of MySQL. It is easy to use, inexpensive, reliable and has a large community of developers. It also provides some additional features such as foreign key support without requiring complex configuration.
79+
80+
The main disadvantage of PostgreSQL is that sometimes it can be slower in performance than other databases such as MySQL. It is also slightly less popular than MySQL.
5581

56-
This table says that Joe has ordered a keyboard, mouse, and cookies. He has
57-
three orders, because there are three rows with `customer_id = 1`. Jane has ordered one item: rice (`customer_id = 2`).
82+
Some other popular RDBMS include SQLite, SQL Server and Oracle DB.
5883

84+
In this bootcamp we will mostly be exploring MySQL. So now that you have some more knowledge of relational databases, let's understand the structuring and querying of these databases in the next lessons.
5985

86+
---
87+
## References
88+
- https://www.techopedia.com/definition/24361/database-management-systems-dbms
89+
- https://www.techopedia.com/6/28832/enterprise/databases/introduction-to-databases
90+
- https://www.codecademy.com/articles/what-is-rdbms-sql
91+
- https://shopify.engineering/five-common-data-stores-usage
92+
- https://www.educba.com/relational-database-advantages/

0 commit comments

Comments
 (0)