Skip to content

Commit d94b319

Browse files
committed
database section added
1 parent c164f6d commit d94b319

File tree

3 files changed

+221
-6
lines changed

3 files changed

+221
-6
lines changed
Lines changed: 186 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,186 @@
1+
## Introduction to PostgreSQL
2+
3+
PostgreSQL, often simply called Postgres, is a powerful, open-source object-relational database management system (ORDBMS). It has a strong reputation for reliability, feature robustness, and performance. PostgreSQL runs on all major operating systems, including Linux, Mac OS, and Windows.
4+
5+
PostgreSQL was first developed in 1986 at the University of California, Berkeley as part of the POSTGRES project. It has since evolved into one of the most advanced and widely-used database systems, with a strong community supporting its development.
6+
7+
8+
## Key Features of PostgreSQL
9+
10+
PostgreSQL offers a wide range of features that make it a popular choice for many applications:
11+
12+
1. **Extensive data types**: PostgreSQL supports a large variety of built-in data types and allows users to define their own custom data types. It can handle complex data types such as arrays, JSON, and geometric types[1][2].
13+
14+
2. **ACID compliance**: PostgreSQL adheres to the ACID principles (Atomicity, Consistency, Isolation, Durability), ensuring reliable and trustworthy transactions[2].
15+
16+
3. **Concurrency control**: PostgreSQL uses multi-version concurrency control (MVCC) to provide high concurrency without conflicts, allowing multiple transactions to access the same data simultaneously[2][4].
17+
18+
4. **Advanced querying capabilities**: PostgreSQL supports complex SQL queries, subqueries, common table expressions (CTEs), recursive queries, and window functions. It also allows users to define their own functions, triggers, and stored procedures in various programming languages[2][4].
19+
20+
5. **Full-text search**: PostgreSQL provides powerful full-text search capabilities, including stemming, ranking, and phrase-searching support. It uses indexes like B-tree, hash, and GiST to optimize search performance[2].
21+
22+
6. **Replication and high availability**: PostgreSQL supports various replication strategies, such as asynchronous streaming, logical, and synchronous replication, providing data redundancy, fault tolerance, and high availability[2].
23+
24+
7. **Security and authentication**: PostgreSQL offers robust security features, including SSL encryption, username/password authentication, LDAP authentication, Kerberos authentication, role-based access control (RBAC), and row-level security (RLS)[2].
25+
26+
8. **Extensibility**: PostgreSQL is designed to be extensible, allowing users to add custom data types, operators, and functions to the database to expand its capabilities[1][2].
27+
28+
## Setting Up PostgreSQL
29+
30+
To get PostgreSQL running on your local machine, you will need to have the following tools installed:
31+
32+
1. **PostgreSQL Server**: To set up PostgreSQL server on your local machine, follow these step-by-step instructions provided on the [official website](https://www.postgresql.org/download/). Once the installation is complete, you can run the server by opening the application.
33+
34+
2. **PostgreSQL Admin/Dev Tools**: Once the PostgreSQL server is installed, you can install tools to manage PostgreSQL and interact with it. There are multiple such tools, each with its own set of features but most of them support the basic features. Here are famous ones - [PgAdmin](https://www.pgadmin.org/), [DBeaver](https://dbeaver.io/), or you can even use terminal tools like [Psql](https://www.postgresql.org/docs/current/app-psql.html).
35+
36+
!!! Hint
37+
**Installation on Mac**
38+
39+
PostgreSQL can be installed on Mac by using homebrew by just running the command `brew install postgresql`. For more options, follow the [official website](https://www.postgresql.org/download/macosx/).
40+
41+
## Snippets
42+
43+
Learning PostgreSQL through sample code is a great way to understand its capabilities and how it can be used in different scenarios. Below are 10 sample code snippets in increasing order of complexity, designed to help you understand various aspects of PostgreSQL:
44+
45+
### PostgreSQL Query Language
46+
47+
Let's first explore some of the most basic operations in PostgreSQL. To get started we will cover the PostgreSQL Query Language, which is a variant of the SQL language and would look very familar if you are a beginner.
48+
49+
If you are using terminal, then you can activate psql mode by running `psql`. Once inside you can connect to the database by running the following command:
50+
51+
```sql
52+
-- Connecting to a PostgreSQL database
53+
-- Use a client or terminal with appropriate access credentials
54+
\c my_database;
55+
```
56+
57+
Or you can use any of the user-interface tools like PgAdmin for better user experience.
58+
59+
#### Connecting to a PostgreSQL Database
60+
61+
62+
#### Creating a Database
63+
64+
```sql
65+
-- Creating a database
66+
CREATE DATABASE my_database;
67+
```
68+
69+
#### Creating a Table
70+
71+
```sql
72+
-- Creating a simple table
73+
CREATE TABLE employees (
74+
id SERIAL PRIMARY KEY,
75+
name VARCHAR(50),
76+
position VARCHAR(50),
77+
salary DECIMAL
78+
);
79+
```
80+
81+
**3. Inserting Data**
82+
83+
```sql
84+
-- Inserting data into the table
85+
INSERT INTO employees (name, position, salary)
86+
VALUES ('John Doe', 'Software Engineer', 70000);
87+
```
88+
89+
**4. Basic Data Retrieval**
90+
91+
```sql
92+
-- Retrieving all data from a table
93+
SELECT * FROM employees;
94+
```
95+
96+
**5. Data Retrieval with Conditions**
97+
98+
```sql
99+
-- Retrieving specific data with a condition
100+
SELECT name, position FROM employees WHERE salary > 50000;
101+
```
102+
103+
**6. Updating Data**
104+
105+
```sql
106+
-- Updating data in the table
107+
UPDATE employees SET salary = 75000 WHERE name = 'John Doe';
108+
```
109+
110+
**7. Deleting Data**
111+
112+
```sql
113+
-- Deleting data from the table
114+
DELETE FROM employees WHERE id = 1;
115+
116+
-- Deleting all data from the table
117+
DELETE FROM employees;
118+
119+
-- Deleting the table
120+
DROP TABLE employees;
121+
122+
-- Deleting multiple tables
123+
DROP TABLE employees, departments;
124+
```
125+
126+
**8. Joining Tables**
127+
128+
```sql
129+
-- Creating another table
130+
CREATE TABLE departments (
131+
id SERIAL PRIMARY KEY,
132+
name VARCHAR(50)
133+
);
134+
135+
-- Inserting data into the new table
136+
INSERT INTO departments (name) VALUES ('Engineering');
137+
138+
-- Joining two tables
139+
SELECT employees.name, departments.name AS department_name
140+
FROM employees
141+
JOIN departments ON employees.id = departments.id;
142+
143+
```
144+
145+
**9. Using Aggregate Functions**
146+
147+
```sql
148+
-- Using an aggregate function to get the average salary
149+
SELECT AVG(salary) FROM employees;
150+
```
151+
152+
**10. Complex Query with Subquery and Grouping**
153+
154+
```sql
155+
-- Finding the highest salary in each department
156+
SELECT department_name, MAX(salary) AS max_salary
157+
FROM (
158+
SELECT employees.name, employees.salary, departments.name AS department_name
159+
FROM employees
160+
JOIN departments ON employees.id = departments.id
161+
) AS department_salaries
162+
GROUP BY department_name;
163+
164+
```
165+
166+
These examples cover a range of basic to more complex tasks you can perform with PostgreSQL, from establishing a connection to executing advanced queries. As you become more comfortable with these operations, you'll be able to tackle more complex scenarios and optimize your database interactions.
167+
168+
### Python Sample Code
169+
170+
...
171+
172+
## Conclusion
173+
174+
PostgreSQL's combination of features, performance, and reliability makes it a popular choice for a wide range of applications, from small projects to large-scale enterprise systems. Its open-source nature, strong community support, and continuous development ensure that PostgreSQL will remain a leading database management system for years to come.
175+
176+
## References
177+
178+
[1] https://www.geeksforgeeks.org/what-is-postgresql-introduction/
179+
180+
[2] https://www.linkedin.com/pulse/postgresql-practical-guidefeatures-advantages-brainerhub-solutions
181+
182+
[3] https://www.w3schools.com/postgresql/postgresql_intro.php
183+
184+
[4] https://www.tutorialspoint.com/postgresql/postgresql_overview.htm
185+
186+
[5] https://www.geeksforgeeks.org/postgresql-tutorial/
Lines changed: 26 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,26 @@
1+
# Databases
2+
3+
## Introduction
4+
5+
Databases are an essential part of any data science project. They provide a structured, organized, and secure storage for data. They allow for efficient retrieval, manipulation, and analysis of data.
6+
7+
## Relational Databases
8+
9+
Relational databases are structured using tables, with each table containing columns and rows. Each row represents a record, and each column contains a field. This structure makes it easier to query the database and find the information that is needed. Additionally, relational databases can be linked together using keys, allowing data to be shared between different tables. This makes it possible to link data from different sources and use it to generate meaningful insights. Hence the name “relational” database.
10+
11+
### Comparison of Relational Databases
12+
13+
| Feature / Database | Oracle | MySQL | Microsoft SQL Server | PostgreSQL | MariaDB | SQLite | IBM Db2 | SAP HANA |
14+
|-------------------------|-----------------------|-----------------------|----------------------|----------------------|----------------------|----------------------|----------------------|----------------------|
15+
| **License** | Commercial | Open Source | Commercial | Open Source | Open Source | Open Source | Commercial | Commercial |
16+
| **ACID Compliance** | Yes | Yes | Yes | Yes | Yes | Yes | Yes | Yes |
17+
| **Scalability** | Vertical/Horizontal | Vertical | Vertical/Horizontal | Vertical/Horizontal | Vertical/Horizontal | Limited | Vertical/Horizontal | Horizontal |
18+
| **Partitioning** | Yes | Yes | Yes | Yes | Yes | No | Yes | Yes |
19+
| **Replication** | Yes | Yes | Yes | Yes | Yes | No | Yes | Yes |
20+
| **JSON Support** | Yes | Limited | Yes | Yes | Yes | No | Yes | Yes |
21+
| **Geospatial Support** | Yes | Limited | No | Yes (PostGIS) | Yes | No | Yes | Yes |
22+
| **Stored Procedures** | Yes | Yes | Yes | Yes | Yes | No | Yes | Yes |
23+
| **User Base** | Large Enterprises | Web Applications | Enterprises | Diverse Applications | Web Applications | Embedded Applications | Enterprises | Enterprises |
24+
| **Popularity Rank** | 1 | 2 | 3 | 4 | 5 | 6 | 7 | 8 |
25+
26+
This table summarizes the key features and characteristics of some of the top relational databases, highlighting their strengths and use cases. Oracle leads in enterprise environments, while MySQL and PostgreSQL are popular for web applications and diverse workloads, respectively. Microsoft SQL Server is favored in corporate settings, and MariaDB is recognized for its compatibility with MySQL. SQLite is widely used for lightweight applications, and IBM Db2 and SAP HANA cater to enterprise needs with robust features.

mkdocs.yml

Lines changed: 9 additions & 6 deletions
Original file line numberDiff line numberDiff line change
@@ -126,12 +126,15 @@ nav:
126126
- 'KG Embedding Algorithms': 'network_science/kg_embedding_algorithms.md'
127127

128128
- 'Data Science Tools':
129-
- 'data_science_tools/introduction.md'
130-
- 'data_science_tools/python_snippets.md'
131-
- 'data_science_tools/python_good_practices.md'
132-
- 'data_science_tools/version_control.md'
133-
- 'data_science_tools/compute_and_ai_services.md'
134-
- 'data_science_tools/scraping_websites.md'
129+
- 'data_science_tools/introduction.md'
130+
- 'data_science_tools/python_snippets.md'
131+
- 'data_science_tools/python_good_practices.md'
132+
- 'data_science_tools/version_control.md'
133+
- 'data_science_tools/compute_and_ai_services.md'
134+
- 'data_science_tools/scraping_websites.md'
135+
- 'Database':
136+
- 'Introduction': 'data_science_tools/databases_introduction.md'
137+
- 'PostgreSQL': 'data_science_tools/database_postgresql.md'
135138

136139
- 'Machine Learning':
137140
- 'machine_learning/introduction.md'

0 commit comments

Comments
 (0)