Skip to content

Commit 8daef07

Browse files
authored
Merge pull request #32 from imohitmayank/database-section
Added Database section
2 parents c164f6d + 8364053 commit 8daef07

File tree

4 files changed

+450
-7
lines changed

4 files changed

+450
-7
lines changed
Lines changed: 387 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,387 @@
1+
## Introduction to PostgreSQL
2+
3+
PostgreSQL, often simply called Postgres, is a powerful, open-source object-relational database management system (ORDBMS). It has a strong reputation for reliability, feature robustness, and performance. PostgreSQL was first developed in 1986 at the University of California, Berkeley as part of the POSTGRES project. It has since evolved into one of the most advanced and widely-used database systems, with a strong community supporting its development. PostgreSQL supports all major operating systems, including Linux, Mac OS, and Windows.
4+
5+
## Key Features of PostgreSQL
6+
7+
PostgreSQL offers a wide range of features that make it a popular choice for many applications:
8+
9+
1. **Extensive data types**: PostgreSQL supports a large variety of built-in data types and allows users to define their own custom data types. It can handle complex data types such as arrays, JSON, and geometric types.
10+
11+
2. **ACID compliance**: PostgreSQL adheres to the ACID principles (Atomicity, Consistency, Isolation, Durability), ensuring reliable and trustworthy transactions. [More details](databases_introduction.md#acid-compliance)
12+
13+
3. **Concurrency control**: PostgreSQL uses multi-version concurrency control (MVCC) to provide high concurrency without conflicts, allowing multiple transactions to access the same data simultaneously.
14+
15+
4. **Advanced querying capabilities**: PostgreSQL supports complex SQL queries, subqueries, common table expressions (CTEs), recursive queries, and window functions. It also allows users to define their own functions, triggers, and stored procedures in various programming languages.
16+
17+
5. **Full-text search**: PostgreSQL provides powerful full-text search capabilities, including stemming, ranking, and phrase-searching support. It uses indexes like B-tree, hash, and GiST to optimize search performance.
18+
19+
6. **Replication and high availability**: PostgreSQL supports various replication strategies, such as asynchronous streaming, logical, and synchronous replication, providing data redundancy, fault tolerance, and high availability.
20+
21+
7. **Security and authentication**: PostgreSQL offers robust security features, including SSL encryption, username/password authentication, LDAP authentication, Kerberos authentication, role-based access control (RBAC), and row-level security (RLS).
22+
23+
## Setting Up PostgreSQL
24+
25+
To get PostgreSQL running on your local machine, you will need to have the following tools installed:
26+
27+
1. **PostgreSQL Server**: You can follow the step-by-step instructions provided on the [official website](https://www.postgresql.org/download/). Once the installation is complete, you can run the server by opening the application.
28+
29+
2. **PostgreSQL Query Tools**: Once the PostgreSQL server is installed, you can install tools to manage and interact with PostgreSQL. There are multiple choices, each with its own set of unique features and all of them support the basic functionalities. Here are some famous ones - [PgAdmin](https://www.pgadmin.org/), [DBeaver](https://dbeaver.io/), or you can even use terminal tools like [Psql](https://www.postgresql.org/docs/current/app-psql.html).
30+
31+
!!! Hint
32+
**Installation on Mac**
33+
34+
PostgreSQL can be installed on Mac by using `homebrew`. Run the command `brew install postgresql`. For more details and options, follow the [official website](https://www.postgresql.org/download/macosx/).
35+
36+
## Learning the Basics
37+
38+
Practice makes man perfect, so let's learn PostgreSQL through sample codes. Below are some sample code snippets in increasing order of complexity, designed to help you understand various aspects of PostgreSQL.
39+
40+
!!! Hint
41+
Before we begin, please note that to interact with the database, you need to use the PostgreSQL Query Language, which is a variant of the SQL language. If you are using terminal, then you can activate psql mode by running `psql`. Once inside you can connect to the database by running the following command:
42+
43+
```sql
44+
-- Connecting to a PostgreSQL database
45+
-- Use a client or terminal with appropriate access credentials
46+
\c my_database;
47+
```
48+
Or you can use any of the user-interface tools like PgAdmin for better user experience.
49+
50+
**1. Creating a Database**
51+
52+
```sql
53+
-- Creating a database. Replace `my_database` with your database name
54+
CREATE DATABASE my_database;
55+
```
56+
57+
**2. Creating a Table**
58+
59+
60+
```sql
61+
-- Creating a simple table. Replace `employees` with your table name
62+
CREATE TABLE employees (
63+
id SERIAL PRIMARY KEY,
64+
name VARCHAR(50),
65+
position VARCHAR(50),
66+
departmentid INT,
67+
salary DECIMAL
68+
);
69+
```
70+
71+
!!! Hint
72+
[Here is a detailed list](https://www.postgresql.org/docs/current/datatype.html) of all supported data types in PostgreSQL. Note, you can also [create custom data types](https://www.postgresql.org/docs/current/sql-createtype.html).
73+
74+
**3. Inserting Data**
75+
76+
```sql
77+
-- Inserting data into the table
78+
INSERT INTO employees (name, position, salary)
79+
VALUES ('John Doe', 'Software Engineer', 70000);
80+
```
81+
82+
**4. Basic Data Retrieval**
83+
84+
```sql
85+
-- Retrieving all data from a table
86+
SELECT * FROM employees;
87+
88+
-- Limiting the number of rows returned
89+
SELECT * FROM employees LIMIT 10;
90+
91+
-- Retrieving specific columns
92+
SELECT name, position FROM employees;
93+
94+
-- Retrieving data in descending order
95+
SELECT * FROM employees ORDER BY salary DESC;
96+
```
97+
98+
**5. Data Retrieval with Conditions**
99+
100+
```sql
101+
-- Retrieving specific data with a condition
102+
SELECT name, position FROM employees WHERE salary > 50000;
103+
104+
-- Filtering on string columns
105+
SELECT * FROM employees WHERE name LIKE '%Doe%';
106+
107+
-- Filtering on datetime columns
108+
SELECT * FROM orders WHERE order_date BETWEEN '2022-01-01' AND '2022-02-01';
109+
110+
-- Filtering on datetime columns with interval (works same as above)
111+
SELECT * FROM orders WHERE order_date BETWEEN '2022-01-01' AND '2022-02-01'::date + interval '1 month';
112+
113+
-- To filter based on multiple conditions and values
114+
SELECT * FROM employees WHERE name LIKE '%Doe%' AND salary > 50000
115+
AND position in ('Software Engineer', 'Data Scientist');
116+
```
117+
118+
**6. Updating Data**
119+
120+
```sql
121+
-- Updating data in the table
122+
UPDATE employees SET salary = 75000 WHERE name = 'John Doe';
123+
```
124+
125+
**7. Deleting Data**
126+
127+
```sql
128+
-- Deleting data from the table
129+
DELETE FROM employees WHERE id = 1;
130+
131+
-- Deleting all data from the table
132+
DELETE FROM employees;
133+
134+
-- Deleting the table
135+
DROP TABLE employees;
136+
137+
-- Deleting multiple tables
138+
DROP TABLE employees, departments;
139+
```
140+
141+
**8. Joining Tables**
142+
143+
```sql
144+
-- Creating another table
145+
CREATE TABLE departments (
146+
id SERIAL PRIMARY KEY,
147+
name VARCHAR(50)
148+
);
149+
150+
-- Inserting data into the new table
151+
INSERT INTO departments (name) VALUES ('Engineering');
152+
153+
-- Joining two tables
154+
SELECT employees.name, departments.name AS department_name
155+
FROM employees
156+
JOIN departments ON employees.departmentid = departments.id;
157+
```
158+
159+
**9. Using Aggregate Functions**
160+
161+
```sql
162+
-- Using an aggregate function to get the average salary
163+
SELECT AVG(salary) FROM employees;
164+
165+
-- Group by a column (ex: getting the average salary by department)
166+
SELECT department_name, AVG(salary) AS avg_salary
167+
FROM employees
168+
JOIN departments ON employees.id = departments.id
169+
GROUP BY department_name;
170+
```
171+
172+
**10. Complex Query with Subquery and Grouping**
173+
174+
```sql
175+
-- Finding the highest salary in each department
176+
SELECT department_name, MAX(salary) AS max_salary
177+
FROM (
178+
SELECT employees.name, employees.salary, departments.name AS department_name
179+
FROM employees
180+
JOIN departments ON employees.id = departments.id
181+
) AS department_salaries
182+
GROUP BY department_name;
183+
```
184+
185+
These examples cover a range of basic to more complex tasks you can perform with PostgreSQL, from establishing a connection to executing advanced queries. As you become more comfortable with these operations, you'll be able to tackle more complex scenarios and optimize your database interactions.
186+
187+
## Python Sample Code
188+
189+
There are multiple python packages available for PostgreSQL like, [psycopg2](https://pypi.org/project/psycopg2/) and [asyncpg](https://pypi.org/project/asyncpg/). For this section, we will use [asyncpg](https://pypi.org/project/asyncpg/) package that provides support for asynchronous programming.
190+
191+
A sample code to connect to the PostgreSQL server and fetch the result is shown below,
192+
193+
```python linenums="1"
194+
# import
195+
import asyncio
196+
import asyncpg
197+
198+
# the main function that connect to the PostgreSQL server,
199+
# fetch the result and print the result
200+
async def run():
201+
# connect to the PostgreSQL server
202+
conn = await asyncpg.connect(user='postgres', password='admin',
203+
database='mydb', host='localhost')
204+
# fetch the result
205+
result = await conn.fetch(
206+
'SELECT * FROM mytbl LIMIT 1'
207+
)
208+
# print the result
209+
print(dict(result))
210+
# close the connection
211+
await conn.close()
212+
213+
if __name__ == '__main__':
214+
# run the code
215+
loop = asyncio.get_event_loop()
216+
loop.run_until_complete(run())
217+
```
218+
219+
Creating dynamic queries based on user input can be easily done by passing the variables to the `fetch` function. Below is the modification you need to do. If you notice, we have two variables in the query for `id` and `limit` denoted by `$1` and `$2` respectively. The respective values are passed in the `fetch` function. Rest of the code remains same.
220+
221+
```python linenums="1"
222+
# fetch the result
223+
result = await conn.fetch(
224+
'SELECT * FROM mytbl where id = $1 LIMIT $2',
225+
123, 1
226+
)
227+
```
228+
229+
You can use `conn.execute` to run the query without fetching the result. Below is the modification needed.
230+
231+
```python linenums="1"
232+
# insertion example (one row)
233+
result = await conn.execute(
234+
'INSERT INTO mytbl (code, name) VALUES ($1, $2) where id = $3',
235+
123, 'mohit', 1
236+
)
237+
```
238+
239+
If you want to execute for multiple rows, you can use `conn.executemany` instead of `conn.execute`. Below is the modification to the code shown above.
240+
241+
```python linenums="1"
242+
# insertion example (multiple rows)
243+
result = await conn.executemany(
244+
'INSERT INTO mytbl (code, name) VALUES ($1, $2) where id = $3',
245+
[(123, 'mohit', 1), (124, 'mayank', 2)]
246+
)
247+
```
248+
249+
You might want to create a generic function to execute queries and retry in case of failure. Here is how you can do it using the `tenacity` library. The below code will retry 3 times if the query fails with exponential backoff.
250+
251+
```python linenums="1"
252+
# import
253+
import asyncio
254+
import asyncpg
255+
import functools
256+
from tenacity import TryAgain, retry, stop_after_attempt, wait_exponential
257+
258+
# custom retry logging function
259+
def custom_retry_log(retry_state, msg):
260+
if retry_state.attempt_number != 1:
261+
print(f"Retrying {retry_state.attempt_number - 1} for {msg}")
262+
263+
# main function
264+
async def execute_fetch_script(script, values=(), msg=None, retry_on_failure=True):
265+
# create connection
266+
conn = await asyncpg.connect(user='postgres', password='admin',
267+
database='mydb', host='localhost')
268+
try:
269+
# retry mechanism
270+
log_callback = functools.partial(custom_retry_log, msg=msg)
271+
272+
# retry mechanism
273+
@retry(wait=wait_exponential(multiplier=2, min=2, max=16),
274+
stop=stop_after_attempt(4),
275+
after=log_callback, reraise=True)
276+
async def retry_wrapper():
277+
try:
278+
# execute the select SQL script
279+
records = await conn.fetch(script, *values)
280+
project_records = [dict(record) for record in records]
281+
print(project_records) # remove this
282+
return project_records
283+
except Exception as e:
284+
if retry_on_failure:
285+
raise TryAgain(e)
286+
else:
287+
print(f"Failure in {msg} - {e}")
288+
return
289+
290+
# db call wrapper
291+
return await retry_wrapper()
292+
except Exception as e:
293+
raise Exception(f"Failure in {msg} - {e}")
294+
finally:
295+
# close db connections
296+
await conn.close()
297+
298+
299+
if __name__ == '__main__':
300+
loop = asyncio.get_event_loop()
301+
script ='SELECT * FROM mytbl where projectid = $1 LIMIT $2'
302+
values = (2, 1)
303+
loop.run_until_complete(execute_fetch_script(script, values, "Testing Run"))
304+
```
305+
306+
If you noticed, all of the above examples are executing the query within a single transaction. In case you want to execute multiple queries in one transaction, you can do as shown below,
307+
308+
```python linenums="1"
309+
# import
310+
import asyncio
311+
import asyncpg
312+
import functools
313+
from tenacity import TryAgain, retry, stop_after_attempt, wait_exponential
314+
315+
# create the connection
316+
conn = await asyncpg.connect(user='postgres', password='admin',
317+
database='mydb', host='localhost')
318+
319+
# start the transaction
320+
async with conn.transaction():
321+
322+
try:
323+
# Query 1 - execute the select SQL script
324+
records = await conn.fetch('SELECT * FROM mytbl where projectid = $1 LIMIT $2', 2, 1)
325+
326+
# Query 2 - update the table
327+
await conn.execute('UPDATE mytbl SET name = $1 where projectid = $2', 'mohit', 2)
328+
329+
# handle exception
330+
except Exception as e:
331+
# in case of exception rollback the transaction
332+
await conn.execute('ROLLBACK;')
333+
334+
finally:
335+
# close db connections
336+
await conn.close()
337+
```
338+
339+
## Snippets
340+
341+
Real world problems will require much more than what we covered in the above sections. Lets cover some important queries in this section.
342+
343+
**Casting a column to a different data type**
344+
345+
```sql
346+
-- Casting a column to a different data type
347+
SELECT CAST(salary AS VARCHAR) FROM employees;
348+
```
349+
350+
**Using JSONB column**
351+
352+
```sql
353+
-- Extracting data from JSONB column
354+
-- Suppose data column contains {"name": "John", "address": {"city": "New York", "state": "NY"}}
355+
SELECT name, jsonb_extract_path(data, 'address', 'city') AS city FROM employees;
356+
```
357+
358+
**Extracting components from a DateTime column**
359+
360+
```sql
361+
-- Extracting month from DATE column
362+
-- Suppose in a tbl, order_date col contains info like 2022-01-01
363+
SELECT DATE_TRUNC('month', order_date) AS month, COUNT(*) AS order_count
364+
FROM orders
365+
GROUP BY month
366+
ORDER BY month;
367+
368+
-- Extract year from DATE column, use: DATE_TRUNC('year', order_date)
369+
-- Extract quarter from DATE column, use: DATE_TRUNC('quarter', order_date)
370+
-- Extract week from DATE column, use: DATE_TRUNC('week', order_date)
371+
-- Extract day from DATE column, use: DATE_TRUNC('day', order_date)
372+
-- Extract hour from DATE column, use: DATE_TRUNC('hour', order_date)
373+
-- Extract minute from DATE column, use: DATE_TRUNC('minute', order_date)
374+
-- Extract second from DATE column, use: DATE_TRUNC('second', order_date)
375+
```
376+
377+
## Conclusion
378+
379+
PostgreSQL's combination of features, performance, and reliability makes it a popular choice for a wide range of applications, from small projects to large-scale enterprise systems. Its open-source nature, strong community support, and continuous development ensure that PostgreSQL will remain a leading database management system for years to come. Hope this article helped you understand the basics of PostgreSQL and piqued your interest in learning more.
380+
381+
## References
382+
383+
[1] GeeksforGeeks - [What is PostgreSQL?](https://www.geeksforgeeks.org/what-is-postgresql-introduction/) | [PostgreSQL Tutorial](https://www.geeksforgeeks.org/postgresql-tutorial/)
384+
385+
[2] w3schools - [PostgreSQL Tutorial](https://www.w3schools.com/postgresql/postgresql_intro.php)
386+
387+
[3] Tutorialspoint - [PostgreSQL Tutorial](https://www.tutorialspoint.com/postgresql/postgresql_overview.htm)

0 commit comments

Comments
 (0)