Skip to content

Commit a4cbf55

Browse files
authored
Merge pull request #1036 from dinesh-2047/sql-group-having
sql-having-vs-group-by
2 parents 1d8901e + 06ff74b commit a4cbf55

File tree

2 files changed

+217
-0
lines changed

2 files changed

+217
-0
lines changed
Lines changed: 216 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,216 @@
1+
---
2+
id: sql-having-vs-group-by
3+
title: Difference Between HAVING and GROUP BY in SQL
4+
sidebar_label: HAVING vs GROUP BY
5+
sidebar_position: 4
6+
tags: [sql, having, group-by, database, relational-databases]
7+
description: In this super beginner-friendly guide, you’ll learn the key differences between SQL’s HAVING and GROUP BY clauses, how they work together, and when to use each for powerful data analysis!
8+
keywords: [sql, having, group by, sql tutorial, sql basics, database management, sql for beginners, sql in 2025]
9+
---
10+
11+
## 📙 Welcome to HAVING vs GROUP BY!
12+
13+
Hey there, SQL beginner! If you’ve ever wondered how to group data and filter those groups in SQL, you’ve likely come across **GROUP BY** and **HAVING**. These clauses are powerful tools for summarizing and filtering data, but they serve different purposes and are often confused. Using a simple `students` table (with columns `id`, `name`, `age`, `marks`, and `city`), we’ll break down their differences, show how they work together, provide a handy comparison table, and include clear examples to make you a pro. Let’s dive in!
14+
15+
### 📘 What Are GROUP BY and HAVING?
16+
17+
- **GROUP BY**: Organizes rows into groups based on one or more columns and is typically used with aggregate functions (e.g., `COUNT`, `AVG`, `SUM`) to summarize data within each group.
18+
- **HAVING**: Filters the grouped results based on conditions involving aggregate functions, acting like a `WHERE` clause but for groups rather than individual rows.
19+
20+
Think of `GROUP BY` as sorting your data into buckets (e.g., grouping students by city), and `HAVING` as deciding which buckets to keep (e.g., only cities with an average mark above 80). They’re often used together in SQL queries, but they have distinct roles and rules.
21+
22+
> **Pro Tip**: Always write `GROUP BY` before `HAVING` in a query, as SQL processes `GROUP BY` first to create groups, then applies `HAVING` to filter them!
23+
24+
### 📘 Detailed Differences Between GROUP BY and HAVING
25+
26+
To understand when and how to use `GROUP BY` and `HAVING`, let’s explore their differences in detail, followed by a comparison table summarizing the key points.
27+
28+
#### 1. Purpose
29+
- **GROUP BY**:
30+
- Groups rows with identical values in specified columns into summary rows.
31+
- Used to aggregate data (e.g., calculate averages, counts) within each group.
32+
- Example: Group students by `city` to find the average marks per city.
33+
- **HAVING**:
34+
- Filters the groups created by `GROUP BY` based on conditions involving aggregate functions.
35+
- Acts like a gatekeeper, keeping only the groups that meet the condition.
36+
- Example: Keep only cities where the average marks are above 80.
37+
38+
#### 2. What They Operate On
39+
- **GROUP BY**:
40+
- Operates on individual rows to organize them into groups.
41+
- Works with raw column values (e.g., `city`, `age`) to define groups.
42+
- Must be used with aggregate functions (e.g., `AVG`, `COUNT`) in the `SELECT` clause for meaningful results.
43+
- **HAVING**:
44+
- Operates on the grouped results after `GROUP BY` is applied.
45+
- Works with aggregate functions (e.g., `AVG(marks)`, `COUNT(id)`) to filter groups.
46+
- Cannot reference non-aggregated columns unless they’re in the `GROUP BY` clause.
47+
48+
#### 3. Position in Query
49+
- **GROUP BY**:
50+
- Appears after the `FROM` and `WHERE` clauses in a SQL query.
51+
- Precedes `HAVING` in both syntax and execution order.
52+
- Syntax order: `SELECT``FROM``WHERE``GROUP BY``HAVING``ORDER BY``LIMIT`.
53+
- **HAVING**:
54+
- Appears immediately after `GROUP BY` in a query.
55+
- Applied after groups are formed, filtering the aggregated results.
56+
- Cannot be used without `GROUP BY` in standard SQL, as it relies on grouped data.
57+
58+
#### 4. Conditions They Support
59+
- **GROUP BY**:
60+
- Doesn’t support conditions directly; it defines how rows are grouped.
61+
- Example: `GROUP BY city` groups all rows by unique city values.
62+
- **HAVING**:
63+
- Supports conditions using aggregate functions (e.g., `AVG(marks) > 80`).
64+
- Can also include non-aggregated columns if they’re part of the `GROUP BY` clause (e.g., `city = 'Mumbai'`).
65+
- Example: `HAVING AVG(marks) > 80` keeps groups with high average marks.
66+
67+
#### 5. Comparison with WHERE
68+
- **GROUP BY**:
69+
- Works with `WHERE` to filter individual rows before grouping.
70+
- Example: Use `WHERE age > 18` to filter students before grouping by city.
71+
- **HAVING**:
72+
- Acts like `WHERE` but for groups, applied after `GROUP BY`.
73+
- Cannot use with non-aggregated data unless grouped, unlike `WHERE`.
74+
- Example: Use `HAVING COUNT(id) > 2` to keep groups with more than two students.
75+
76+
#### 6. Execution Order
77+
- **GROUP BY**:
78+
- Executed after `FROM` and `WHERE`, grouping rows based on specified columns.
79+
- Part of the query execution pipeline: `FROM``WHERE``GROUP BY``HAVING``SELECT``ORDER BY``LIMIT`.
80+
- **HAVING**:
81+
- Executed after `GROUP BY`, filtering the grouped results.
82+
- Only processes the aggregated data produced by `GROUP BY`.
83+
84+
#### 7. Use Cases
85+
- **GROUP BY**:
86+
- Summarizing data (e.g., average marks per city).
87+
- Creating reports with aggregated metrics (e.g., total students per age group).
88+
- Preparing data for further filtering with `HAVING`.
89+
- **HAVING**:
90+
- Filtering groups based on aggregates (e.g., cities with high average marks).
91+
- Refining reports to show only relevant groups (e.g., groups with more than one student).
92+
- Combining with `GROUP BY` for advanced analysis.
93+
94+
#### 8. As of 2025
95+
- Modern DBMS (e.g., SQL Server 2025, PostgreSQL 17) optimize `GROUP BY` with parallel processing for large datasets.
96+
- `HAVING` benefits from improved query planners, allowing complex aggregate conditions with better performance.
97+
- Some DBMS (e.g., PostgreSQL) support advanced grouping extensions like `GROUPING SETS` that work with `HAVING` for multi-level summaries.
98+
99+
#### Comparison Table
100+
101+
Here’s a concise table summarizing the key differences between `GROUP BY` and `HAVING`:
102+
103+
| **Aspect** | **GROUP BY** | **HAVING** |
104+
|---------------------------|------------------------------------------------------------------------------|---------------------------------------------------------------------------|
105+
| **Purpose** | Groups rows with identical values in specified columns for summarization. | Filters groups based on conditions involving aggregate functions. |
106+
| **Operates On** | Individual rows, organizing them into groups based on column values. | Grouped results after `GROUP BY`, using aggregate functions. |
107+
| **Query Position** | After `FROM` and `WHERE`, before `HAVING`. | After `GROUP BY`, before `ORDER BY`. |
108+
| **Conditions** | Defines groups (e.g., `GROUP BY city`); no direct conditions. | Uses aggregate conditions (e.g., `HAVING AVG(marks) > 80`). |
109+
| **Relation to WHERE** | Works with `WHERE` to filter rows before grouping. | Acts like `WHERE` for groups, applied after grouping. |
110+
| **Execution Order** | After `WHERE`, before `HAVING` in the query pipeline. | After `GROUP BY`, before `SELECT` in the query pipeline. |
111+
| **Typical Use Cases** | Summarize data (e.g., average marks by city). | Filter groups (e.g., cities with average marks > 80). |
112+
| **Dependencies** | Can be used without `HAVING`. | Requires `GROUP BY` in standard SQL. |
113+
114+
### 📘 Examples to Illustrate Differences
115+
116+
Let’s use the `students` table to show how `GROUP BY` and `HAVING` work together and differ. Assume the table has the following data:
117+
118+
| id | name | age | marks | city |
119+
|----|-------|-----|-------|--------|
120+
| 1 | Alice | 20 | 85 | Mumbai |
121+
| 2 | Bob | 22 | 92 | Mumbai |
122+
| 3 | Carol | 19 | 75 | Delhi |
123+
| 4 | Dave | 20 | 88 | Mumbai |
124+
125+
**Examples**:
126+
:::info
127+
<Tabs>
128+
<TabItem value="GROUP BY Alone" label="GROUP BY Alone">
129+
```sql title="Using GROUP BY to Summarize Data"
130+
SELECT city, AVG(marks) AS avg_marks
131+
FROM students
132+
GROUP BY city;
133+
```
134+
</TabItem>
135+
136+
<TabItem value="GROUP BY Output" label="Output">
137+
| city | avg_marks |
138+
|--------|-----------|
139+
| Mumbai | 88.33 |
140+
| Delhi | 75.0 |
141+
</TabItem>
142+
143+
<TabItem value="GROUP BY with HAVING" label="GROUP BY with HAVING">
144+
```sql title="Using GROUP BY and HAVING to Filter Groups"
145+
SELECT city, AVG(marks) AS avg_marks
146+
FROM students
147+
GROUP BY city
148+
HAVING AVG(marks) > 80;
149+
```
150+
</TabItem>
151+
152+
<TabItem value="HAVING Output" label="Output">
153+
| city | avg_marks |
154+
|--------|-----------|
155+
| Mumbai | 88.33 |
156+
</TabItem>
157+
158+
<TabItem value="GROUP BY with WHERE and HAVING" label="WHERE and HAVING">
159+
```sql title="Combining WHERE, GROUP BY, and HAVING"
160+
SELECT city, COUNT(id) AS student_count
161+
FROM students
162+
WHERE age > 19
163+
GROUP BY city
164+
HAVING COUNT(id) >= 2;
165+
```
166+
</TabItem>
167+
168+
<TabItem value="WHERE and HAVING Output" label="Output">
169+
| city | student_count |
170+
|--------|---------------|
171+
| Mumbai | 2 |
172+
</TabItem>
173+
</Tabs>
174+
:::
175+
176+
**Explanation of Examples**:
177+
- **GROUP BY Alone**: Groups students by `city` and calculates the average marks for each city. All cities appear in the result.
178+
- **GROUP BY with HAVING**: Adds a `HAVING` clause to filter groups, keeping only cities where the average marks exceed 80 (only Mumbai qualifies).
179+
- **WHERE and HAVING**: Uses `WHERE` to filter individual rows (age > 19) before grouping, then `GROUP BY` to group by city, and `HAVING` to keep only groups with at least two students.
180+
181+
### 📘 Key Rules and Best Practices
182+
183+
- **GROUP BY**:
184+
- Always list all non-aggregated columns in the `SELECT` clause in the `GROUP BY` clause (e.g., `SELECT city, AVG(marks)` requires `GROUP BY city`).
185+
- Use with aggregate functions like `COUNT`, `SUM`, `AVG`, `MAX`, `MIN`.
186+
- Can group by multiple columns (e.g., `GROUP BY city, age`).
187+
- **HAVING**:
188+
- Only use aggregate functions or columns listed in `GROUP BY` in the condition.
189+
- Place after `GROUP BY` in the query.
190+
- Use for group-level filtering, not row-level (use `WHERE` for that).
191+
- **Combining Them**:
192+
- Use `WHERE` to filter rows before grouping, `GROUP BY` to create groups, and `HAVING` to filter those groups.
193+
- Example: Filter students by age (`WHERE`), group by city (`GROUP BY`), then keep groups with high averages (`HAVING`).
194+
195+
> **What NOT to Do**:
196+
> - **GROUP BY**:
197+
- Don’t include non-aggregated columns in `SELECT` without adding them to `GROUP BY`—it causes errors in most DBMS (e.g., MySQL strict mode, PostgreSQL).
198+
- Don’t use `GROUP BY` without an aggregate function unless you want unique combinations (rare).
199+
- Don’t group by unnecessary columns—it increases query complexity and slows performance.
200+
- **HAVING**:
201+
- Don’t use `HAVING` for row-level filtering—use `WHERE` instead to filter before grouping for better performance.
202+
- Don’t use column aliases in `HAVING` (e.g., `HAVING avg_marks > 80`)—use the aggregate function directly (e.g., `HAVING AVG(marks) > 80`).
203+
- Don’t place `HAVING` before `GROUP BY`—it’s a syntax error.
204+
- **General**:
205+
- Don’t skip testing with small datasets; `GROUP BY` and `HAVING` can produce unexpected results with complex queries.
206+
- Don’t assume `HAVING` works without `GROUP BY`—it’s invalid in standard SQL.
207+
208+
### ✅ What You’ve Learned
209+
210+
You’re now a pro at understanding the differences between `GROUP BY` and `HAVING`! You’ve mastered:
211+
- **GROUP BY**: Groups rows by columns for summarization, used with aggregates like `AVG` or `COUNT`.
212+
- **HAVING**: Filters groups based on aggregate conditions, applied after `GROUP BY`.
213+
- **Key Differences**: Purpose, what they operate on, query position, conditions, and more, as summarized in the comparison table.
214+
- **Best Practices**: Use `WHERE` for row filtering, `GROUP BY` for grouping, and `HAVING` for group filtering in the correct order.
215+
216+
Practice these with the `students` table to create powerful summaries and reports. Follow the “What NOT to Do” tips to write efficient, error-free queries!

sidebars.ts

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -123,6 +123,7 @@ const sidebars: SidebarsConfig = {
123123
"sql/SQL-basics/filtering-data",
124124
"sql/SQL-basics/ordering-data",
125125
"sql/SQL-basics/grouping-data",
126+
"sql/SQL-basics/sql-having-vs-group-by",
126127
"sql/SQL-basics/the-inequality-operator",
127128
"sql/SQL-basics/sql-datatypes",
128129
"sql/SQL-basics/primary-key-foreign-key",

0 commit comments

Comments
 (0)