-
Notifications
You must be signed in to change notification settings - Fork 359
4146 add troubleshooting section #4181
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Changes from 12 commits
32feb0b
60f3fbe
c619204
aa15a8a
0c7d189
7c94a46
9166323
2bbe1bc
bf67a27
e10f908
dc751b2
0ba3eb4
e9a6adf
0f7d07e
7159177
03abb7d
8e8fd7b
4867bbe
ec2036f
de5e966
21e518c
33935f4
f3806da
d8e3ca2
395d484
aa308be
bc0e0eb
1dd1100
f37c70a
c0cf0fa
d546215
6e98868
12fb3ef
500d515
fc5d1a1
267527c
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,57 @@ | ||
--- | ||
sidebar_position: 1 | ||
slug: /tips-and-tricks/community-wisdom | ||
sidebar_label: 'Community Wisdom' | ||
doc_type: 'overview' | ||
keywords: [ | ||
'database tips', | ||
'community wisdom', | ||
'production troubleshooting', | ||
'performance optimization', | ||
'database debugging', | ||
'clickhouse guides', | ||
'real world examples', | ||
'database best practices', | ||
'meetup insights', | ||
'production lessons', | ||
'interactive tutorials', | ||
'database solutions' | ||
] | ||
title: 'ClickHouse Community Wisdom' | ||
description: 'Learn from the ClickHouse community with real world scenarios and lessons learned' | ||
--- | ||
|
||
# ClickHouse Community Wisdom: Tips and Tricks from Meetups {#community-wisdom} | ||
|
||
*These interactive guides represent collective wisdom from hundreds of production deployments. Each runnable example helps you understand ClickHouse patterns using real GitHub events data - practice these concepts to avoid common mistakes and accelerate your success.* | ||
|
||
Combine this collected knowledge with our [Best Practices](/best-practices) guide for optimal ClickHouse Experience. | ||
|
||
## Problem-Specific Quick Jumps {#problem-specific-quick-jumps} | ||
|
||
| Issue | Document | Description | | ||
|-------|---------|-------------| | ||
| **Production Issue** | [Debugging-Toolkit](./debugging-toolkit.md) | Copy/Paste Queries, production debugging guidance | | ||
| **Slow Queries** | [Performance Optimization](./performance-optimization.md) | Optimize Performance | | ||
| **Materialized Views** | [MV Double-Edged Sword](./materialized-views.md) | Avoid 10x storage instances | | ||
| **Too Many Parts** | [Too Many Parts](./too-many-parts.md) | Addressing the 'Too Many Parts' error and performance slowdown | | ||
| **High Costs** | [Cost Optimization](./cost-optimization.md) | Optimize Cost | | ||
| **Creative Use Cases** | [Success Stories](./creative-usecases.md) | Examples of ClickHouse in 'Outside the Box' use cases | | ||
|
||
### Usage Instructions {#usage-instructions} | ||
|
||
1. **Run the examples** - Many SQL blocks executable | ||
2. **Experiment freely** - Modify queries to test different patterns | ||
3. **Adapt to your data** - Use templates with your own table names | ||
4. **Monitor regularly** - Implement health check queries as ongoing monitoring | ||
5. **Learn progressively** - Start with basics, advance to optimization patterns | ||
|
||
### Interactive Features {#interactive-features} | ||
|
||
- **Real Data Examples**: Using actual GitHub events from ClickHouse playground | ||
- **Production-Ready Templates**: Adapt examples for your production systems | ||
- **Progressive Difficulty**: From basic concepts to advanced optimization | ||
- **Emergency Procedures**: Ready-to-use debugging and recovery queries | ||
|
||
**Last Updated:** Based on community meetup insights through 2024-2025 | ||
**Contributing:** Found a mistake or have a new lesson? Community contributions welcome |
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,159 @@ | ||
--- | ||
sidebar_position: 1 | ||
slug: /community-wisdom/cost-optimization | ||
sidebar_label: 'Performance Optimization' | ||
doc_type: 'how-to-guide' | ||
keywords: [ | ||
'cost optimization', | ||
'storage costs', | ||
'partition management', | ||
'data retention', | ||
'storage analysis', | ||
'database optimization', | ||
'clickhouse cost reduction', | ||
'storage hot spots', | ||
'ttl performance', | ||
'disk usage', | ||
'compression strategies', | ||
'retention analysis' | ||
] | ||
title: 'Lessons - Cost Optimization' | ||
description: 'Find solutions to the most common ClickHouse problems including slow queries, memory errors, connection issues, and configuration problems.' | ||
--- | ||
|
||
# Cost Optimization: Battle-Tested Strategies {#cost-optimization} | ||
*This guide is part of a collection of findings gained from community meetups. For more real world solutions and insights you can [browse by specific problem](./community-wisdom.md).* | ||
*Want to learn about creative use cases for ClickHouse? Check out the [Creative Use Cases](./creative-usecases.md) community insights guide.* | ||
dhtclk marked this conversation as resolved.
Show resolved
Hide resolved
|
||
|
||
## The Partition Deletion vs TTL Discovery {#partition-vs-ttl} | ||
|
||
**Hard-learned lesson from production:** TTL mutations are resource-intensive and slow down everything. | ||
|
||
*"Don't try to mutate data if there isn't a world where you absolutely need to... when you mutate data ClickHouse creates a new version of the data and then it merges it with the existing data... it's resource intensive... significantly significant performance impact"* | ||
|
||
**Better strategy:** Delete entire partitions instead of TTL row-by-row deletion. | ||
|
||
```sql runnable editable | ||
-- Challenge: Adjust the month thresholds (3 months, 1 month) based on your retention needs | ||
-- Experiment: Try different partition patterns like weekly or daily instead of monthly | ||
SELECT | ||
toYYYYMM(created_at) as year_month, | ||
count() as events, | ||
min(created_at) as oldest_event, | ||
max(created_at) as newest_event, | ||
formatReadableSize(count() * 200) as estimated_size_bytes, | ||
CASE | ||
WHEN toYYYYMM(created_at) < toYYYYMM(now()) - 3 | ||
THEN 'DELETE PARTITION - older than 3 months' | ||
WHEN toYYYYMM(created_at) < toYYYYMM(now()) - 1 | ||
THEN 'ARCHIVE CANDIDATE - 1-3 months old' | ||
ELSE 'KEEP - recent data' | ||
END as retention_strategy | ||
FROM github.github_events | ||
WHERE created_at >= '2023-01-01' | ||
GROUP BY year_month | ||
ORDER BY year_month DESC | ||
LIMIT 12; | ||
``` | ||
|
||
## Storage Hot Spots Analysis {#storage-hot-spots} | ||
|
||
**Find your biggest storage consumers:** Identify which columns and patterns drive your storage costs. | ||
|
||
```sql runnable editable | ||
-- Challenge: Replace column names with your own table's columns to find storage hot spots | ||
-- Experiment: Try different size thresholds (50MB) and repetition factors (10, 3, 5) | ||
SELECT | ||
column_name, | ||
total_size_mb, | ||
unique_values, | ||
repetition_factor, | ||
storage_efficiency, | ||
optimization_priority | ||
FROM ( | ||
SELECT | ||
'repo_name' as column_name, | ||
round(sum(length(repo_name)) / 1024 / 1024, 2) as total_size_mb, | ||
count(DISTINCT repo_name) as unique_values, | ||
round(count() / count(DISTINCT repo_name), 1) as repetition_factor, | ||
CASE | ||
WHEN count() / count(DISTINCT repo_name) > 10 THEN 'HIGH compression potential' | ||
WHEN count() / count(DISTINCT repo_name) > 3 THEN 'MEDIUM compression potential' | ||
ELSE 'LOW compression potential' | ||
END as storage_efficiency, | ||
CASE | ||
WHEN round(sum(length(repo_name)) / 1024 / 1024, 2) > 50 AND count() / count(DISTINCT repo_name) > 5 | ||
THEN 'OPTIMIZE FIRST - large + repetitive' | ||
WHEN round(sum(length(repo_name)) / 1024 / 1024, 2) > 50 | ||
THEN 'SIZE CONCERN - consider retention' | ||
ELSE 'LOW PRIORITY' | ||
END as optimization_priority | ||
FROM github.github_events | ||
WHERE created_at >= '2024-01-01' AND created_at < '2024-01-08' | ||
|
||
UNION ALL | ||
|
||
SELECT | ||
'actor_login', | ||
round(sum(length(actor_login)) / 1024 / 1024, 2), | ||
count(DISTINCT actor_login), | ||
round(count() / count(DISTINCT actor_login), 1), | ||
CASE | ||
WHEN count() / count(DISTINCT actor_login) > 10 THEN 'HIGH compression potential' | ||
WHEN count() / count(DISTINCT actor_login) > 3 THEN 'MEDIUM compression potential' | ||
ELSE 'LOW compression potential' | ||
END, | ||
CASE | ||
WHEN round(sum(length(actor_login)) / 1024 / 1024, 2) > 50 AND count() / count(DISTINCT actor_login) > 5 | ||
THEN 'OPTIMIZE FIRST - large + repetitive' | ||
WHEN round(sum(length(actor_login)) / 1024 / 1024, 2) > 50 | ||
THEN 'SIZE CONCERN - consider retention' | ||
ELSE 'LOW PRIORITY' | ||
END | ||
FROM github.github_events | ||
WHERE created_at >= '2024-01-01' AND created_at < '2024-01-08' | ||
) | ||
ORDER BY total_size_mb DESC; | ||
``` | ||
|
||
## Cost-Driven Retention Analysis {#cost-driven-retention} | ||
|
||
**Real production strategy:** *"Once we get this kind of deletion signal... we do the row based deletion... we know what needs to be deleted and keep on tracking"* | ||
|
||
```sql runnable editable | ||
-- Challenge: Modify the age thresholds (7, 30, 90 days) to match your business needs | ||
-- Experiment: Try different retention strategies for each temperature tier | ||
SELECT | ||
data_temperature, | ||
count() as event_count, | ||
round(count() * 100.0 / sum(count()) OVER(), 2) as percentage_of_total, | ||
formatReadableSize(count() * 200) as estimated_storage_size, | ||
retention_strategy | ||
FROM ( | ||
SELECT | ||
CASE | ||
WHEN dateDiff('day', created_at, now()) <= 7 THEN 'Hot Data (0-7 days)' | ||
WHEN dateDiff('day', created_at, now()) <= 30 THEN 'Warm Data (8-30 days)' | ||
WHEN dateDiff('day', created_at, now()) <= 90 THEN 'Cool Data (31-90 days)' | ||
ELSE 'Cold Data (90+ days)' | ||
END as data_temperature, | ||
CASE | ||
WHEN dateDiff('day', created_at, now()) <= 7 THEN 'Keep all columns - high query value' | ||
WHEN dateDiff('day', created_at, now()) <= 30 THEN 'Consider column-based TTL for large fields' | ||
WHEN dateDiff('day', created_at, now()) <= 90 THEN 'Drop expensive columns, keep core data' | ||
ELSE 'DELETE PARTITION - storage cost > query value' | ||
END as retention_strategy, | ||
CASE | ||
WHEN dateDiff('day', created_at, now()) <= 7 THEN 1 | ||
WHEN dateDiff('day', created_at, now()) <= 30 THEN 2 | ||
WHEN dateDiff('day', created_at, now()) <= 90 THEN 3 | ||
ELSE 4 | ||
END as sort_order | ||
FROM github.github_events | ||
WHERE created_at >= '2023-01-01' | ||
) | ||
GROUP BY data_temperature, retention_strategy, sort_order | ||
ORDER BY sort_order; | ||
``` | ||
|
||
**The key insight:** Instead of deleting entire rows, strategically drop the expensive columns first while preserving the essential data structure for longer periods. This can save "several terabytes" as Displayce discovered. |
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,136 @@ | ||
--- | ||
sidebar_position: 1 | ||
slug: /community-wisdom/creative-use-cases | ||
sidebar_label: 'Creative Use Cases' | ||
doc_type: 'how-to-guide' | ||
keywords: [ | ||
'clickhouse creative use cases', | ||
'clickhouse success stories', | ||
'unconventional database uses', | ||
'clickhouse rate limiting', | ||
'analytics database applications', | ||
'clickhouse mobile analytics', | ||
'customer-facing analytics', | ||
'database innovation', | ||
'clickhouse real-time applications', | ||
'alternative database solutions', | ||
'breaking database conventions', | ||
'production success stories' | ||
] | ||
title: 'Lessons - Creative Use Cases' | ||
description: 'Find solutions to the most common ClickHouse problems including slow queries, memory errors, connection issues, and configuration problems.' | ||
--- | ||
|
||
# Breaking the Rules: Success Stories {#breaking-the-rules} | ||
*This guide is part of a collection of findings gained from community meetups. For more real world solutions and insights you can [browse by specific problem](./community-wisdom.md).* | ||
*Need tips on debugging an issue in prod? Check out the [Debugging Toolkit](./debugging-toolkit.md) community insights guide.* | ||
|
||
## ClickHouse as Rate Limiter (Craigslist Story) {#clickhouse-rate-limiter} | ||
|
||
**Conventional wisdom:** Use Redis for rate limiting. | ||
|
||
**Craigslist's breakthrough:** *"Everyone uses Redis for rate limiter implementations... Why not just do it in Redis?"* | ||
|
||
**The problem with Redis:** *"Our experience with Redis is not like what you've seen in the movies... weird maintenance issues... we will reboot a node in a Redis cluster and some weird latency spike hits the front end"* | ||
|
||
**Test rate limiting logic using ClickHouse approach:** | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more.
Also feel that if we're referencing specific companies and quoting from the videos we should probably link to the meetup videos. |
||
|
||
```sql runnable editable | ||
-- Challenge: Try different rate limit thresholds (100, 50) or time windows (hour vs minute) | ||
-- Experiment: Test with different user patterns by changing the HAVING clause | ||
SELECT | ||
actor_login as user_id, | ||
toStartOfHour(created_at) as hour, | ||
count() as requests_per_hour, | ||
CASE | ||
WHEN count() > 100 THEN 'RATE_LIMITED' | ||
WHEN count() > 50 THEN 'WARNING' | ||
ELSE 'ALLOWED' | ||
END as rate_limit_status | ||
FROM github.github_events | ||
WHERE created_at >= '2024-01-15' | ||
AND created_at < '2024-01-16' | ||
GROUP BY actor_login, hour | ||
HAVING count() > 10 | ||
ORDER BY requests_per_hour DESC | ||
LIMIT 20; | ||
``` | ||
|
||
**Results:** *"Running untouched for nearly a year without any alert"* - a dramatic improvement over Redis infrastructure. | ||
|
||
**Why it works:** | ||
- Incredible write performance for access log data | ||
- Built-in TTL for automatic cleanup | ||
- SQL flexibility for complex rate limiting rules | ||
- No Redis cluster maintenance headaches | ||
|
||
## Mobile Analytics: The 7-Eleven Success Story {#mobile-analytics} | ||
|
||
**Conventional wisdom:** Analytics databases aren't for mobile applications. | ||
|
||
**The reality:** *"People out in the factory floors... people out in health care facilities construction sites... they like to be able to look at reports... to sit at a computer at a desktop... is just not optimal"* | ||
|
||
**7-Eleven's breakthrough:** Store managers using ClickHouse-powered analytics on mobile devices. | ||
|
||
```sql runnable editable | ||
-- Challenge: Modify this to show weekly or monthly patterns instead of daily | ||
-- Experiment: Add different metrics like peak activity hours or user retention patterns | ||
SELECT | ||
'Daily Sales Summary' as report_type, | ||
toDate(created_at) as date, | ||
count() as total_transactions, | ||
uniq(actor_login) as unique_customers, | ||
round(count() / uniq(actor_login), 1) as avg_transactions_per_customer, | ||
'Perfect for mobile dashboard' as mobile_optimized | ||
FROM github.github_events | ||
WHERE created_at >= today() - 7 | ||
GROUP BY date | ||
ORDER BY date DESC; | ||
``` | ||
|
||
**The use case:** *"The person who runs a store they're going back and forth between the stock room out to the front into the register and then going between stores"* | ||
|
||
**Success metrics:** | ||
- Daily sales by store (corporate + franchise) | ||
- Out-of-stock alerts in real-time | ||
- *"Full feature capability between your phone and your desktop"* | ||
|
||
## Customer-Facing Real-Time Applications {#customer-facing-applications} | ||
|
||
**Conventional wisdom:** ClickHouse is for internal analytics, not customer-facing apps. | ||
|
||
**ServiceNow's reality:** *"We offer an analytic solution both for internal needs and for customers across web mobile and chatbots"* | ||
|
||
**The breakthrough insight:** *"It enables you to build applications that are highly responsive... customer facing applications... whether they're web apps or mobile apps"* | ||
|
||
```sql runnable editable | ||
-- Challenge: Try different segmentation approaches like geographic or time-based grouping | ||
-- Experiment: Add percentage calculations or ranking functions for customer insights | ||
SELECT | ||
'Customer Segmentation' as feature, | ||
event_type as segment, | ||
count() as segment_size, | ||
round(count() * 100.0 / sum(count()) OVER(), 1) as percentage, | ||
'Real-time customer insights' as value_proposition | ||
FROM github.github_events | ||
WHERE created_at >= '2024-01-01' | ||
AND created_at < '2024-01-02' | ||
GROUP BY event_type | ||
ORDER BY segment_size DESC; | ||
``` | ||
|
||
**Why this breaks conventional rules:** | ||
- **Real-time customer segmentation:** *"Give customers the ability to real-time segments the data and dynamically slicing"* | ||
- **User expectations:** *"In 2024 we have been very much trained to expect a certain degree of responsiveness"* | ||
- **Retention impact:** *"If that repeats often enough you're either not going to come back"* | ||
|
||
**Success pattern:** ClickHouse's speed enables customer-facing applications with sub-second response times, challenging the notion that analytical databases are only for internal use. | ||
|
||
### The Rule-Breaking Philosophy {#rule-breaking-philosophy} | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Should this be H3 as a sub point of "Customer-Facing Real-Time Applications"? |
||
|
||
**Common thread:** These successes came from questioning assumptions: | ||
- *"I asked my boss like what do you think of this idea maybe I can try this with ClickHouse"* - Craigslist | ||
- *"Mobile first actually became a big part of how we thought about this"* - Mobile analytics pioneers | ||
- *"We wanted to give customers the ability to... slice and dice everything as much as they wanted"* - ServiceNow | ||
|
||
**The lesson:** Sometimes the "wrong" tool for the job becomes the right tool when you understand its strengths and design around them. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We had taken to sentence casing, which is how Google does it https://developers.google.com/style/capitalization. I'm not strongly opinionated on which style we have but it will be nice to keep it consistent.