DataTalksClub
diff --git a/‎.cursor/commands/seo-description.md‎
Lines changed: 12 additions & 0 deletions b/‎.cursor/commands/seo-description.md‎
Lines changed: 12 additions & 0 deletions
diff --git a/‎.cursor/commands/seo-images.md‎
Lines changed: 10 additions & 0 deletions b/‎.cursor/commands/seo-images.md‎
Lines changed: 10 additions & 0 deletions
diff --git a/‎.cursor/commands/seo-title.md‎
Lines changed: 6 additions & 0 deletions b/‎.cursor/commands/seo-title.md‎
Lines changed: 6 additions & 0 deletions
diff --git a/‎_posts/2020-11-29-segmentation.md‎
Lines changed: 43 additions & 13 deletions b/‎_posts/2020-11-29-segmentation.md‎
Lines changed: 43 additions & 13 deletions
diff --git a/‎_posts/2020-12-07-practical-guide-better-code.md‎
Lines changed: 47 additions & 13 deletions b/‎_posts/2020-12-07-practical-guide-better-code.md‎
Lines changed: 47 additions & 13 deletions
@@ -0,0 +1,12 @@
+# Update Description
+
+Based on the page content, generate a SEO-optimized meta description:
+- Length: 140-155 characters
+- Formula Structure: Problem/Benefit + What's Inside + Soft CTA
+- No formatting/markup - plain text only
+- Primary keyword early (first 60 characters for search bolding)
+- Use action verbs: Learn, Discover, Master, Build, etc.
+- Benefit-focused with specific details
+- Match content accurately
+- Emotional triggers: essential, proven, expert, comprehensive, etc.
+- Compelling: Appeals to the target audience
@@ -0,0 +1,10 @@
+# Update Image Formatting
+
+For each image in the page, update the image formatting to be SEO-optimized:
+- Add all the image tags to the page for SEO and optimize them for SEO (src, alt, title, width and height to auto)
+- Optimize the image descriptions for SEO (alt, title, width, height)
+- Use formatting like here:
+```html
+loading="lazy" style="max-width: 100%; height: auto; border: 1px solid #ddd; border-radius: 4px;"
+```
+- If the name of the image is not descriptive, update the name of the image to be descriptive
@@ -0,0 +1,6 @@
+# Update Title
+
+Based on the page content, generate a SEO-optimized title:
+- Should be under 100 characters
+- Focus primarily on the main themes and topics in the page content
+- Make it clear, keyword-rich, and engaging.
@@ -1,8 +1,8 @@
 ---
 layout: post
-title: "Not A Regular RFM Analysis"
-subtitle: "Why limit to Recency, Frequency and Monetary measures during Customer Segmentation?"
-description: "Customer segmentation with limited data: learn a proven 5D RFM+ approach using k-means to segment responders/non-responders and drive targeted in-game marketing."
+title: "Customer Segmentation with RFM+ and K-Means: 7 Segments from Gaming Data"
+subtitle: "Build a 5D RFM+ framework, engineer metrics, and segment responders/non-responders with k-means to power targeted in‑game marketing"
+description: "Customer segmentation with limited data. Learn a 5D RFM+ framework, engineer metrics, and use k-means to create 7 segments—apply insights now."
 image: "images/posts/2020-11-29-segmentation/cover.jpg"
 authors: [nishantmohan]
 tags: [analytics, clustering]
@@ -35,7 +35,10 @@ So let's start, shall we!?
 
 Let's take a quick look at the available features.
 
-<img src="/images/posts/2020-11-29-segmentation/data.jpg" />
+<figure>
+<img src="/images/posts/2020-11-29-segmentation/data.jpg" alt="Sample of gaming user-level dataset with purchase dates for base game, expansion packs, and downloadable content" title="User-Level Gaming Dataset Features" loading="lazy" style="max-width: 100%; height: auto; border: 1px solid #ddd; border-radius: 4px;" />
+<figcaption><p>Snapshot of available features used for segmentation: base game, expansions, and DLC install dates</p></figcaption>
+</figure>
 
 So the last 8 features are the names of either an expansion pack of the game or a downloadable content. The dataset has 500k rows. That's good because it means we can make more segments, right!?
 
@@ -53,7 +56,10 @@ I tag the users as responders or non-responders based on whether they buy any ad
 
 Now I can begin defining my key metrics for segmenting the responders:
 
-<img src="/images/posts/2020-11-29-segmentation/recency.jpg" />
+<figure>
+<img src="/images/posts/2020-11-29-segmentation/recency.jpg" alt="Recency distribution showing user activity recency across years with higher activity in 2019" title="Recency Distribution of Player Activity" loading="lazy" style="max-width: 100%; height: auto; border: 1px solid #ddd; border-radius: 4px;" />
+<figcaption><p>Recency metric: days since last activity, highlighting more recent engagement in 2019</p></figcaption>
+</figure>
 
 
 ### Recency
@@ -62,7 +68,10 @@ This is the number of days passed since the user was seen active on the gaming p
 
 The chart shows that more users have been active in 2019, as compared to the users in 2017.
 
-<img src="/images/posts/2020-11-29-segmentation/frequency.jpg" />
+<figure>
+<img src="/images/posts/2020-11-29-segmentation/frequency.jpg" alt="Frequency distribution of days played since installation, skewed toward fewer active days" title="Frequency of Gameplay Days" loading="lazy" style="max-width: 100%; height: auto; border: 1px solid #ddd; border-radius: 4px;" />
+<figcaption><p>Frequency metric: number of active days since install, skewed toward fewer days for most players</p></figcaption>
+</figure>
 
 ### Frequency
 
@@ -71,7 +80,10 @@ Since the day a player installed the game, how many days did he play the game?
 The chart is concentrated towards left, meaning that most players are active for lesser days. However, it should be noted that new players have less number of days where they could be active, as compared to older players.
 
 
-<img src="/images/posts/2020-11-29-segmentation/monetary-value.png" />
+<figure>
+<img src="/images/posts/2020-11-29-segmentation/monetary-value.png" alt="Monetary value distribution of player spending based on mapped add-on prices" title="Monetary Value of Player Spend" loading="lazy" style="max-width: 100%; height: auto; border: 1px solid #ddd; border-radius: 4px;" />
+<figcaption><p>Monetary value metric: spend estimated by mapping store prices to user add-on purchases</p></figcaption>
+</figure>
 
 
 ### Monetary Value
@@ -80,7 +92,10 @@ Since this information is not available in the data, I went to the game store we
 
 Most players spend less than a hundred bucks. This is expected because the base game costs 55 bucks. And the downloadable content is generally cheap!
 
-<img src="/images/posts/2020-11-29-segmentation/responses.png" />
+<figure>
+<img src="/images/posts/2020-11-29-segmentation/responses.png" alt="Distribution of number of add-ons purchased per player showing most buyers purchase one" title="Responses: Add-ons Purchased per Player" loading="lazy" style="max-width: 100%; height: auto; border: 1px solid #ddd; border-radius: 4px;" />
+<figcaption><p>Responses metric: count of prior add-on purchases per player; most buyers purchase only one</p></figcaption>
+</figure>
 
 
 ### Responses
@@ -89,7 +104,10 @@ How many add-ons did the player buy previously? This will not be correlated with
 
 It can be seen that most people who bought any add-on, only bought one.
 
-<img src="/images/posts/2020-11-29-segmentation/purchase-frequency.png" />
+<figure>
+<img src="/images/posts/2020-11-29-segmentation/purchase-frequency.png" alt="Histogram of purchase intervals showing peaks near expansion launch windows" title="Purchase Frequency Over Time" loading="lazy" style="max-width: 100%; height: auto; border: 1px solid #ddd; border-radius: 4px;" />
+<figcaption><p>Purchase frequency metric: intervals between purchases with peaks around expansion release periods</p></figcaption>
+</figure>
 
 
 ### Purchase Frequency
@@ -104,7 +122,10 @@ While most players buy everything soon after they buy the game, we see other hig
 
 Using the 5 key metrics, I apply k-means clustering to segment the users.
 
-<img src="/images/posts/2020-11-29-segmentation/elbow.jpg" />
+<figure>
+<img src="/images/posts/2020-11-29-segmentation/elbow.jpg" alt="Elbow method chart indicating optimal k around five clusters for k-means" title="Elbow Method for Optimal k" loading="lazy" style="max-width: 100%; height: auto; border: 1px solid #ddd; border-radius: 4px;" />
+<figcaption><p>Elbow plot suggests k=5 as a balanced choice for k-means clustering complexity and cohesion</p></figcaption>
+</figure>
 
 Looking at the chart, I select 5 as the optimum number of clusters/segments. This gives me a balance between homogeneity within clusters and complexity of the analysis.
 
@@ -113,23 +134,32 @@ Looking at the chart, I select 5 as the optimum number of clusters/segments. Thi
 
 Since these are the users who have not interacted much, we only have two measures to judge them: Recency and Frequency.
 
-<img src="/images/posts/2020-11-29-segmentation/recency-vs-frequency.jpg" />
+<figure>
+<img src="/images/posts/2020-11-29-segmentation/recency-vs-frequency.jpg" alt="Scatter plot of recency versus frequency used to segment non-responders by activity threshold" title="Recency vs Frequency for Non-Responders" loading="lazy" style="max-width: 100%; height: auto; border: 1px solid #ddd; border-radius: 4px;" />
+<figcaption><p>Non-responder segmentation using a recency threshold to separate recently active from lapsed users</p></figcaption>
+</figure>
 
 As can be seen in the above chart, I segment such users by a threshold of 1000 days. That is, those who have been active in last 200 days are in Cluster 6, others are in Cluster 5 (Cluster 0–4 being the responders).
 
 ## Analysis and Strategy
 
 Following table gives means of all the features across the user segments.
 
-<img src="/images/posts/2020-11-29-segmentation/segments.jpg" />
+<figure>
+<img src="/images/posts/2020-11-29-segmentation/segments.jpg" alt="Table of means for key metrics across identified customer segments" title="Segment Means Across Metrics" loading="lazy" style="max-width: 100%; height: auto; border: 1px solid #ddd; border-radius: 4px;" />
+<figcaption><p>Summary statistics by segment for recency, frequency, responses, monetary value, and purchase cadence</p></figcaption>
+</figure>
 
 Look at the first row. On average, players in Cluster 0 were active for nearly 15 days, bought 1.5 add-ons, were active 477 days from the beginning (long back), spent 65 bucks, and purchased an add-on every 33 days. Since these were active long back, they have probably forgotten about the game. So, in-game marketing may not work on them! On the other hand, email marketing might!
 
 Now look at the second row. On average, players in Cluster 1 were active for a whopping 92 days, bought nearly 3 add-ons, were active fairly recently, have spent much more than others have, but purchase relatively rarely. These could be the players who have recently bought an add-on. These are the customers who seem to be loyal. We could target them with more exciting features!
 
 Following figure gives similar summary of each cluster/segment.
 
-<img src="/images/posts/2020-11-29-segmentation/strategy.jpg" />
+<figure>
+<img src="/images/posts/2020-11-29-segmentation/strategy.jpg" alt="Per-segment strategy summary visualization guiding targeted marketing actions" title="Per-Segment Strategy Overview" loading="lazy" style="max-width: 100%; height: auto; border: 1px solid #ddd; border-radius: 4px;" />
+<figcaption><p>Actionable strategy guidance for each segment to tailor in-game and email marketing</p></figcaption>
+</figure>
 
 ## Conclusion
 
 
@@ -1,7 +1,8 @@
 ---
 layout: post
-title: "A practical guide for better-looking python code"
-description: "Setting up a CI/CD pipeline using GitHub"
+title: "Python CI/CD with GitHub Actions: Pre-commit, Linters, and Pytest Guide"
+subtitle: "Step-by-step workflow to secure branches, automate linting, and run tests using GitHub Actions, pre-commit, black/isort/flake8/mypy, and pytest."
+description: "Python CI/CD with GitHub Actions: Discover branch protection, pre-commit, black, isort, flake8, mypy, and pytest to enforce essential, tested code—start now."
 image: "images/posts/2020-12-07-practical-guide-better-code/cover.jpg"
 authors: [olegpolivin]
 tags: [github, python, cicd]
@@ -34,7 +35,10 @@ I create an empty repository to illustrate how one sets up a CI/CD pipeline step
 git clone https://github.com/olegpolivin/Fizz-Buzz-CI-CD.git
 ```
 
-<img src="/images/posts/2020-12-07-practical-guide-better-code/empty-repo.png" />
+<figure>
+<img src="/images/posts/2020-12-07-practical-guide-better-code/empty-repo.png" alt="New GitHub repository with only README on main branch" title="Empty Repository on GitHub" loading="lazy" style="max-width: 100%; height: auto; border: 1px solid #ddd; border-radius: 4px;" />
+<figcaption><p>Starting point: an empty repo with a single README on main</p></figcaption>
+</figure>
 
 
 ### Rules for branches
@@ -43,7 +47,10 @@ git clone https://github.com/olegpolivin/Fizz-Buzz-CI-CD.git
 As usual I can work on the code, and then push to the `main` branch. That’s what I want to prohibit.
 Go to the `Settings` menu for a given repo and choose `Branches`.
 
-<img src="/images/posts/2020-12-07-practical-guide-better-code/branches.png" />
+<figure>
+<img src="/images/posts/2020-12-07-practical-guide-better-code/branches.png" alt="GitHub settings page showing Branches section for adding protection rules" title="GitHub Branches Settings" loading="lazy" style="max-width: 100%; height: auto; border: 1px solid #ddd; border-radius: 4px;" />
+<figcaption><p>Navigate to Settings → Branches to configure protection rules</p></figcaption>
+</figure>
 
 
 There are two ways to prevent pushing to the main branch, and you can choose it in the Add rule section. They are:
@@ -58,7 +65,10 @@ However, indeed, this will prevent you from pushing to `main` branch, but you ca
 
 Click on `Add rule`, and here is the rule that I’ve added:
 
-<img src="/images/posts/2020-12-07-practical-guide-better-code/branch-protection.png" />
+<figure>
+<img src="/images/posts/2020-12-07-practical-guide-better-code/branch-protection.png" alt="Add branch protection rule modal with required status checks and include administrators" title="Add Branch Protection Rule" loading="lazy" style="max-width: 100%; height: auto; border: 1px solid #ddd; border-radius: 4px;" />
+<figcaption><p>Add a protection rule for main with required status checks and admin inclusion</p></figcaption>
+</figure>
 
 
 In particular, I have added:
@@ -125,7 +135,10 @@ Creating a pull request will run the script above. Pull request will always pass
 
 It is necessary just to add some modifications to the `Settings -> Branches -> Rules part`. See what’s new:
 
-<img src="/images/posts/2020-12-07-practical-guide-better-code/branch-protection-rule.png" />
+<figure>
+<img src="/images/posts/2020-12-07-practical-guide-better-code/branch-protection-rule.png" alt="List of branch protection rules showing required check build (3.7)" title="Branch Protection Rule with Required Check" loading="lazy" style="max-width: 100%; height: auto; border: 1px solid #ddd; border-radius: 4px;" />
+<figcaption><p>Required check “build (3.7)” appears after configuring the workflow</p></figcaption>
+</figure>
 
 Notice that `build (3.7)` has appeared among status checks. This corresponds to the name of the job (`build`) and python version `3.7`. I made a small modification to the `README.md` file, and let’s see if I can push it now to the main branch. Here is the error I get:
 
@@ -151,13 +164,19 @@ git push origin dev
 A new branch called `dev` is created on the remote repository. What’s left is to create a pull request, and merge it to the `main` branch.
 
 
-<img src="/images/posts/2020-12-07-practical-guide-better-code/pull-request.png" />
+<figure>
+<img src="/images/posts/2020-12-07-practical-guide-better-code/pull-request.png" alt="GitHub pull request UI ready to merge after checks" title="Pull Request Flow on GitHub" loading="lazy" style="max-width: 100%; height: auto; border: 1px solid #ddd; border-radius: 4px;" />
+<figcaption><p>Create a PR from your feature branch to main to trigger checks</p></figcaption>
+</figure>
 
 
 It becomes possible to merge after all checks are run:
 
 
-<img src="/images/posts/2020-12-07-practical-guide-better-code/status-check-passed.png" />
+<figure>
+<img src="/images/posts/2020-12-07-practical-guide-better-code/status-check-passed.png" alt="GitHub PR showing all status checks have passed" title="Status Checks Passed" loading="lazy" style="max-width: 100%; height: auto; border: 1px solid #ddd; border-radius: 4px;" />
+<figcaption><p>All required checks pass—your PR is ready to merge</p></figcaption>
+</figure>
 
 
 We would like to introduce actions or tests to be performed, before the pull request is ready to be approved, so let’s provide code that will be actually checked. We will consider solving the `FizzBuzz` problem, see the next section.
@@ -264,7 +283,10 @@ jobs:
 
 Let’s now try to push the solution above to the repository.
 
-<img src="/images/posts/2020-12-07-practical-guide-better-code/fail.png" />
+<figure>
+<img src="/images/posts/2020-12-07-practical-guide-better-code/fail.png" alt="GitHub Actions CI job failing due to linter or formatter issues" title="CI Job Failing Example" loading="lazy" style="max-width: 100%; height: auto; border: 1px solid #ddd; border-radius: 4px;" />
+<figcaption><p>Example of a failing CI run—fix issues locally and push updates</p></figcaption>
+</figure>
 
 
 And we see that it fails on the first check. When it fails it does not proceed to the next steps, but it turns out that the code above for solving the `FizzBuzz` problem will fail on every check.
@@ -360,12 +382,18 @@ After the file is created in the repository, run `pre-commit install` to install
 Here is a small test: let’s change the neat `fizzbuzz.py` code to get back to the one that does not pass the checks and see what happens. Here is a part of the result: we see where it fails. Note that the pre-commit hook modifies files for some commands (like black or isort).
 
 
-<img src="/images/posts/2020-12-07-practical-guide-better-code/pre-commit.png" />
+<figure>
+<img src="/images/posts/2020-12-07-practical-guide-better-code/pre-commit.png" alt="Terminal output showing pre-commit hooks failing on formatting and linting" title="Pre-commit Hooks Catch Issues" loading="lazy" style="max-width: 100%; height: auto; border: 1px solid #ddd; border-radius: 4px;" />
+<figcaption><p>Pre-commit prevents bad commits by running formatters and linters before commit</p></figcaption>
+</figure>
 
 Coming back to the neat version of the `fizzbuzz.py`, the pre-commit hook test is passed. That’s how it looks like in my case:
 
 
-<img src="/images/posts/2020-12-07-practical-guide-better-code/pre-commit-pass.png" />
+<figure>
+<img src="/images/posts/2020-12-07-practical-guide-better-code/pre-commit-pass.png" alt="Terminal output showing all pre-commit hooks passing" title="Pre-commit Hooks Passing" loading="lazy" style="max-width: 100%; height: auto; border: 1px solid #ddd; border-radius: 4px;" />
+<figcaption><p>All hooks pass—your changes are clean and consistent</p></figcaption>
+</figure>
 
 Nice!
 
@@ -425,7 +453,10 @@ Append the code below to the `ci.yml` file:
 
 And here is the result:
 
-<img src="/images/posts/2020-12-07-practical-guide-better-code/test-pass.png" />
+<figure>
+<img src="/images/posts/2020-12-07-practical-guide-better-code/test-pass.png" alt="GitHub Actions run with pytest tests passing" title="Pytest Suite Passing in CI" loading="lazy" style="max-width: 100%; height: auto; border: 1px solid #ddd; border-radius: 4px;" />
+<figcaption><p>Unit tests executed by pytest pass successfully in CI</p></figcaption>
+</figure>
 
 
 But that was the case when everything is ok. We are happy.
@@ -452,7 +483,10 @@ def fizz_buzz(num: int) -> str:
 
 Great, let’s push and see that one test has failed:
 
-<img src="/images/posts/2020-12-07-practical-guide-better-code/test-fail.png" />
+<figure>
+<img src="/images/posts/2020-12-07-practical-guide-better-code/test-fail.png" alt="GitHub Actions run showing a failing pytest test case" title="Pytest Failure Caught in CI" loading="lazy" style="max-width: 100%; height: auto; border: 1px solid #ddd; border-radius: 4px;" />
+<figcaption><p>Failing test demonstrates how CI guards against regressions before merge</p></figcaption>
+</figure>
 
 
 That is, by introducing unit tests into the CI/CD pipeline we were able to catch the problem before merging pull request into the `main` branch.