-
Notifications
You must be signed in to change notification settings - Fork 41
Description
name: Good First Issue
about: A beginner-friendly task perfect for first-time contributors
title: '[GOOD FIRST ISSUE] Refactor DataSet class: Move large methods into smaller helper functions'
labels: 'good first issue'
assignees: ''
Welcome! π
This is a beginner-friendly issue perfect for first-time contributors to the Intugle project. We've designed this task to help you get familiar with our codebase while making a meaningful contribution.
Task Description
The DataSet class in src/intugle/analysis/models.py contains several large methods (e.g., profile_columns, identify_keys, generate_glossary) that perform multiple steps in a single function. For better readability, maintainability, and testability, we want to refactor these methods by breaking them down into smaller, well-named helper functions.
Your task:
- Identify at least one large method in the
DataSetclass (e.g.,profile_columns,identify_keys, orgenerate_glossary). - Refactor it by extracting logical blocks into private helper methods (e.g.,
_collect_column_profiles,_build_column_profiles_df, etc.). - Replace the in-method code with calls to these new helper methods.
- Ensure the main method remains concise and easy to read.
Why This Matters
- Improves code readability and maintainability for all contributors.
- Makes it easier to test and debug smaller, focused functions.
- Helps new contributors understand the codebase faster.
What You'll Learn
- How to refactor large Python methods into smaller, reusable functions
- Best practices for code organization and readability
Step-by-Step Guide
Prerequisites
- Python 3.10+ installed
- Git basics (clone, commit, push, pull request)
- Read our CONTRIBUTING.md guide
Setup Instructions
-
Fork and clone the repository
git clone https://github.com/YOUR_USERNAME/data-tools.git cd data-tools -
Create a virtual environment
python -m venv .venv source .venv/bin/activate # On Windows: .venv\Scripts\activate
-
Install dependencies
pip install -e ".[dev]" -
Create a new branch
git checkout -b fix/issue-NUMBER-refactor-dataset-methods
Implementation Steps
- Open
src/intugle/analysis/models.pyand locate theDataSetclass. - Choose one of the larger methods (e.g.,
profile_columns,identify_keys, orgenerate_glossary). - Identify logical blocks within the method that can be separated (e.g., collecting column data, building DataFrames, updating attributes).
- Move these blocks into private helper methods (prefix with
_), placing them as methods of theDataSetclass. - Replace the original code in the main method with calls to these new helper methods.
- Ensure all existing tests pass and that the refactored code behaves identically.
Files to Modify
- File:
src/intugle/analysis/models.py- Change: Refactor at least one large method in the
DataSetclass by extracting helper functions. - Line(s): For example,
profile_columns(around line 120-150),identify_keys(around line 180-220), orgenerate_glossary(around line 260-300).
- Change: Refactor at least one large method in the
Testing Your Changes
# Run tests
pytest tests/
# Or run specific test
pytest tests/test_analysis_models.pySubmitting Your Work
-
Commit your changes
git add . git commit -m "Refactor DataSet method(s) into smaller helper functions"
-
Push to your fork
git push origin fix/issue-NUMBER-refactor-dataset-methods
-
Create a Pull Request
- Go to the original repository
- Click "Pull Requests" β "New Pull Request"
- Select your branch
- Fill out the PR template
- Reference this issue with "Fixes #ISSUE_NUMBER"
Example Code
# Before
def profile_columns(self) -> 'DataSet':
# ... (long method with multiple steps)
# After
def profile_columns(self) -> 'DataSet':
self._collect_column_profiles()
return self
def _collect_column_profiles(self):
# ... (code moved from profile_columns)Expected Outcome
- The chosen method in
DataSetis now concise and calls one or more private helper methods. - The helper methods are well-named and encapsulate logical sub-tasks.
- All tests pass and there is no change in functionality.
Definition of Done
- Code changes implemented
- Tests added/updated
- Tests passing locally
- Code follows project style guidelines
- No new linter warnings
- Documentation updated (if needed)
- Pull request submitted
Resources
Need Help?
Don't hesitate to ask questions! We're here to help you succeed.
- Comment below with your questions
- Join our Discord for real-time support
- Tag maintainers: @raphael-intugle
Skills You'll Use
- Python basics
- Git and GitHub
- Testing with pytest (optional)
- Other: Refactoring, code organization
Thank you for contributing to Intugle!
Tips for Success:
- Take your time and read through everything carefully
- Don't be afraid to ask questions
- Test your changes before submitting
- Have fun! π