Skip to content

[Sub-Issue #549.2] Create S11 Module 1 - Data Labeling with Label Studio #552

@SkafteNicki

Description

@SkafteNicki

Parent Issue

Part of #549 - New learning session on data engineering

Phase

Phase 1 - Create New S11 Data Engineering Session

Description

Create the first module for the new S11 Data Engineering session focused on data labeling using Label Studio. This module will teach students how to label new data collected from deployed models.

Tasks

  • Create new session directory: s11_data_engineering/
  • Create module file: s11_data_engineering/data_labeling.md
  • Write module content covering:
    • Introduction to data labeling in MLOps lifecycle
    • Why data labeling is important for model retraining
    • Label Studio setup and installation
    • Creating labeling projects
    • Labeling workflows and best practices
    • Exporting labeled data
    • Integration with ML pipelines
  • Create exercises for students:
    • Install and configure Label Studio
    • Create a labeling project
    • Label a sample dataset
    • Export and use labeled data
  • Create exercise_files/ directory if needed
  • Create initial S11 README.md with session overview
  • Ensure consistent formatting with other course modules

References

Dependencies

None - this can be worked on immediately

Acceptance Criteria

  • Module follows the same structure/format as other course modules
  • Content is clear and appropriate for the course level
  • Exercises are practical and achievable
  • Code examples are tested and working
  • Module introduces data labeling in the context of the full ML lifecycle

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions