-
-
Notifications
You must be signed in to change notification settings - Fork 5
Description
Issue Description
The current machine learning model in the elimu-ai/ml-storybook-reading-level repository predicts a storybook’s reading level based on metrics like total word count. To improve the model’s accuracy and provide more granular reading level predictions, additional text complexity metrics, such as Flesch-Kincaid Grade Level and Lexile scores, should be incorporated. These metrics consider factors like sentence length, word difficulty, and syllable count, which are widely used in educational tools to assess text readability. Adding these metrics will enhance the model’s utility for the Java-based webapp and better serve out-of-school children by ensuring more precise reading level assignments.
Proposed Solution
Extend the model to calculate and incorporate additional text complexity metrics (Flesch-Kincaid Grade Level and Lexile scores) as features for predicting reading levels. The solution includes:
Computing Flesch-Kincaid and Lexile scores for input storybook text.
Updating the feature extraction pipeline to include these metrics.
Ensuring the updated model is exported in a Java-compatible format (e.g., PMML) for integration with the webapp.
Maintaining compatibility with the existing Python-based workflow (run_all_steps.py).