-
Notifications
You must be signed in to change notification settings - Fork 253
ISSUE-7449: Offer Suggestions to Partially Matching OCR Scans #7760
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Merged
david-yz-liu
merged 8 commits into
master
from
ISSUE-7449_offer_suggestions_to_partially_matching_ocr_scans
Dec 22, 2025
Merged
Changes from 7 commits
Commits
Show all changes
8 commits
Select commit
Hold shift + click to select a range
1ae8431
ISSUE-7449: Offer suggestions to partially matching OCR scans
Naragod 86c685f
ISSUE-7749: Clean up suggestions display
Naragod 2e125ca
ISSUE-7749: Update changelog
Naragod 0487a67
ISSUE-7449: Remove exam template id
Naragod cf159aa
ISSUE-7449: Apply suggestion threshold and clean up code
Naragod 96ea99c
ISSUE-7449: Internationalize strings
Naragod 8d83af2
Merge branch 'master' into ISSUE-7449_offer_suggestions_to_partially_…
david-yz-liu b0d2ac1
ISSUE-7449: Use max_by to take highest similarity suggestions
Naragod File filter
Filter by extension
Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
There are no files selected for viewing
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,42 @@ | ||
| // OCR Suggestions Styles | ||
| // Used in assign_scans view for displaying OCR match data and student suggestions | ||
|
|
||
| @import 'constants'; | ||
|
|
||
| .ocr-suggestions-container { | ||
| margin: 1em 0; | ||
| padding: 1em; | ||
| background-color: $background-support; | ||
| border: 1px solid $gridline; | ||
| border-radius: var(--radius); | ||
| max-height: 400px; | ||
| overflow-y: auto; | ||
| overflow-x: hidden; | ||
|
|
||
| code { | ||
| background-color: $disabled-area; | ||
| padding: 0.2em 0.4em; | ||
| border-radius: 3px; | ||
| } | ||
|
|
||
| .no-match { | ||
| color: $disabled-text; | ||
| font-style: italic; | ||
| } | ||
| } | ||
|
|
||
| .ocr-suggestions-list { | ||
| margin-top: 0.5em; | ||
| position: static; | ||
|
|
||
| .ui-menu-item div:hover { | ||
| background-color: $primary-three; | ||
| color: $sharp-line; | ||
| } | ||
|
|
||
| .student-info { | ||
| font-size: 1.1em; | ||
| color: $sharp-line; | ||
| font-weight: 500; | ||
| } | ||
| } |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,70 @@ | ||
| /** | ||
| * OCR Suggestions Module | ||
| * Handles display and interaction with OCR match data and student suggestions | ||
| * Used in the assign_scans view for exam template processing | ||
| */ | ||
|
|
||
| export function updateOcrSuggestions(ocrMatch, suggestions = []) { | ||
| const container = $("#ocr_suggestions"); | ||
| container.empty(); | ||
|
|
||
| if (!ocrMatch) { | ||
| container.hide(); | ||
| return; | ||
| } | ||
|
|
||
| container.show(); | ||
|
|
||
| // internationalization | ||
| const noId = I18n.t("exam_templates.assign_scans.no_id"); | ||
| const idNumber = I18n.t("activerecord.attributes.user.id_number"); | ||
| const userName = I18n.t("activerecord.attributes.user.user_name"); | ||
| const suggestedStudents = I18n.t("exam_templates.assign_scans.suggested_students"); | ||
| const noSimilarStudents = I18n.t("exam_templates.assign_scans.no_similar_students"); | ||
|
|
||
| const ocrDisplay = $("<p></p>"); | ||
| // Display the parsed OCR value | ||
| const parsedValue = ocrMatch.parsed_value; | ||
| const fieldType = ocrMatch.field_type === "id_number" ? idNumber : userName; | ||
| const ocrDetected = I18n.t("exam_templates.assign_scans.ocr_detected", {field_type: fieldType}); | ||
|
|
||
| ocrDisplay.append(`<strong>${ocrDetected}</strong>`); | ||
| const codeElem = $("<code></code>").text(parsedValue); | ||
| ocrDisplay.append(codeElem); | ||
| container.append(ocrDisplay); | ||
|
|
||
| if (suggestions.length == 0) { | ||
| return container.append(`<p class="no-match">${noSimilarStudents}</p>`); | ||
| } | ||
|
|
||
| // Display suggestions if available | ||
| container.append(`<p><strong>${suggestedStudents}</strong></p>`); | ||
| const list = $('<ul class="ui-menu ocr-suggestions-list"></ul>'); | ||
|
|
||
| suggestions.forEach(function (suggestion) { | ||
| const similarity = suggestion.similarity; | ||
| const item = $('<li class="ui-menu-item"></li>'); | ||
| const content = $("<div></div>"); | ||
|
|
||
| // Use .text() to safely insert user-supplied data and prevent XSS | ||
| const nameElem = $("<strong></strong>").text(suggestion.display_name); | ||
| const infoText = `${suggestion.id_number || noId} | ${suggestion.user_name}`; | ||
| const infoElem = $('<span class="student-info"></span>').text(infoText); | ||
|
|
||
| content.append(nameElem); | ||
| content.append(` (${similarity}%)`); | ||
| content.append("<br>"); | ||
| content.append(infoElem); | ||
|
|
||
| content.on("click", function () { | ||
| $("#student_id").val(suggestion.id); | ||
| $("#names").val(suggestion.display_name); | ||
| $("#names").focus(); | ||
| }); | ||
|
|
||
| item.append(content); | ||
| list.append(item); | ||
| }); | ||
|
|
||
| container.append(list); | ||
| } |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,113 @@ | ||
| # Service for storing and retrieving OCR match data from Redis. | ||
| # Used to persist OCR parsing results for scanned exam assignments, | ||
| # enabling suggestions for manual student assignment. | ||
| class OcrMatchService | ||
| # Time-to-live for OCR match data in Redis (30 days) | ||
| TTL = 30.days.to_i | ||
|
|
||
| class << self | ||
| # Store an OCR match result in Redis | ||
| def store_match(grouping_id, parsed_value, field_type, matched: false, student_id: nil) | ||
| data = { | ||
| parsed_value: parsed_value, | ||
| field_type: field_type, | ||
| timestamp: Time.current.iso8601, | ||
| matched: matched, | ||
| matched_student_id: student_id | ||
| } | ||
|
|
||
| redis.setex(match_key(grouping_id), TTL, data.to_json) | ||
|
|
||
| # Add to unmatched set if not auto-matched | ||
| unless matched | ||
| redis.sadd(unmatched_set_key, grouping_id) | ||
| redis.expire(unmatched_set_key, TTL) | ||
| end | ||
| end | ||
|
|
||
| # Retrieve stored OCR match data for a grouping | ||
| def get_match(grouping_id) | ||
| data = redis.get(match_key(grouping_id)) | ||
| data ? JSON.parse(data, symbolize_names: true) : nil | ||
| end | ||
|
|
||
| # Get student suggestions based on stored OCR match using fuzzy matching | ||
| # Only considers students not already assigned to a grouping for this assignment | ||
| # Returns students meeting the similarity threshold (default 80%), limited to top matches (default 5) | ||
| def get_suggestions(grouping_id, course_id, threshold: 0.8, limit: 5) | ||
| match_data = get_match(grouping_id) | ||
| return [] if match_data.nil? | ||
|
|
||
| grouping = Grouping.find(grouping_id) | ||
| assignment = grouping.assignment | ||
| course = Course.find(course_id) | ||
|
|
||
| # Get students who are not assigned to any grouping for this assignment | ||
| assigned_student_ids = assignment.groupings | ||
| .joins(:student_memberships) | ||
| .pluck('memberships.role_id') | ||
| students = course.students.includes(:user).where.not(id: assigned_student_ids) | ||
|
|
||
| # Calculate similarity scores for each student | ||
| suggestions = students.filter_map do |student| | ||
| value_to_match = student_match_value(student, match_data[:field_type]) | ||
| next if value_to_match.blank? | ||
|
|
||
| similarity = string_similarity(match_data[:parsed_value], value_to_match) | ||
| next if similarity < threshold | ||
|
|
||
| { student: student, similarity: similarity } | ||
| end | ||
|
|
||
| # Sort by similarity (highest first) and limit results | ||
| suggestions.sort_by { |s| -s[:similarity] }.take(limit) | ||
| end | ||
|
|
||
| # Clear OCR match data after manual assignment | ||
| def clear_match(grouping_id) | ||
| redis.del(match_key(grouping_id)) | ||
| redis.srem(unmatched_set_key, grouping_id) | ||
| end | ||
|
|
||
| private | ||
|
|
||
| def match_key(grouping_id) | ||
| "ocr_matches:grouping:#{grouping_id}" | ||
| end | ||
|
|
||
| def unmatched_set_key | ||
| 'ocr_matches:unmatched' | ||
| end | ||
|
|
||
| def redis | ||
| Redis::Namespace.new(Rails.root.to_s, redis: Resque.redis) | ||
| end | ||
|
|
||
| # Get the value to match against based on field type | ||
| def student_match_value(student, field_type) | ||
| case field_type | ||
| when 'id_number' then student.user.id_number | ||
| when 'user_name' then student.user.user_name | ||
| end | ||
| end | ||
|
|
||
| # Calculate similarity between two strings using Levenshtein distance | ||
| # Returns a score between 0 and 1, where 1 is identical | ||
| def string_similarity(str1, str2) | ||
| return 1.0 if str1 == str2 | ||
| return 0.0 if str1.blank? || str2.blank? | ||
|
|
||
| # Normalize strings for case-insensitive comparison | ||
| s1 = str1.to_s.downcase.strip | ||
| s2 = str2.to_s.downcase.strip | ||
| return 1.0 if s1 == s2 | ||
|
|
||
| # Use Ruby's built-in Levenshtein distance calculation | ||
| distance = DidYouMean::Levenshtein.distance(s1, s2) | ||
| max_length = [s1.length, s2.length].max | ||
| return 0.0 if max_length.zero? | ||
|
|
||
| 1.0 - (distance.to_f / max_length) | ||
| end | ||
| end | ||
| end | ||
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Oops, something went wrong.
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Uh oh!
There was an error while loading. Please reload this page.