Conversation
There was a problem hiding this comment.
Pull request overview
This PR implements a backtracking algorithm for optimizing course schedule selection at NTU. The implementation allows students to input desired courses and occupied time slots, then finds a valid combination of course indices (sections) that don't conflict.
Changes:
- Implemented core optimization algorithm using backtracking with schedule conflict detection via bitmasks
- Added comprehensive test suite covering various scenarios including edge cases
- Modified API view to handle no-solution cases by returning empty list instead of None
Reviewed changes
Copilot reviewed 3 out of 3 changed files in this pull request and generated 12 comments.
| File | Description |
|---|---|
| apps/optimizer/algo.py | Implements the core optimization logic including schedule parsing, bitmask operations, and backtracking solver |
| apps/optimizer/views.py | Updates response handling to return empty list when no solution exists; reorders imports |
| apps/optimizer/tests.py | Adds extensive test coverage for parsing functions, solver, optimizer, and API endpoints |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
| # 1. Calculate Fixed Masks (Sessions common to all indices of a course) | ||
| fixed_masks = {} | ||
| for c_id, indices in course_data.items(): | ||
| if not indices: | ||
| continue | ||
| f_mask = list(indices.values())[0] | ||
| for mask in indices.values(): | ||
| f_mask &= mask | ||
| fixed_masks[c_id] = f_mask | ||
|
|
There was a problem hiding this comment.
The fixed_masks dictionary is calculated (lines 138-145) but never used in the rest of the function. This computation involves iterating through all course indices and performing bitwise AND operations, which wastes CPU cycles. Either remove this dead code or add a comment explaining why it's being preserved (e.g., for future use or debugging).
| # 1. Calculate Fixed Masks (Sessions common to all indices of a course) | |
| fixed_masks = {} | |
| for c_id, indices in course_data.items(): | |
| if not indices: | |
| continue | |
| f_mask = list(indices.values())[0] | |
| for mask in indices.values(): | |
| f_mask &= mask | |
| fixed_masks[c_id] = f_mask |
| start_time, end_time = time_range.split('-') | ||
| start_slot = time_to_slot_index(start_time) | ||
| end_slot = time_to_slot_index(end_time) | ||
|
|
||
| # Set bits for all slots in the range (inclusive of end_slot) | ||
| day_offset = day_map[day] * 32 | ||
| for slot in range(start_slot, end_slot + 1): | ||
| if 0 <= slot < 32: | ||
| mask |= (1 << (day_offset + slot)) |
There was a problem hiding this comment.
The filtered_information_to_mask function doesn't handle the case where start_slot > end_slot, which could occur if time ranges are malformed or if a class spans midnight (though this is unlikely for most university classes). When this happens, range(start_slot, end_slot + 1) would produce an empty range, silently skipping the session without logging an error. Consider adding validation to detect and handle this edge case.
| for slot in range(start_slot, end_slot + 1): | ||
| if 0 <= slot < 32: | ||
| mask |= (1 << (day_offset + slot)) |
There was a problem hiding this comment.
The bounds check if 0 <= slot < 32 at line 80 means that if start_slot or end_slot fall outside the valid range (e.g., if a class starts before 8:00 AM or extends past midnight), those slots will be silently skipped. This could lead to incomplete schedule data where late-night or early-morning classes are not properly accounted for in conflict detection. Consider validating the slot indices earlier and logging a warning or error when time ranges are outside the supported window.
| if '-' not in time_range: | ||
| continue | ||
|
|
||
| start_time, end_time = time_range.split('-') |
There was a problem hiding this comment.
The time range parsing at line 73 uses split('-') which could fail if the time range contains multiple hyphens (e.g., "0930-1020-some-note") or could produce unexpected results. While this is unlikely given the expected format, consider using split('-', 1) to ensure only the first hyphen is used for splitting, or add validation to ensure exactly one hyphen exists in the time range.
| start_time, end_time = time_range.split('-') | |
| time_parts = time_range.split('-') | |
| if len(time_parts) != 2: | |
| # Malformed time range; skip this session | |
| continue | |
| start_time, end_time = time_parts |
| - courses: List of dicts with 'code', optional 'include', 'exclude' | ||
| - occupied: 192-char string ('O' or 'X') representing blocked time slots |
There was a problem hiding this comment.
The documentation mentions optional 'include' and 'exclude' fields in the courses list, but the algorithm doesn't use them. Either implement support for these fields to filter which indices can be selected for each course, or remove the mention of these fields from the documentation to avoid confusion.
| for course_code in wanted_courses: | ||
| try: | ||
| course = Course.objects.get(code=course_code) | ||
| except Course.DoesNotExist: | ||
| continue | ||
|
|
||
| # Common schedule for the course (applies to all indices) | ||
| common_mask = parse_schedule(course.common_schedule) if course.common_schedule else 0 | ||
|
|
||
| # Fetch indices with their schedules | ||
| indexes = CourseIndex.objects.filter(course_code=course_code).prefetch_related('schedules') | ||
| if not indexes.exists(): | ||
| continue | ||
|
|
||
| # Common schedules stored in CourseSchedule (optional) | ||
| common_schedules = CourseSchedule.objects.filter(common_schedule_for_course=course_code) | ||
| common_schedule_mask = 0 | ||
| for schedule in common_schedules: | ||
| if schedule.schedule: | ||
| common_schedule_mask |= parse_schedule(schedule.schedule) | ||
|
|
||
| course_data[course_code] = {} | ||
|
|
||
| for index in indexes: | ||
| index_id = index.index | ||
| index_mask = common_mask | common_schedule_mask | ||
|
|
||
| # Add schedules from CourseSchedule model | ||
| for schedule in index.schedules.all(): | ||
| if schedule.schedule: | ||
| index_mask |= parse_schedule(schedule.schedule) | ||
|
|
||
| # Add index-specific times from filtered_information | ||
| if index.filtered_information: | ||
| index_mask |= filtered_information_to_mask(index.filtered_information) | ||
|
|
||
| course_data[course_code][index_id] = index_mask | ||
|
|
||
| # 1. Calculate Fixed Masks (Sessions common to all indices of a course) | ||
| fixed_masks = {} | ||
| for c_id, indices in course_data.items(): | ||
| if not indices: | ||
| continue | ||
| f_mask = list(indices.values())[0] | ||
| for mask in indices.values(): | ||
| f_mask &= mask | ||
| fixed_masks[c_id] = f_mask | ||
|
|
||
| # 2. Constraint Mask is passed as parameter | ||
| # (Skip create_constraint_mask - work directly with the bitmask) | ||
|
|
||
| # 3. Build valid_indices - just filter out indices with no schedule when constraints exist | ||
| # Don't try to be clever with fixed_masks since it misses index-specific conflicts | ||
| valid_indices = {} | ||
| for c_id in course_data: | ||
| valid_indices[c_id] = [] | ||
| for idx_id, mask in course_data[c_id].items(): | ||
| # Reject indices with no schedule data when constraints are specified | ||
| if constraint_mask != 0 and mask == 0: | ||
| continue | ||
| # Check against user constraints only (not other courses - backtracker handles that) | ||
| if (mask & constraint_mask) == 0: | ||
| valid_indices[c_id].append(idx_id) | ||
|
|
||
| # 4. Solver | ||
| courses = list(valid_indices.keys()) |
There was a problem hiding this comment.
If a requested course doesn't exist or has no indices, it's silently skipped and the algorithm attempts to find a solution for the remaining courses. This could return a partial solution when a complete solution is impossible. The function should detect when any requested course cannot be satisfied (doesn't exist, has no indices, or all indices conflict with constraints) and return None immediately, rather than returning a solution for a subset of the requested courses.
| def _build_full_mask_for_assignment(assignment): | ||
| """Given an assignment list of (course_code, index_id), return dict of full masks for each course.""" | ||
| full_masks = {} | ||
| for course_code, index_id in assignment: | ||
| try: | ||
| course = Course.objects.get(code=course_code) | ||
| except Course.DoesNotExist: | ||
| full_masks[course_code] = 0 | ||
| continue | ||
| m = parse_schedule(course.common_schedule) if course.common_schedule else 0 | ||
| # include CourseSchedule entries common to course | ||
| for cs in CourseSchedule.objects.filter(common_schedule_for_course=course_code): | ||
| if cs.schedule: | ||
| m |= parse_schedule(cs.schedule) | ||
| # include index-specific schedules | ||
| try: | ||
| idx_obj = CourseIndex.objects.get(index=index_id) | ||
| except CourseIndex.DoesNotExist: | ||
| full_masks[course_code] = m | ||
| continue | ||
| for s in idx_obj.schedules.all(): | ||
| if s.schedule: | ||
| m |= parse_schedule(s.schedule) | ||
| # include filtered_information | ||
| if idx_obj.filtered_information: | ||
| m |= filtered_information_to_mask(idx_obj.filtered_information) | ||
| full_masks[course_code] = m | ||
| return full_masks |
There was a problem hiding this comment.
The _build_full_mask_for_assignment function queries the database for each course in the assignment (lines 171, 177, 182, 186) every time it's called. Since _assignment_is_valid is called for every complete assignment found by the backtracker, this results in redundant database queries. Consider caching these values or pre-computing them once, especially since the course data doesn't change during the algorithm execution. The same masks are already computed and stored in course_data during initialization.
| mask = 0 | ||
| for i, char in enumerate(occupied_str): | ||
| if char == 'X': | ||
| mask |= (1 << i) | ||
| return mask |
There was a problem hiding this comment.
The parse_schedule and occupied_str_to_mask functions are functionally identical - both convert a string of 'X' and other characters into a bitmask. Consider consolidating these into a single function to reduce code duplication. For example, occupied_str_to_mask could simply call parse_schedule after handling the empty string check.
| mask = 0 | |
| for i, char in enumerate(occupied_str): | |
| if char == 'X': | |
| mask |= (1 << i) | |
| return mask | |
| return parse_schedule(occupied_str) |
| {'code': 'MH1100', 'include': [], 'exclude': []}, | ||
| {'code': 'MH1200', 'include': [], 'exclude': []}, | ||
| {'code': 'MH1300', 'include': [], 'exclude': []}, |
There was a problem hiding this comment.
The test passes 'include' and 'exclude' fields in the input data, but these fields are not used by the optimize_index function (only 'code' is extracted at line 263 of algo.py). This test is passing data that will be silently ignored, which may give a false impression that these features are tested. Either implement support for these fields in the algorithm or remove them from this test.
| def time_to_slot_index(time_str): | ||
| """ | ||
| Convert time string like '0930' to slot index (0-31, starting from 8am). | ||
| 8:00 = slot 0, 8:30 = slot 1, 9:00 = slot 2, etc. | ||
| """ | ||
| hour = int(time_str[:2]) | ||
| minute = int(time_str[2:4]) | ||
| slot = (hour - 8) * 2 + (1 if minute >= 30 else 0) | ||
| return slot |
There was a problem hiding this comment.
The time_to_slot_index function lacks input validation and bounds checking. If the input time string is malformed (not exactly 4 digits) or represents a time outside the expected range (before 8:00 AM or after midnight), it could produce invalid slot indices or raise exceptions. Add validation to ensure the time string is properly formatted and within the expected range (8:00 AM to midnight, or whatever range is supported by the 32-slot system).
This PR implements a backtracking algorithm for the mods optimizer features, along with relevant test cases
A proof-of-concept frontend can be seen here: https://github.com/RobotHanzo/NTUModsOptimizerDemo
https://robothanzo.github.io/NTUModsOptimizerDemo/