Skip to content

Commit 8a3b996

Browse files
authored
Merge pull request #4 from mgb45/Handling-preferences
Add preference handling, graceful behaviour with constraint subsets.
2 parents af81598 + 983bca2 commit 8a3b996

File tree

4 files changed

+301
-134
lines changed

4 files changed

+301
-134
lines changed

.gitignore

Lines changed: 3 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -35,3 +35,6 @@ var/
3535
.installed.cfg
3636
*.egg
3737
MANIFES
38+
39+
# Test files
40+
*.xlsx

README.md

Lines changed: 54 additions & 23 deletions
Original file line numberDiff line numberDiff line change
@@ -1,53 +1,84 @@
11
# Teamformer
22

3-
4-
Teamformer builds student teams for you. The primary objective is to form as few teams as are needed while ensuring constraints are met, and encouraging WAM (weighted average mark/gpa) balance across teams. The system is basically a wrapper around a CP-SAT solver using Google OR-Tools.
3+
Teamformer builds student teams for you. The primary objective is to form as few teams as needed while ensuring constraints are met and encouraging WAM (weighted average mark/GPA) balance across teams. The system is a wrapper around a CP-SAT solver using Google OR-Tools.
54

65
Constraint handling includes:
76

8-
✅ Each student is assigned to exactly one team
7+
✅ Each student is assigned to exactly one team
98

109
✅ Each team has between min and max students
1110

12-
✅ No team has only one student of a given gender (current only M/F, if other self-report categories these are ignored and not balanced, but does not break anything)
11+
✅ No team has only one student of a given gender (currently only M/F; other self-reported categories are ignored for balancing, but won't break anything)
1312

1413
✅ The number of teams used is minimized
1514

1615
✅ Students are only assigned to teams in the same lab as them
1716

18-
✅ Deviation from average WAM across class is penalised
17+
✅ Deviation from average WAM across the class is penalised
18+
19+
**Student preferences are favoured (positive and negative preferences now supported!)**
1920

20-
❌ Student preferences are favoured (Not yet implemented)
21+
The output is an Excel sheet with students and teams. Team numbers may not be sequential (drawn from 1\:max\_teams).
2122

22-
The output is an excel sheet with students and teams. Team numbers may not be sequential (drawn from 1:max_teams).
23+
---
2324

2425
### Data structure
25-
Team former assumes data is in a spreadsheet that looks something like (fake data):
2626

27-
| | first_name | last_name | email | gender | wam | lab |
28-
|---:|:-------------|:------------|:--------------------------|:---------|------:|------:|
29-
| 0 | Mark | Johnson | ... | M | 51.13 | 3 |
30-
| 1 | Donald | Walker | ... | M | 60.04 | 1 |
31-
| 2 | Sarah | Rhodes | ... | F | 76.57 | 1 |
32-
| 3 | Steven | Miller | ... | M | 54.22 | 2 |
33-
| 4 | Javier | Johnson | ... | M | 75.26 | 4 |
27+
Teamformer expects data in a spreadsheet like this (fake example):
28+
29+
| | Student\_ID | first\_name | last\_name | email | gender | wam | lab | Prefer\_With | Prefer\_Not\_With |
30+
| - | ----------- | ----------- | ---------- | ----- | ------ | ---- | --- | ------------ | ----------------- |
31+
| 0 | S1 | Mark | Johnson | ... | M | 51.1 | 3 | S2, S3 | S4 |
32+
| 1 | S2 | Donald | Walker | ... | M | 60.0 | 1 | | |
33+
| 2 | S3 | Sarah | Rhodes | ... | F | 76.6 | 1 | S1 | S3 |
34+
| 3 | S4 | Steven | Miller | ... | M | 54.2 | 2 | | |
35+
| 4 | S5 | Javier | Johnson | ... | M | 75.3 | 4 | | |
36+
37+
**Columns used:**
38+
39+
* `Student_ID`
40+
* `gender`
41+
* `wam`
42+
* `lab`
43+
* `Prefer_With` (optional): comma-separated list of Student\_IDs the student wants to work with
44+
* `Prefer_Not_With` (optional): comma-separated list of Student\_IDs the student prefers not to work with
3445

35-
**Only the gender, wam and lab columns are used.**
46+
---
3647

3748
### Install
3849

39-
```
50+
```bash
4051
pip install -e .
4152
```
53+
54+
---
55+
4256
### Run
4357

44-
```
45-
team_former --input_file=students.xlsx --sheet_name=0 --output_file=teams.xlsx --wam_weight=0.05 --min_team_size=3 --max_team_size=5 --max_solve_time=30
58+
```bash
59+
team_former --input_file=students.xlsx --sheet_name=0 --output_file=teams.xlsx --wam_weight=0.05 --pos_pref_weight=0.8 --neg_pref_weight=0.8 --min_team_size=3 --max_team_size=5 --max_solve_time=30
4660
```
4761

62+
---
63+
4864
### How to get a good solution
4965

50-
Depending on your class sizes, demographics and lab distribution, you may struggle to find a feasible solution. Options to address this include:
51-
* Increase the max solve time, it may just be a matter of waiting a bit longer
52-
* Reduce or remove the wam weight penalty
53-
* Reduce the minimimun team size, it may be that the balance of students is infeasible.
66+
Depending on your class sizes, demographics, and lab distribution, you may struggle to find a feasible solution. Options to address this include:
67+
68+
* Increase the max solve time — it may just be a matter of waiting longer
69+
* Reduce or remove the WAM weight penalty
70+
* Adjust the minimum team size — sometimes team balance is infeasible
71+
* Adjust positive or negative preference weights (e.g., set `pos_pref_weight=0.5` if you want preferences to have less influence)
72+
73+
---
74+
75+
### Preference handling
76+
77+
When using preference columns, Teamformer will attempt to:
78+
79+
* **Keep students together** if listed in `Prefer_With`, unless it conflicts with other constraints.
80+
* **Avoid assigning students together** if listed in `Prefer_Not_With`.
81+
82+
Preferences are not strictly enforced (they are "soft" constraints), but they strongly influence the solution when weights are set high.
83+
84+
---

team_former/make_teams.py

Lines changed: 135 additions & 47 deletions
Original file line numberDiff line numberDiff line change
@@ -7,24 +7,67 @@
77
from ortools.sat.python import cp_model
88

99

10+
def parse_preferences(df):
11+
"""Parse positive and negative preferences from the DataFrame columns."""
12+
id_to_index = {row["Student_ID"]: idx for idx, row in df.iterrows()}
13+
14+
positive_prefs = []
15+
negative_prefs = []
16+
17+
has_pos = "Prefer_With" in df.columns
18+
has_neg = "Prefer_Not_With" in df.columns
19+
20+
for _, row in df.iterrows():
21+
student = row["Student_ID"].strip()
22+
23+
# Positive preferences
24+
if has_pos and pd.notna(row["Prefer_With"]) and row["Prefer_With"].strip():
25+
preferred = [s.strip() for s in row["Prefer_With"].split(",") if s.strip()]
26+
for target in preferred:
27+
if target in id_to_index:
28+
positive_prefs.append((student, target))
29+
30+
# Negative preferences
31+
if (
32+
has_neg
33+
and pd.notna(row["Prefer_Not_With"])
34+
and row["Prefer_Not_With"].strip()
35+
):
36+
not_preferred = [
37+
s.strip() for s in row["Prefer_Not_With"].split(",") if s.strip()
38+
]
39+
for target in not_preferred:
40+
if target in id_to_index:
41+
negative_prefs.append((student, target))
42+
43+
positive_prefs = [(id_to_index[a], id_to_index[b]) for (a, b) in positive_prefs]
44+
negative_prefs = [(id_to_index[a], id_to_index[b]) for (a, b) in negative_prefs]
45+
46+
return positive_prefs, negative_prefs
47+
48+
1049
def allocate_teams(
1150
*,
1251
input_file="students.xlsx",
1352
sheet_name=0,
1453
output_file="class_teams.xlsx",
1554
wam_weight=0.05,
55+
pos_pref_weight=0.05,
56+
neg_pref_weight=0.1,
1657
min_team_size=4,
1758
max_team_size=5,
1859
max_solve_time=60,
1960
):
2061
"""
21-
Allocate students into balanced teams based on WAM, gender, and lab constraints.
62+
Allocate students into balanced teams based on optional WAM, gender, lab, and preferences.
2263
2364
Args:
2465
input_file (str): Path to the Excel file with student data.
2566
sheet_name (int or str): Sheet index or name.
2667
output_file (str): Output Excel file with team assignments.
2768
wam_weight (float): Weight for WAM balancing in the objective.
69+
pos_pref_weight (float): Weight for positive preference balancing.
70+
neg_pref_weight (float): Weight for negative preference balancing.
2871
min_team_size (int): Minimum number of students per team.
2972
max_team_size (int): Maximum number of students per team.
3073
max_solve_time (int): Solver timeout in seconds.
@@ -34,13 +77,25 @@ def allocate_teams(
3477

3578
students = student_df.to_dict(orient="index")
3679
num_students = len(students)
37-
genders = student_df["gender"]
38-
wams = student_df["wam"].astype(int).values
39-
lab_ids = sorted(set(student_df["lab"].astype(int).values))
40-
student_labs = student_df["lab"].astype(int).values
41-
global_avg_wam = sum(wams) // len(wams)
4280
max_teams = num_students // min_team_size
4381

82+
has_wam = "wam" in student_df.columns
83+
has_lab = "lab" in student_df.columns
84+
has_gender = "gender" in student_df.columns
85+
86+
if has_wam:
87+
wams = student_df["wam"].astype(int).values
88+
global_avg_wam = int(sum(wams) / len(wams))
89+
90+
if has_lab:
91+
lab_ids = sorted(set(student_df["lab"].astype(int).values))
92+
student_labs = student_df["lab"].astype(int).values
93+
94+
if has_gender:
95+
genders = student_df["gender"].values
96+
97+
pos_preferences, neg_preferences = parse_preferences(student_df)
98+
4499
model = cp_model.CpModel()
45100

46101
# Variables
@@ -51,11 +106,13 @@ def allocate_teams(
51106
}
52107

53108
team_used = [model.NewBoolVar(f"team_used_{team}") for team in range(max_teams)]
54-
lab_team = {
55-
(team, lab): model.NewBoolVar(f"team_{team}_lab_{lab}")
56-
for team in range(max_teams)
57-
for lab in lab_ids
58-
}
109+
110+
if has_lab:
111+
lab_team = {
112+
(team, lab): model.NewBoolVar(f"team_{team}_lab_{lab}")
113+
for team in range(max_teams)
114+
for lab in lab_ids
115+
}
59116

60117
# Constraints
61118
for i in range(num_students):
@@ -67,45 +124,76 @@ def allocate_teams(
67124
model.Add(team_size >= min_team_size).OnlyEnforceIf(team_used[team])
68125
model.Add(team_size == 0).OnlyEnforceIf(team_used[team].Not())
69126

70-
for team in range(max_teams):
71-
model.AddExactlyOne(lab_team[team, lab] for lab in lab_ids)
72-
73-
for i in range(num_students):
127+
if has_lab:
74128
for team in range(max_teams):
75-
model.Add(lab_team[team, student_labs[i]] == 1).OnlyEnforceIf(
76-
assign[i, team]
77-
)
129+
model.AddExactlyOne(lab_team[team, lab] for lab in lab_ids)
78130

79-
for team in range(max_teams):
80-
male_students = [
81-
assign[i, team] for i in range(num_students) if genders[i] == "M"
82-
]
83-
female_students = [
84-
assign[i, team] for i in range(num_students) if genders[i] == "F"
85-
]
86-
if male_students:
87-
model.Add(sum(male_students) != 1)
88-
if female_students:
89-
model.Add(sum(female_students) != 1)
90-
91-
# Objective: minimize number of teams + balance WAM
131+
for i in range(num_students):
132+
for team in range(max_teams):
133+
model.Add(lab_team[team, student_labs[i]] == 1).OnlyEnforceIf(
134+
assign[i, team]
135+
)
136+
137+
if has_gender:
138+
for team in range(max_teams):
139+
male_students = [
140+
assign[i, team] for i in range(num_students) if genders[i] == "M"
141+
]
142+
female_students = [
143+
assign[i, team] for i in range(num_students) if genders[i] == "F"
144+
]
145+
if male_students:
146+
model.Add(sum(male_students) != 1)
147+
if female_students:
148+
model.Add(sum(female_students) != 1)
149+
150+
# Objective terms
92151
squared_deviation_terms = []
93-
for team in range(max_teams):
94-
wam_sum = model.NewIntVar(0, 100 * max_team_size, f"wam_sum_{team}")
95-
size_var = model.NewIntVar(0, max_team_size, f"team_size_{team}")
96-
model.Add(size_var == sum(assign[i, team] for i in range(num_students)))
97-
model.Add(
98-
wam_sum == sum(wams[i] * assign[i, team] for i in range(num_students))
99-
)
100-
diff = model.NewIntVar(-500, 500, f"wam_diff_{team}")
101-
model.Add(diff == wam_sum - size_var * global_avg_wam)
102-
squared_diff = model.NewIntVar(0, 250000, f"squared_diff_{team}")
103-
model.AddMultiplicationEquality(squared_diff, [diff, diff])
104-
squared_deviation_terms.append(squared_diff)
105-
106-
model.Minimize(
107-
sum(team_used) + int(wam_weight * 1000) * sum(squared_deviation_terms)
108-
)
152+
if has_wam:
153+
for team in range(max_teams):
154+
wam_sum = model.NewIntVar(0, 100 * max_team_size, f"wam_sum_{team}")
155+
size_var = model.NewIntVar(0, max_team_size, f"team_size_{team}")
156+
model.Add(size_var == sum(assign[i, team] for i in range(num_students)))
157+
model.Add(
158+
wam_sum == sum(wams[i] * assign[i, team] for i in range(num_students))
159+
)
160+
diff = model.NewIntVar(-500, 500, f"wam_diff_{team}")
161+
model.Add(diff == wam_sum - size_var * global_avg_wam)
162+
squared_diff = model.NewIntVar(0, 250000, f"squared_diff_{team}")
163+
model.AddMultiplicationEquality(squared_diff, [diff, diff])
164+
squared_deviation_terms.append(squared_diff)
165+
166+
pref_bonus_terms = []
167+
for i, j in pos_preferences:
168+
for team in range(max_teams):
169+
together = model.NewBoolVar(f"prefer_{i}_{j}_team_{team}")
170+
model.AddBoolAnd([assign[i, team], assign[j, team]]).OnlyEnforceIf(together)
171+
model.AddBoolOr(
172+
[assign[i, team].Not(), assign[j, team].Not()]
173+
).OnlyEnforceIf(together.Not())
174+
pref_bonus_terms.append(together)
175+
176+
negative_terms = []
177+
for i, j in neg_preferences:
178+
for team in range(max_teams):
179+
both = model.NewBoolVar(f"neg_pref_{i}_{j}_team_{team}")
180+
model.AddBoolAnd([assign[i, team], assign[j, team]]).OnlyEnforceIf(both)
181+
model.AddBoolOr(
182+
[assign[i, team].Not(), assign[j, team].Not()]
183+
).OnlyEnforceIf(both.Not())
184+
negative_terms.append(both)
185+
186+
# Objective
187+
objective_terms = [sum(team_used)]
188+
189+
if has_wam and wam_weight > 0:
190+
objective_terms.append(int(wam_weight * 1000) * sum(squared_deviation_terms))
191+
if pos_pref_weight > 0:
192+
objective_terms.append(-pos_pref_weight * sum(pref_bonus_terms))
193+
if neg_pref_weight > 0:
194+
objective_terms.append(neg_pref_weight * sum(negative_terms))
195+
196+
model.Minimize(sum(objective_terms))
109197

110198
# Solve
111199
solver = cp_model.CpSolver()

0 commit comments

Comments
 (0)