-
Notifications
You must be signed in to change notification settings - Fork 11
Average GPA Graph and Sentiment Analysis Secion #1120
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: dev
Are you sure you want to change the base?
Changes from all commits
acdf494
4a11501
142e2e7
e19b738
da25aa0
1b38948
ff4988e
b9fa33d
e93e069
a4634d9
836c46c
86e12ea
6df7538
32a4174
2bd4181
08fc3f1
3d5fd0b
e73bb71
e2cfb64
0a702fd
699e5b1
e49e493
bb8eaff
07a8b3c
71f012a
1c2b708
48b3b85
7f5fd17
c1d4182
2202f15
b302a49
1addb45
f4f3138
f7ae375
901835e
0e8a3e3
00a5b2a
5e26f57
559e144
417d351
de2571d
b58b0bf
02caf5c
3273f14
24bc7f2
1d75095
6b8a4b7
3f2ba06
5d01a80
fc2fb1f
0f66303
8a69e7b
a059d45
6ab3931
f55d345
92475dd
238edb1
8b39551
4d8cd6c
e8f5805
30fab7a
73e622b
5318b2e
ddba357
d4e2a84
39a83a1
204bbb1
ea9fc94
e70257e
0a10b16
2390370
233041f
9f859d2
3616bbb
4d622b3
e386ab1
0567429
3213c02
50da5c9
36a1a55
16648c2
eb5ada9
badc6a8
d3fc478
8c97313
4893d10
cf2ea7c
181884b
4ded517
775b6c3
6087ae5
e9e9e0e
3de2cc5
33e3156
1274e0e
de3ae94
b92fd52
794c1f7
86ba75a
70cbea9
af43b7b
f94b518
e8431a9
ac99552
dcf44a9
9af3b2a
44feffc
66d4d48
6461282
add8db0
a7402bb
0cada31
6e57cf3
024fda6
8dcf8e4
ced9815
7586387
35ca2f4
e7cf6f0
a5c9627
c14ee0d
cbcb4ff
06928a8
8b81ff9
bf2ff9b
9566fe5
b64672c
18a8b9f
417a179
81541ee
00597a8
9ed561e
8cc299f
7073258
b16337b
a0d2cb4
573bf50
15a25f3
2e6fdba
05ff19a
7218ed5
c906160
cb55734
7758fe6
6d94624
69f2f16
9b1ad8b
f293a31
104976f
0ee39c0
7b8b1cf
55d480a
0cc44fc
1540893
5b0a0fd
9a77b0f
020c937
5276e67
4eec172
505930f
40ca669
c29bff7
9cdb8a6
dbf7971
be9b65e
e7b799b
f7e739a
38abd49
b930ac3
7450ab6
c7b42f8
c2b125b
493ecae
43a322e
dd8185a
5f58f53
0b75280
37e0f2c
98a06e4
e9fc862
4c9e2b2
2f4d409
b7b5836
e3eb00f
ee8eac7
efeea23
fbcde2a
3a97802
157713b
0e4acc7
38e7bb1
0df4f62
1b17f47
5b61678
ef15ce0
85ee65b
9f310ea
03b3de4
56c0df8
ac05c09
dfed1e7
3fb745a
0bf612f
adee65e
00cf6da
04107da
e4a7b34
5be02ed
549eee1
1cbf5e6
1317d8e
e341128
fe01db4
7a60556
d4a1c3b
63fb8a2
edb72c6
2360dab
f87ea95
ddf3d35
7cbb7e7
5e03426
6e825f9
8f263d8
ba27bae
594104a
a2c2dff
40fcb6a
c72d146
1964236
9a189ab
8ccedc2
3417733
7d0bd8c
4e36ec6
04dd524
7f52f4f
db11ac0
18c7e36
099ac27
e5cb2f8
e386b6a
06eee79
cd3a5fc
87c26a5
758655c
12d1c1f
91d0ff4
297ea06
a2861ae
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
Large diffs are not rendered by default.
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,38 @@ | ||
| import os | ||
| import re | ||
| import pandas as pd | ||
|
|
||
| def extract_professor_name(professor_full): | ||
| return re.match(r'^[^()]+', professor_full).group().strip() | ||
|
|
||
| def extract_course_mnemonic(course_full): | ||
| return course_full.split(' |')[0].strip() | ||
|
|
||
| def avg_sentiment_df_creator(): | ||
| current_directory = os.path.dirname(os.path.abspath(__file__)) | ||
| reviews_data_path = os.path.join(current_directory, "reviews_data", "reviews_data_with_sentiment.csv") | ||
| df = pd.read_csv(reviews_data_path) | ||
| df["instructor_name_only"] = df["instructor"].apply(extract_professor_name) | ||
| df["course_code_only"] = df["course"].apply(extract_course_mnemonic) | ||
| avg_sentiment_df = df.groupby(["instructor_name_only", "course_code_only"])["sentiment_score"].mean().reset_index() | ||
| return avg_sentiment_df | ||
|
|
||
| def query_average(df, professor, course): | ||
| result = df[(df["instructor_name_only"] == professor) & (df["course_code_only"] == course)] | ||
| if result.empty: | ||
| print("No data found for the given professor and course.") | ||
| else: | ||
| print(f"Average sentiment score for {professor} in {course}: {result['sentiment_score'].values[0]:.2f}") | ||
|
|
||
| if __name__ == "__main__": | ||
| avg_sentiment_df = avg_sentiment_df_creator() | ||
| print(avg_sentiment_df) | ||
|
|
||
| while True: | ||
| professor = input("Enter professor name (or type 'exit' to quit): ") | ||
| if professor.lower() == 'exit': | ||
| break | ||
| course = input("Enter course name (or type 'exit' to quit): ") | ||
| if course.lower() == 'exit': | ||
| break | ||
| query_average(avg_sentiment_df, professor, course) | ||
|
Comment on lines
+27
to
+38
Contributor
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Expose this logic via a proper Django management command Because this module sits under Apply this diff to add the command scaffold and optional interactive mode: -import os
-import re
-import pandas as pd
+import os
+import re
+import pandas as pd
+from django.core.management.base import BaseCommand
@@
-def query_average(df, professor, course):
+def query_average(df, professor, course):
result = df[(df["instructor_name_only"] == professor) & (df["course_code_only"] == course)]
if result.empty:
print("No data found for the given professor and course.")
else:
print(f"Average sentiment score for {professor} in {course}: {result['sentiment_score'].values[0]:.2f}")
-
-if __name__ == "__main__":
- avg_sentiment_df = avg_sentiment_df_creator()
- print(avg_sentiment_df)
-
- while True:
- professor = input("Enter professor name (or type 'exit' to quit): ")
- if professor.lower() == 'exit':
- break
- course = input("Enter course name (or type 'exit' to quit): ")
- if course.lower() == 'exit':
- break
- query_average(avg_sentiment_df, professor, course)
+def interactive_loop(avg_sentiment_df):
+ print(avg_sentiment_df)
+ while True:
+ professor = input("Enter professor name (or type 'exit' to quit): ")
+ if professor.lower() == "exit":
+ break
+ course = input("Enter course name (or type 'exit' to quit): ")
+ if course.lower() == "exit":
+ break
+ query_average(avg_sentiment_df, professor, course)
+
+
+class Command(BaseCommand):
+ help = "Generate average sentiment scores for instructor-course pairs."
+
+ def add_arguments(self, parser):
+ parser.add_argument(
+ "--interactive",
+ action="store_true",
+ help="Print the dataframe and prompt for instructor/course lookups.",
+ )
+
+ def handle(self, *args, **options):
+ avg_sentiment_df = avg_sentiment_df_creator()
+ if options["interactive"]:
+ interactive_loop(avg_sentiment_df)
+ else:
+ self.stdout.write(avg_sentiment_df.to_string(index=False)) |
||
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Handle missing instructor/course values safely
re.match(...).group()and.split()assume every CSV row has non-empty strings. Pandas will passNaN(floats) for missing values, which makesre.matchraiseTypeErrorand breaks the command on real data. Guard the inputs or drop null rows before applying the regex/split so the pipeline continues even when data is incomplete.Apply this diff to make the helpers resilient:
📝 Committable suggestion