fix: enrollment thread and temp storage issue by brandonistfan · Pull Request #1187 · thecourseforum/theCourseForum2

brandonistfan · 2025-11-02T23:11:54Z

GitHub Issues addressed

This PR closes

What I did

Stopped the leaking fetch_enrollment.py script: replaced the global session shared by the threads with thread-local sessions via threading.local(), closed all threads after updating, and added connection cleanup on interrupt. Tests and a run on 9,887 sections showed no thread or memory leaks.

Screenshots

Before
After

Testing

A brief explanation of tests done/written or how reviewers can test your work

Run tests

docker exec tcf_django python manage.py test tcf_website.tests.test_fetch_enrollment

Test on dataset

Before running, click on a few different courses and run the commands to see the stats below and ensure that the threads are being terminated.
The command below is for all sections.

docker exec tcf_django python manage.py fetch_enrollment

Verify no orphans

docker exec tcf_django bash -c "pgrep -a python"

Monitor memory

docker stats tcf_django

Questions/Discussions/Notes

Summary by CodeRabbit

New Features
- Network requests now use isolated per-thread sessions for more reliable background fetching.
Bug Fixes
- Improved interruption handling: background work cleanly stops and releases DB connections to prevent leaks.
Performance
- Reduced concurrent workers (20→5) and ensured section counts are computed before processing for steadier resource use.
Tests
- Updated tests to validate thread-local session behavior and cleanup.

coderabbitai · 2025-11-02T23:12:12Z

Walkthrough

Replaces a global requests Session with per-thread sessions via get_session() backed by thread-local storage; materializes the Section queryset before parallel processing; reduces max workers from 20 to 5; ensures executor shutdown and calls django.db.connections.close_all() on KeyboardInterrupt; updates tests to mock get_session() and verify thread-local sessions; switches a single connection.close() call to connections.close_all() in the API background update.

Changes

Cohort / File(s)	Summary
Thread-Local Session Management `tcf_website/management/commands/fetch_enrollment.py`	Adds `_session_local` thread-local storage and a public `get_session()` to create/return per-thread `requests.Session` instances with mounted `HTTPAdapter` and `Retry` configuration; replaces previous global session usage.
Fetch & Concurrency Adjustments `tcf_website/management/commands/fetch_enrollment.py`	Updates `fetch_section_data()` and caller sites to use `get_session()`; materializes the Sections queryset by calling `count()` then `list(...)` before parallel processing; reduces `ThreadPoolExecutor` `max_workers` from 20 to 5; on `KeyboardInterrupt` shuts down the executor and calls `django.db.connections.close_all()`.
Error Handling & Retry Integration `tcf_website/management/commands/fetch_enrollment.py`	Adapts existing retry/backoff logic to work with per-thread sessions and preserves request/error handling around network calls.
API: DB Connection Finalization `tcf_website/api/views.py`	Replaces import/use of a single `connection.close()` with `django.db.connections.close_all()` inside the background update thread.
Tests: Session Mocking & Thread Safety `tcf_website/tests/test_fetch_enrollment.py`	Changes tests to patch `get_session()` and use the returned mock session for `.get()`, `.raise_for_status()` and `.json()` behavior; adds `test_thread_local_sessions` asserting distinct session instances per thread; updates fetch tests accordingly.

Sequence Diagram(s)

sequenceDiagram
    participant Main as Main Thread
    participant Worker as Worker Thread
    participant get_session as get_session()
    participant TLS as _session_local (thread-local)
    participant Adapter as HTTPAdapter/Retry
    participant Remote as Remote Service

    rect rgb(220,235,255)
    Note over Worker,Main: First call in a thread
    Worker->>get_session: request session
    get_session->>TLS: check for session
    alt none present
        get_session->>Adapter: create HTTPAdapter + Retry
        get_session->>TLS: store session
    end
    TLS-->>get_session: session
    get_session-->>Worker: session
    end

    rect rgb(235,245,220)
    Note over Worker,Remote: Use session for requests
    Worker->>Remote: session.get(url)
    Remote-->>Worker: response
    Worker->>Worker: handle response / retry logic
    end

    rect rgb(245,230,230)
    Note over Main,Worker: Shutdown on KeyboardInterrupt
    Main->>Worker: KeyboardInterrupt
    Main->>Main: executor.shutdown()
    Main->>Main: django.db.connections.close_all()
    end

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~25 minutes

Review thread-local session initialization and adapter/retry configuration.
Verify tests correctly simulate per-thread sessions and that test_thread_local_sessions is reliable.
Check queryset materialization for memory implications on large datasets.
Confirm connections.close_all() placement inside background threads doesn't introduce race conditions.

Poem

🐰 A hop, a thread, each burrow keeps a brew,
Sessions snug and separate, not one but a few.
Five busy paws fetch, no jostle or din,
Connections tucked away when the cleanup begins.
The rabbit smiles: quiet fetches win.

Pre-merge checks and finishing touches

✅ Passed checks (2 passed)

Check name	Status	Explanation
Title Check	✅ Passed	The title "fix: enrollment thread and temp storage issue" is directly related to the main changes in the changeset. The PR addresses threading issues by replacing a global session with thread-local sessions and adding connection cleanup on interrupt—both core concerns mentioned in the title. The title is concise and specific enough that a teammate scanning the history would understand this fixes threading and resource cleanup issues in the enrollment functionality. While "temp storage issue" could arguably be more specific, the title adequately summarizes the primary changes without being vague or generic.
Description Check	✅ Passed	The pull request description follows the required template structure and provides substantive content in the critical sections. The "What I did" section clearly describes the changes made (thread-local sessions, connection cleanup, and memory leak prevention) with validation results from testing. The "Testing" section is comprehensive, providing specific Docker commands for unit tests, dataset validation, orphan process verification, and memory monitoring. However, two sections are incomplete: the "GitHub Issues addressed" section contains only the template placeholder without an actual issue reference, and the "Screenshots" section has "Before" and "After" labels but no actual images are included.

✨ Finishing touches

📝 Generate docstrings

🧪 Generate unit tests (beta)

Create PR with unit tests
Post copyable unit tests in a comment
Commit unit tests in branch enrollment-fix

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

_{Comment @coderabbitai help to get the list of available commands and usage tips.}

coderabbitai

Actionable comments posted: 3

📜 Review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 12e0ca1 and 5ad8636.

📒 Files selected for processing (2)

tcf_website/management/commands/fetch_enrollment.py (6 hunks)
tcf_website/tests/test_fetch_enrollment.py (5 hunks)

🧰 Additional context used

🧬 Code graph analysis (2)

tcf_website/management/commands/fetch_enrollment.py (1)

tcf_website/models/models.py (1)

Section (942-992)

tcf_website/tests/test_fetch_enrollment.py (2)

tcf_website/management/commands/fetch_enrollment.py (2)

fetch_section_data (76-137)

get_session (49-65)

tcf_website/models/models.py (1)

SectionEnrollment (1055-1099)

🪛 GitHub Actions: Continuous Integration

tcf_website/management/commands/fetch_enrollment.py

[error] 168-168: Pylint: Trailing whitespace. (C0303)

tcf_website/tests/test_fetch_enrollment.py

[error] 116-116: Pylint: Line too long (101/100). (C0301)

[error] 125-125: Pylint: Line too long (105/100). (C0301)

[error] 132-132: Pylint: Line too long (104/100). (C0301)

tcf_website/management/commands/fetch_enrollment.py

tcf_website/tests/test_fetch_enrollment.py

coderabbitai

Actionable comments posted: 1

♻️ Duplicate comments (1)

tcf_website/management/commands/fetch_enrollment.py (1)
168-168: Remove trailing whitespace on the blank line.

Pylint flags trailing whitespace on this blank line (C0303). Remove the trailing spaces to pass CI.
 print(f"Found {total_sections} sections")
-        
+
 sections = list(sections_queryset)

🧹 Nitpick comments (1)

tcf_website/tests/test_fetch_enrollment.py (1)
107-118: Enhance the test to verify concurrent access.

The test correctly uses the real get_session() implementation (not patched), but it runs threads sequentially (start-join-start-join) rather than concurrently. This verifies that different threads get different sessions, but doesn't test the primary concern: that concurrent threads can safely use thread-local sessions without contention.

Consider this enhanced version that spawns all threads before joining:
 def test_thread_local_sessions(self):
     """Test that each thread gets its own session instance."""
     sessions = []
+    threads = []
+
+    def capture_session():
+        sessions.append(get_session())
+
     for _ in range(5):
-        thread = threading.Thread(target=lambda: sessions.append(get_session()))
-        thread.start()
-        thread.join()
+        thread = threading.Thread(target=capture_session)
+        threads.append(thread)
+
+    for thread in threads:
+        thread.start()
+    for thread in threads:
+        thread.join()
+
     self.assertEqual(
         len(set(id(s) for s in sessions)),
         5,
         "Each thread should have its own session"
     )
This change also eliminates the lambda closure and tests true concurrent access.

📜 Review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 5ad8636 and f6903db.

📒 Files selected for processing (2)

tcf_website/management/commands/fetch_enrollment.py (6 hunks)
tcf_website/tests/test_fetch_enrollment.py (5 hunks)

🧰 Additional context used

🧬 Code graph analysis (2)

tcf_website/management/commands/fetch_enrollment.py (3)

tcf_website/models/models.py (1)

Section (942-992)

tcf_website/api/enrollment.py (2)

process_section (93-102)

update_enrollment_data (68-108)

tcf_website/api/views.py (1)

SectionEnrollmentViewSet (187-221)

tcf_website/tests/test_fetch_enrollment.py (1)

tcf_website/management/commands/fetch_enrollment.py (2)

fetch_section_data (76-137)

get_session (49-65)

🔇 Additional comments (13)

tcf_website/management/commands/fetch_enrollment.py (7)

10-10: LGTM!

The new imports are used appropriately: threading for thread-local storage and connections for cleanup on interrupt.

Also applies to: 17-17

27-27: LGTM!

Reducing MAX_WORKERS from 20 to 5 aligns with the PR objectives to prevent thread and memory leaks.

35-36: LGTM!

Using threading.local() is the correct pattern for thread-local storage to avoid session contention across threads.

49-66: LGTM!

The thread-local session implementation is correct. Each thread gets its own requests.Session with appropriate retry logic and connection pooling. The hasattr check ensures lazy initialization per thread.

93-93: LGTM!

Correctly obtains the thread-local session via get_session(), eliminating the shared global session that caused thread contention.

165-169: LGTM! Materializing the queryset prevents lazy evaluation issues.

Converting the queryset to a list before parallel processing is good practice and prevents each thread from independently evaluating the queryset, which could cause database connection issues.

183-187: LGTM! Proper cleanup on interrupt.

Calling connections.close_all() after shutting down the executor ensures Django database connections are cleaned up when the script is interrupted, preventing connection leaks.

tcf_website/tests/test_fetch_enrollment.py (6)

3-3: LGTM!

The threading import is used appropriately in the new thread-local session test.

8-8: LGTM!

The get_session import is necessary for testing the new thread-local session functionality.

23-28: LGTM!

The test correctly patches get_session and configures the mock session. This properly isolates the HTTP layer during testing.

48-62: LGTM!

The test correctly patches get_session and configures the mock session, consistent with the other tests.

82-88: LGTM!

The test correctly patches get_session and configures the mock session.

95-100: LGTM!

The test correctly patches get_session and configures the mock session with error behavior.

tcf_website/management/commands/fetch_enrollment.py

coderabbitai

Actionable comments posted: 0

🧹 Nitpick comments (1)

tcf_website/management/commands/fetch_enrollment.py (1)
49-66: Consider reducing per-thread pool sizes.

Each thread now creates its own session with pool_connections=MAX_POOL_SIZE (100) and pool_maxsize=MAX_POOL_SIZE (100). With MAX_WORKERS=5, this allocates potentially large connection pools per thread. Since each thread typically connects to a single host (sisuva.admin.virginia.edu), consider reducing these to more modest values like:
-        adapter = HTTPAdapter(
-            pool_connections=MAX_POOL_SIZE,
-            pool_maxsize=MAX_POOL_SIZE,
-            max_retries=retry_strategy,
-        )
+        adapter = HTTPAdapter(
+            pool_connections=10,
+            pool_maxsize=20,
+            max_retries=retry_strategy,
+        )
This would reduce memory overhead while maintaining adequate connection reuse.

📜 Review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between f6903db and 50d6f71.

📒 Files selected for processing (3)

tcf_website/api/views.py (2 hunks)
tcf_website/management/commands/fetch_enrollment.py (6 hunks)
tcf_website/tests/test_fetch_enrollment.py (5 hunks)

🚧 Files skipped from review as they are similar to previous changes (1)

tcf_website/tests/test_fetch_enrollment.py

🧰 Additional context used

🧬 Code graph analysis (1)

tcf_website/management/commands/fetch_enrollment.py (1)

tcf_website/models/models.py (1)

Section (942-992)

🔇 Additional comments (4)

tcf_website/api/views.py (1)

5-5: LGTM! Thread-local connection cleanup is appropriate.

The change from connection.close() to connections.close_all() ensures all thread-local database connections are properly cleaned up in the background thread, preventing connection leaks. This aligns well with the thread-local session pattern introduced in fetch_enrollment.py.

Also applies to: 200-200

tcf_website/management/commands/fetch_enrollment.py (3)

93-93: LGTM! Thread-local session usage fixes the concurrency issue.

Switching from a global session to get_session() ensures each worker thread uses its own requests session, eliminating race conditions and thread contention that could occur with a shared global session.

165-169: LGTM! Materializing the queryset is appropriate for thread safety.

Converting the queryset to a list before parallel processing ensures all section objects are fetched upfront, avoiding potential issues with lazy evaluation and database access patterns in the threaded context.

186-186: LGTM! Connection cleanup on interrupt prevents leaks.

Calling connections.close_all() after shutting down the executor ensures all thread-local database connections are properly closed when the user interrupts the process, preventing connection leaks.

fix: enrollment thread and temp storage issue

5ad8636

coderabbitai bot reviewed Nov 2, 2025

View reviewed changes

tcf_website/management/commands/fetch_enrollment.py Show resolved Hide resolved

tcf_website/tests/test_fetch_enrollment.py Outdated Show resolved Hide resolved

tcf_website/tests/test_fetch_enrollment.py Outdated Show resolved Hide resolved

coderabbitai bot reviewed Nov 2, 2025

View reviewed changes

tcf_website/management/commands/fetch_enrollment.py Show resolved Hide resolved

lint fixes

50d6f71

brandonistfan force-pushed the enrollment-fix branch from f6903db to 50d6f71 Compare November 2, 2025 23:34

coderabbitai bot reviewed Nov 2, 2025

View reviewed changes

brandonistfan force-pushed the enrollment-fix branch from d891339 to 50d6f71 Compare November 3, 2025 04:58

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

fix: enrollment thread and temp storage issue#1187

fix: enrollment thread and temp storage issue#1187
brandonistfan wants to merge 2 commits intodevfrom
enrollment-fix

brandonistfan commented Nov 2, 2025 •

edited

Loading

Uh oh!

coderabbitai bot commented Nov 2, 2025 •

edited

Loading

Uh oh!

coderabbitai bot left a comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

coderabbitai bot left a comment

Uh oh!

Uh oh!

coderabbitai bot left a comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

brandonistfan commented Nov 2, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

GitHub Issues addressed

What I did

Screenshots

Testing

Run tests

Test on dataset

Verify no orphans

Monitor memory

Questions/Discussions/Notes

Summary by CodeRabbit

Uh oh!

coderabbitai bot commented Nov 2, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Walkthrough

Changes

Sequence Diagram(s)

Estimated code review effort

Poem

Pre-merge checks and finishing touches

Uh oh!

coderabbitai bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

coderabbitai bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

coderabbitai bot left a comment

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

brandonistfan commented Nov 2, 2025 •

edited

Loading

coderabbitai bot commented Nov 2, 2025 •

edited

Loading