Skip to content

refactor: use struct and impl method for link check#241

Merged
reddevilmidzy merged 6 commits intomainfrom
refact-link-checker
Oct 5, 2025
Merged

refactor: use struct and impl method for link check#241
reddevilmidzy merged 6 commits intomainfrom
refact-link-checker

Conversation

@reddevilmidzy
Copy link
Owner

@reddevilmidzy reddevilmidzy commented Oct 4, 2025

♟️ What’s this PR about?

  • 모듈 이름 변경
  • client를 매번 생성하여 link를 체크하였었는데, link_checker를 struct로 만들고 client를 재사용하는 방식으로 변경

(default 구현은 안했었는데 clippy가 구현하라고 친절하게 알려줘서 구현해주었다 )

Summary by CodeRabbit

  • Refactor

    • Consolidated link validation into a single link-checking component with a configured timeout, retries and redirect handling for more consistent, maintainable checks.
  • Bug Fixes

    • Improved handling of redirects and specific GitHub 404 cases to reduce false positives during link checks.
  • Tests

    • Updated tests to exercise the new link-checking component and its behavior.

@github-actions github-actions bot added this to the 6차 MVP 구현 milestone Oct 4, 2025
@coderabbitai
Copy link
Contributor

coderabbitai bot commented Oct 4, 2025

Note

Other AI code review bot(s) detected

CodeRabbit has detected other AI code review bot(s) in this pull request and will avoid duplicating their findings in the review comments. This may lead to a less comprehensive review.

Walkthrough

Introduces a public LinkChecker struct encapsulating a reqwest::Client (5s timeout, no redirects) and moves the former module-level async check_link into an async method LinkChecker::check_link(&self, url). Module exports and service call sites are updated to use LinkChecker.

Changes

Cohort / File(s) Summary
LinkChecker implementation
src/link_checker/checker.rs
Adds public struct LinkChecker with a reqwest::Client; implements pub fn new() -> Result<Self, reqwest::Error>, impl Default (delegates to new), and pub async fn check_link(&self, url: &str) -> LinkCheckResult containing the previous check logic (retries, status/redirect handling, error propagation). Removes the prior free pub async fn check_link(url: &str) -> LinkCheckResult.
Module re-exports
src/link_checker/mod.rs
Replaces mod link; with mod checker; and changes pub use link::{LinkCheckResult, check_link}; to pub use checker::{LinkCheckResult, LinkChecker};.
Service integration
src/link_checker/service.rs
Replaces free-function calls with an instantiated LinkChecker (e.g., LinkChecker::default()/new()) and calls link_checker.check_link(&link.url).await; updates imports accordingly. Tests updated to instantiate LinkChecker where applicable.

Sequence Diagram(s)

sequenceDiagram
  autonumber
  actor Caller as Caller
  participant Service as link_checker::service
  participant LC as LinkChecker
  participant HTTP as reqwest::Client

  Caller->>Service: run(...)
  Service->>LC: LinkChecker::new() / default()
  loop For each link
    Service->>LC: check_link(url)
    activate LC
    LC->>HTTP: HTTP request (no redirects, 5s timeout)
    HTTP-->>LC: response / error
    LC-->>Service: LinkCheckResult
    deactivate LC
  end
  Service-->>Caller: Summary/Results
  note over LC,HTTP: Retries, redirect handling, and status classification are internal to LinkChecker
Loading

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~25 minutes

Possibly related PRs

Pre-merge checks and finishing touches

❌ Failed checks (2 warnings)
Check name Status Explanation Resolution
Description Check ⚠️ Warning The description provides a clear “What’s this PR about?” section but does not include the required “🔗 Related Issues / PRs” section from the repository’s template, making the template incomplete. Please add a “## 🔗 Related Issues / PRs” heading and list any related issue or pull request references or explicitly state “None” if there are no related items.
Docstring Coverage ⚠️ Warning Docstring coverage is 60.00% which is insufficient. The required threshold is 80.00%. You can run @coderabbitai generate docstrings to improve docstring coverage.
✅ Passed checks (1 passed)
Check name Status Explanation
Title Check ✅ Passed The title succinctly captures the primary refactoring by indicating the switch to a struct and implementation method for link checking, directly reflecting the core changes in the diff.
✨ Finishing touches
  • 📝 Docstrings were successfully generated. (🔄 Check again to generate docstrings again)
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Post copyable unit tests in a comment
  • Commit unit tests in branch refact-link-checker

📜 Recent review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 38c3433 and 74f3455.

📒 Files selected for processing (1)
  • src/link_checker/checker.rs (3 hunks)
🧰 Additional context used
🧠 Learnings (1)
📓 Common learnings
Learnt from: reddevilmidzy
PR: reddevilmidzy/queensac#226
File: src/main.rs:21-32
Timestamp: 2025-09-22T15:50:19.531Z
Learning: User reddevilmidzy prefers to communicate in Korean and likes to keep PR scope focused and well-defined.
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (1)
  • GitHub Check: test

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 0

🧹 Nitpick comments (1)
src/link_checker/checker.rs (1)

10-17: LinkChecker::new에서 안전한 초기화 경로 추가 제안

현재 build().unwrap() 때문에 TLS 설정 문제 등으로 클라이언트 생성이 실패하면 전체 실행이 패닉으로 종료됩니다. 구조체를 도입해 재사용하기로 한 만큼, 호출 측이 오류를 처리할 수 있도록 fallible 생성자를 제공하는 편이 안정적입니다. new()는 유지하되 내부에서 try_new()를 호출하도록 하면 기존 호출부 변경도 최소화할 수 있습니다.

-    pub fn new() -> Self {
-        let client = reqwest::Client::builder()
-            .timeout(std::time::Duration::from_secs(5))
-            .redirect(reqwest::redirect::Policy::none())
-            .build()
-            .unwrap();
-
-        LinkChecker { client }
-    }
+    pub fn try_new() -> Result<Self, reqwest::Error> {
+        let client = reqwest::Client::builder()
+            .timeout(std::time::Duration::from_secs(5))
+            .redirect(reqwest::redirect::Policy::none())
+            .build()?;
+
+        Ok(LinkChecker { client })
+    }
+
+    pub fn new() -> Self {
+        Self::try_new().expect("failed to build reqwest::Client")
+    }
📜 Review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 6d8d741 and c04ec47.

📒 Files selected for processing (3)
  • src/link_checker/checker.rs (3 hunks)
  • src/link_checker/mod.rs (1 hunks)
  • src/link_checker/service.rs (2 hunks)
🧰 Additional context used
🧠 Learnings (1)
📓 Common learnings
Learnt from: reddevilmidzy
PR: reddevilmidzy/queensac#226
File: src/main.rs:21-32
Timestamp: 2025-09-22T15:50:19.531Z
Learning: User reddevilmidzy prefers to communicate in Korean and likes to keep PR scope focused and well-defined.
🧬 Code graph analysis (1)
src/link_checker/service.rs (1)
src/link_checker/checker.rs (1)
  • new (9-17)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (1)
  • GitHub Check: build

@coveralls-official
Copy link

coveralls-official bot commented Oct 4, 2025

Pull Request Test Coverage Report for Build 18257090077

Details

  • 44 of 50 (88.0%) changed or added relevant lines in 1 file are covered.
  • No unchanged relevant lines lost coverage.
  • Overall coverage increased (+0.2%) to 79.67%

Changes Missing Coverage Covered Lines Changed/Added Lines %
src/link_checker/checker.rs 44 50 88.0%
Totals Coverage Status
Change from base Build 18244616633: 0.2%
Covered Lines: 870
Relevant Lines: 1092

💛 - Coveralls

@reddevilmidzy reddevilmidzy requested a review from Copilot October 5, 2025 09:03
Copy link
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull Request Overview

This PR refactors the link checking functionality by introducing a struct-based approach for better client reuse and improved architecture.

  • Renamed module from link to checker and introduced LinkChecker struct
  • Replaced function-based link checking with a struct that reuses HTTP client instances
  • Added Default implementation for LinkChecker as suggested by Clippy

Reviewed Changes

Copilot reviewed 3 out of 3 changed files in this pull request and generated 1 comment.

File Description
src/link_checker/mod.rs Updated module and public API exports to use new LinkChecker struct
src/link_checker/service.rs Modified to use instance-based link checking with LinkChecker::new()
src/link_checker/checker.rs Complete refactor introducing LinkChecker struct with client reuse and updated tests

Tip: Customize your code reviews with copilot-instructions.md. Create the file or learn how to get started.

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🧹 Nitpick comments (2)
src/link_checker/checker.rs (2)

9-20: Consider capturing the underlying error.

The error message "failed to create Client" is generic and doesn't include the actual error from reqwest::Client::builder().build(), which would help with debugging if client creation fails.

Apply this diff to capture and propagate the underlying error:

 pub fn new() -> Result<Self, String> {
     let client = reqwest::Client::builder()
         .timeout(std::time::Duration::from_secs(5))
         .redirect(reqwest::redirect::Policy::none())
-        .build();
-
-    if let Ok(client) = client {
-        Ok(LinkChecker { client })
-    } else {
-        Err("failed to create Client".to_string())
-    }
+        .build()
+        .map_err(|e| format!("failed to create Client: {e}"))?;
+    
+    Ok(LinkChecker { client })
 }

59-63: Consider whether Default should panic.

The Default trait implementation calls expect(), which will panic if client creation fails. While reqwest::Client::builder().build() rarely fails in practice, panicking in a Default implementation can be surprising for callers who expect Default to be infallible.

Consider one of these alternatives:

  1. Document that Default::default() may panic.
  2. Use ClientBuilder::build() without custom configuration (which is infallible) for the Default impl, and reserve the configured client for explicit new() calls.
  3. Remove the Default impl and require explicit construction.

Since your use case consistently needs the timeout and redirect policy, and clippy suggested adding Default, the current approach is acceptable if you document the panic possibility or are confident client creation won't fail in your environment.

📜 Review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between c04ec47 and d9ad8f1.

📒 Files selected for processing (2)
  • src/link_checker/checker.rs (3 hunks)
  • src/link_checker/service.rs (2 hunks)
🧰 Additional context used
🧠 Learnings (1)
📓 Common learnings
Learnt from: reddevilmidzy
PR: reddevilmidzy/queensac#226
File: src/main.rs:21-32
Timestamp: 2025-09-22T15:50:19.531Z
Learning: User reddevilmidzy prefers to communicate in Korean and likes to keep PR scope focused and well-defined.
🧬 Code graph analysis (1)
src/link_checker/service.rs (1)
src/link_checker/checker.rs (2)
  • default (60-62)
  • new (9-20)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (1)
  • GitHub Check: test
🔇 Additional comments (4)
src/link_checker/service.rs (1)

103-108: LGTM! Efficient refactor to reuse HTTP client.

Creating the LinkChecker instance once before the loop ensures the underlying reqwest::Client is reused across all link checks, which is more efficient than creating a new client for each check. The method call syntax is correct.

src/link_checker/checker.rs (3)

4-6: LGTM! Clean struct definition.

The struct encapsulates the reqwest::Client to enable efficient reuse across multiple link checks.


22-56: LGTM! Retry logic is sound.

The retry mechanism with 3 attempts, 1-second sleep between retries, and proper error handling ensures robustness when checking links. The method correctly uses &self to share the client across calls.


131-188: LGTM! Tests updated correctly.

All tests now instantiate LinkChecker::default() and call the method appropriately. The test coverage remains intact.

Copilot AI review requested due to automatic review settings October 5, 2025 09:30
Copy link
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull Request Overview

Copilot reviewed 3 out of 3 changed files in this pull request and generated 2 comments.


Tip: Customize your code reviews with copilot-instructions.md. Create the file or learn how to get started.

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 2

🧹 Nitpick comments (2)
src/link_checker/checker.rs (2)

22-56: Consider distinguishing retryable from non-retryable errors.

The retry logic currently retries all request errors uniformly. However, certain failures like DNS resolution errors, certificate validation failures, or 4xx client errors (except 429 rate limits) are unlikely to succeed on retry and waste time.

You could check error types or status codes to skip retries for non-transient failures:

pub async fn check_link(&self, url: &str) -> LinkCheckResult {
    let mut attempts = 3;
    while attempts > 0 {
        match self.client.get(url).send().await {
            Ok(res) => {
                // ... existing logic ...
            }
            Err(e) => {
                // Don't retry non-transient errors
                if e.is_connect() || e.is_timeout() || e.is_status() {
                    if attempts == 1 {
                        return LinkCheckResult::Invalid(format!("Request error: {e}"));
                    }
                } else {
                    // DNS, TLS, redirect errors - fail immediately
                    return LinkCheckResult::Invalid(format!("Request error: {e}"));
                }
            }
        }
        attempts -= 1;
        tokio::time::sleep(tokio::time::Duration::from_secs(1)).await;
    }
    LinkCheckResult::Invalid("Max retries exceeded".to_string())
}

30-39: Redirect without Location header returns Valid.

Line 39 returns LinkCheckResult::Valid when a redirection status is received but no Location header is present. A redirect response without a location header is unusual and might indicate a misconfigured server.

Consider treating this as an error or at least logging it:

             if let Some(redirect_url) = res.headers().get("location")
                 && let Ok(redirect_str) = redirect_url.to_str()
             {
                 if is_trivial_redirect(url, redirect_str) {
                     return LinkCheckResult::Valid;
                 }
                 return LinkCheckResult::Redirect(redirect_str.to_string());
             }
-            return LinkCheckResult::Valid;
+            // Redirect without Location header is unusual
+            return LinkCheckResult::Invalid(
+                format!("Redirect status {} without Location header", status)
+            );
📜 Review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between d9ad8f1 and 38c3433.

📒 Files selected for processing (1)
  • src/link_checker/checker.rs (3 hunks)
🧰 Additional context used
🧠 Learnings (1)
📓 Common learnings
Learnt from: reddevilmidzy
PR: reddevilmidzy/queensac#226
File: src/main.rs:21-32
Timestamp: 2025-09-22T15:50:19.531Z
Learning: User reddevilmidzy prefers to communicate in Korean and likes to keep PR scope focused and well-defined.
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (1)
  • GitHub Check: build

Comment on lines +40 to +41
} else if status.as_u16() == 404 && url.contains("github.com") {
return handle_github_404(url);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

URL host check is fragile.

Using url.contains("github.com") can match unintended URLs like https://notgithub.com or paths containing the substring. Consider parsing the URL and checking the host properly.

-} else if status.as_u16() == 404 && url.contains("github.com") {
+} else if status.as_u16() == 404 && is_github_url(url) {
     return handle_github_404(url);

Add a helper function:

fn is_github_url(url: &str) -> bool {
    Url::parse(url)
        .ok()
        .and_then(|u| u.host_str())
        .map(|h| h == "github.com" || h.ends_with(".github.com"))
        .unwrap_or(false)
}
🤖 Prompt for AI Agents
In src/link_checker/checker.rs around lines 40-41, the check using
url.contains("github.com") is fragile and can match unintended strings; replace
it by parsing the URL and checking the host properly (use
Url::parse(url).ok().and_then(|u| u.host_str()).map(|h| h == "github.com" ||
h.ends_with(".github.com")).unwrap_or(false))—add a small helper fn
is_github_url(url: &str) -> bool that implements this logic, use that helper in
place of the contains() call, and ensure the url crate is in scope and parsing
failures are handled by returning false.

@reddevilmidzy reddevilmidzy merged commit bf1d7d6 into main Oct 5, 2025
6 checks passed
@reddevilmidzy reddevilmidzy deleted the refact-link-checker branch October 5, 2025 09:52
coderabbitai bot added a commit that referenced this pull request Oct 5, 2025
Docstrings generation was requested by @reddevilmidzy.

* #241 (comment)

The following files were modified:

* `src/link_checker/checker.rs`
* `src/link_checker/service.rs`
@coderabbitai
Copy link
Contributor

coderabbitai bot commented Oct 5, 2025

Note

Generated docstrings for this pull request at #243

reddevilmidzy added a commit that referenced this pull request Oct 5, 2025
Docstrings generation was requested by @reddevilmidzy.

* #241 (comment)

The following files were modified:

* `src/link_checker/checker.rs`
* `src/link_checker/service.rs`

Co-Authored-by: Redddy <areyou@redddy.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants