Skip to content

v1.2.0#705

Merged
Mzack9999 merged 37 commits intomainfrom
dev
Nov 25, 2025
Merged

v1.2.0#705
Mzack9999 merged 37 commits intomainfrom
dev

Conversation

@dogancanbakir
Copy link
Member

@dogancanbakir dogancanbakir commented Nov 25, 2025

Summary by CodeRabbit

Release Notes

  • New Features

    • Added GreyNoise GNQL query support with pagination and rate-limiting.
    • Enhanced Hunter filtering with web status, port filtering, and time-range options.
    • Updated FOFA queries to include full-result option for expanded data retrieval.
  • Bug Fixes

    • Improved error handling and response body cleanup across query sources.
  • Documentation

    • Updated Censys configuration to use API Token and Organization ID.
    • Removed legacy ZoomEye host configuration guidance.
  • Chores

    • Bumped Go version to 1.24.0.

✏️ Tip: You can customize this high-level summary in your review settings.

TonyD0g and others added 30 commits July 8, 2025 15:22
According to the latest documentation on the FOFA official website, update the API request construction,and convert some const variables used by FOFA into regular variables to allow customizable configuration when used as an SDK.
Use the new censys sdk for better stability in API updates. As censys now returns multiple endpoints per search, we now iterate over every endpoint and create a new result. This also changes the way ip are saved in the raw response, as we dont have the problem anymore, as we have one ip per endpoint.
Adapt censys to new platform search
Fix(Onyphe): Handle quoted queries correctly
@dogancanbakir dogancanbakir self-assigned this Nov 25, 2025
@coderabbitai
Copy link
Contributor

coderabbitai bot commented Nov 25, 2025

Walkthrough

This PR integrates GreyNoise GNQL as a new query source with full agent implementation and CLI support, migrates Censys from manual HTTP calls to the censys-sdk-go library with updated credentials (API token and organization ID), standardizes HTTP response cleanup error handling across multiple agents, updates documentation and environment variables, and extends configuration to support GreyNoise authentication.

Changes

Cohort / File(s) Summary
GreyNoise Agent Implementation
sources/agent/greynoise/greynoise.go, sources/agent/greynoise/request.go, sources/agent/greynoise/response.go
New GreyNoise GNQL agent with query streaming, pagination, rate-limiting, error mapping, and data extraction; request and response types for API serialization and deserialization
Censys SDK Migration
sources/agent/censys/censys.go, sources/agent/censys/response.go
Replaced manual HTTP/REST with censys-sdk-go client; updated authentication to token + organization ID; adapted pagination and result extraction to typed SDK structures; deleted legacy response types
HTTP Response Cleanup Standardization
sources/agent/binaryedge/binaryedge.go, sources/agent/driftnet/driftnet.go, sources/agent/google/google.go, sources/agent/odin/odin.go, sources/agent/onyphe/onyphe.go, sources/agent/zoomeye/zoomeye.go
Unified defer Close() patterns to explicitly ignore errors via anonymous functions
CLI and Options Integration
runner/options.go, uncover.go
Added GreyNoise CLI flag (--greynoise/-gn), integrated into query aggregation and engine selection logic; registered GreyNoise agent in factory
Configuration and Keys
sources/keys.go, sources/provider.go, sources/session.go
Updated Keys struct to replace CensysSecret with CensysOrgId and add GreyNoiseKey; extended Provider struct with GreyNoise field; added GreyNoise rate limit entry
Agent-Specific Enhancements
sources/agent/fofa/fofa.go, sources/agent/hunter/hunter.go, sources/agent/hunter/request.go
FOFA: added Full bool parameter, improved error handling with raw response fallback; Hunter: added new filter parameters (is_web, start_time, end_time, port_filter) and raw response fallback on JSON decode error
Test and Output Updates
integration-tests/integration-test.go, integration-tests/source-test.go, runner/output_writer.go, runner/runner.go
Added GreyNoise test case; standardized cleanup in test helpers; updated output logging to suppress non-verbose printing; made error handling explicit in output writer
Documentation and Dependencies
README.md, go.mod
Updated Censys configuration references and environment variables; removed ZoomEye host guidance; added censys-sdk-go dependency; bumped Go version to 1.24.0

Sequence Diagram

sequenceDiagram
    participant CLI as User/CLI
    participant Runner as Runner
    participant Session as Session
    participant Agent as GreyNoise Agent
    participant API as GreyNoise API
    
    CLI->>Runner: --greynoise "query" --greynoise "query2"
    activate Runner
    Runner->>Session: Create with keys
    activate Session
    
    loop For each GreyNoise query
        Runner->>Agent: Query(session, query)
        activate Agent
        Agent->>API: POST /v3/gnql (auth header, query params)
        activate API
        API-->>Agent: JSON response (GNQLItem[] with pagination)
        deactivate API
        
        loop Process items and pagination
            Agent->>Agent: Extract IPs, hosts, ports from GNQLItem
            Agent->>Agent: Emit Result per combination
            Note over Agent: Check rate limits, handle errors
            alt Rate limited (429)
                Agent-->>Runner: ErrRateLimited
            else Unauthorized (401)
                Agent-->>Runner: ErrUnauthorized
            else Success
                Agent-->>Runner: chan Result (IP, Host, Port, Raw)
            end
        end
        deactivate Agent
    end
    
    Runner->>Session: Write results
    deactivate Session
    Runner-->>CLI: Output
    deactivate Runner
Loading

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~60 minutes

  • High-complexity areas:
    • GreyNoise agent implementation: new feature with dense logic including pagination, error mapping, data extraction helpers, and concurrent streaming
    • Censys migration: substantial refactoring from manual HTTP to SDK-based approach with updated auth and response handling
    • Integration points: new GreyNoise wiring across CLI, configuration, session, and provider layers
  • Attention needed:
    • Verify GreyNoise API error handling and rate-limit retry logic correctness
    • Validate Censys SDK migration preserves query semantics and pagination behavior
    • Confirm FOFA and Hunter error fallback paths (raw response emission) behave as intended
    • Check test coverage for new GreyNoise test cases and cleanup patterns

Possibly related PRs

Suggested reviewers

  • dwisiswant0
  • Mzack9999

Poem

🐰 A rabbit's rhyme for sources new

With GreyNoise queries streaming through,
And Censys SDK in place,
Error handling keeps its grace—
Response bodies close just right,
Our uncover shines ever bright! 🌟

Pre-merge checks and finishing touches

❌ Failed checks (2 warnings)
Check name Status Explanation Resolution
Title check ⚠️ Warning The title 'v1.2.0' is a version number that does not describe the actual changes in the pull request, which include Censys API migration, GreyNoise integration, and various refactoring updates. Use a descriptive title summarizing the main changes, such as 'Add GreyNoise integration and migrate Censys to SDK' or similar, instead of just the version number.
Docstring Coverage ⚠️ Warning Docstring coverage is 25.00% which is insufficient. The required threshold is 80.00%. You can run @coderabbitai generate docstrings to improve docstring coverage.
✅ Passed checks (1 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
✨ Finishing touches
  • 📝 Generate docstrings
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Post copyable unit tests in a comment
  • Commit unit tests in branch dev

Comment @coderabbitai help to get the list of available commands and usage tips.

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 4

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (5)
sources/agent/hunter/hunter.go (1)

113-121: StatusCode and PortFilter parameters are set but never used in the query URL.

The hunterRequest populates both StatusCode (line 53) and PortFilter (line 54), but the fmt.Sprintf call on line 115 only passes 7 parameters: ApiKey, base64Query, Page, PageSize, IsWeb, StartTime, and EndTime. The URL template (line 17) has no placeholders for these filters.

Per the Hunter QianXin API documentation, status_code should be added as a URL parameter (e.g., &status_code=%s). The port_filter should be embedded in the base64-encoded search query using DSL syntax (e.g., ip.port="6379"), not appended to the URL.

Fix: Update the URL template to include &status_code=%s and modify fmt.Sprintf to pass hunterRequest.StatusCode. For PortFilter, consider whether it should be part of the search query encoding or handled separately.

sources/agent/onyphe/onyphe.go (1)

119-126: Close and drain response body on non-200 status to avoid connection leaks

In this branch, the function returns an error without consuming or closing resp.Body. With the usual http.Client semantics, that can leak connections and prevent the transport from reusing them under repeated non-200 responses.

You can defensively drain and close the body before returning:

-	resp, err := session.Do(request, agent.Name())
-	if err != nil {
-		return nil, err
-	}
-
-	if resp.StatusCode != http.StatusOK {
-		return nil, fmt.Errorf("unexpected status code: %d", resp.StatusCode)
-	}
-
-	return resp, nil
+	resp, err := session.Do(request, agent.Name())
+	if err != nil {
+		return nil, err
+	}
+
+	if resp.StatusCode != http.StatusOK {
+		// Drain and close body to allow connection reuse on error responses
+		_, _ = io.Copy(io.Discard, resp.Body)
+		_ = resp.Body.Close()
+		return nil, fmt.Errorf("unexpected status code: %d", resp.StatusCode)
+	}
+
+	return resp, nil
sources/agent/google/google.go (1)

72-100: Potential resource leak: resp.Body is not closed.

The queryURL method returns an *http.Response, but resp.Body is never closed in the query method. While gzipReader.Close() is properly deferred (lines 87-89), this only closes the gzip reader wrapper—not the underlying response body.

Apply this diff to ensure the response body is closed:

 func (agent *Agent) query(session *sources.Session, googleRequest *Request, results chan sources.Result) []string {

 	resp, err := agent.queryURL(session, googleRequest)
 	if err != nil {
 		results <- sources.Result{Source: agent.Name(), Error: err}
 		return nil
 	}
+	defer func() {
+		_ = resp.Body.Close()
+	}()

 	var apiResponse Response
 	if resp.Header.Get("Content-Encoding") == "gzip" {
sources/agent/driftnet/driftnet.go (1)

100-123: Deferred closure now leaks earlier response bodies in loops

Wrapping resp.Body.Close() in a closure inside the loops introduces a subtle bug:

resp, queryError := agent.queryURL(...)
defer func() {
    _ = resp.Body.Close()
}()

Because resp is a loop-scoped variable reused on each iteration (:= only defines it once), all deferred closures capture the same resp and will run with its final value. As a result:

  • Only the last resp.Body is closed (multiple times).
  • All previous response bodies are never closed, leading to descriptor/connection leaks under load.

Previously, defer resp.Body.Close() evaluated the receiver at defer time, so each response was closed correctly.

Fix by capturing the body value as a parameter to the deferred function in both locations:

-           defer func() {
-               _ = resp.Body.Close()
-           }()
+           defer func(body io.ReadCloser) {
+               _ = body.Close()
+           }(resp.Body)

and similarly for the CIDR path:

-       defer func() {
-           _ = resp.Body.Close()
-       }()
+       defer func(body io.ReadCloser) {
+           _ = body.Close()
+       }(resp.Body)

This keeps the “ignore close error” behavior but ensures each response body is closed exactly once.

Also applies to: 217-237

README.md (1)

179-196: Env var examples align with Censys changes but omit GreyNoise

Adding CENSYS_API_TOKEN and CENSYS_ORGANIZATION_ID here matches the new env handling in sources/provider.go. However, since the code reads GREYNOISE_API_KEY, it would be good to add it to this block as well for discoverability, e.g.:

 export DRIFTNET_API_KEY=xxx
+export GREYNOISE_API_KEY=xxx
🧹 Nitpick comments (14)
runner/output_writer.go (1)

72-79: Clarify the intentional ignoring of Close errors (optional).

Using _ = fileWriter.Close() keeps the previous behavior (ignoring errors) but makes it more explicit. If this is intentionally best-effort cleanup, consider adding a short comment or logging failures so future readers and linters know the errors are deliberately ignored, not accidentally forgotten.

sources/agent/hunter/hunter.go (1)

20-27: Package-level mutable state may cause issues with concurrent usage.

These variables are shared across all agent instances. If the agent is used concurrently with different configurations, this could lead to race conditions. Consider passing these as parameters or part of the session/query configuration if concurrent usage with different settings is expected.

sources/agent/fofa/fofa.go (2)

18-19: Full flag wiring is fine but can be made more flexible and consistent

The new Full plumbing (global Full bool, &full=%t in URL, and FofaRequest.Full bool passed into fmt.Sprintf) looks coherent and should produce URLs like ...&full=true / ...&full=false.

A couple of small improvements to consider:

  • In queryURL, you currently hard‑code Fields:

    fofaURL := fmt.Sprintf(URL, session.Keys.FofaKey, base64Query, Fields, fofaRequest.Page, fofaRequest.Size, fofaRequest.Full)

    but FofaRequest already has a Fields field. Using the struct field improves testability and future flexibility:

  • fofaURL := fmt.Sprintf(URL, session.Keys.FofaKey, base64Query, Fields, fofaRequest.Page, fofaRequest.Size, fofaRequest.Full)

  • fofaURL := fmt.Sprintf(URL, session.Keys.FofaKey, base64Query, fofaRequest.Fields, fofaRequest.Page, fofaRequest.Size, fofaRequest.Full)

- Exposing `Size`, `Fields`, and `Full` as package‑level vars is convenient for CLI overrides, but it also makes them mutable globals. As long as they’re only set once at startup before any queries run, you’re fine; if you ever need per‑query control or concurrent tweaking, it’d be safer to move them into `sources.Query` or `Session` instead of relying on globals.

These aren’t blockers, but tightening them up now will make the FOFA agent easier to evolve.  



Also applies to: 21-29, 50-56, 73-76, 121-127

---

`61-64`: **Re‑verify pagination break condition vs meaning of `fofaResponse.Size`**

The new loop termination condition:

```go
numberOfResults += len(fofaResponse.Results)
page++
size := fofaResponse.Size
if size == 0 || numberOfResults >= query.Limit || len(fofaResponse.Results) == 0 || numberOfResults > size {
  break
}

mixes a cumulative counter (numberOfResults) with fofaResponse.Size. If Size represents the total number of matches, numberOfResults > size is reasonable. But if Size is per‑page size (common in APIs), numberOfResults > size will become true after the second page and prematurely stop pagination.

This behavior predates the Full change, but given you’re touching pagination logic (numberOfResults >= query.Limit), it’s a good moment to double‑check FOFA’s response schema and ensure:

  • fofaResponse.Size is indeed “total available results,” or
  • If it’s page size, you should compare against a “total” field instead (if available), or drop the numberOfResults > size clause.

Please confirm against FOFA’s API docs or a real response and adjust as needed.

sources/agent/onyphe/onyphe.go (2)

31-62: API key validation looks good; consider clarifying limit semantics

The new empty-key check and the overall pagination loop look correct and consistent with other agents.

One behavioral quirk that remains: maxResults is enforced only at page granularity. If query.Limit is not a multiple of the page size (10), the final page can push you over the requested limit (e.g. 90 + 20 ⇒ 110 results) before the loop terminates.

If Limit is intended as a strict cap rather than “best effort”, consider enforcing it while emitting results (or adjusting page size / stopping mid-page). Otherwise, a short comment noting that the limit is best-effort would help future readers.


10-10: Manual %22" replacement after url.QueryEscape is brittle; consider reverting or documenting

escapedQuery := url.QueryEscape(onypheRequest.Query) is the right primitive for building the q parameter. Replacing all %22 with literal " afterward:

escapedQuery = strings.ReplaceAll(escapedQuery, "%22", "\"")

introduces a few risks:

  • Queries that intentionally contain the literal substring %22 (already-encoded input) will be silently rewritten to include ", changing their meaning.
  • http.NewRequest / URL parsing may re-encode the raw " back to %22 anyway, so this line might be a no-op in practice but adds confusion.
  • Without a comment tying this to a specific Onyphe API quirk, future maintainers will have a hard time understanding or safely modifying this behavior.

Unless you’ve confirmed that Onyphe requires literal double quotes in the q parameter and fails when they’re sent as %22, I’d suggest simplifying back to the standard escaping (and dropping the strings import):

-	escapedQuery := url.QueryEscape(onypheRequest.Query)
-	escapedQuery = strings.ReplaceAll(escapedQuery, "%22", "\"")
-	urlWithQuery := fmt.Sprintf(URLTemplate, escapedQuery, onypheRequest.Page)
+	escapedQuery := url.QueryEscape(onypheRequest.Query)
+	urlWithQuery := fmt.Sprintf(URLTemplate, escapedQuery, onypheRequest.Page)

If this replacement is required to work around a real Onyphe quirk, please add a brief code comment and, ideally, a test that demonstrates the failure without it.

Also applies to: 107-110

sources/provider.go (1)

159-176: Censys & GreyNoise env var handling: OK, but consider migration messaging

LoadProviderKeysFromEnv:

  • Uses CENSYS_API_TOKEN + CENSYS_ORGANIZATION_ID, aligning with the new SDK auth model.
  • Reads GREYNOISE_API_KEY into provider.GreyNoise, matching the rest of the integration.

This does, however, silently drop support for any legacy CENSYS_API_ID / CENSYS_API_SECRET env vars. If many users depend on those, consider a transitional path (e.g., also reading old vars with a warning, or clearly documenting the breaking change in release notes).

Would you confirm whether the previous Censys env var names are explicitly deprecated in your release notes/changelog? If not, it’s worth calling that out to avoid confusion for existing users.

integration-tests/source-test.go (2)

302-302: Inconsistent cleanup pattern.

Other test cases in this file use defer func() { _ = os.RemoveAll(ConfigFile) }() but this one uses defer os.RemoveAll(ConfigFile). While functionally similar, the pattern should be consistent with the rest of the file.

-	defer os.RemoveAll(ConfigFile)
+	defer func() {
+		_ = os.RemoveAll(ConfigFile)
+	}()

304-318: Test silently passes on failures and empty results.

This test returns nil (success) even when the query fails or returns no results, which means CI will always report success regardless of actual GreyNoise functionality. While the comments explain this is due to Community vs Enterprise API key differences, consider returning an error when GREYNOISE_API_KEY is set but the query unexpectedly fails with a non-plan-related error.

sources/agent/censys/censys.go (2)

89-89: Remove commented-out code.

Dead code should be removed rather than left as comments.

		results <- sources.Result{Source: agent.Name(), Error: err}
-		// httputil.DrainResponseBody(resp)
		return nil

93-97: Variable shadowing: result is declared twice in nested scopes.

The inner result at line 97 shadows the outer result from line 93. Consider renaming one of them (e.g., sdkResult for the outer or r for the inner) to improve clarity.

-	if result := resp.ResponseEnvelopeSearchQueryResponse.Result; result != nil {
-		for _, censysResult := range result.Hits {
+	if sdkResult := resp.ResponseEnvelopeSearchQueryResponse.Result; sdkResult != nil {
+		for _, censysResult := range sdkResult.Hits {
sources/agent/greynoise/response.go (1)

47-47: Tags field uses json.RawMessage but Tag struct is defined.

The Tags field is declared as json.RawMessage (line 47), but there's a fully-defined Tag struct (lines 80-92). If the API consistently returns an array of tags, consider using []Tag instead for type safety. If the structure varies, json.RawMessage is appropriate, but then the Tag struct may be dead code.

sources/agent/greynoise/greynoise.go (2)

173-178: Ignored error from url.Parse and redundant URL construction.

The error from url.Parse(URL) is discarded (line 173). While URL is a constant and unlikely to fail, consider either handling the error or simplifying the construction since URL already contains the path.

-	path := "/v3/gnql"
-	if request.ExcludeRaw {
-		path = "/v3/gnql/metadata"
-	}
-
-	baseURL, _ := url.Parse(URL)
-	baseURL.Path = path
-	fullURL := baseURL.String()
-	if enc := params.Encode(); enc != "" {
-		fullURL = fullURL + "?" + enc
-	}
+	path := URL
+	if request.ExcludeRaw {
+		path = "https://api.greynoise.io/v3/gnql/metadata"
+	}
+	fullURL := path
+	if enc := params.Encode(); enc != "" {
+		fullURL = path + "?" + enc
+	}

Alternatively, define both URL constants at the package level.


281-288: IPv6 address handling in hostname extraction.

The port-stripping logic at lines 281-285 attempts to handle IPv6 addresses with ports (e.g., [::1]:8080), but the bracket handling is partial. Consider using net.SplitHostPort for more robust host:port parsing that handles both IPv4 and IPv6 correctly.

// More robust approach using net.SplitHostPort
if host, _, err := net.SplitHostPort(h); err == nil {
    h = host
}
📜 Review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 4f9bcff and 0ac7391.

⛔ Files ignored due to path filters (7)
  • .github/workflows/build-test.yml is excluded by !**/*.yml
  • .github/workflows/provider-integration.yml is excluded by !**/*.yml
  • .github/workflows/release-binary.yml is excluded by !**/*.yml
  • .github/workflows/release-test.yml is excluded by !**/*.yml
  • go.sum is excluded by !**/*.sum
  • sources/agent/censys/example.json is excluded by !**/*.json
  • sources/agent/greynoise/example.json is excluded by !**/*.json
📒 Files selected for processing (25)
  • README.md (3 hunks)
  • go.mod (2 hunks)
  • integration-tests/integration-test.go (1 hunks)
  • integration-tests/source-test.go (12 hunks)
  • runner/options.go (7 hunks)
  • runner/output_writer.go (1 hunks)
  • runner/runner.go (0 hunks)
  • sources/agent/binaryedge/binaryedge.go (1 hunks)
  • sources/agent/censys/censys.go (3 hunks)
  • sources/agent/censys/response.go (0 hunks)
  • sources/agent/driftnet/driftnet.go (2 hunks)
  • sources/agent/fofa/fofa.go (5 hunks)
  • sources/agent/google/google.go (1 hunks)
  • sources/agent/greynoise/greynoise.go (1 hunks)
  • sources/agent/greynoise/request.go (1 hunks)
  • sources/agent/greynoise/response.go (1 hunks)
  • sources/agent/hunter/hunter.go (4 hunks)
  • sources/agent/hunter/request.go (1 hunks)
  • sources/agent/odin/odin.go (1 hunks)
  • sources/agent/onyphe/onyphe.go (4 hunks)
  • sources/agent/zoomeye/zoomeye.go (1 hunks)
  • sources/keys.go (3 hunks)
  • sources/provider.go (5 hunks)
  • sources/session.go (2 hunks)
  • uncover.go (3 hunks)
💤 Files with no reviewable changes (2)
  • sources/agent/censys/response.go
  • runner/runner.go
🧰 Additional context used
🧠 Learnings (1)
📚 Learning: 2025-05-12T11:42:48.205Z
Learnt from: Xanderux
Repo: projectdiscovery/uncover PR: 672
File: sources/agent/onyphe/response.go:3-14
Timestamp: 2025-05-12T11:42:48.205Z
Learning: The Onyphe integration in Uncover deliberately captures only IP and Port information from API responses to minimize memory usage, as other fields returned by the Onyphe API are not needed for Uncover's functionality.

Applied to files:

  • README.md
🧬 Code graph analysis (8)
sources/agent/hunter/request.go (1)
sources/agent/hunter/hunter.go (1)
  • PortFilter (23-23)
sources/agent/greynoise/request.go (1)
sources/agent.go (1)
  • Query (3-6)
uncover.go (2)
sources/agent/greynoise/greynoise.go (1)
  • Agent (36-36)
sources/agent.go (1)
  • Agent (8-11)
sources/agent/hunter/hunter.go (2)
sources/agent.go (1)
  • Query (3-6)
sources/agent/hunterhow/hunterhow.go (1)
  • Size (14-14)
sources/agent/onyphe/onyphe.go (3)
sources/keys.go (1)
  • Keys (3-23)
sources/agent/onyphe/response.go (2)
  • Result (11-14)
  • OnypheResponse (3-9)
sources/agent.go (1)
  • Query (3-6)
integration-tests/source-test.go (2)
testutils/integration.go (1)
  • RunUncoverAndGetResults (10-38)
uncover.go (1)
  • New (57-113)
sources/agent/censys/censys.go (3)
sources/keys.go (1)
  • Keys (3-23)
sources/result.go (1)
  • Result (8-17)
sources/session.go (1)
  • Session (39-44)
sources/agent/fofa/fofa.go (2)
sources/keys.go (1)
  • Keys (3-23)
sources/result.go (1)
  • Result (8-17)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (5)
  • GitHub Check: release-test
  • GitHub Check: Test Builds (macOS-latest, 1.24.x)
  • GitHub Check: Test Builds (ubuntu-latest, 1.24.x)
  • GitHub Check: Test Builds (windows-latest, 1.24.x)
  • GitHub Check: Analyze (go)
🔇 Additional comments (32)
sources/agent/hunter/request.go (1)

12-12: LGTM!

The PortFilter field is correctly added with appropriate type and JSON tag. However, note that this field is currently not being used in the URL construction in hunter.go (see related comment there).

sources/agent/onyphe/onyphe.go (1)

74-94: Response body closing and JSON/API error handling are correct

Deferring resp.Body.Close() via an anonymous function and explicitly ignoring the close error is idiomatic here, and the JSON unmarshal + apiResponse.Error checks correctly turn parse/API errors into sources.Result{Error: ...} and stop pagination by returning nil.

No functional issues from this change.

sources/agent/zoomeye/zoomeye.go (1)

97-99: LGTM!

The deferred closure pattern is consistent with the standardized approach across other agents in this PR. The response body is properly closed after the error check.

sources/agent/binaryedge/binaryedge.go (1)

44-46: LGTM!

The deferred closure is correctly placed after the error check and follows the standardized pattern across the codebase.

sources/agent/odin/odin.go (1)

93-95: LGTM!

The deferred closure follows the standardized pattern and is correctly placed after the error check.

go.mod (1)

3-3: Go 1.24.0 is stable and available for production use.

Go 1.24 was released on February 11, 2025, making it a stable release that has been in production for several months. No action required.

sources/session.go (2)

19-36: GreyNoise default rate limit wiring looks consistent

"greynoise" entry matches the agent/engine naming and other 1 req/sec defaults. No issues from a rate limiting perspective.


120-123: Non-200 error context change is acceptable

Switching to url.QueryUnescape(request.String()) gives a more complete, human-readable representation of the request in errors. Since the unescape error is intentionally ignored and only used for logging, this is fine.

README.md (1)

133-143: Censys provider config docs match new token/orgId format

The Censys YAML examples now align with provider.go (CENSYS_API_TOKEN:CENSYS_ORGANIZATION_ID). This is consistent with the new Keys fields and SDK usage.

integration-tests/integration-test.go (1)

20-38: GreyNoise integration tests are correctly wired

Adding "greynoise": greynoiseTestcases{} to the tests map matches the naming of the new agent and follows the existing pattern for other sources.

uncover.go (2)

8-27: GreyNoise agent import is consistent with existing agent structure

Importing sources/agent/greynoise alongside the other agents is straightforward and matches the project’s layout.


59-95: GreyNoise agent selection is correctly integrated

The "greynoise" case in New appends &greynoise.Agent{} to s.Agents, and AllAgents() lists "greynoise" as supported. The engine string matches CLI (-greynoise) and the rate-limit key in sources/session.go, so everything lines up.

sources/provider.go (3)

23-40: GreyNoise provider field matches YAML tag and usage

Adding GreyNoise []string 'yaml:"greynoise"' to Provider is consistent with the configuration format used elsewhere and will deserialize cleanly from provider-config.yaml.


55-62: Key extraction for Censys and GreyNoise is coherent

  • Censys now expects token:orgId and maps cleanly into keys.CensysToken and keys.CensysOrgId.
  • GreyNoise selects a random key from provider.GreyNoise into keys.GreyNoiseKey, matching Keys and the greynoise agent’s expectations.

No issues here.

Also applies to: 120-125


179-197: HasKeys updated correctly for GreyNoise

Including len(provider.GreyNoise) > 0 in HasKeys() keeps the “any provider has keys” logic in sync with the new source.

sources/keys.go (1)

3-23: Keys struct and Empty() logic remain sound after Censys/GreyNoise changes

  • Swapping CensysSecret for CensysOrgId and adding GreyNoiseKey aligns with the new auth model and provider fields.
  • Empty() now checks CensysOrgId and GreyNoiseKey in addition to existing fields, maintaining the invariant that it’s only true when all keys are unset.

No functional issues spotted.

Also applies to: 25-45

runner/options.go (6)

49-67: Options struct cleanly extends to GreyNoise

Adding GreyNoise goflags.StringSlice alongside other per-engine slices is consistent with the existing design and keeps configuration symmetric.


75-79: CLI flags and help text for GreyNoise are consistent

  • --engine/-e help now lists greynoise, matching the internal engine string.
  • New --greynoise/-gn flag mirrors the pattern of other per-engine flags, including goflags.FileStringSliceOptions.

This should make GreyNoise discoverable and usable from the CLI.

Also applies to: 81-99


157-177: Engine defaulting logic correctly considers GreyNoise

Extending the genericutil.EqualsAll(0, ...) check to include len(options.Driftnet) and len(options.GreyNoise) ensures:

  • Shodan is only auto-selected when no engine or per-engine queries (including GreyNoise) are specified.
  • Passing -greynoise or -e greynoise prevents unintended fallback to shodan.

Looks correct.


226-250: Validation correctly treats GreyNoise as a first-class query source

Adding len(options.GreyNoise) to the “no query provided” check means -greynoise alone counts as having queries; likewise, it’s included in the “no engine specified” check. This keeps GreyNoise behavior in line with other providers.


257-278: Engine validation includes GreyNoise

The “no engine specified” validation now factors in len(options.GreyNoise), making it consistent with other engines and preventing spurious errors when only GreyNoise-specific flags are used.


303-321: GreyNoise is correctly added in appendAllQueries

appendQuery(options, "greynoise", options.GreyNoise...) ensures:

  • GreyNoise gets added to options.Engine if there are greynoise-specific queries.
  • Those queries are merged into the global options.Query set without duplication.

This matches the behavior of other engines.

sources/agent/greynoise/request.go (1)

1-8: LGTM!

Clean data structure with appropriate JSON tags. The omitempty tags on optional fields and the json:"-" on ExcludeRaw are correct for the intended serialization behavior.

integration-tests/source-test.go (1)

26-28: Cleanup pattern standardization looks good.

The change to wrap os.RemoveAll in an anonymous function with explicit error discarding (_ =) is consistently applied across all existing test cases, improving code uniformity.

sources/agent/censys/censys.go (3)

72-82: SDK query implementation looks correct.

The SDK integration properly handles pagination with PageToken and uses the helper censyssdkgo.Int64() for type conversion. On the first request, &censysRequest.Cursor will point to an empty string—verify this is handled correctly by the SDK.


104-106: Verify Port type compatibility—SDK type definition not accessible.

The review comment raises a valid concern. The Censys SDK port field is Int64 based on platform documentation. However, I cannot access the SDK code in the sandbox to confirm the exact Go type of host.Port in censys-sdk-go v0.19.1.

Code at lines 104–105 dereferences host.Port directly without casting to result.Port (which is int). In Go, this requires host.Port to already be *int or compatible. If it's *int64 as the web documentation suggests, the code would require explicit type conversion via int() to compile.

Please verify:

  • Run go build to confirm the code compiles without type errors
  • If compilation fails, add explicit conversion: result.Port = int(*host.Port)

93-116: The identified nil pointer dereference concern is incorrect based on how the censys-sdk-go SDK structures its response types.

The censys-sdk-go v0.19.1 SDK is Speakeasy-generated and follows the standard pattern where nullable API response fields are represented as regular Go values (not pointers), accompanied by respjson.Field metadata to distinguish null/omitted/present states. Accessing censysResult.WebpropertyV1.Resource.Endpoints cannot panic—if these fields are unset, they are zero-valued and the range loop simply doesn't execute. No nil guards are needed.

Likely an incorrect or invalid review comment.

sources/agent/greynoise/response.go (1)

1-149: Data models are well-structured for GNQL API responses.

The response types comprehensively cover the GNQL API structure with appropriate JSON tags. The use of nested anonymous structs in RawData is acceptable for deserialization-only types.

sources/agent/greynoise/greynoise.go (4)

116-129: Cartesian product of hosts and ports may cause result explosion.

When both hosts and ports are present, the code emits len(hosts) × len(ports) results per item. For items with many hosts/ports, this could significantly multiply the result count. Ensure this is the intended behavior, as a single GNQL item could produce dozens of Result entries.


237-271: Port collection is comprehensive.

The function correctly aggregates ports from multiple sources (Scan, JA3, HASSH, and HTTP hosts) using a map for deduplication. The logic for extracting ports from host strings (lines 259-264) complements the hostname extraction approach.


194-217: Comprehensive error handling with wrapped errors.

The status code handling covers relevant error cases (401, 403, 404, 429, 5xx) with informative messages. The use of %w allows callers to use errors.Is() for error type checking, and the Retry-After hint extraction is a nice touch for rate limiting.


54-57: The review comment is incorrect; the code is already consistent with GreyNoise API limits.

The default pageSize of 1000 does not exceed API limits—it matches the GreyNoise GNQL API default of 1,000 results per page, and the clamping to 10,000 correctly implements the API maximum of 10,000 results per page. The code is already properly aligned with the documented API constraints.

Likely an incorrect or invalid review comment.

go 1.24.0

require (
github.com/censys/censys-sdk-go v0.19.1
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor

🧩 Analysis chain

Verify censys-sdk-go version.

The new censys-sdk-go dependency aligns with the Censys migration mentioned in the PR summary. Confirm v0.19.1 is the latest stable version.


🏁 Script executed:

#!/bin/bash
# Check latest version of censys-sdk-go
curl -s "https://proxy.golang.org/github.com/censys/censys-sdk-go/@v/list" | tail -5

Length of output: 39


Update censys-sdk-go to the latest stable version.

The censys-sdk-go dependency is pinned to v0.19.1, but v0.22.3 is the latest stable version available. Update to v0.22.3 or review the changelog to ensure v0.19.1 is intentionally pinned (e.g., due to compatibility constraints).

🤖 Prompt for AI Agents
In go.mod around line 6, the censys-sdk-go dependency is pinned to v0.19.1;
update the version to the latest stable (v0.22.3) unless there is a deliberate
compatibility constraint. Edit go.mod to replace v0.19.1 with v0.22.3, run `go
get github.com/censys/censys-sdk-go@v0.22.3` and `go mod tidy`, then run the
test suite and scan the changelog for breaking changes to confirm compatibility;
if incompatibilities exist, document the reason to keep the older pin.

Comment on lines +198 to 199
Required API keys can be obtained by signing up on following platform [Shodan](https://account.shodan.io/register), [Censys](https://censys.io/register), [Fofa](https://fofa.info/toLogin), [Quake](https://quake.360.net/quake/#/index), [Hunter](https://user.skyeye.qianxin.com/user/register?next=https%3A//hunter.qianxin.com/api/uLogin&fromLogin=1), [ZoomEye](https://www.zoomeye.ai), [Netlas](https://app.netlas.io/registration/), [CriminalIP](https://www.criminalip.io/register), [Publicwww](https://publicwww.com/profile/signup.html), Google [[1]](https://developers.google.com/custom-search/v1/introduction#identify_your_application_to_google_with_api_key),[[2]](https://programmablesearchengine.google.com/controlpanel/create), [Onyphe](https://search.onyphe.io/signup) and [Driftnet](https://driftnet.io/auth?state=signup).

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor

Consider documenting GreyNoise in the “Required API keys” list

This list now includes Driftnet but not GreyNoise, even though GreyNoise is a first-class provider in the code (options, provider, keys, agent, tests). Recommend appending GreyNoise with its signup URL, e.g.:

..., Onyphe, Driftnet, and GreyNoise (or the appropriate signup page).

This keeps the README in sync with supported providers.

🤖 Prompt for AI Agents
In README.md around lines 198-199 the "Required API keys" list omits GreyNoise
even though the codebase supports it; add GreyNoise to the list with its signup
URL. Edit the sentence to append ",
[GreyNoise](https://www.greynoise.io/register)" (or the appropriate GreyNoise
signup page) after Driftnet so the README matches supported providers.

Comment on lines 90 to 103
fofaResponse := &FofaResponse{}

RespBodyByBodyBytes, _ := io.ReadAll(resp.Body)
if err := json.NewDecoder(resp.Body).Decode(fofaResponse); err != nil {
results <- sources.Result{Source: agent.Name(), Error: err}
result := sources.Result{Source: agent.Name()}
defer func(Body io.ReadCloser) {
if bodyCloseErr := Body.Close(); bodyCloseErr != nil {
gologger.Info().Msgf("response body close error : %v", bodyCloseErr)
}
}(resp.Body)
raw, _ := json.Marshal(RespBodyByBodyBytes)
result.Raw = raw
results <- result
return nil
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🔴 Critical

Response body handling and JSON decode are currently broken and leak connections

  • io.ReadAll(resp.Body) at Line 91 consumes the entire body; the subsequent json.NewDecoder(resp.Body).Decode(fofaResponse) then reads from an already‑drained stream and will consistently return an error (typically EOF). That means you will never successfully parse a FOFA response and will always go down the error path.
  • resp.Body is only closed inside the JSON‑decode error branch. On a “successful” decode (if it ever occurred), the body would never be closed, leaking connections and eventually exhausting the HTTP client’s pool.
  • In the error path you also double‑encode the raw body: json.Marshal(RespBodyByBodyBytes) wraps the bytes in JSON again, and Result.Error is left nil, so consumers can’t distinguish success from failure via the error field.

A safer pattern is:

  • Immediately defer closing the body once you know err == nil, with logging on close failure.
  • Read the body once into a []byte.
  • json.Unmarshal from that buffer.
  • On decode failure, emit a Result that carries both Raw and Error.

Suggested fix:

-	fofaResponse := &FofaResponse{}
-	RespBodyByBodyBytes, _ := io.ReadAll(resp.Body)
-	if err := json.NewDecoder(resp.Body).Decode(fofaResponse); err != nil {
-		result := sources.Result{Source: agent.Name()}
-		defer func(Body io.ReadCloser) {
-			if bodyCloseErr := Body.Close(); bodyCloseErr != nil {
-				gologger.Info().Msgf("response body close error : %v", bodyCloseErr)
-			}
-		}(resp.Body)
-		raw, _ := json.Marshal(RespBodyByBodyBytes)
-		result.Raw = raw
-		results <- result
-		return nil
-	}
+	defer func(body io.ReadCloser) {
+		if bodyCloseErr := body.Close(); bodyCloseErr != nil {
+			gologger.Info().Msgf("fofa: response body close error: %v", bodyCloseErr)
+		}
+	}(resp.Body)
+
+	fofaResponse := &FofaResponse{}
+	respBodyBytes, err := io.ReadAll(resp.Body)
+	if err != nil {
+		results <- sources.Result{Source: agent.Name(), Error: err}
+		return nil
+	}
+	if err := json.Unmarshal(respBodyBytes, fofaResponse); err != nil {
+		results <- sources.Result{
+			Source: agent.Name(),
+			Raw:    respBodyBytes,
+			Error:  err,
+		}
+		return nil
+	}

This keeps Result.Raw as the exact response body, surfaces the decode failure via Error, and ensures the body is always closed.

Comment on lines +85 to 96
RespBodyByBodyBytes, _ := io.ReadAll(resp.Body)
if err := json.NewDecoder(resp.Body).Decode(hunterResponse); err != nil {
results <- sources.Result{Source: agent.Name(), Error: err}
result := sources.Result{Source: agent.Name()}
defer func(Body io.ReadCloser) {
if bodyCloseErr := Body.Close(); bodyCloseErr != nil {
gologger.Info().Msgf("response body close error : %v", bodyCloseErr)
}
}(resp.Body)
raw, _ := json.Marshal(RespBodyByBodyBytes)
result.Raw = raw
results <- result
return nil
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🔴 Critical

Critical bug: Response body is exhausted before JSON decoding.

io.ReadAll(resp.Body) consumes the entire body, so json.NewDecoder(resp.Body).Decode() will always fail with an empty reader. Additionally, on the error path, json.Marshal(RespBodyByBodyBytes) will base64-encode the byte slice rather than store the raw response.

Apply this diff to fix the body handling:

-	RespBodyByBodyBytes, _ := io.ReadAll(resp.Body)
-	if err := json.NewDecoder(resp.Body).Decode(hunterResponse); err != nil {
-		result := sources.Result{Source: agent.Name()}
-		defer func(Body io.ReadCloser) {
-			if bodyCloseErr := Body.Close(); bodyCloseErr != nil {
-				gologger.Info().Msgf("response body close error : %v", bodyCloseErr)
-			}
-		}(resp.Body)
-		raw, _ := json.Marshal(RespBodyByBodyBytes)
-		result.Raw = raw
-		results <- result
+	defer func() {
+		if err := resp.Body.Close(); err != nil {
+			gologger.Info().Msgf("response body close error: %v", err)
+		}
+	}()
+
+	bodyBytes, err := io.ReadAll(resp.Body)
+	if err != nil {
+		results <- sources.Result{Source: agent.Name(), Error: err}
+		return nil
+	}
+
+	if err := json.Unmarshal(bodyBytes, hunterResponse); err != nil {
+		result := sources.Result{Source: agent.Name()}
+		result.Raw = bodyBytes
+		results <- result
 		return nil
 	}
🤖 Prompt for AI Agents
In sources/agent/hunter/hunter.go around lines 85 to 96, the code reads
resp.Body with io.ReadAll and then tries to decode JSON from the same resp.Body
(which is already exhausted) and also json.Marshal is being used incorrectly on
the raw bytes; fix it by first reading resp.Body into a []byte variable, close
resp.Body immediately after reading, create a bytes.Reader (or bytes.NewBuffer)
from that []byte and pass it to json.NewDecoder().Decode() so decoding operates
on the saved bytes; on the error path set result.Raw = bodyBytes (the raw
[]byte) rather than json.Marshal(bodyBytes), and keep the body-close logic but
ensure it runs after the initial read.

@Mzack9999 Mzack9999 merged commit e261ee8 into main Nov 25, 2025
8 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

7 participants