feat: add DayDayMap search engine support#710
feat: add DayDayMap search engine support#710taielab wants to merge 2 commits intoprojectdiscovery:mainfrom
Conversation
Add support for DayDayMap (每日地图) cyberspace mapping engine as a new search provider. DayDayMap is a Chinese cyberspace asset search platform. Changes: - Add DayDayMap agent implementation with Base64 query encoding - Support pagination with max 10,000 results per API limit - Add command-line flag -daydaymap / -ddm for DayDayMap queries - Add DAYDAYMAP_API_KEY environment variable support - Update engine list in help text and AllAgents() method - Add gitignore entries for build artifacts Implementation details: - API endpoint: https://www.daydaymap.com/api/v1/raymap/search/all - Authentication via api-key header - Query keywords are Base64 encoded before sending - Result parsing includes IP, port, domain, and service fields - Full error handling and API response validation Files changed: - sources/agent/daydaymap/daydaymap.go (new) - sources/agent/daydaymap/request.go (new) - sources/agent/daydaymap/response.go (new) - sources/keys.go (add Daydaymap field) - sources/provider.go (add configuration support) - uncover.go (register agent) - runner/options.go (add CLI flags) - .gitignore (add build artifacts)
WalkthroughAdds multiple new data-source agents (daydaymap, github, fullhunt, ip138-spider, rapiddns-spider, sitedossier-spider, zone0), CLI/provider key support, HTTP helpers and HTML subdomain extraction, plus .gitignore updates and service registration across the codebase. Changes
Estimated code review effort🎯 4 (Complex) | ⏱️ ~45 minutes
Pre-merge checks and finishing touches❌ Failed checks (1 warning)
✅ Passed checks (2 passed)
✨ Finishing touches
🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
There was a problem hiding this comment.
Actionable comments posted: 2
🧹 Nitpick comments (3)
sources/agent/daydaymap/response.go (1)
19-20: Consider using a typed struct forDaydaymapResult.Using
map[string]interface{}requires runtime type assertions (as seen in daydaymap.go lines 128-155). A typed struct with known fields (IP,Port,Domain,Service) would provide compile-time safety and clearer documentation of the expected response format, similar tosources/agent/hunterhow/response.go.sources/agent/daydaymap/daydaymap.go (2)
148-156: URL construction prefers IP over domain.When both a domain and IP are available, using the domain in the URL would be more user-friendly and potentially more accurate for virtual hosts.
Consider using the domain when available:
// Extract title for URL construction if service, ok := daydaymapResult["service"]; ok && service != nil { serviceStr := fmt.Sprint(service) if serviceStr == "https" || serviceStr == "http" { - result.Url = fmt.Sprintf("%s://%s", serviceStr, result.IP) + host := result.IP + if result.Host != "" { + host = result.Host + } + result.Url = fmt.Sprintf("%s://%s", serviceStr, host) if result.Port > 0 && result.Port != 80 && result.Port != 443 { result.Url = fmt.Sprintf("%s:%d", result.Url, result.Port) } } }
15-18: Clarify theMaxSizeconstant comment.The comment says "API max page size" but based on the usage at line 73, this appears to be the maximum total results the API supports, not the page size.
const ( URL = "https://www.daydaymap.com/api/v1/raymap/search/all" - MaxSize = 10000 // API max page size + MaxSize = 10000 // API maximum total results limit )
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (8)
.gitignore(1 hunks)runner/options.go(7 hunks)sources/agent/daydaymap/daydaymap.go(1 hunks)sources/agent/daydaymap/request.go(1 hunks)sources/agent/daydaymap/response.go(1 hunks)sources/keys.go(2 hunks)sources/provider.go(4 hunks)uncover.go(3 hunks)
🧰 Additional context used
🧬 Code graph analysis (4)
uncover.go (16)
sources/agent/daydaymap/daydaymap.go (1)
Agent(20-20)sources/agent.go (1)
Agent(8-11)sources/agent/binaryedge/binaryedge.go (1)
Agent(18-18)sources/agent/google/google.go (1)
Agent(17-17)sources/agent/driftnet/driftnet.go (1)
Agent(43-43)sources/agent/fofa/fofa.go (1)
Agent(31-31)sources/agent/hunter/hunter.go (1)
Agent(29-29)sources/agent/odin/odin.go (1)
Agent(13-13)sources/agent/onyphe/onyphe.go (1)
Agent(24-24)sources/agent/zoomeye/zoomeye.go (1)
Agent(18-18)sources/agent/censys/censys.go (1)
Agent(19-19)sources/agent/criminalip/criminalip.go (1)
Agent(20-20)sources/agent/hunterhow/hunterhow.go (1)
Agent(17-17)sources/agent/netlas/netlas.go (1)
Agent(17-17)sources/agent/greynoise/greynoise.go (1)
Agent(36-36)sources/agent/publicwww/publicwww.go (1)
Agent(19-19)
sources/agent/daydaymap/response.go (1)
sources/agent/hunterhow/response.go (1)
List(8-12)
sources/agent/daydaymap/request.go (1)
sources/agent/fofa/fofa.go (1)
Fields(25-25)
sources/agent/daydaymap/daydaymap.go (5)
sources/session.go (1)
Session(39-44)sources/keys.go (1)
Keys(3-24)sources/agent/daydaymap/request.go (1)
DaydaymapRequest(4-10)sources/util.go (1)
NewHTTPRequest(10-17)sources/agent/daydaymap/response.go (1)
DaydaymapResponse(4-8)
🔇 Additional comments (8)
.gitignore (3)
20-20: Build artifact entries are appropriate.Adding
dist/anduncoverto ignore build outputs aligns with standard Go project practices—both are expected build artifacts that should not be committed.Also applies to: 22-22
21-21: Clarify the purpose of thedocsentry.Confirm whether
docsis a directory containing generated documentation (e.g., fromgodocor a doc build tool) that should be ignored, or whether it contains source documentation that should remain tracked.
23-23: Verify the relevance ofENScan_GOto this PR.The
ENScan_GOentry appears unrelated to the DayDayMap feature being added. Confirm whether this is an intentional change or accidental scope drift.sources/keys.go (1)
23-23: LGTM!The
Daydaymapfield addition and integration into theEmpty()check follows the established pattern for other providers.Also applies to: 45-46
uncover.go (1)
13-13: LGTM!The import and agent registration follow the established pattern for other search engine agents.
Also applies to: 96-97
sources/agent/daydaymap/request.go (1)
1-10: LGTM!The request struct is clean and follows established patterns with proper JSON tags.
runner/options.go (1)
66-66: LGTM!The CLI flag integration is consistent with other search engines, properly added to all validation and query merging logic.
Also applies to: 100-100, 326-326
sources/provider.go (1)
40-40: LGTM!The provider integration for Daydaymap follows the established pattern for single-key providers like Shodan and Quake.
Also applies to: 127-129, 162-162, 202-202
| // Check if we should stop | ||
| if numberOfResults >= query.Limit || numberOfResults >= totalResults || len(daydaymapResponse.Data.List) == 0 { | ||
| break | ||
| } |
There was a problem hiding this comment.
Potential issue when query.Limit is 0.
When query.Limit is 0 (the default/unset value), the condition numberOfResults >= query.Limit will be true immediately after the first page (since numberOfResults will be > 0), causing early termination after fetching only the first page.
Apply this diff to handle the case when limit is not set:
// Check if we should stop
- if numberOfResults >= query.Limit || numberOfResults >= totalResults || len(daydaymapResponse.Data.List) == 0 {
+ if (query.Limit > 0 && numberOfResults >= query.Limit) || numberOfResults >= totalResults || len(daydaymapResponse.Data.List) == 0 {
break
}📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| // Check if we should stop | |
| if numberOfResults >= query.Limit || numberOfResults >= totalResults || len(daydaymapResponse.Data.List) == 0 { | |
| break | |
| } | |
| // Check if we should stop | |
| if (query.Limit > 0 && numberOfResults >= query.Limit) || numberOfResults >= totalResults || len(daydaymapResponse.Data.List) == 0 { | |
| break | |
| } |
🤖 Prompt for AI Agents
In sources/agent/daydaymap/daydaymap.go around lines 67 to 70, the loop break
condition uses numberOfResults >= query.Limit which incorrectly triggers when
query.Limit == 0; change the logic to only enforce the limit if query.Limit > 0
(e.g., check query.Limit > 0 && numberOfResults >= query.Limit) and keep the
existing checks for totalResults and empty page so the loop only terminates
early when a positive limit is reached or there are no more results.
| func (agent *Agent) query(URL string, session *sources.Session, daydaymapRequest *DaydaymapRequest, results chan sources.Result) *DaydaymapResponse { | ||
| resp, err := agent.queryURL(session, URL, daydaymapRequest) | ||
| if err != nil { | ||
| results <- sources.Result{Source: agent.Name(), Error: err} | ||
| return nil | ||
| } | ||
|
|
||
| daydaymapResponse := &DaydaymapResponse{} | ||
| if err := json.NewDecoder(resp.Body).Decode(daydaymapResponse); err != nil { | ||
| results <- sources.Result{Source: agent.Name(), Error: err} | ||
| return nil | ||
| } |
There was a problem hiding this comment.
Resource leak: HTTP response body is never closed.
The resp.Body from queryURL is decoded but never closed, which will cause connection leaks over time.
Apply this diff to close the response body:
func (agent *Agent) query(URL string, session *sources.Session, daydaymapRequest *DaydaymapRequest, results chan sources.Result) *DaydaymapResponse {
resp, err := agent.queryURL(session, URL, daydaymapRequest)
if err != nil {
results <- sources.Result{Source: agent.Name(), Error: err}
return nil
}
+ defer resp.Body.Close()
daydaymapResponse := &DaydaymapResponse{}
if err := json.NewDecoder(resp.Body).Decode(daydaymapResponse); err != nil {📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| func (agent *Agent) query(URL string, session *sources.Session, daydaymapRequest *DaydaymapRequest, results chan sources.Result) *DaydaymapResponse { | |
| resp, err := agent.queryURL(session, URL, daydaymapRequest) | |
| if err != nil { | |
| results <- sources.Result{Source: agent.Name(), Error: err} | |
| return nil | |
| } | |
| daydaymapResponse := &DaydaymapResponse{} | |
| if err := json.NewDecoder(resp.Body).Decode(daydaymapResponse); err != nil { | |
| results <- sources.Result{Source: agent.Name(), Error: err} | |
| return nil | |
| } | |
| func (agent *Agent) query(URL string, session *sources.Session, daydaymapRequest *DaydaymapRequest, results chan sources.Result) *DaydaymapResponse { | |
| resp, err := agent.queryURL(session, URL, daydaymapRequest) | |
| if err != nil { | |
| results <- sources.Result{Source: agent.Name(), Error: err} | |
| return nil | |
| } | |
| defer resp.Body.Close() | |
| daydaymapResponse := &DaydaymapResponse{} | |
| if err := json.NewDecoder(resp.Body).Decode(daydaymapResponse); err != nil { | |
| results <- sources.Result{Source: agent.Name(), Error: err} | |
| return nil | |
| } |
🤖 Prompt for AI Agents
In sources/agent/daydaymap/daydaymap.go around lines 101 to 112, the HTTP
response body returned by queryURL is decoded but never closed, causing a
resource/connection leak; after verifying err == nil and before decoding, add a
defer resp.Body.Close() (or explicitly close on all return paths) so the
response body is always closed even on decode errors, ensuring the body is
released before returning results.
Add support for Zone0 API integration along with several other search engines: - Zone0: 0.zone API integration with API key authentication - Github: Github code/repository search provider - ip138-spider: IP138 subdomain enumeration spider - rapiddns-spider: RapidDNS subdomain discovery spider - sitedossier-spider: SiteDossier domain intelligence spider - Utility functions: ReadBody helper and MatchSubdomains for spider agents Key changes: - Add API key management in Keys struct (Zone0Token, GithubToken) - Add provider configuration for all new engines in Provider struct - Add environment variable support (ZONE0_API_KEY, GITHUB_TOKEN) - Add command-line flags for all engines (--zone0, --github, --ip138-spider, etc.) - Update agent registry and validation logic - Add utility functions for HTTP body reading and subdomain extraction All engines follow the existing agent interface pattern and support configuration via YAML config, environment variables, or CLI flags.
There was a problem hiding this comment.
Actionable comments posted: 12
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (1)
uncover.go (1)
234-236: Spider agents should be included in anonymous provider check.The spider agents (
ip138-spider,rapiddns-spider,sitedossier-spider) don't require API keys as they scrape public websites. However,hasAnyAnonymousProvider()only checks forshodan-idb, which may cause the keys check at line 141 to incorrectly fail for these agents.func (s *Service) hasAnyAnonymousProvider() bool { - return stringsutil.EqualFoldAny("shodan-idb", s.Options.Agents...) + return stringsutil.EqualFoldAny("shodan-idb", s.Options.Agents...) || + stringsutil.EqualFoldAny("ip138-spider", s.Options.Agents...) || + stringsutil.EqualFoldAny("rapiddns-spider", s.Options.Agents...) || + stringsutil.EqualFoldAny("sitedossier-spider", s.Options.Agents...) }
🧹 Nitpick comments (13)
sources/util.go (2)
34-35: Consider compiling regex once and reusing.The regex is compiled on every function call, which can be costly when processing many HTML responses. For better performance, consider caching compiled patterns or using a sync.Pool.
47-57: Extract duplicated deduplication logic into a helper.The anonymous functions for deduplication are identical in both branches. Consider extracting this into a named helper function to reduce code duplication and improve readability.
+func deduplicateLower(items []string) []string { + s := make(map[string]struct{}) + for _, item := range items { + s[strings.ToLower(item)] = struct{}{} + } + result := make([]string, 0, len(s)) + for k := range s { + result = append(result, k) + } + return result +}Also applies to: 61-70
sources/agent/fullhunt/fullhunt.go (2)
81-83: Duplicate status code check - already handled by session.Do.The
session.Domethod (lines 112-127 in session.go) already returns an error whenresp.StatusCode != http.StatusOK. This duplicate check is redundant.resp, err := session.Do(request, agent.Name()) if err != nil { return nil, err } - if resp.StatusCode != http.StatusOK { - return nil, fmt.Errorf("unexpected status code %d received from %s", resp.StatusCode, requestURL) - } return resp, nil
24-26: Unused structfullhuntRequest.The
fullhuntRequeststruct is defined but itsDomainfield is only used to format the URL. Consider simplifying by passing the domain string directly instead of wrapping it in a struct.sources/agent/zone0/zone0.go (1)
37-40: Consider usingfmt.Errorfinstead oferrors.New(fmt.Sprintf(...)).This is more idiomatic Go.
if session.Keys.Zone0Token == "" { - return nil, errors.New(fmt.Sprintf("empty %s keys please read docs %s on how to add keys ", Source, "https://github.com/projectdiscovery/uncover?tab=readme-ov-file#provider-configuration")) + return nil, fmt.Errorf("empty %s keys please read docs %s on how to add keys", Source, "https://github.com/projectdiscovery/uncover?tab=readme-ov-file#provider-configuration") }sources/agent/rapiddns-spider/rapiddns.go (2)
19-21: Unusedoptionsfield.The
optionsfield is declared but never used. Consider removing it or using it if intended.type Agent struct { - options *sources.Agent }
52-79: Consider using sharedReadBodyhelper from session.go.The
sources.ReadBodyhelper (fromsources/session.golines 129-139) already handles the TLS cancellation error pattern and could simplify this code.sources/agent/github/github.go (1)
25-28: Usefmt.Errorfinstead oferrors.New(fmt.Sprintf(...)).More idiomatic Go style.
if session.Keys.GithubToken == "" { - return nil, errors.New(fmt.Sprintf("empty %s keys please read docs %s on how to add keys ", Source, "https://github.com/projectdiscovery/uncover?tab=readme-ov-file#provider-configuration")) + return nil, fmt.Errorf("empty %s keys please read docs %s on how to add keys", Source, "https://github.com/projectdiscovery/uncover?tab=readme-ov-file#provider-configuration") }sources/agent/sitedossier-spider/sitedossier.go (1)
20-22: Unusedoptionsfield.The
optionsfield is declared but never used.type Agent struct { - options *sources.Agent }sources/agent/ip138-spider/ip138.go (4)
18-20: Remove the unusedoptionsfield.The
optionsfield is never initialized or accessed anywhere in this agent implementation.Apply this diff to remove the unused field:
type Agent struct { - options *sources.Agent }
29-40: Thequery.Limitfield is ignored.The
Querymethod doesn't usequery.Limitto constrain results. Spider-based agents may naturally return all discovered subdomains, but consider documenting this behavior or implementing early termination if limit support is expected.
74-74: Consider handling the JSON marshaling error.Ignoring the error from
json.Marshalcould mask serialization issues.If marshaling fails, consider either logging the error or using a fallback:
- raw, _ := json.Marshal(result) + raw, err := json.Marshal(result) + if err != nil { + // Log or handle marshaling error + continue + } result.Raw = raw
51-79: The return valuesubis never used.The
querymethod declares a named returnsub []stringbut the caller on line 36 doesn't capture or use the return value. Consider removing the named return to simplify the signature.Apply this diff:
-func (agent *Agent) query(URL string, session *sources.Session, request *ip138Request, results chan sources.Result) (sub []string) { +func (agent *Agent) query(URL string, session *sources.Session, request *ip138Request, results chan sources.Result) {Then remove the
returnstatement on line 78 or change it to a barereturn.
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
⛔ Files ignored due to path filters (2)
sources/agent/fullhunt/response.jsonis excluded by!**/*.jsonsources/agent/zone0/example.jsonis excluded by!**/*.json
📒 Files selected for processing (14)
.gitignore(1 hunks)runner/options.go(7 hunks)sources/agent/fullhunt/fullhunt.go(1 hunks)sources/agent/github/github.go(1 hunks)sources/agent/ip138-spider/ip138.go(1 hunks)sources/agent/rapiddns-spider/rapiddns.go(1 hunks)sources/agent/sitedossier-spider/sitedossier.go(1 hunks)sources/agent/zone0/response.go(1 hunks)sources/agent/zone0/zone0.go(1 hunks)sources/keys.go(2 hunks)sources/provider.go(4 hunks)sources/session.go(2 hunks)sources/util.go(2 hunks)uncover.go(3 hunks)
✅ Files skipped from review due to trivial changes (1)
- sources/agent/zone0/response.go
🚧 Files skipped from review as they are similar to previous changes (1)
- sources/keys.go
🧰 Additional context used
🧬 Code graph analysis (6)
uncover.go (5)
sources/agent/github/github.go (1)
Agent(19-19)sources/agent/ip138-spider/ip138.go (1)
Agent(18-20)sources/agent/rapiddns-spider/rapiddns.go (1)
Agent(19-21)sources/agent/sitedossier-spider/sitedossier.go (1)
Agent(20-22)sources/agent/daydaymap/daydaymap.go (1)
Agent(20-20)
sources/agent/rapiddns-spider/rapiddns.go (3)
sources/agent.go (1)
Query(3-6)sources/session.go (1)
Session(42-47)sources/util.go (2)
NewHTTPRequest(13-20)MatchSubdomains(31-71)
sources/agent/fullhunt/fullhunt.go (3)
sources/session.go (1)
Session(42-47)sources/keys.go (1)
Keys(3-26)sources/util.go (1)
NewHTTPRequest(13-20)
sources/agent/sitedossier-spider/sitedossier.go (3)
sources/agent.go (1)
Query(3-6)sources/session.go (1)
Session(42-47)sources/util.go (2)
MatchSubdomains(31-71)NewHTTPRequest(13-20)
sources/agent/ip138-spider/ip138.go (3)
sources/agent.go (1)
Query(3-6)sources/session.go (1)
Session(42-47)sources/util.go (2)
NewHTTPRequest(13-20)MatchSubdomains(31-71)
sources/agent/github/github.go (4)
sources/agent.go (1)
Query(3-6)sources/session.go (2)
Session(42-47)ReadBody(130-140)sources/keys.go (1)
Keys(3-26)sources/util.go (2)
NewHTTPRequest(13-20)MatchSubdomains(31-71)
🔇 Additional comments (12)
.gitignore (1)
20-24: Clarify the purpose ofuncover2andENScan_GOentries.The additions
dist/,docs, anduncoverare reasonable for build artifacts. However, two entries warrant clarification:
uncover2(line 23): Why is there a second binary name? Is this a separate variant or transitional naming?ENScan_GO(line 24): This entry appears unrelated to the stated PR scope (DayDayMap agent support) and the broader scope mentioned in the summary (multiple new agents like github, fullhunt, ip138-spider, etc.). Is this intentional or an accidental inclusion?sources/session.go (1)
130-140: LGTM! Minor note on TLS-cancel error handling.The function correctly reads the response body and handles TLS-cancel errors gracefully. The approach of suppressing TLS-cancel errors while surfacing other errors is appropriate for this use case where partial reads during TLS cancellation shouldn't cause failures.
runner/options.go (3)
66-71: LGTM! New engine options are consistently added.The new engine fields follow the established pattern and are properly integrated into the Options struct.
105-110: LGTM! CLI flags are properly defined.The new engine flags are added consistently with existing patterns, including appropriate aliases.
351-356: LGTM! Query merging is correctly implemented.The new engines are properly integrated into the query aggregation logic.
sources/provider.go (4)
40-42: LGTM! New provider fields properly added.The Daydaymap, Github, and Zone0 fields are correctly added with appropriate YAML tags.
129-137: LGTM! Key retrieval follows existing pattern.The random key selection logic is consistent with other providers.
170-172: LGTM! Environment variable mapping is correct.The env var names follow established conventions.
212-214: The review comment is based on incorrect information. The code snippet referenced (lines 212-214 withDaydaymap,Github, andZone0fields) does not exist insources/provider.go. The file contains 213 lines total, and lines 212-214 are part of theinit()function handling file creation, not provider validation logic. Additionally, there is noFullHuntagent directory insources/agent/, noFullHuntfield in theProviderstruct, and noFullHuntTokenin theKeysstruct. The review appears to reference outdated or non-existent code.Likely an incorrect or invalid review comment.
sources/agent/fullhunt/fullhunt.go (1)
45-47: File and code snippet do not exist in the repository.The file
sources/agent/fullhunt/fullhunt.godoes not exist in this repository. A search of the codebase shows no FullHunt agent implementation, no references toFullHuntToken, and no code matching the provided snippet. The agent directory contains only: binaryedge, censys, criminalip, driftnet, fofa, google, greynoise, hunter, hunterhow, netlas, odin, onyphe, publicwww, quake, shodan, shodanidb, and zoomeye.Likely an incorrect or invalid review comment.
uncover.go (1)
101-112: New agent registrations look consistent with existing patterns.The switch cases follow the established pattern for agent registration.
sources/agent/sitedossier-spider/sitedossier.go (1)
16-16: No action needed. The site uses only HTTP without HTTPS support, making this the correct and only protocol option for this endpoint.Likely an incorrect or invalid review comment.
| "github.com/wjlin0/uncover/sources" | ||
| util "github.com/wjlin0/uncover/utils" |
There was a problem hiding this comment.
Incorrect import path - should use projectdiscovery/uncover.
The import paths reference github.com/wjlin0/uncover instead of github.com/projectdiscovery/uncover. This will cause build failures in the upstream repository.
- "github.com/wjlin0/uncover/sources"
- util "github.com/wjlin0/uncover/utils"
+ "github.com/projectdiscovery/uncover/sources"
+ util "github.com/projectdiscovery/uncover/utils"📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| "github.com/wjlin0/uncover/sources" | |
| util "github.com/wjlin0/uncover/utils" | |
| "github.com/projectdiscovery/uncover/sources" | |
| util "github.com/projectdiscovery/uncover/utils" |
🤖 Prompt for AI Agents
In sources/agent/fullhunt/fullhunt.go around lines 7 to 8, the imports reference
github.com/wjlin0/uncover which is incorrect for the upstream repo; update those
import paths to github.com/projectdiscovery/uncover (both sources and util
import) so they match the canonical project namespace and restore successful
builds.
| var ( | ||
| numberOfResults int | ||
| fullhuntResponse *response | ||
| ) | ||
|
|
||
| fullhunt := &fullhuntRequest{ | ||
| Domain: query.Query, | ||
| } | ||
| if fullhuntResponse = agent.query(session, URL, fullhunt, results); fullhuntResponse == nil { | ||
| return | ||
| } | ||
| numberOfResults += len(fullhuntResponse.Hosts) |
There was a problem hiding this comment.
Remove unused variable numberOfResults.
The variable numberOfResults is assigned but never used. Either remove it or implement the intended pagination/counting logic.
go func() {
defer close(results)
- var (
- numberOfResults int
- fullhuntResponse *response
- )
+ var fullhuntResponse *response
fullhunt := &fullhuntRequest{
Domain: query.Query,
}
if fullhuntResponse = agent.query(session, URL, fullhunt, results); fullhuntResponse == nil {
return
}
- numberOfResults += len(fullhuntResponse.Hosts)
}()📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| var ( | |
| numberOfResults int | |
| fullhuntResponse *response | |
| ) | |
| fullhunt := &fullhuntRequest{ | |
| Domain: query.Query, | |
| } | |
| if fullhuntResponse = agent.query(session, URL, fullhunt, results); fullhuntResponse == nil { | |
| return | |
| } | |
| numberOfResults += len(fullhuntResponse.Hosts) | |
| var fullhuntResponse *response | |
| fullhunt := &fullhuntRequest{ | |
| Domain: query.Query, | |
| } | |
| if fullhuntResponse = agent.query(session, URL, fullhunt, results); fullhuntResponse == nil { | |
| return | |
| } |
🤖 Prompt for AI Agents
In sources/agent/fullhunt/fullhunt.go around lines 52 to 63, the local variable
`numberOfResults` is declared and incremented but never used; either remove the
unused variable and the increment (simplest fix), or implement the intended
pagination/counting by looping through pages, accumulating host counts into
`numberOfResults`, and using it where needed (e.g., to decide whether to request
more pages or to return the total). Choose one approach and update the
surrounding code to eliminate the unused variable warning.
| for _, host := range fullhuntResponse.Hosts { | ||
| result := sources.Result{Source: agent.Name()} | ||
| protocol, host, port := util.GetProtocolHostAndPort(host) | ||
| result.Url = fmt.Sprintf("%s://%s:%d", protocol, host, port) | ||
| result.Host = host | ||
| result.IP = host | ||
| result.Port = port | ||
| raw, _ := json.Marshal(result) | ||
| result.Raw = raw | ||
| results <- result |
There was a problem hiding this comment.
IP field may incorrectly contain a hostname.
Setting result.IP = host is potentially incorrect when the host value is a subdomain/hostname rather than an actual IP address. Consider resolving to IP or leaving it empty if the source doesn't provide IP information.
🤖 Prompt for AI Agents
In sources/agent/fullhunt/fullhunt.go around lines 101-110, you currently set
result.IP = host which will store hostnames for non-IP hosts; instead detect
whether the returned host string is an IP (use net.ParseIP) and if so set
result.IP to it, otherwise attempt a DNS resolution (net.LookupIP) and set
result.IP to the first successful A/AAAA address, and if resolution fails leave
result.IP empty (or nil-equivalent). Ensure you do not change result.Host (keep
the original host string) and handle lookup errors without failing the loop.
| ) | ||
|
|
||
| const ( | ||
| URL = "https://api.github.com/search/code?q=%s&per_page=%d&page=%d&sort=indexed&access_token=%s" |
There was a problem hiding this comment.
Security: Remove token from URL query parameter.
The GitHub token is passed both in the URL (access_token=%s) and the Authorization header (line 67). Tokens in URLs can be exposed in server logs, proxy logs, and browser history. Use only the Authorization header.
const (
- URL = "https://api.github.com/search/code?q=%s&per_page=%d&page=%d&sort=indexed&access_token=%s"
+ URL = "https://api.github.com/search/code?q=%s&per_page=%d&page=%d&sort=indexed"
PerPage = 100
Source = "github"
)Update queryURL accordingly:
func (agent *Agent) queryURL(session *sources.Session, URL string, githubRequest *githubRequest) (*http.Response, error) {
- githubURL := fmt.Sprintf(URL, githubRequest.Query, githubRequest.PerPage, githubRequest.Page, session.Keys.GithubToken)
+ githubURL := fmt.Sprintf(URL, githubRequest.Query, githubRequest.PerPage, githubRequest.Page)
request, err := sources.NewHTTPRequest(http.MethodGet, githubURL, nil)📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| URL = "https://api.github.com/search/code?q=%s&per_page=%d&page=%d&sort=indexed&access_token=%s" | |
| const ( | |
| URL = "https://api.github.com/search/code?q=%s&per_page=%d&page=%d&sort=indexed" | |
| PerPage = 100 | |
| Source = "github" | |
| ) |
| URL = "https://api.github.com/search/code?q=%s&per_page=%d&page=%d&sort=indexed&access_token=%s" | |
| func (agent *Agent) queryURL(session *sources.Session, URL string, githubRequest *githubRequest) (*http.Response, error) { | |
| githubURL := fmt.Sprintf(URL, githubRequest.Query, githubRequest.PerPage, githubRequest.Page) | |
| request, err := sources.NewHTTPRequest(http.MethodGet, githubURL, nil) |
| func (agent *Agent) query(URL string, session *sources.Session, githubRequest *githubRequest, results chan sources.Result) []sources.Result { | ||
| resp, err := agent.queryURL(session, URL, githubRequest) | ||
| if err != nil { | ||
| results <- sources.Result{Source: agent.Name(), Error: err} | ||
| return nil | ||
| } | ||
| var githubResult []sources.Result | ||
| body, err := sources.ReadBody(resp) | ||
| if err != nil { | ||
| return nil | ||
| } | ||
| subdomains := sources.MatchSubdomains(githubRequest.Query, body.String(), true) | ||
| for _, sub := range subdomains { | ||
| result := sources.Result{Source: agent.Name()} | ||
| _, result.Host, result.Port = util.GetProtocolHostAndPort(sub) | ||
| result.IP = result.Host | ||
| raw, _ := json.Marshal(result) | ||
| result.Raw = raw | ||
| results <- result | ||
| } | ||
| return githubResult |
There was a problem hiding this comment.
githubResult is never populated — pagination is broken.
The query function declares githubResult but never appends to it, so it always returns an empty slice. This causes the pagination loop to break immediately after the first request since len(githubResponse) is always 0.
func (agent *Agent) query(URL string, session *sources.Session, githubRequest *githubRequest, results chan sources.Result) []sources.Result {
resp, err := agent.queryURL(session, URL, githubRequest)
if err != nil {
results <- sources.Result{Source: agent.Name(), Error: err}
return nil
}
var githubResult []sources.Result
body, err := sources.ReadBody(resp)
if err != nil {
return nil
}
subdomains := sources.MatchSubdomains(githubRequest.Query, body.String(), true)
for _, sub := range subdomains {
result := sources.Result{Source: agent.Name()}
_, result.Host, result.Port = util.GetProtocolHostAndPort(sub)
- result.IP = result.Host
raw, _ := json.Marshal(result)
result.Raw = raw
results <- result
+ githubResult = append(githubResult, result)
}
return githubResult
}🤖 Prompt for AI Agents
In sources/agent/github/github.go around lines 78 to 98, the githubResult slice
is never populated so pagination stops; when you construct each Result for a
discovered subdomain, append that Result to githubResult (in addition to sending
it on the results channel) so the function returns the collected results for the
caller to continue pagination; ensure you append the same result object (or a
copied value) to githubResult before sending it on the channel and return
githubResult at the end.
| if strings.ContainsAny(err.Error(), "tls: user canceled") { | ||
| shouldIgnoreErrors = true | ||
| } | ||
| if !shouldIgnoreErrors { |
There was a problem hiding this comment.
strings.ContainsAny is incorrect — use strings.Contains.
ContainsAny checks if any single character from the second argument exists in the first string. This will match almost any error message since characters like t, l, s, :, and space are common. This effectively disables error handling.
- if strings.ContainsAny(err.Error(), "tls: user canceled") {
+ if strings.Contains(err.Error(), "tls: user canceled") {
shouldIgnoreErrors = true
}📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| if strings.ContainsAny(err.Error(), "tls: user canceled") { | |
| shouldIgnoreErrors = true | |
| } | |
| if !shouldIgnoreErrors { | |
| if strings.Contains(err.Error(), "tls: user canceled") { | |
| shouldIgnoreErrors = true | |
| } | |
| if !shouldIgnoreErrors { |
🤖 Prompt for AI Agents
In sources/agent/rapiddns-spider/rapiddns.go around lines 63 to 66, the check
uses strings.ContainsAny(err.Error(), "tls: user canceled") which is wrong
because ContainsAny tests for any single rune; change it to
strings.Contains(err.Error(), "tls: user canceled") so the code correctly
detects the full substring and sets shouldIgnoreErrors = true when the error
message contains that phrase, leaving the surrounding logic unchanged.
| if strings.ContainsAny(err.Error(), "tls: user canceled") { | ||
| shouldIgnoreErrors = true | ||
| } |
There was a problem hiding this comment.
strings.ContainsAny is incorrect — use strings.Contains.
Same issue as in rapiddns-spider: ContainsAny checks for individual characters, not the full substring. This will match almost any error message.
- if strings.ContainsAny(err.Error(), "tls: user canceled") {
+ if strings.Contains(err.Error(), "tls: user canceled") {
shouldIgnoreErrors = true
}📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| if strings.ContainsAny(err.Error(), "tls: user canceled") { | |
| shouldIgnoreErrors = true | |
| } | |
| if strings.Contains(err.Error(), "tls: user canceled") { | |
| shouldIgnoreErrors = true | |
| } |
🤖 Prompt for AI Agents
In sources/agent/sitedossier-spider/sitedossier.go around lines 80 to 82, the
code uses strings.ContainsAny(err.Error(), "tls: user canceled") which checks
for any of the individual characters rather than the full substring; replace it
with strings.Contains(err.Error(), "tls: user canceled") so the check matches
the full message exactly (no other logic change required).
| if !strings.ContainsAny(body.String(), "Show next 100 items") && len(sub) == 0 { | ||
| return nil | ||
| } |
There was a problem hiding this comment.
strings.ContainsAny is incorrect — use strings.Contains.
Same issue: checking for the text "Show next 100 items" requires Contains, not ContainsAny.
- if !strings.ContainsAny(body.String(), "Show next 100 items") && len(sub) == 0 {
+ if !strings.Contains(body.String(), "Show next 100 items") && len(sub) == 0 {
return nil
}📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| if !strings.ContainsAny(body.String(), "Show next 100 items") && len(sub) == 0 { | |
| return nil | |
| } | |
| if !strings.Contains(body.String(), "Show next 100 items") && len(sub) == 0 { | |
| return nil | |
| } |
🤖 Prompt for AI Agents
In sources/agent/sitedossier-spider/sitedossier.go around lines 104 to 106, the
condition uses strings.ContainsAny to check for the substring "Show next 100
items" which is incorrect; replace strings.ContainsAny(body.String(), "Show next
100 items") with strings.Contains(body.String(), "Show next 100 items") so the
code correctly tests for that full substring and keep the existing negation and
len(sub) check unchanged.
| size := len(zone0Response.Data) | ||
| numberOfResults += size | ||
| total, _ := strconv.Atoi(zone0Response.Total) | ||
|
|
||
| if size == 0 || numberOfResults > query.Limit || len(zone0Response.Data) == 0 || numberOfResults > size || total > query.Limit { | ||
| break | ||
| } |
There was a problem hiding this comment.
Pagination break condition has incorrect logic.
The condition numberOfResults > size compares the accumulated result count against the current page size (which is always ≤100), causing premature termination after the first page. Additionally, total > query.Limit would break if the API reports more total results than requested, even before retrieving them.
size := len(zone0Response.Data)
numberOfResults += size
total, _ := strconv.Atoi(zone0Response.Total)
- if size == 0 || numberOfResults > query.Limit || len(zone0Response.Data) == 0 || numberOfResults > size || total > query.Limit {
+ if size == 0 || numberOfResults >= query.Limit || numberOfResults >= total {
break
}📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| size := len(zone0Response.Data) | |
| numberOfResults += size | |
| total, _ := strconv.Atoi(zone0Response.Total) | |
| if size == 0 || numberOfResults > query.Limit || len(zone0Response.Data) == 0 || numberOfResults > size || total > query.Limit { | |
| break | |
| } | |
| size := len(zone0Response.Data) | |
| numberOfResults += size | |
| total, _ := strconv.Atoi(zone0Response.Total) | |
| if size == 0 || numberOfResults >= query.Limit || numberOfResults >= total { | |
| break | |
| } |
🤖 Prompt for AI Agents
In sources/agent/zone0/zone0.go around lines 61 to 67, the pagination break
condition is incorrect: remove the checks comparing accumulated results to the
current page size and removing the check that breaks when the API-reported total
exceeds the requested limit; instead, break when the current page has no results
(size == 0), or when the accumulated numberOfResults has reached or exceeded the
query.Limit, or when numberOfResults has reached or exceeded the parsed total
count from the API; ensure total is parsed safely (handle parse error) and use
>= comparisons to avoid off-by-one premature termination.
| func (agent *Agent) query(URL string, session *sources.Session, zone0Request *request, results chan sources.Result) *response { | ||
| resp, err := agent.queryURL(session, URL, zone0Request) | ||
| if err != nil { | ||
| results <- sources.Result{Source: agent.Name(), Error: err} | ||
| return nil | ||
| } | ||
| zone0Response := &response{} | ||
|
|
||
| if err := json.NewDecoder(resp.Body).Decode(zone0Response); err != nil { | ||
| results <- sources.Result{Source: agent.Name(), Error: err} | ||
| return nil | ||
| } | ||
| if zone0Response.Msg != "success" { | ||
| results <- sources.Result{Source: agent.Name(), Error: fmt.Errorf(zone0Response.Msg)} | ||
| return nil | ||
| } |
There was a problem hiding this comment.
Response body is never closed — resource leak.
The HTTP response body must be closed after reading to prevent resource exhaustion. The JSON decoder reads from resp.Body but it's never closed.
func (agent *Agent) query(URL string, session *sources.Session, zone0Request *request, results chan sources.Result) *response {
resp, err := agent.queryURL(session, URL, zone0Request)
if err != nil {
results <- sources.Result{Source: agent.Name(), Error: err}
return nil
}
+ defer resp.Body.Close()
zone0Response := &response{}
if err := json.NewDecoder(resp.Body).Decode(zone0Response); err != nil {📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| func (agent *Agent) query(URL string, session *sources.Session, zone0Request *request, results chan sources.Result) *response { | |
| resp, err := agent.queryURL(session, URL, zone0Request) | |
| if err != nil { | |
| results <- sources.Result{Source: agent.Name(), Error: err} | |
| return nil | |
| } | |
| zone0Response := &response{} | |
| if err := json.NewDecoder(resp.Body).Decode(zone0Response); err != nil { | |
| results <- sources.Result{Source: agent.Name(), Error: err} | |
| return nil | |
| } | |
| if zone0Response.Msg != "success" { | |
| results <- sources.Result{Source: agent.Name(), Error: fmt.Errorf(zone0Response.Msg)} | |
| return nil | |
| } | |
| func (agent *Agent) query(URL string, session *sources.Session, zone0Request *request, results chan sources.Result) *response { | |
| resp, err := agent.queryURL(session, URL, zone0Request) | |
| if err != nil { | |
| results <- sources.Result{Source: agent.Name(), Error: err} | |
| return nil | |
| } | |
| defer resp.Body.Close() | |
| zone0Response := &response{} | |
| if err := json.NewDecoder(resp.Body).Decode(zone0Response); err != nil { | |
| results <- sources.Result{Source: agent.Name(), Error: err} | |
| return nil | |
| } | |
| if zone0Response.Msg != "success" { | |
| results <- sources.Result{Source: agent.Name(), Error: fmt.Errorf(zone0Response.Msg)} | |
| return nil | |
| } |
🤖 Prompt for AI Agents
In sources/agent/zone0/zone0.go around lines 98 to 113, the HTTP response body
from agent.queryURL is decoded but never closed, leaking resources; after the
nil-check that confirms resp was returned (immediately after resp, err :=
agent.queryURL(... ) and err==nil), add a defer resp.Body.Close() so the body is
closed on all return paths (including JSON decode errors and non-success
messages); ensure you add the defer only when resp is non-nil to avoid panics.
|
Thank you for the PR. I notice there are many more changes than just the addition of DayDayMap. Was this intentional? |
Apologies for that. The extra changes were unintentional (likely due to my local config/misoperation). I will clean up the PR to include only the DayDayMap changes and push an update shortly. |
Add support for DayDayMap (每日地图) cyberspace mapping engine as a new search provider. DayDayMap is a Chinese cyberspace asset search platform.
Changes:
Implementation details:
Files changed:
Summary by CodeRabbit
New Features
Chores
✏️ Tip: You can customize this high-level summary in your review settings.