-
Notifications
You must be signed in to change notification settings - Fork 5.5k
16762 components scrapecreators #18488
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
…ns for fetching creator profiles and searching creators, and introduce constants and utility functions for improved functionality.
|
The latest updates on your projects. Learn more about Vercel for GitHub. 2 Skipped Deployments
|
WalkthroughAdds a scrapecreators app with HTTP helpers, platform constants, pagination, and a deep-object diff utility. Introduces search and fetch actions, a reusable polling base, a New Profile Update source that diffs profiles and emits change events, and bumps package version and dependencies. Changes
Sequence Diagram(s)sequenceDiagram
autonumber
actor User
participant Action as Action: Fetch Creator Profile
participant App as App: scrapecreators
participant API as ScrapeCreators API
User->>Action: Run with platform, profileId
Action->>App: fetchCreatorProfile({ platform, profileId })
App->>App: parsePath(platform), parseParams(platform, profileId)
App->>API: GET /{path} with params/headers
API-->>App: Response (profile or message)
App-->>Action: Profile data
Action-->>User: Summary + result or message
Note over Action,App: Error path throws unless response.data?.success is true
sequenceDiagram
autonumber
actor User
participant Action as Action: Search Creators
participant App as App: scrapecreators
participant API as ScrapeCreators API
User->>Action: Run with platform, query, limit?
Action->>App: paginate({ fn: searchCreators, params, platform, maxResults: limit })
loop Pagination
App->>API: GET /search with cursor/params
API-->>App: Items + next cursor
App-->>Action: yield items
end
Action-->>User: Collected items + summary
sequenceDiagram
autonumber
participant Timer as Timer
participant Source as Source: New Profile Update
participant App as App: scrapecreators
participant DB as DB
participant PD as Pipedream
Timer->>Source: run/deploy
Source->>DB: read lastProfile
Source->>App: fetchCreatorProfile({ platform, profileId })
App-->>Source: currentProfile
Source->>Source: getObjectDiff(lastProfile, currentProfile)
alt Diff not empty
Source->>PD: $emit({ profile, diff }, meta)
Source->>DB: write currentProfile
else No changes
Source-->>Timer: noop
end
Estimated code review effort🎯 3 (Moderate) | ⏱️ ~25 minutes Poem
Pre-merge checks and finishing touches❌ Failed checks (2 warnings, 1 inconclusive)
✅ Passed checks (2 passed)
✨ Finishing touches
🧪 Generate unit tests
📜 Recent review detailsConfiguration used: CodeRabbit UI Review profile: CHILL Plan: Pro 📒 Files selected for processing (1)
🚧 Files skipped from review as they are similar to previous changes (1)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (4)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 4
🧹 Nitpick comments (10)
components/scrapecreators/sources/new-profile-update/test-event.mjs (1)
48-49: Consider normalizing collection fields
tagsis a comma-separated string. Prefer an array (e.g.,["Channel", "Channel Name", "Channel Description"]) to avoid downstream splitting and i18n issues.components/scrapecreators/common/utils.mjs (1)
1-46: Harden deep-diff: handle arrays/non‑plain objects and avoid false negativesCurrent recursion treats any
objectas nested. Non‑plain objects likeDate,RegExp,Map,Set, and arrays can produce incorrect diffs (e.g., two differentDates appear unchanged). Also, there’s no cycle protection.Refactor to:
- Recurse only into plain objects.
- Compare arrays element‑wise (or treat as modified if length/ordering differences are unacceptable).
- Treat non‑plain objects as value types (strict equality or
.valueOf()forDate).- Add a visited set to prevent infinite recursion.
Patch sketch:
-export function getObjectDiff(obj1, obj2) { - const diff = {}; +export function getObjectDiff(obj1, obj2, _seen = new WeakSet()) { + const diff = {}; + const isPlainObject = (v) => v && typeof v === "object" && v.constructor === Object; + const isArray = Array.isArray; + const isNonPlainObject = (v) => + v && typeof v === "object" && !isPlainObject(v) && !isArray(v); + + // Cycle guard + if (obj1 && typeof obj1 === "object") { + if (_seen.has(obj1)) return diff; + _seen.add(obj1); + } // Check for differences in obj1's properties - for (const key in obj1) { - if (Object.prototype.hasOwnProperty.call(obj1, key)) { + for (const key of Object.keys(obj1 ?? {})) { + if (Object.prototype.hasOwnProperty.call(obj1, key)) { if (!Object.prototype.hasOwnProperty.call(obj2, key)) { diff[key] = { oldValue: obj1[key], newValue: undefined, status: "deleted", }; - } else if (typeof obj1[key] === "object" && obj1[key] !== null && - typeof obj2[key] === "object" && obj2[key] !== null) { - const nestedDiff = getObjectDiff(obj1[key], obj2[key]); + } else if (isArray(obj1[key]) && isArray(obj2[key])) { + // Array diff: element-wise + const a1 = obj1[key], a2 = obj2[key]; + const max = Math.max(a1.length, a2.length); + const arrDiff = {}; + for (let i = 0; i < max; i++) { + if (i in a1 && !(i in a2)) { + arrDiff[i] = { oldValue: a1[i], newValue: undefined, status: "deleted" }; + } else if (!(i in a1) && i in a2) { + arrDiff[i] = { oldValue: undefined, newValue: a2[i], status: "added" }; + } else if (i in a1 && i in a2) { + if (isPlainObject(a1[i]) && isPlainObject(a2[i])) { + const nd = getObjectDiff(a1[i], a2[i], _seen); + if (Object.keys(nd).length) arrDiff[i] = { status: "modified", changes: nd }; + } else if (a1[i] !== a2[i]) { + arrDiff[i] = { oldValue: a1[i], newValue: a2[i], status: "modified" }; + } + } + } + if (Object.keys(arrDiff).length) { + diff[key] = { status: "modified", changes: arrDiff }; + } + } else if (isPlainObject(obj1[key]) && isPlainObject(obj2[key])) { + const nestedDiff = getObjectDiff(obj1[key], obj2[key], _seen); if (Object.keys(nestedDiff).length > 0) { diff[key] = { status: "modified", changes: nestedDiff, }; } + } else if (isNonPlainObject(obj1[key]) || isNonPlainObject(obj2[key])) { + // Compare non-plain objects as values (e.g., Date.valueOf()) + const v1 = obj1[key] instanceof Date ? obj1[key].valueOf() : obj1[key]; + const v2 = obj2[key] instanceof Date ? obj2[key].valueOf() : obj2[key]; + if (v1 !== v2) { + diff[key] = { oldValue: obj1[key], newValue: obj2[key], status: "modified" }; + } } else if (obj1[key] !== obj2[key]) { diff[key] = { oldValue: obj1[key], newValue: obj2[key], status: "modified", }; } } } // Check for properties added in obj2 - for (const key in obj2) { - if (Object.prototype.hasOwnProperty.call(obj2, key)) { + for (const key of Object.keys(obj2 ?? {})) { + if (Object.prototype.hasOwnProperty.call(obj2, key)) { if (!Object.prototype.hasOwnProperty.call(obj1, key)) { diff[key] = { oldValue: undefined, newValue: obj2[key], status: "added", }; } } } return diff; }Please confirm if profile snapshots can contain Dates, Maps/Sets, or specialized objects. If strictly JSON (primitives/arrays/plain objects), we can simplify the array handling and skip non‑plain types.
components/scrapecreators/common/constants.mjs (1)
1-43: Stabilize and future‑proof platform lists
- Dedupe
PLATFORMSto avoid duplicates as groups evolve.- Freeze exports to prevent accidental mutation at runtime.
- Optional: rename
"user/boards"touserBoardsto avoid slash‑key edge cases.Example:
-export const PATH_PLATFORMS = { +export const PATH_PLATFORMS = Object.freeze({ - "user/boards": [ + // Consider: userBoards + "user/boards": [ "pinterest", ], ... -}; +}); -export const URL_PLATFORMS = [ +export const URL_PLATFORMS = Object.freeze([ "linkedin", "facebook", ...PATH_PLATFORMS["empty"], -]; +]); -export const SEARCH_PLATFORMS = [ +export const SEARCH_PLATFORMS = Object.freeze([ "tiktok", "threads", -]; +]); -export const HANDLE_PLATFORMS = [ +export const HANDLE_PLATFORMS = Object.freeze([ "instagram", "twitter", "truthsocial", "bluesky", "twitch", "snapchat", ...SEARCH_PLATFORMS, ...PATH_PLATFORMS["user/boards"], ...PATH_PLATFORMS["channel"], -]; +]); -export const PLATFORMS = [ - ...URL_PLATFORMS, - ...HANDLE_PLATFORMS, -]; +export const PLATFORMS = Object.freeze( + Array.from(new Set([ + ...URL_PLATFORMS, + ...HANDLE_PLATFORMS, + ])) +);components/scrapecreators/sources/common/base.mjs (1)
41-47: UseDate.now()and emit string idsMinor polish:
Date.parse(new Date())is wasteful; useDate.now().- Normalize event
idto string to avoid numeric id collisions across sources.- this.$emit(item, { - id: item[fieldId], - summary: this.getSummary(item), - ts: Date.parse(new Date()), - }); + this.$emit(item, { + id: String(item[fieldId]), + summary: this.getSummary(item), + ts: Date.now(), + });components/scrapecreators/actions/search-creators/search-creators.mjs (2)
12-18: Avoid diverging platform options here (potential mismatch with app-level PLATFORMS).This action overrides the app’s platform propDefinition options with SEARCH_PLATFORMS (currently ["tiktok","threads"]). That may exclude platforms users expect (e.g., YouTube/Instagram per issue), or fall out of sync with app-level PLATFORMS.
Option A: Remove the options override and rely on app’s PLATFORMS.
platform: { propDefinition: [ app, "platform", ], - options: SEARCH_PLATFORMS, },Option B: If search truly supports a subset, update SEARCH_PLATFORMS in common/constants.mjs to the intended, documented set and add a short note in the action’s description clarifying supported platforms. Based on PR objectives.
47-48: More helpful summary (include platform and count).Small UX win: show platform and number of results.
- $.export("$summary", `Successfully searched for **${this.query}**`); + $.export("$summary", `Found ${data.length} creator(s) on ${this.platform} for "${this.query}"`);components/scrapecreators/actions/fetch-creator-profile/fetch-creator-profile.mjs (1)
25-35: Tighten the summary.Include platform to aid debugging across multi-platform workflows.
- const summary = `Successfully fetched creator profile for **${this.profileId}**`; + const summary = `Fetched ${this.platform} profile: ${this.profileId}`;components/scrapecreators/sources/new-profile-update/new-profile-update.mjs (1)
59-62: UseDate.now()directly for timestamps.
Date.parse(new Date())is equivalent but noisier. PreferDate.now().- ts: Date.parse(new Date()), + ts: Date.now(),components/scrapecreators/scrapecreators.app.mjs (2)
71-84: SimplifyparsePathpredicate (return boolean).Current
.find()returns a truthy string (key) as the predicate result. Works, but non-idiomatic. Return a boolean for clarity.- parsePath(platform) { - const path = Object.entries(PATH_PLATFORMS).find(([ - key, - value, - ]) => value.includes(platform) - ? key - : null); - - return path - ? path[0] === "empty" - ? "" - : `/${path[0]}` - : "/profile"; - }, + parsePath(platform) { + const entry = Object.entries(PATH_PLATFORMS) + .find(([, value]) => value.includes(platform)); + return entry + ? (entry[0] === "empty" ? "" : `/${entry[0]}`) + : "/profile"; + },
86-113: Do not mutate the caller’sparamsinpaginate; add a default forusers.Avoid side effects across iterations/callers and guard against missing arrays.
- async *paginate({ - fn, params = {}, platform, maxResults = null, ...opts - }) { - let hasMore = false; - let count = 0; - let newCursor; - - do { - params.cursor = newCursor; - const { - cursor, - users, - } = await fn({ - platform, - params, - ...opts, - }); + async *paginate({ fn, params = {}, platform, maxResults = null, ...opts }) { + let hasMore = false; + let count = 0; + let newCursor; + const baseParams = { ...params }; + + do { + const reqParams = { ...baseParams, cursor: newCursor }; + const { cursor, users = [] } = await fn({ + platform, + params: reqParams, + ...opts, + }); for (const d of users) { yield d; if (maxResults && ++count === maxResults) { return count; } } newCursor = cursor; - hasMore = users.length; + hasMore = users.length; } while (hasMore); },
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
⛔ Files ignored due to path filters (1)
pnpm-lock.yamlis excluded by!**/pnpm-lock.yaml
📒 Files selected for processing (9)
components/scrapecreators/actions/fetch-creator-profile/fetch-creator-profile.mjs(1 hunks)components/scrapecreators/actions/search-creators/search-creators.mjs(1 hunks)components/scrapecreators/common/constants.mjs(1 hunks)components/scrapecreators/common/utils.mjs(1 hunks)components/scrapecreators/package.json(2 hunks)components/scrapecreators/scrapecreators.app.mjs(1 hunks)components/scrapecreators/sources/common/base.mjs(1 hunks)components/scrapecreators/sources/new-profile-update/new-profile-update.mjs(1 hunks)components/scrapecreators/sources/new-profile-update/test-event.mjs(1 hunks)
🧰 Additional context used
🧠 Learnings (1)
📚 Learning: 2024-12-12T19:23:09.039Z
Learnt from: jcortes
PR: PipedreamHQ/pipedream#14935
File: components/sailpoint/package.json:15-18
Timestamp: 2024-12-12T19:23:09.039Z
Learning: When developing Pipedream components, do not add built-in Node.js modules like `fs` to `package.json` dependencies, as they are native modules provided by the Node.js runtime.
Applied to files:
components/scrapecreators/package.json
🧬 Code graph analysis (6)
components/scrapecreators/common/utils.mjs (1)
components/scrapecreators/sources/new-profile-update/new-profile-update.mjs (1)
diff(51-51)
components/scrapecreators/actions/search-creators/search-creators.mjs (1)
components/scrapecreators/common/constants.mjs (2)
SEARCH_PLATFORMS(22-25)SEARCH_PLATFORMS(22-25)
components/scrapecreators/scrapecreators.app.mjs (1)
components/scrapecreators/common/constants.mjs (6)
PLATFORMS(39-42)PLATFORMS(39-42)URL_PLATFORMS(16-20)URL_PLATFORMS(16-20)PATH_PLATFORMS(1-14)PATH_PLATFORMS(1-14)
components/scrapecreators/actions/fetch-creator-profile/fetch-creator-profile.mjs (1)
components/scrapecreators/sources/common/base.mjs (1)
response(26-26)
components/scrapecreators/sources/new-profile-update/new-profile-update.mjs (1)
components/scrapecreators/common/utils.mjs (2)
diff(2-2)getObjectDiff(1-46)
components/scrapecreators/sources/common/base.mjs (1)
components/scrapecreators/scrapecreators.app.mjs (1)
fn(94-101)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (5)
- GitHub Check: Ensure component commits modify component versions
- GitHub Check: Verify TypeScript components
- GitHub Check: Publish TypeScript components
- GitHub Check: pnpm publish
- GitHub Check: Lint Code Base
🔇 Additional comments (3)
components/scrapecreators/package.json (1)
15-17: @pipedream/platform version up-to-date
- The dependency version
^3.1.0matches the latest release, so no update is needed.- Optional: add an
"engines": { "node": ">=18" }field to align with the Pipedream runtime.components/scrapecreators/sources/common/base.mjs (1)
26-27: Verify and handle all possiblefn()return shapes in base.mjsIn components/scrapecreators/sources/common/base.mjs (lines 25–27), I couldn’t locate any
getFunction()implementations—please confirm what shape(s)fn()returns and updateemitEventto handle{ value },{ users }, plain arrays, etc., with a clear fallback and anArray.isArray(response)check before iterating.components/scrapecreators/scrapecreators.app.mjs (1)
26-30: Optional: includeAccept: application/jsonheader. The component correctly exposesthis.$auth.api_key, so no rename is needed.
components/scrapecreators/actions/fetch-creator-profile/fetch-creator-profile.mjs
Show resolved
Hide resolved
components/scrapecreators/actions/search-creators/search-creators.mjs
Outdated
Show resolved
Hide resolved
components/scrapecreators/sources/new-profile-update/new-profile-update.mjs
Show resolved
Hide resolved
…ors.mjs Co-authored-by: coderabbitai[bot] <136622811+coderabbitai[bot]@users.noreply.github.com>
Resolves #16762
Summary by CodeRabbit
New Features
Chores