Skip to content

feat(tools): add social_platforms tool for Apify social scraping#1

Open
protoss70 wants to merge 546 commits intomain-syncedfrom
feat/apify-social-scrapers
Open

feat(tools): add social_platforms tool for Apify social scraping#1
protoss70 wants to merge 546 commits intomain-syncedfrom
feat/apify-social-scrapers

Conversation

@protoss70
Copy link
Collaborator

@protoss70 protoss70 commented Feb 16, 2026

What

Adds TikTok, Instagram, Youtube and LinkedIn scrapers as a built-in tool for the OpenClaw agent.

How

Created a new tool called social-platforms where by only providing you Apify token it can start scraping these websites

How to test

Testing this will be tricky because you need a VPS with Docker installed. After that you can follow the following commands to start your OpenClaw

  1. Clone this repository at the feat/apify-social-scraper branch
  2. Cd into it cd openclaw
  3. put the following tokens into env files
cp .env.example .env
nano .env
  • APIFY_TOKEN (at the bottom, make sure to uncomment it)
  • A model provider token (I use Anthropic) Smarter models work better for testing
  1. build the docker: docker build -t openclaw .
  2. run the docker with the following command (NOTE it is important to keep the restart logic)
sudo docker run -d \
  --name openclaw-gateway \
  --restart unless-stopped \
  --env-file .env \
  -p 3000:3000 \
  openclaw \
  node openclaw.mjs gateway --allow-unconfigured --bind lan
  1. Configure openclaw models, choose model -> select your provider -> make sure a model is selected (it will restart once you do this so wait about 5 seconds)
sudo docker exec -it openclaw-gateway node openclaw.mjs configure
  1. Run the following command to start chatting with your OpenClaw agent. You can ask it to list all available built-in tools it has. If the social scraper not showing up, there is an issue with your settup. If it is showing you can start prompting your way
sudo docker exec -it openclaw-gateway node openclaw.mjs tui

Pipeline

The pipeline fails here because we dont have all repository secrets but you can check out the actual PR pointing to the openclaw repo here: openclaw#19421

@protoss70 protoss70 self-assigned this Feb 16, 2026
@github-actions
Copy link

⚠️ Formal models conformance drift detected

The formal models extracted constants (generated/*) do not match this openclaw PR.

This check is informational (not blocking merges yet).
See the formal-models-conformance-drift artifact for the diff.

If this change is intentional, follow up by updating the formal models repo or regenerating the extracted artifacts there.

6 similar comments
@github-actions
Copy link

⚠️ Formal models conformance drift detected

The formal models extracted constants (generated/*) do not match this openclaw PR.

This check is informational (not blocking merges yet).
See the formal-models-conformance-drift artifact for the diff.

If this change is intentional, follow up by updating the formal models repo or regenerating the extracted artifacts there.

@github-actions
Copy link

⚠️ Formal models conformance drift detected

The formal models extracted constants (generated/*) do not match this openclaw PR.

This check is informational (not blocking merges yet).
See the formal-models-conformance-drift artifact for the diff.

If this change is intentional, follow up by updating the formal models repo or regenerating the extracted artifacts there.

@github-actions
Copy link

⚠️ Formal models conformance drift detected

The formal models extracted constants (generated/*) do not match this openclaw PR.

This check is informational (not blocking merges yet).
See the formal-models-conformance-drift artifact for the diff.

If this change is intentional, follow up by updating the formal models repo or regenerating the extracted artifacts there.

@github-actions
Copy link

⚠️ Formal models conformance drift detected

The formal models extracted constants (generated/*) do not match this openclaw PR.

This check is informational (not blocking merges yet).
See the formal-models-conformance-drift artifact for the diff.

If this change is intentional, follow up by updating the formal models repo or regenerating the extracted artifacts there.

@github-actions
Copy link

⚠️ Formal models conformance drift detected

The formal models extracted constants (generated/*) do not match this openclaw PR.

This check is informational (not blocking merges yet).
See the formal-models-conformance-drift artifact for the diff.

If this change is intentional, follow up by updating the formal models repo or regenerating the extracted artifacts there.

@github-actions
Copy link

⚠️ Formal models conformance drift detected

The formal models extracted constants (generated/*) do not match this openclaw PR.

This check is informational (not blocking merges yet).
See the formal-models-conformance-drift artifact for the diff.

If this change is intentional, follow up by updating the formal models repo or regenerating the extracted artifacts there.

@protoss70 protoss70 force-pushed the feat/apify-social-scrapers branch from 691b5ea to f31d5ad Compare February 17, 2026 20:02
@protoss70 protoss70 changed the base branch from main to main-synced February 17, 2026 20:20
@protoss70 protoss70 marked this pull request as ready for review February 17, 2026 21:01
@protoss70 protoss70 requested review from MQ37 and drobnikj February 17, 2026 21:01
steipete and others added 30 commits February 19, 2026 10:04
…w#20828)

Co-authored-by: mbelinky <mbelinky@users.noreply.github.com>
…penclaw#12897)

* fix(otel): complete diagnostics-otel OpenTelemetry v2 API migration

* chore(format): align otel files with updated oxfmt config

* chore(format): apply updated oxfmt spacing to otel diagnostics
* Docs: backfill changelog entries

* Docs: mark PR 20836 as merged in changelog
* fix(an-03): apply security fix

Generated by staged fix workflow.

* fix(an-03): apply security fix

Generated by staged fix workflow.

* fix(an-03): remove stale test-link artifact from patch

Remove accidental a2ui test-link artifact from the tracked diff and keep startup auth enforcement centralized in startup-auth.ts.
…addresses (openclaw#20803)

* fix(security): block plaintext WebSocket connections to non-loopback addresses

Addresses CWE-319 (Cleartext Transmission of Sensitive Information).

Previously, ws:// connections to remote hosts were allowed, exposing
both credentials and chat data to network interception. This change
blocks ALL plaintext ws:// connections to non-loopback addresses,
regardless of whether explicit credentials are configured (device
tokens may be loaded dynamically).

Security policy:
- wss:// allowed to any host
- ws:// allowed only to loopback (127.x.x.x, localhost, ::1)
- ws:// to LAN/tailnet/remote hosts now requires TLS

Changes:
- Add isSecureWebSocketUrl() validation in net.ts
- Block insecure connections in GatewayClient.start()
- Block insecure URLs in buildGatewayConnectionDetails()
- Handle malformed URLs gracefully without crashing
- Update tests to use wss:// for non-loopback URLs

Fixes openclaw#12519

* fix(test): update gateway-chat mock to preserve net.js exports

Use importOriginal to spread actual module exports and mock only
the functions needed for testing. This ensures isSecureWebSocketUrl
and other exports remain available to the code under test.
…w#20857)

YAML 1.1 default schema silently coerces values like "on" to true and
"off" to false, which can cause unexpected behavior in frontmatter
parsing. Explicitly set schema: "core" to use YAML 1.2 rules that
only recognize true/false/null literals.
…leak (openclaw#20856)

The previous implementation returned early when buffer lengths differed,
leaking the expected secret's length via timing side-channel. Hashing both
inputs with SHA-256 before comparison ensures fixed-length buffers and
constant-time comparison regardless of input lengths.
…enclaw#20854)

Command text displayed in Discord exec-approval embeds was not sanitized,
allowing crafted commands containing backticks to break out of the markdown
code block and inject arbitrary Discord formatting. This fix inserts a
zero-width space before each backtick to neutralize markdown injection.
…enclaw#20655)

Replace execSync (which spawns a shell) with execFileSync (which
invokes the binary directly with an argv array). This eliminates
command injection risk from interpolated arguments.

Co-authored-by: sirishacyd <sirishacyd@gmail.com>
Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>
Replace Math.random() with crypto.randomBytes() for generating
temporary file names. Math.random() is predictable and can enable
TOCTOU race conditions. Also set mode 0o600 on TTS temp files.

Co-authored-by: sirishacyd <sirishacyd@gmail.com>
Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>
…enclaw#10526)

* security: add baseline security headers to gateway HTTP responses

All responses from the gateway HTTP server now include
X-Content-Type-Options: nosniff and Referrer-Policy: no-referrer.

These headers are applied early in handleRequest, before any
handler runs, ensuring coverage for every response including
error pages and 404s.

Headers that restrict framing (X-Frame-Options, CSP
frame-ancestors) are intentionally omitted at this global level
because the canvas host and A2UI handlers serve content that may
be loaded inside frames.

* fix: apply security headers before WebSocket upgrade check

Move setDefaultSecurityHeaders() above the WebSocket early-return so
the headers are set on every HTTP response path including upgrades.

---------

Co-authored-by: Vincent Koc <vincentkoc@ieee.org>
…nclaw#20874)

Merged via /review-pr -> /prepare-pr -> /merge-pr.

Prepared head SHA: de69f81
Co-authored-by: mbelinky <132747814+mbelinky@users.noreply.github.com>
Co-authored-by: mbelinky <132747814+mbelinky@users.noreply.github.com>
Reviewed-by: @mbelinky
…law#16941)

* fix(matrix): detect mentions in formatted_body matrix.to links

Many Matrix clients (including Element) send mentions using HTML links
in formatted_body instead of or in addition to the m.mentions field:

```json
{
  "formatted_body": "<a href=\"https://matrix.to/#/@bot:matrix.org\">Bot</a>: hello",
  "m.mentions": null
}
```

This change adds detection for matrix.to links in formatted_body,
supporting both plain and URL-encoded user IDs.

Changes:
- Add checkFormattedBodyMention() helper function
- Check formatted_body in resolveMentions()
- Add comprehensive test coverage

Fixes openclaw#6982

* Update extensions/matrix/src/matrix/monitor/mentions.ts

Co-authored-by: greptile-apps[bot] <165735046+greptile-apps[bot]@users.noreply.github.com>

---------

Co-authored-by: zerone0x <zerone0x@users.noreply.github.com>
Co-authored-by: Vincent Koc <vincentkoc@ieee.org>
Co-authored-by: greptile-apps[bot] <165735046+greptile-apps[bot]@users.noreply.github.com>
…penclaw#17094)

_clawdock_compose() only passed -f docker-compose.yml, ignoring the
extra compose file that docker-setup.sh generates for persistent home
volumes and custom mounts. This broke all clawdock-* commands for
setups using OPENCLAW_HOME_VOLUME.

Fixes openclaw#17083

Co-authored-by: Claude <noreply@anthropic.com>
…20853)

Reject __proto__, prototype, and constructor keys during deep-merge
to prevent prototype pollution when merging untrusted config objects.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Comments