why does the crawl return empty? #139

HaoGuo98 · 2025-09-24T08:28:29Z

HaoGuo98
Sep 24, 2025

crawl this website( https://www.sac.net.cn/tzgg/202509/t20250919_68417.html ),return empty result.

HaoGuo98 · 2025-09-24T08:30:26Z

dosubot[bot]
bot Sep 24, 2025

There are a few likely reasons for getting an empty result when crawling that site. WaterCrawl uses Scrapy with Playwright middleware, so it can handle JavaScript-rendered pages, but empty results can still happen if the site uses strong anti-bot measures, loads content in a way Playwright can't capture, or if there are parsing errors or network issues. The crawler is set up to ignore certain HTTP errors and uses pipelines to filter and process content, so if the page doesn't match expected patterns or is blocked, you might get nothing back. Checking the crawl logs for errors or trying to adjust Playwright/browser settings could help pinpoint the issue. You can see more about the crawler's setup and error handling in the settings.

To reply, just mention @dosu.

^{How did I do? Good | Irrelevant | Incorrect | Verbose | Hallucination | Report 🐛 | Other}

1 reply

HaoGuo98 Sep 24, 2025
Author

curl-H "User-Agent: Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/120.0 Safari/537.36" https://www.sac.net.cn/tzgg/202508/t20250829_68327.html. i can use this curl command to get website content.

amirasaran · 2025-09-24T20:00:17Z

amirasaran
Sep 24, 2025
Maintainer

Hi @HaoGuo98,

I’ve checked the URL, and it’s currently blocked outside of China. To access it, you’ll need to either use a Chinese HTTP proxy or run the application on servers hosted within China.

If you’re using the Watercrawl Cloud paid plan, you’ll automatically have access to our Chinese proxy.

Here’s an example result generated using the Chinese proxy:
👉 https://watercrawl.dev/playground?requestId=855ec6b1-178c-43b7-b0c2-6a3292f385a6&mode=crawl

0 replies

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

why does the crawl return empty? #139

Uh oh!

{{title}}

Uh oh!

Replies: 2 comments 1 reply

Uh oh!

{{title}}

Uh oh!

Uh oh!

{{title}}

Uh oh!

Uh oh!

{{title}}

Uh oh!

Select a reply

Uh oh!

why does the crawl return empty? #139

Uh oh!

HaoGuo98 Sep 24, 2025

Replies: 2 comments · 1 reply

Uh oh!

dosubot[bot] bot Sep 24, 2025

Uh oh!

HaoGuo98 Sep 24, 2025 Author

Uh oh!

amirasaran Sep 24, 2025 Maintainer

HaoGuo98
Sep 24, 2025

Replies: 2 comments 1 reply

dosubot[bot]
bot Sep 24, 2025

HaoGuo98 Sep 24, 2025
Author

amirasaran
Sep 24, 2025
Maintainer