[Bug]: Token count discrepancy: Local LLMs process filtered HTML while cloud LLMs process unfiltered HTML for raw HTML input

### crawl4ai version

0.7.4

### Expected Behavior

Bug Description
When processing raw HTML content (not URLs), there's a significant discrepancy in token usage between local LLMs (Ollama) and cloud LLMs (Groq, DeepSeek). Local LLMs appear to process a filtered/compressed version of the HTML, while cloud LLMs process the full unfiltered version.

Expected Behavior
Both local and cloud LLMs should process the same filtered version of HTML content when using `input_format="fit_markdown"` and content filtering configurations.


### Current Behavior


Actual Behavior
- Local LLM (Ollama): ~4k tokens for the same HTML content
- Cloud LLMs (Groq/DeepSeek): 70-80k tokens for the same HTML content
- Note: This issue only occurs with raw HTML input. URL crawling works consistently across all LLM providers.



### Is this reproducible?

Yes

### Inputs Causing the Bug

```bash

```

### Steps to Reproduce

```bash

```

### Code snippets

```python
async def run_crawler_for_text(raw_html: str):
    try:
        extraction_strategy = LLMExtractionStrategy(
            llm_config=LLMConfig(
                provider="ollama/llama3.1:8b", 
                base_url="http://localhost:11434/", 
                #provider = "openai/deepseek-chat",
                #base_url="https://api.deepseek.com/v1",
                #api_token="xyz",
                temperature=0.0,
                top_p=0.0
            ),
            instruction="""
instruction for llm
""",      #LLM Config Options
            extraction_type="schema",
            extra_args={"temperature": 0.0}, 
            verbose=True, 
            input_format="fit_markdown", 
            apply_chunking=True, 
            chunk_token_threshold=3200,
            force_json_response=True,
            overlap_rate=0.3,
        )
        pruning_filter = PruningContentFilter(threshold_type="fixed", threshold=0.05) #Filter configs
        run_config = CrawlerRunConfig( 
            extraction_strategy=extraction_strategy,
     markdown_generator=DefaultMarkdownGenerator(content_filter=pruning_filter, content_source="cleaned_html", options={"ignore_links": True, "ignore_images" : True}),
            exclude_external_links=False,
            exclude_all_images=False,
            exclude_social_media_links=True,
            exclude_external_images=False,
            verbose=True,
            cache_mode=CacheMode.BYPASS
        )
```

### OS

macOS

### Python version

3.13.5

### Browser

_No response_

### Browser version

_No response_

### Error logs & Screenshots (if applicable)

_No response_

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

[Bug]: Token count discrepancy: Local LLMs process filtered HTML while cloud LLMs process unfiltered HTML for raw HTML input #1499

crawl4ai version

Expected Behavior

Current Behavior

Is this reproducible?

Inputs Causing the Bug

Steps to Reproduce

Code snippets

OS

Python version

Browser

Browser version

Error logs & Screenshots (if applicable)

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Uh oh!

[Bug]: Token count discrepancy: Local LLMs process filtered HTML while cloud LLMs process unfiltered HTML for raw HTML input #1499

Description

crawl4ai version

Expected Behavior

Current Behavior

Is this reproducible?

Inputs Causing the Bug

Steps to Reproduce

Code snippets

OS

Python version

Browser

Browser version

Error logs & Screenshots (if applicable)

Metadata

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Issue actions