[BUG] Cheshire Cat cannot launch if LLM quota is exceeded

**Bug Description**
If your LLM provider returns a 429 error because the quota has been exceeded, the Cheshire Cat container fails to start. During initialization the app hits the model error and immediately shuts down with the message: *"Application startup failed. Exiting."*

**Steps to Reproduce**
1. Use an LLM/Embedder API key that has already exceeded its quota.  
2. Run `docker compose down`.  
3. Run `docker compose up`.  
4. Check the logs.

**Expected Behavior**
The container should start normally even if the provider returns a 429, and it should handle the error gracefully without shutting down the entire application.


**Additional context**
Logs:
```
[notice] A new release of pip is available: 25.1.1 -> 25.3
[notice] To update, run: pip install --upgrade pip
DEBUG:  Import module cat.plugins.influ_b2b.tools
DEBUG:  Import module cat.plugins.influ_b2b.custom-api
DEBUG:  Import module cat.plugins.cat_advanced_tools.settings
DEBUG:  Import module cat.plugins.cat_advanced_tools.fast_setup
DEBUG:  Executing core_plugin::factory_allowed_auth_handlers with priority 0
DEBUG:  Initializing WhiteRabbit...
DEBUG:  WhiteRabbit: Starting scheduler
[INFO] Scheduler started
DEBUG:  WhiteRabbit: Scheduler started
DEBUG:  Executing core_plugin::before_cat_bootstrap with priority 0
[DEBUG] Looking for jobs to run
DEBUG:  Executing core_plugin::factory_allowed_llms with priority 0
[DEBUG] No jobs; waiting until a job is added
[DEBUG] Using AsyncIOEngine.POLLER as I/O engine
DEBUG:  Executing core_plugin::factory_allowed_embedders with priority 0
ERROR:    Traceback (most recent call last):
  File "/usr/local/lib/python3.10/site-packages/langchain_google_genai/embeddings.py", line 225, in embed_documents
    result = self.client.batch_embed_contents(
  File "/usr/local/lib/python3.10/site-packages/google/ai/generativelanguage_v1beta/services/generative_service/client.py", line 1365, in batch_embed_contents
    response = rpc(
  File "/usr/local/lib/python3.10/site-packages/google/api_core/gapic_v1/method.py", line 131, in __call__
    return wrapped_func(*args, **kwargs)
  File "/usr/local/lib/python3.10/site-packages/google/api_core/retry/retry_unary.py", line 294, in retry_wrapped_func
    return retry_target(
  File "/usr/local/lib/python3.10/site-packages/google/api_core/retry/retry_unary.py", line 156, in retry_target
    next_sleep = _retry_error_helper(
  File "/usr/local/lib/python3.10/site-packages/google/api_core/retry/retry_base.py", line 214, in _retry_error_helper
    raise final_exc from source_exc
  File "/usr/local/lib/python3.10/site-packages/google/api_core/retry/retry_unary.py", line 147, in retry_target
    result = target()
  File "/usr/local/lib/python3.10/site-packages/google/api_core/timeout.py", line 130, in func_with_timeout
    return func(*args, **kwargs)
  File "/usr/local/lib/python3.10/site-packages/google/api_core/grpc_helpers.py", line 78, in error_remapped_callable
    raise exceptions.from_grpc_error(exc) from exc
google.api_core.exceptions.ResourceExhausted: 429 You exceeded your current quota, please check your plan and billing details. For more information on this error, head to: https://ai.google.dev/gemini-api/docs/rate-limits. To monitor your current usage, head to: https://ai.dev/usage?tab=rate-limit. 
* Quota exceeded for metric: generativelanguage.googleapis.com/embed_content_free_tier_requests, limit: 0
* Quota exceeded for metric: generativelanguage.googleapis.com/embed_content_free_tier_requests, limit: 0
* Quota exceeded for metric: generativelanguage.googleapis.com/embed_content_free_tier_requests, limit: 0
* Quota exceeded for metric: generativelanguage.googleapis.com/embed_content_free_tier_requests, limit: 0 [links {
  description: "Learn more about Gemini API quotas"
  url: "https://ai.google.dev/gemini-api/docs/rate-limits"
}
, violations {
  quota_metric: "generativelanguage.googleapis.com/embed_content_free_tier_requests"
  quota_id: "EmbedContentRequestsPerDayPerProjectPerModel-FreeTier"
}
violations {
  quota_metric: "generativelanguage.googleapis.com/embed_content_free_tier_requests"
  quota_id: "EmbedContentRequestsPerMinutePerProjectPerModel-FreeTier"
}
violations {
  quota_metric: "generativelanguage.googleapis.com/embed_content_free_tier_requests"
  quota_id: "EmbedContentRequestsPerMinutePerUserPerProjectPerModel-FreeTier"
}
violations {
  quota_metric: "generativelanguage.googleapis.com/embed_content_free_tier_requests"
  quota_id: "EmbedContentRequestsPerDayPerUserPerProjectPerModel-FreeTier"
}
]

The above exception was the direct cause of the following exception:

Traceback (most recent call last):
  File "/usr/local/lib/python3.10/site-packages/starlette/routing.py", line 693, in lifespan
    async with self.lifespan_context(app) as maybe_state:
  File "/usr/local/lib/python3.10/contextlib.py", line 199, in __aenter__
    return await anext(self.gen)
  File "/usr/local/lib/python3.10/site-packages/fastapi/routing.py", line 133, in merged_lifespan
    async with original_context(app) as maybe_original_state:
  File "/usr/local/lib/python3.10/contextlib.py", line 199, in __aenter__
    return await anext(self.gen)
  File "/usr/local/lib/python3.10/site-packages/fastapi/routing.py", line 133, in merged_lifespan
    async with original_context(app) as maybe_original_state:
  File "/usr/local/lib/python3.10/contextlib.py", line 199, in __aenter__
    return await anext(self.gen)
  File "/usr/local/lib/python3.10/site-packages/fastapi/routing.py", line 133, in merged_lifespan
    async with original_context(app) as maybe_original_state:
  File "/usr/local/lib/python3.10/contextlib.py", line 199, in __aenter__
    return await anext(self.gen)
  File "/usr/local/lib/python3.10/site-packages/fastapi/routing.py", line 133, in merged_lifespan
    async with original_context(app) as maybe_original_state:
  File "/usr/local/lib/python3.10/contextlib.py", line 199, in __aenter__
    return await anext(self.gen)
  File "/usr/local/lib/python3.10/site-packages/fastapi/routing.py", line 133, in merged_lifespan
    async with original_context(app) as maybe_original_state:
  File "/usr/local/lib/python3.10/contextlib.py", line 199, in __aenter__
    return await anext(self.gen)
  File "/usr/local/lib/python3.10/site-packages/fastapi/routing.py", line 133, in merged_lifespan
    async with original_context(app) as maybe_original_state:
  File "/usr/local/lib/python3.10/contextlib.py", line 199, in __aenter__
    return await anext(self.gen)
  File "/usr/local/lib/python3.10/site-packages/fastapi/routing.py", line 133, in merged_lifespan
    async with original_context(app) as maybe_original_state:
  File "/usr/local/lib/python3.10/contextlib.py", line 199, in __aenter__
    return await anext(self.gen)
  File "/usr/local/lib/python3.10/site-packages/fastapi/routing.py", line 133, in merged_lifespan
    async with original_context(app) as maybe_original_state:
  File "/usr/local/lib/python3.10/contextlib.py", line 199, in __aenter__
    return await anext(self.gen)
  File "/usr/local/lib/python3.10/site-packages/fastapi/routing.py", line 133, in merged_lifespan
    async with original_context(app) as maybe_original_state:
  File "/usr/local/lib/python3.10/contextlib.py", line 199, in __aenter__
    return await anext(self.gen)
  File "/usr/local/lib/python3.10/site-packages/fastapi/routing.py", line 133, in merged_lifespan
    async with original_context(app) as maybe_original_state:
  File "/usr/local/lib/python3.10/contextlib.py", line 199, in __aenter__
    return await anext(self.gen)
  File "/usr/local/lib/python3.10/site-packages/fastapi/routing.py", line 133, in merged_lifespan
    async with original_context(app) as maybe_original_state:
  File "/usr/local/lib/python3.10/contextlib.py", line 199, in __aenter__
    return await anext(self.gen)
  File "/app/cat/startup.py", line 42, in lifespan
    app.state.ccat = CheshireCat(cheshire_cat_api)
  File "/app/cat/utils.py", line 325, in getinstance
    cls.instances[class_] = class_(*args, **kwargs)
  File "/app/cat/looking_glass/cheshire_cat.py", line 88, in __init__
    self.load_memory()
  File "/app/cat/looking_glass/cheshire_cat.py", line 290, in load_memory
    embedder_size = len(self.embedder.embed_query("hello world"))
  File "/usr/local/lib/python3.10/site-packages/langchain_google_genai/embeddings.py", line 254, in embed_query
    return self.embed_documents(
  File "/usr/local/lib/python3.10/site-packages/langchain_google_genai/embeddings.py", line 229, in embed_documents
    raise GoogleGenerativeAIError(f"Error embedding content: {e}") from e
langchain_google_genai._common.GoogleGenerativeAIError: Error embedding content: 429 You exceeded your current quota, please check your plan and billing details. For more information on this error, head to: https://ai.google.dev/gemini-api/docs/rate-limits. To monitor your current usage, head to: https://ai.dev/usage?tab=rate-limit. 
* Quota exceeded for metric: generativelanguage.googleapis.com/embed_content_free_tier_requests, limit: 0
* Quota exceeded for metric: generativelanguage.googleapis.com/embed_content_free_tier_requests, limit: 0
* Quota exceeded for metric: generativelanguage.googleapis.com/embed_content_free_tier_requests, limit: 0
* Quota exceeded for metric: generativelanguage.googleapis.com/embed_content_free_tier_requests, limit: 0 [links {
  description: "Learn more about Gemini API quotas"
  url: "https://ai.google.dev/gemini-api/docs/rate-limits"
}
, violations {
  quota_metric: "generativelanguage.googleapis.com/embed_content_free_tier_requests"
  quota_id: "EmbedContentRequestsPerDayPerProjectPerModel-FreeTier"
}
violations {
  quota_metric: "generativelanguage.googleapis.com/embed_content_free_tier_requests"
  quota_id: "EmbedContentRequestsPerMinutePerProjectPerModel-FreeTier"
}
violations {
  quota_metric: "generativelanguage.googleapis.com/embed_content_free_tier_requests"
  quota_id: "EmbedContentRequestsPerMinutePerUserPerProjectPerModel-FreeTier"
}
violations {
  quota_metric: "generativelanguage.googleapis.com/embed_content_free_tier_requests"
  quota_id: "EmbedContentRequestsPerDayPerUserPerProjectPerModel-FreeTier"
}
]

ERROR:    Application startup failed. Exiting.
```

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[BUG] Cheshire Cat cannot launch if LLM quota is exceeded #1107

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

[BUG] Cheshire Cat cannot launch if LLM quota is exceeded #1107

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions