ToxScan Project Explanation: API Keys and Toxicity Analysis

Here are the answers to your questions based on the application's code.

1. Is the API key in the code? Can I safely make the GitHub project open source?

No, your API key is not written into the code.

Yes, it is safe to make your GitHub project public.

Here is why:

Environment Variables: The code is designed to get the API key from a secure location called an "environment variable." This line in services/geminiService.ts is the key:
```
const apiKey = process.env.API_KEY;
```
This tells the application to look for a variable named API_KEY that is provided by the server environment (like Vercel or your local computer), not from the code files.
Build Configuration: The vite.config.ts file helps make this environment variable available during development and build, but it never stores the actual key value in the code that gets published.
How it Works on Vercel: When you deploy to Vercel, you set your API_KEY in the Vercel project's "Environment Variables" settings. Vercel securely injects this key into the running application. Anyone who looks at your GitHub code will only see process.env.API_KEY; they will never see your actual secret key.

This is the standard and secure way to handle secret keys, so you can confidently share your code publicly.

2. How is the product toxicity analyzed? What is the AI prompt and database?

The analysis is performed by the Gemini AI model, guided by a very specific set of instructions (the "prompt") and a powerful tool.

The AI Prompt

The core of the analysis is the detailed prompt located in the services/geminiService.ts file. It essentially turns the AI into an expert toxicologist by giving it a multi-step process:

Assume a Role: It first tells the AI: "You are an expert toxicologist and product safety analyst."
Extract Text (OCR): It instructs the AI to find and read the full list of ingredients from the image you provide.
Analyze & Classify: For each ingredient found, it must:
- Define what makes an ingredient High, Medium, or Low risk (e.g., High risk includes carcinogens, mutagens; Low risk includes irritants).
Calculate a Score (Strict Rubric): This is the most important part for consistency. It forces the AI to do math instead of guessing a score. It must follow this exact formula:
- Start at 0.
- Add 25 points for each "High" risk ingredient.
- Add 10 points for each "Medium" risk ingredient.
- Add 2 points for each "Low" risk ingredient.
Format the Output: It commands the AI to return the final answer in a specific JSON format so the app can easily display the results. It also has rules for cleaning up organ names to keep the UI looking good.

The Toxicity "Database"

The application does not use a single, fixed toxicity database.

Instead, the prompt instructs Gemini to use its Google Search tool. You can see this enabled in the API call configuration:

config: {
  tools: [{ googleSearch: {} }],
  // ...
}

This means that for each ingredient it identifies, Gemini performs a live search on Google. It acts like an expert researcher, looking for scientific consensus from a huge variety of sources on the web, such as:

Scientific studies and papers.
Reports from regulatory agencies (like the FDA or EPA).
Reputable health and chemical safety websites.

This approach is very powerful because it uses the most current information available on the internet, rather than relying on a database that might be outdated. To ensure the results are consistent and not random, the configuration is set to be very strict (temperature: 0, topK: 1, etc.), forcing the AI to choose the most logical and fact-based answer every time.

3. Difference between the General Score and Affected System Score

Based on the current code, the difference in scores comes from the fact that they are calculated in two completely different places using different scales:

General Toxicity Score (calculated by AI)

Where: This happens in services/geminiService.ts.
Method: It uses a Weighted Rubric designed to produce a percentage-like score (0-100).
Points: High Risk = 25, Medium Risk = 10, Low Risk = 2.
Example calculation: If a product has 2 Medium risk ingredients, the General Score is 10 + 10 = 20.

Affected System Score (calculated by App)

Where: This happens in components/Results.tsx.
Method: It uses a Simple Count system to show intensity per organ.
Points: High Risk = 3, Medium Risk = 2, Low Risk = 1.
Example calculation: If one organ is affected by 1 Medium risk ingredient, its score is 2.

Why you might see General = 20 and Organ = 2

If a product has two Medium risk ingredients that affect different organs (e.g., one affects Lungs, one affects Nervous System):

General Score: 10 (Medium 1) + 10 (Medium 2) = 20.
Lungs Score: 2 (Medium 1) = 2.
Nervous System Score: 2 (Medium 2) = 2.

The General Score is cumulative for the whole product and uses higher numbers to create a "danger level," while the Organ Score is isolated to that specific body part and uses lower numbers to indicate simple severity.

4. How is AI consistency enforced with code?

To ensure the results are consistent and not random, the configuration is set to be very strict, forcing the AI to choose the most logical and fact-based answer every time.

The code fragments demonstrating this are located in the services/geminiService.ts file within the analyzeProductLabel function.

Here is the specific part of the code:

// From services/geminiService.ts

    const response = await ai.models.generateContent({
      model: model,
      contents: {
        // ... (image and prompt here)
      },
      config: {
        tools: [{ googleSearch: {} }],
        // Strict parameters to ensure deterministic (same input = same output) results
        temperature: 0,
        topK: 1, 
        topP: 0.1,
        seed: 42, 
      }
    });

Explanation of each parameter:

temperature: 0: This is the most important parameter for consistency. A temperature of 0 instructs the model to always choose the word with the highest probability, eliminating randomness in its response.
topK: 1: This tells the model to only consider the single most likely next word (token) at each step of the generation. It works with temperature: 0 to prevent any other options from being considered.
topP: 0.1: This is another method to control randomness. While less critical when temperature is 0 and topK is 1, setting it to a low value further restricts the model's choices to a very small set of high-probability words.
seed: 42: This provides a fixed starting point for the model's internal random number generator (even though we've minimized randomness). It ensures that if there were any residual stochasticity, the results would still be reproducible across identical requests.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

ToxScan Project Explanation: API Keys and Toxicity Analysis

1. Is the API key in the code? Can I safely make the GitHub project open source?

2. How is the product toxicity analyzed? What is the AI prompt and database?

The AI Prompt

The Toxicity "Database"

3. Difference between the General Score and Affected System Score

General Toxicity Score (calculated by AI)

Affected System Score (calculated by App)

Why you might see General = 20 and Organ = 2

4. How is AI consistency enforced with code?

FilesExpand file tree

explanation.md

Latest commit

History

explanation.md

File metadata and controls

ToxScan Project Explanation: API Keys and Toxicity Analysis

1. Is the API key in the code? Can I safely make the GitHub project open source?

2. How is the product toxicity analyzed? What is the AI prompt and database?

The AI Prompt

The Toxicity "Database"

3. Difference between the General Score and Affected System Score

General Toxicity Score (calculated by AI)

Affected System Score (calculated by App)

Why you might see General = 20 and Organ = 2

4. How is AI consistency enforced with code?