Skip to content

hypothesis testing for custom data #8

@LYudia

Description

@LYudia

I'm currently exploring how to use POPPER with a custom CSV dataset for hypothesis testing, as mentioned in the README under "Run on your own hypothesis and database". However, I'm encountering an issue where the system automatically tries to download popper_data_processed.tar.gz even when I specify a local directory and loader_type='custom'.

Here is my demo.py script:

from popper import Popper
import os
# Set API key
os.environ['ANTHROPIC_API_KEY'] = 'YOUR_API_KEY'
# Initialize Popper agent
agent = Popper(llm="claude-3-5-sonnet-20240620")
# Register custom dataset
agent.register_data(data_path='.', loader_type='custom')
# Configure agent
agent.configure(
    alpha=0.1,
    max_num_of_tests=5,
    max_retry=3,
    time_limit=2,
    aggregate_test='E-value',
    relevance_checker=True,
    use_react_agent=True
)
# Launch UI
agent.launch_UI()
# hypothesis = "x1 is a valid shadow variable
# results = agent.validate(hypothesis=hypothesis)
# print(results)

Despite setting loader_type='custom' and pointing data_path to the current directory (.), the program still attempts to download popper_data_processed.tar.gz. I expected that using loader_type='custom' would skip any default data downloading and only use the files I provide.

Could you please clarify:
How should I properly structure and name my CSV file so that POPPER can load it correctly with loader_type='custom'?
Is there a way to completely disable the automatic download of the default dataset?
Do I need to implement a custom data loader, or is there a built-in way to handle arbitrary CSV files?
Any guidance or example code for using POPPER with a user-provided CSV would be greatly appreciated!

Thank you again for your work and for making this tool open-source.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions