hypothesis testing for custom data

I'm currently exploring how to use POPPER with a custom CSV dataset for hypothesis testing, as mentioned in the README under "Run on your own hypothesis and database". However, I'm encountering an issue where the system automatically tries to download popper_data_processed.tar.gz even when I specify a local directory and loader_type='custom'.

Here is my demo.py script:


```
from popper import Popper
import os
# Set API key
os.environ['ANTHROPIC_API_KEY'] = 'YOUR_API_KEY'
# Initialize Popper agent
agent = Popper(llm="claude-3-5-sonnet-20240620")
# Register custom dataset
agent.register_data(data_path='.', loader_type='custom')
# Configure agent
agent.configure(
    alpha=0.1,
    max_num_of_tests=5,
    max_retry=3,
    time_limit=2,
    aggregate_test='E-value',
    relevance_checker=True,
    use_react_agent=True
)
# Launch UI
agent.launch_UI()
# hypothesis = "x1 is a valid shadow variable
# results = agent.validate(hypothesis=hypothesis)
# print(results)
```


Despite setting loader_type='custom' and pointing data_path to the current directory (.), the program still attempts to download popper_data_processed.tar.gz. I expected that using loader_type='custom' would skip any default data downloading and only use the files I provide.

Could you please clarify:
How should I properly structure and name my CSV file so that POPPER can load it correctly with loader_type='custom'?
Is there a way to completely disable the automatic download of the default dataset?
Do I need to implement a custom data loader, or is there a built-in way to handle arbitrary CSV files?
Any guidance or example code for using POPPER with a user-provided CSV would be greatly appreciated!

Thank you again for your work and for making this tool open-source.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

hypothesis testing for custom data #8

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

hypothesis testing for custom data #8

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions