This repository contains curated datasets in JSONL format. Each dataset follows a strict schema validation process to ensure data quality and consistency. All contributions are automatically validated through GitHub Actions.
A comprehensive dataset documenting individuals and organizations who have publicly expressed that AI systems should not have legal rights or personhood equivalent to humans. This includes those advocating that AI should remain as tools under human control.
Browse more datasets in the datasets/
directory.
.
βββ README.md # This file
βββ datasets/ # All datasets live here
β βββ example-dataset/ # Example dataset folder
β β βββ schema.json # JSON Schema for validation
β β βββ data.jsonl # Actual data in JSONL format
β β βββ README.md # Dataset-specific documentation
β βββ another-dataset/ # Another dataset
β βββ schema.json
β βββ data.jsonl
β βββ README.md
βββ .github/
β βββ workflows/
β β βββ validate-datasets.yml # GitHub Action for validation
β βββ pull_request_template.md # PR template
βββ scripts/
βββ validate.py # Validation script
Create a new folder under datasets/
with a descriptive name using kebab-case:
mkdir datasets/your-dataset-name
Create a schema.json
file in your dataset folder. This should be a valid JSON Schema (draft-07 or later). Example:
{
"$schema": "http://json-schema.org/draft-07/schema#",
"type": "object",
"required": ["name"],
"properties": {
"name": {
"type": "string",
"description": "The name of the entity"
}
}
}
Create a data.jsonl
file with your data. Each line should be a valid JSON object that conforms to your schema:
{"name": "Example 1", "optional_field": "value"}
{"name": "Example 2"}
Create a README.md
file in your dataset folder describing:
- What the dataset contains
- Data sources and collection methodology
- Any special considerations or limitations
- License information
- Update frequency (if applicable)
- Fork the repository
- Create a feature branch:
git checkout -b add-dataset-name
- Add your dataset files
- Commit with a descriptive message:
git commit -m "Add dataset-name dataset"
- Push to your fork:
git push origin add-dataset-name
- Open a Pull Request using our template
All datasets must:
-
Have a valid JSON Schema (
schema.json
)- Must be valid JSON
- Must be a valid JSON Schema (draft-07 or later)
- Must define required fields
-
Have valid JSONL data (
data.jsonl
)- Each line must be valid JSON
- Each record must validate against the schema
- File must not be empty
-
Have documentation (
README.md
)- Must describe the dataset
- Must include data sources
-
Pass automated validation
- GitHub Actions will automatically validate your PR
- All checks must pass before merge
Before submitting a PR, you can validate your dataset locally:
# Install dependencies
pip install -r requirements.txt
# Run validation
python scripts/validate.py datasets/your-dataset-name
See datasets/example-dataset/
for a complete example with:
- Schema requiring
name
field - Optional
social_links
,sources
, andquote
fields - Sample data entries
- Complete documentation
We welcome contributions! Please:
- Follow the structure and naming conventions
- Ensure your data is properly licensed for inclusion
- Validate your dataset before submitting
- Use our PR template
- Be responsive to review feedback
You can contribute to datasets directly through GitHub's website without installing any software. Here's how:
- Navigate to the
datasets
folder - Click on the dataset you want to update (e.g.,
ai-rights-opposition
) - Click on the
data.jsonl
file
- Click the pencil icon (βοΈ) in the top-right corner of the file view
- You'll see the file contents in an editor
Add a new line at the end of the file with your data. Here's the format:
{"name": "Person Name", "social_links": [{"platform": "Twitter", "url": "https://twitter.com/username"}], "sources": [{"title": "Article Title", "url": "https://example.com/article"}], "quote": "Optional quote here"}
Important formatting rules:
- Everything must be on ONE line
- Use double quotes
"
not single quotes'
- URLs must start with
https://
orhttp://
- Don't add a comma at the end of the line
- Make sure all brackets
{}
and[]
are properly closed
Before submitting, validate your JSON entry:
- Copy your entire line
- Go to jsonlines.org/validator
- Paste your line and click "Validate JSON"
- Fix any errors it shows
- Scroll down to "Commit changes"
- Add a title like:
Add [Person Name] to dataset
- Add a description explaining why this person/org belongs in the dataset
- Select "Create a new branch" (it will suggest a name)
- Click "Propose changes"
- GitHub will take you to a "Pull Request" page
- Fill out the template that appears
- Click "Create pull request"
Our automated system will check your contribution. You'll see:
- β Green checkmark: Your entry is valid and ready for review!
- β Red X: There's an issue - click "Details" to see what needs fixing
If validation fails, click the pencil icon again on your branch to fix the issues.
Remember: In the JSONL file, each entry must be on a single line!
- Using single quotes - Always use double quotes:
"name"
not'name'
- Line breaks - Everything must be on ONE line in JSONL format
- Trailing commas - Don't add a comma after the last item in an object or array
- Missing brackets - Ensure all
{
,}
,[
,]
are properly paired - Invalid URLs - URLs must start with
http://
orhttps://
- Wrong platform names - Use exact platform names from the allowed list
- Use a JSON validator before submitting to catch errors early
- Start small - Try adding just one entry first
- Check existing entries as examples of proper formatting
- Open an issue for bugs or problems
- Start a discussion for questions or suggestions
- Check existing issues before creating new ones
Each dataset may have its own license. Check the individual dataset README files for specific licensing information.