Skip to content

Commit ec98211

Browse files
authored
chore: some readme and docs cleanup (#56)
* update classifiers * remove commented section for now * update readme badges and links * rename persons section to person sampling
1 parent 98a67da commit ec98211

File tree

4 files changed

+26
-10
lines changed

4 files changed

+26
-10
lines changed

README.md

Lines changed: 22 additions & 5 deletions
Original file line numberDiff line numberDiff line change
@@ -2,7 +2,7 @@
22

33
[![CI](https://github.com/NVIDIA-NeMo/DataDesigner/actions/workflows/ci.yml/badge.svg)](https://github.com/NVIDIA-NeMo/DataDesigner/actions/workflows/ci.yml)
44
[![License](https://img.shields.io/badge/License-Apache_2.0-blue.svg)](https://opensource.org/licenses/Apache-2.0)
5-
[![Python 3.10+](https://img.shields.io/badge/python-3.10+-blue.svg)](https://www.python.org/downloads/) [![NeMo Microservices](https://img.shields.io/badge/NeMo-Microservices-76b900)](https://docs.nvidia.com/nemo/microservices/latest/index.html)
5+
[![Python 3.10 - 3.13](https://img.shields.io/badge/🐍_Python-3.10_|_3.11_|_3.12_|_3.13-blue.svg)](https://www.python.org/downloads/) [![NeMo Microservices](https://img.shields.io/badge/NeMo-Microservices-76b900)](https://docs.nvidia.com/nemo/microservices/latest/index.html) [![Code](https://img.shields.io/badge/Code-Documentation-8A2BE2.svg)](https://nvidia-nemo.github.io/DataDesigner/)
66

77
**Generate high-quality synthetic datasets from scratch or using your own seed data.**
88

@@ -98,10 +98,12 @@ preview.display_sample_record()
9898

9999
### 📚 Learn more
100100

101-
- **[Quick Start Guide](https://nvidia-nemo.github.io/DataDesigner)** – Detailed walkthrough with more examples
102-
- **[Tutorial Notebooks](https://nvidia-nemo.github.io/DataDesigner/notebooks/1-the-basics/)** – Step-by-step interactive tutorials
101+
- **[Quick Start Guide](https://nvidia-nemo.github.io/DataDesigner/quick-start/)** – Detailed walkthrough with more examples
102+
- **[Tutorial Notebooks](https://nvidia-nemo.github.io/DataDesigner/notebooks/intro/)** – Step-by-step interactive tutorials
103103
- **[Column Types](https://nvidia-nemo.github.io/DataDesigner/concepts/columns/)** – Explore samplers, LLM columns, validators, and more
104+
- **[Validators](https://nvidia-nemo.github.io/DataDesigner/concepts/validators/)** – Learn how to validate generated data with Python, SQL, and remote validators
104105
- **[Model Configuration](https://nvidia-nemo.github.io/DataDesigner/models/model-configs/)** – Configure custom models and providers
106+
- **[Person Sampling](https://nvidia-nemo.github.io/DataDesigner/concepts/persons/)** – Learn how to sample realistic person data with demographic attributes
105107

106108
### 🔧 Configure models via CLI
107109

@@ -114,11 +116,26 @@ data-designer config list # View current settings
114116
### 🤝 Get involved
115117

116118
- **[Contributing Guide](https://nvidia-nemo.github.io/DataDesigner/CONTRIBUTING.md)** – Help improve Data Designer
117-
- **[GitHub Issues](https://github.com/NVIDIA-NeMo/DataDesigner/issues)** – Report bugs or request features
118-
- **[GitHub Discussions](https://github.com/NVIDIA-NeMo/DataDesigner/discussions)** – Ask questions and share ideas
119+
- **[GitHub Issues](https://github.com/NVIDIA-NeMo/DataDesigner/issues)** – Report bugs or make a feature request
119120

120121
---
121122

122123
## License
123124

124125
Apache License 2.0 – see [LICENSE](LICENSE) for details.
126+
127+
---
128+
129+
## Citation
130+
131+
If you use NeMo Data Designer in your research, please cite it using the following BibTeX entry:
132+
133+
```bibtex
134+
@misc{nemo-data-designer,
135+
author = {The NeMo Data Designer Team},
136+
title = {NeMo Data Designer: A framework for generating synthetic data from scratch or based on your own seed data},
137+
howpublished = {\url{https://github.com/NVIDIA-NeMo/DataDesigner}},
138+
year = {2025},
139+
note = {GitHub Repository},
140+
}
141+
```
Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -62,7 +62,7 @@ Person samplers accept these configuration parameters:
6262
* `locale`: Language and region code (optional, e.g., "en\_US", "ja\_JP", "hi\_IN", "en\_IN")
6363
* `city`: Filter on cities within the specified locale (optional)
6464
* `age_range`: Age range for filtering (default: ages above 18 only)
65-
* `state`: Filter on US states, only valid when locale is set to "en\_US" (optional)
65+
* `select_field_values`: Filter on specific field values (optional)
6666

6767
**Synthetic Personas Configuration:**
6868

mkdocs.yml

Lines changed: 1 addition & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -10,8 +10,7 @@ nav:
1010
- Concepts:
1111
- Columns: concepts/columns.md
1212
- Validators: concepts/validators.md
13-
- Persons: concepts/persons.md
14-
# - Plugins: concepts/plugins.md
13+
- Person Sampling: concepts/persons.md
1514
- Models:
1615
- Default Model Settings: models/default-model-settings.md
1716
- Configure with the CLI: models/configure-model-settings-with-the-cli.md

pyproject.toml

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -4,15 +4,15 @@ dynamic = ["version"]
44
description = "General framework for synthetic data generation"
55
readme = "README.md"
66
requires-python = ">=3.10"
7+
license = "Apache-2.0"
78

89
classifiers = [
910
"Development Status :: 4 - Beta",
1011
"Intended Audience :: Developers",
1112
"Intended Audience :: Science/Research",
1213
"Topic :: Scientific/Engineering :: Artificial Intelligence",
13-
"Topic :: Scientific/Engineering :: Human Machine Interfaces",
1414
"Topic :: Software Development",
15-
"License :: Other/Proprietary License",
15+
"License :: OSI Approved :: Apache Software License",
1616
"Programming Language :: Python :: 3.10",
1717
"Programming Language :: Python :: 3.11",
1818
"Programming Language :: Python :: 3.12",

0 commit comments

Comments
 (0)