Skip to content

Overview documentation#203

Closed
tim-band wants to merge 108 commits intoalan-turing-institute:mainfrom
SAFEHR-data:overview_documentation
Closed

Overview documentation#203
tim-band wants to merge 108 commits intoalan-turing-institute:mainfrom
SAFEHR-data:overview_documentation

Conversation

@tim-band
Copy link
Copy Markdown
Collaborator

A new Overview page in the documentation, describing the workflow and so how privacy is achieved and where oversight can take place.

To enable diagramming, the Sphinx Mermaid plugin has been added.

Tim Band added 30 commits November 27, 2024 18:27
...with a small config file
Added WITHOUT TIME ZONE.
Removed sqlacodegen dependency.
including union column sets
Tim Band and others added 29 commits May 18, 2025 09:34
Version 0.15 knows that click 8.2 breaks it.
'dsn' no longer contains password.
'schema' is taken from SRC_SCHEMA directly.
Tables no longer have schema.
Also documentation fixes.
* Fixes #33, #34, #31

* documentation normal->generate

---------

Co-authored-by: Tim Band <t.b@ucl>
* Fixes #33, #34, #31

* documentation normal->generate

* Don't unnecessarily create schema

* dump-data command #38

* Initial documentation of orm.yaml

---------

Co-authored-by: Tim Band <t.b@ucl>
* Initial change of name from sqlsynthgen to datafaker

* Refactoring of interactive tests

* `remove-` commands tests replaced

---------

Co-authored-by: Tim Band <t.b@ucl>
* src-stats gain query and date

* Queries gain comments that get copied into src-stats.yaml
Fixed stories

* single letter command synonyms in configure-generators

---------

Co-authored-by: Tim Band <t.b@ucl>
* test_make updated
test_settings fixed

* Fixed main and create tests

* Tests all pass individually now. Many fixes:
* create-vocab actually runs
* create-data reports correct count of story rows
* -f can be used as well as --force
* the logger is called "datafaker" not "utils"
* row generators can fully exhaust unique constraints
* row/story generator modules can just be files (this may just have been broken for tests)
* turning on max constraint retries doesn't break create-data
* unique constraint failure does not blow up datafaker

* More test robustness

* All test pass together!
Row generators can be instantiated objects

---------

Co-authored-by: Tim Band <t.b@ucl>
* Tests all pass individually now. Many fixes:
* create-vocab actually runs
* create-data reports correct count of story rows
* -f can be used as well as --force
* the logger is called "datafaker" not "utils"
* row generators can fully exhaust unique constraints
* row/story generator modules can just be files (this may just have been broken for tests)
* turning on max constraint retries doesn't break create-data
* unique constraint failure does not blow up datafaker

* More test robustness

* #44 create-generators figures out if it needs a stats file

* bump version to 0.2.1

---------

Co-authored-by: Tim Band <t.b@ucl>
* sampled and suppressed choice generators

* Fixed problems found trying this out for real.

---------

Co-authored-by: Tim Band <t.b@ucl>
and --spec=table-column-gen.csv

Co-authored-by: Tim Band <t.b@ucl>
Co-authored-by: Tim Band <t.b@ucl>
* test_prompts added for configure-generators
* Refactored GeneratorCmd to allow multi-column generators: Fixes #54
* configure-generators merge and unmerge commands
* multivariate normal and lognormal generator
* Added (univariate) lognormal generator
* Weighted choice generator
* null-partitioned grouped lognormal plus sampled and suppressed
* VARCHAR(N) generators truncate results
* Updated health_data documentation
* #59 Foreign Keys to ignored tables supported
Co-authored-by: Tim Band <t.b@ucl>
* configure-generators --spec now allows fallbacks and multi-column generators
* null-partitioned grouped sampled generators
* SUPPRESS_COUNT is now 7
* Automatic pre-commit fixes
* Fixed variances in tests
* precommit cleanup, NullPartitionedGrouped fix
* Moved DistributionGenerator to providers.py
Co-authored-by: Tim Band <t.b@ucl>
* Refactoring query construction
out of the generator proposer

* Removd _get_row_partition

* A bit more refactoring

* Fixed #73 Grouped generators query results overlap

* FKs to concept table named, initial implementation

* Many extra comments output with partitions

* Named column fetched from config.yaml

* configure-tables allows the setting of naming columns

---------

Co-authored-by: Tim Band <t.b@ucl>
Mermaid diagrams can be embedded in docs
Table CSS fixed
@tim-band tim-band closed this Jan 19, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant