Skip to content

consistent strategy doc strings#447

Open
psnairne wants to merge 1 commit intomainfrom
pn/consistent-strategy-doc-strings
Open

consistent strategy doc strings#447
psnairne wants to merge 1 commit intomainfrom
pn/consistent-strategy-doc-strings

Conversation

@psnairne
Copy link
Contributor

No description provided.

@psnairne psnairne linked an issue Mar 10, 2026 that may be closed by this pull request
Copy link
Collaborator

@SmartMonkey-git SmartMonkey-git left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please add: #![deny(rustdoc::broken_intra_doc_links)] to the top of our lib.rs file.

Also add this to our ci:

doc-check:
  name: Doc Check
  runs-on: ubuntu-latest
  timeout-minutes: 15
  steps:
    - uses: actions/checkout@v4

    - uses: actions-rust-lang/setup-rust-toolchain@v1.13.0
      with:
        toolchain: stable

    - uses: Swatinem/rust-cache@v2.8.0

    - name: Check documentation
      env:
        RUSTDOCFLAGS: "-D warnings -D rustdoc::broken_intra_doc_links"
      run: cargo doc --no-deps --all-features

/// # Example
///
/// The table
/// ```text
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think, this is:

Suggested change
/// ```text
/// ```csv

use std::collections::{HashMap, HashSet};

#[derive(Debug)]
/// # Description
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The first line of a doc string will be used in different places as a headline for this class on the doc website of crates.io.

For example:
https://docs.rs/securiety/0.2.8/securiety/curie/index.html
https://github.com/SmartMonkey-git/securiety/blob/c6eeb0b7aac212d331b0876dc6e284500ad1c2ad/src/curie.rs#L3

Can use see how the first line is used?

So, if we keep this we will see something like:
AgeToIso8601Strategy Description

So, the first line should be a single sentence summary.

/// Given a column whose cells contains ages (e.g. subject age, age of death, age of onset)
/// this strategy converts integer entries to ISO8601 durations: 47 -> P47Y
///
/// NOTE: the integers must be between 0 and 150.
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would:

Suggested change
/// NOTE: the integers must be between 0 and 150.
/// ## Note
Integers must be between 0 and 150.

or maybe leave it out here, since its already explained in the error section.

#[derive(Debug)]
/// # Description
///
/// Given a column whose cells contains ages (e.g. subject age, age of death, age of onset)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please, refer to our types everywhere.

Suggested change
/// Given a column whose cells contains ages (e.g. subject age, age of death, age of onset)
/// Given a column whose cells contains ages (e.g. [`Context:SubjectAge`], [`Context:AgeOfDeath`], [`Context:AgeOfOnset`])

/// and a ToString AliasMap which converts "M" to "Male" and "F" to "Female"
/// # Description
///
/// Given a collection of `ContextualiseDataframes`, this strategy will apply all the aliases
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also here:

Suggested change
/// Given a collection of `ContextualiseDataframes`, this strategy will apply all the aliases
/// Given a collection of [`ContextualisedDataframe`], this strategy will apply all the aliases

Comment on lines +39 to +46
/// P001, 4
/// P002, 0
/// ```
///
/// # Errors
///
/// Errors will be thrown if:
/// - Any columns to be aliased cannot be cast to String datatype.
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also here: [String] or do you mean [Dtype::String]?

/// # Errors
///
/// An error will be thrown if
/// 1. A DOB is before to a date for a patient, leading to a negative age.
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Here you use numbers on AliasMapStrategy you used dashes.

Suggested change
/// 1. A DOB is before to a date for a patient, leading to a negative age.
/// 1. A date of birth is before to a date for a patient, leading to a negative age.

or might even link the context type.

///
/// An error will be thrown if
/// 1. A DOB is before to a date for a patient, leading to a negative age.
/// 2. There exists a date which cannot be converted to an age due to missing DOB data.
Copy link
Collaborator

@SmartMonkey-git SmartMonkey-git Mar 12, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

  1. There exists a date which cannot be converted to an age due to missing DOB data

I didn't know. This will be a major annoyance, as soon as we have large datasets.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes. For those situations we have a few options:

  1. Just don't apply the strategy. I decided to be strict
  2. Add a bool to the strategy so it can be strict or otherwise
  3. Make the strategy just allow situations with missing data

Probably 2 is best. But I would point out that if you are creating a cohort of Phenopackets, you probably either want just ages in a field, or just dates. To have a mix is strange. So strictness did make sense. And DOB data probably is less likely to be missing than other data. Miraculously the strategy seemed to work with the i_data.

Comment on lines +23 to +26
/// # Fields
///
/// * `hpo_bidict_lib` - This should contain BiDictLibrary for the version of HPO that you want to use.
/// * `disease_bidict_lib` - All non-HPO cells will be processed by this disease BiDictLibrary.
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think Fields is a common section. At least not to the standard.

Common sections

Maybe we should drop it.

Comment on lines 44 to 48
/// ```ignore
/// let sex_mapping = MappingStrategy::default_sex_mapping_strategy();
/// // Maps variations like "m", "male", "man" → "MALE"
/// // and "f", "female", "woman" → "FEMALE"
/// ```
Copy link
Collaborator

@SmartMonkey-git SmartMonkey-git Mar 12, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Example is not concise with the others

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Consistent strategy doc strings

2 participants