@@ -216,7 +216,7 @@ The diagram also highlights the protein features (domains, repeats, etc.).
216216Summarize all variant alleles
217217-----------------------------
218218
219- We can prepare a table of all variant alleles that occurr in the cohort.
219+ We can prepare a table of all variant alleles that occur in the cohort.
220220
221221Each table row corresponds to a single allele and lists the variant key,
222222the predicted effect on the transcript (*cDNA *) and protein of interest,
@@ -240,11 +240,25 @@ with one or more variant alleles (*Count*):
240240Partition the cohort by genotype and phenotype
241241==============================================
242242
243- To test for genotype-phenotype associations, we need to divide the cohort into classes.
244- In GPSEA, we always assign a cohort member into a genotype class,
245- where each individual is assigned into a single class and the classes do not overlap.
246- The phenotype is then used to either assign an individual into a class,
247- or to calculate a numeric score or survival.
243+ Testing for a genotype-phenotype association uses genotype and phenotype as variables.
244+ In GPSEA, the variable value for an individual is computed
245+ either by a :class: `~gpsea.analysis.clf.Classifier `
246+ or by a :class: `~gpsea.analysis.pscore.PhenotypeScorer `.
247+ A `Classifier ` assigns the individual into a class,
248+ whereas a `PhenotypeScorer ` computes a continuous score.
249+ The classifiers and scorers are applied on all individuals of the cohort
250+ and the resulting variable distributions are then assessed by a statistical test.
251+
252+ In GPSEA, genotype is always treated as a class
253+ and a genotype `Classifier ` is a prerequisite for each analysis.
254+ However, there is much more flexibility on the phenotype part,
255+ where either a `Classifier ` or a `PhenotypeScorer ` can be used to compute the values,
256+ depending on the analysis goals.
257+
258+ In this tutorial section, we first configure a `Classifier ` for assigning
259+ the individuals into a genotype class,
260+ and we follow with generating classifiers for testing the presence or exclusion
261+ of HPO terms in the individuals.
248262
249263
250264Partition by genotype
0 commit comments