File with the outline of plots and questions about these first draft of plots

Most Important Questions for Ian:

Below I have a few important questions for my analysis (specifically moving onto SNPeff or Gowinda) and after I have more questions (less important) pertaining to visualization and final plots required.

1. Filtering positions for Fst values:

Currently can filter both model output and selection coefficients for significance after adjusting p values, but how to filter the Fst values (other than keeping all those with a Fst value)?

Note: Fst values are windows, so use of comparison with other measures here is to ensure Fst is sufficiently high in window around positions of interest.

Should I downscale the Fst values with Sel:Sel and Con:Con comparisons? (below are the plots for 115)

 --: Method? (previous ideas were (Fst_C:C + Fst_S:S)/2 for scaling)

2. Adjusting P values: Chromosome OR full genome ??

When performing p.adjust, should the adjustments be for the full genome (all positions) or on a per chromosome basis?

-- Currently have to do per chromo for poolseq and am doing full genome for model output.

3. Bonferroni vs. Fdr:

Fdr adjustment for p-values keeps more positions but Bonferroni gives more visually appealling plots (see below) and more accuracy for positions

For plots of outputs: would Bonferroni plots be better?

For finding positions of interest: is FDR still prefered better?

OR: keep consistent between the two (which method)?

4. Selection Coeffcient Filtering:

Current method is to keep any significant (after FDR p.adjust) selection coefficients that are unique to predation lines (i.e no Selcoeef for Con).

This is the average Selcoef b/w two mappers (keeping the less significant p-value).

Does this method make sense?

Plots with specific questions below for plots:

Pi: Ancestral Pi for Novoalign:

Outline

The ancestral nucleotide diversity:

Questions

Necessary for all populations?

-- Have all populations (and for bowtie and novoalign mappers)

Average Pi for all mappers??

-- Calculate bwa Pi and average between three? or show one (or 2) mappers as a represenation?

Overlay for changes in diversity over time?

-- Do we want overlay plots with ~splines showing the change in diversity from Ancestor --> 115?

Fst Plots:

Outline

Average pairwise Fst between control and selection replicates

-- Average b/w mappers and replicates

Generation 38:

Generation 77:

Generation 115:

Questions

Downscaling (available for all generations): Necessary? and methods?

-- previous ideas were (Fst_C:C + Fst_S:S)/2 for scaling

meanFst: Selection vs. Control: Generation115 meanFst: Control vs. Control: Generation115 meanFst: Selection vs. Selection: Generation115

Cut off for positions?

-- Currently keeping anything with an Fst value for Con:Sel_115 comparison: any way to filter more deeply for peeks 

-- Should I keep the top 50%? the top 10% Fst values?

Model Outputs

Outline

Plots for original values and FDR adjusted

None corrected P values: TxG -log10(meanP-value)

FDR Corrected P-values: TxG -log10(meanP-value)

Questions:

is Bonferroni a better visualization for the paper (much less going on)

TxG: -log10(meanP) with Bonferroni Correction for multiple comparisons

Advantage with this: Can create a plot on with valued of the regular plot (first one) with coloured sig. values: Would not look good with FDR:

Poolseq outputs:

Outline

Output from PoolSeq package: the significant selection coeffients that were significant for Predation lines and not for controls

Ongoing with the slow pace of Poolseq: 3L and 3R almost completed

This is the average b/w two mappers (bwa and novoalign), keeping the least significant pvalue.

Questions

Plot like above (and all chromo eventually)?
Any cut off for selection coefficients or just any significant selection coefficients unique to predator lines?

Trajectories and positions:

Outline

Filtered positions for:

-- pvalues <0.05 after FDR from model output

-- similar positions in the poolseq significant selection coefficients

-- Then found any overlapping windows with these positions with Fst values != 0.

Ended up with ~400 positions for both 2L and 2R each

Trajectories are the mean absolute difference the treatments had from the ancestor

Questions:

Plots of individual positions?

-- Is it informative to select some large peaked positions that are shared and show the actual trajectories of frequencies?

Overlay the postions of interest onto the output from the model:

-- Most interesting plot would be the -log10(p) plot from model, should the positions present from Poolseq and Fst be coloured and used as well?

-- larger and coloured positions on the model output for example:

2L with FDR:

2L with Bonferroni:

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

File with the outline of plots and questions about these first draft of plots

Most Important Questions for Ian:

1. Filtering positions for Fst values:

2. Adjusting P values: Chromosome OR full genome ??

3. Bonferroni vs. Fdr:

4. Selection Coeffcient Filtering:

Plots with specific questions below for plots:

Pi: Ancestral Pi for Novoalign:

Outline

Questions

Fst Plots:

Outline

Questions

Model Outputs

Outline

Questions:

Poolseq outputs:

Outline

Questions

Trajectories and positions:

Outline

Questions:

FilesExpand file tree

Plots_Outline_Questions.md

Latest commit

History

Plots_Outline_Questions.md

File metadata and controls

File with the outline of plots and questions about these first draft of plots

Most Important Questions for Ian:

1. Filtering positions for Fst values:

2. Adjusting P values: Chromosome OR full genome ??

3. Bonferroni vs. Fdr:

4. Selection Coeffcient Filtering:

Plots with specific questions below for plots:

Pi: Ancestral Pi for Novoalign:

Outline

Questions

Fst Plots:

Outline

Questions

Model Outputs

Outline

Questions:

Poolseq outputs:

Outline

Questions

Trajectories and positions:

Outline

Questions: