Skip to content

Commit aa61ed2

Browse files
committed
Merge branch 'vnbdev' into feat-horizontal-plots
# Conflicts: # dabest/_effsize_objects.py # dabest/misc_tools.py # dabest/plot_tools.py # dabest/plotter.py # nbs/API/effsize_objects.ipynb # nbs/API/misc_tools.ipynb # nbs/API/plot_tools.ipynb # nbs/API/plotter.ipynb # nbs/tests/mpl_image_tests/baseline_images/test_01_gardner_altman_unpaired_meandiff.png # nbs/tests/mpl_image_tests/baseline_images/test_02_gardner_altman_unpaired_mediandiff.png # nbs/tests/mpl_image_tests/baseline_images/test_03_gardner_altman_unpaired_hedges_g.png # nbs/tests/mpl_image_tests/baseline_images/test_04_gardner_altman_paired_hedges_g.png # nbs/tests/mpl_image_tests/baseline_images/test_04_gardner_altman_paired_meandiff.png # nbs/tests/mpl_image_tests/baseline_images/test_05_cummings_two_group_unpaired_meandiff.png # nbs/tests/mpl_image_tests/baseline_images/test_06_cummings_two_group_paired_meandiff.png # nbs/tests/mpl_image_tests/baseline_images/test_07_cummings_multi_group_unpaired.png # nbs/tests/mpl_image_tests/baseline_images/test_08_cummings_multi_group_paired.png # nbs/tests/mpl_image_tests/baseline_images/test_09_cummings_shared_control.png # nbs/tests/mpl_image_tests/baseline_images/test_101_gardner_altman_unpaired_propdiff.png # nbs/tests/mpl_image_tests/baseline_images/test_103_cummings_two_group_unpaired_propdiff.png # nbs/tests/mpl_image_tests/baseline_images/test_105_cummings_multi_group_unpaired_propdiff.png # nbs/tests/mpl_image_tests/baseline_images/test_106_cummings_shared_control_propdiff.png # nbs/tests/mpl_image_tests/baseline_images/test_107_cummings_multi_groups_propdiff.png # nbs/tests/mpl_image_tests/baseline_images/test_109_gardner_altman_ylabel.png # nbs/tests/mpl_image_tests/baseline_images/test_10_cummings_multi_groups.png # nbs/tests/mpl_image_tests/baseline_images/test_110_change_fig_size.png # nbs/tests/mpl_image_tests/baseline_images/test_111_change_palette_b.png # nbs/tests/mpl_image_tests/baseline_images/test_112_change_palette_c.png # nbs/tests/mpl_image_tests/baseline_images/test_113_desat.png # nbs/tests/mpl_image_tests/baseline_images/test_114_change_ylims.png # nbs/tests/mpl_image_tests/baseline_images/test_115_invert_ylim.png # nbs/tests/mpl_image_tests/baseline_images/test_116_ticker_gardner_altman.png # nbs/tests/mpl_image_tests/baseline_images/test_117_err_color.png # nbs/tests/mpl_image_tests/baseline_images/test_118_cummings_two_group_unpaired_meandiff_bar_width.png # nbs/tests/mpl_image_tests/baseline_images/test_119_wide_df_nan.png # nbs/tests/mpl_image_tests/baseline_images/test_11_inset_plots.png # nbs/tests/mpl_image_tests/baseline_images/test_120_long_df_nan.png # nbs/tests/mpl_image_tests/baseline_images/test_121_cohens_h_gardner_altman.png # nbs/tests/mpl_image_tests/baseline_images/test_122_cohens_h_cummings.png # nbs/tests/mpl_image_tests/baseline_images/test_123_sankey_gardner_altman.png # nbs/tests/mpl_image_tests/baseline_images/test_124_sankey_cummings.png # nbs/tests/mpl_image_tests/baseline_images/test_125_sankey_2paired_groups.png # nbs/tests/mpl_image_tests/baseline_images/test_126_sankey_2sequential_groups.png # nbs/tests/mpl_image_tests/baseline_images/test_127_sankey_multi_group_paired.png # nbs/tests/mpl_image_tests/baseline_images/test_128_sankey_transparency.png # nbs/tests/mpl_image_tests/baseline_images/test_129_zero_to_zero.png # nbs/tests/mpl_image_tests/baseline_images/test_12_gardner_altman_ylabel.png # nbs/tests/mpl_image_tests/baseline_images/test_130_zero_to_one.png # nbs/tests/mpl_image_tests/baseline_images/test_131_one_to_zero.png # nbs/tests/mpl_image_tests/baseline_images/test_132_shared_control_sankey_off.png # nbs/tests/mpl_image_tests/baseline_images/test_133_shared_control_flow_off.png # nbs/tests/mpl_image_tests/baseline_images/test_134_separate_control_sankey_off.png # nbs/tests/mpl_image_tests/baseline_images/test_135_separate_control_flow_off.png # nbs/tests/mpl_image_tests/baseline_images/test_136_style_sheets.png # nbs/tests/mpl_image_tests/baseline_images/test_13_multi_2group_color.png # nbs/tests/mpl_image_tests/baseline_images/test_14_gardner_altman_paired_color.png # nbs/tests/mpl_image_tests/baseline_images/test_15_change_palette_a.png # nbs/tests/mpl_image_tests/baseline_images/test_16_change_palette_b.png # nbs/tests/mpl_image_tests/baseline_images/test_17_change_palette_c.png # nbs/tests/mpl_image_tests/baseline_images/test_18_desat.png # nbs/tests/mpl_image_tests/baseline_images/test_19_dot_sizes.png # nbs/tests/mpl_image_tests/baseline_images/test_201_forest_plot_no_colorpalette.png # nbs/tests/mpl_image_tests/baseline_images/test_202_forest_plot_with_colorpalette.png # nbs/tests/mpl_image_tests/baseline_images/test_203_horizontal_forest_plot_no_colorpalette.png # nbs/tests/mpl_image_tests/baseline_images/test_204_horizontal_forest_plot_with_colorpalette.png # nbs/tests/mpl_image_tests/baseline_images/test_205_forest_mini_meta_horizontal.png # nbs/tests/mpl_image_tests/baseline_images/test_206_forest_mini_meta.png # nbs/tests/mpl_image_tests/baseline_images/test_207_gardner_altman_meandiff_empty_circle.png # nbs/tests/mpl_image_tests/baseline_images/test_208_cummings_two_group_unpaired_meandiff_empty_circle.png # nbs/tests/mpl_image_tests/baseline_images/test_209_cummings_shared_control_meandiff_empty_circle.png # nbs/tests/mpl_image_tests/baseline_images/test_20_change_ylims.png # nbs/tests/mpl_image_tests/baseline_images/test_210_cummings_multi_groups_meandiff_empty_circle.png # nbs/tests/mpl_image_tests/baseline_images/test_211_cummings_multi_2_group_meandiff_empty_circle.png # nbs/tests/mpl_image_tests/baseline_images/test_212_cummings_unpaired_delta_delta_meandiff_empty_circle.png # nbs/tests/mpl_image_tests/baseline_images/test_213_cummings_unpaired_mini_meta_meandiff_empty_circle.png # nbs/tests/mpl_image_tests/baseline_images/test_214_change_idx_order_custom_palette_original.png # nbs/tests/mpl_image_tests/baseline_images/test_215_change_idx_order_custom_palette_new.png # nbs/tests/mpl_image_tests/baseline_images/test_21_invert_ylim.png # nbs/tests/mpl_image_tests/baseline_images/test_22_ticker_gardner_altman.png # nbs/tests/mpl_image_tests/baseline_images/test_23_ticker_cumming.png # nbs/tests/mpl_image_tests/baseline_images/test_24_wide_df_nan.png # nbs/tests/mpl_image_tests/baseline_images/test_25_long_df_nan.png # nbs/tests/mpl_image_tests/baseline_images/test_26_slopegraph_kwargs.png # nbs/tests/mpl_image_tests/baseline_images/test_27_gardner_altman_reflines_kwargs.png # nbs/tests/mpl_image_tests/baseline_images/test_28_unpaired_cumming_reflines_kwargs.png # nbs/tests/mpl_image_tests/baseline_images/test_29_paired_cumming_slopegraph_reflines_kwargs.png # nbs/tests/mpl_image_tests/baseline_images/test_30_sequential_cumming_slopegraph.png # nbs/tests/mpl_image_tests/baseline_images/test_31_baseline_cumming_slopegraph.png # nbs/tests/mpl_image_tests/baseline_images/test_47_cummings_unpaired_delta_delta_meandiff.png # nbs/tests/mpl_image_tests/baseline_images/test_48_cummings_sequential_delta_delta_meandiff.png # nbs/tests/mpl_image_tests/baseline_images/test_49_cummings_baseline_delta_delta_meandiff.png # nbs/tests/mpl_image_tests/baseline_images/test_50_delta_plot_ylabel.png # nbs/tests/mpl_image_tests/baseline_images/test_51_delta_plot_change_palette_a.png # nbs/tests/mpl_image_tests/baseline_images/test_52_delta_specified.png # nbs/tests/mpl_image_tests/baseline_images/test_53_delta_change_ylims.png # nbs/tests/mpl_image_tests/baseline_images/test_54_delta_invert_ylim.png # nbs/tests/mpl_image_tests/baseline_images/test_55_delta_median_diff.png # nbs/tests/mpl_image_tests/baseline_images/test_56_delta_cohens_d.png # nbs/tests/mpl_image_tests/baseline_images/test_57_delta_show_delta2.png # nbs/tests/mpl_image_tests/baseline_images/test_58_delta_axes_invert_ylim.png # nbs/tests/mpl_image_tests/baseline_images/test_59_delta_axes_invert_ylim_not_showing_delta2.png # nbs/tests/mpl_image_tests/baseline_images/test_60_cummings_unpaired_mini_meta_meandiff.png # nbs/tests/mpl_image_tests/baseline_images/test_61_cummings_sequential_mini_meta_meandiff.png # nbs/tests/mpl_image_tests/baseline_images/test_62_cummings_baseline_mini_meta_meandiff.png # nbs/tests/mpl_image_tests/baseline_images/test_63_mini_meta_plot_ylabel.png # nbs/tests/mpl_image_tests/baseline_images/test_64_mini_meta_plot_change_palette_a.png # nbs/tests/mpl_image_tests/baseline_images/test_65_mini_meta_dot_sizes.png # nbs/tests/mpl_image_tests/baseline_images/test_66_mini_meta_change_ylims.png # nbs/tests/mpl_image_tests/baseline_images/test_67_mini_meta_invert_ylim.png # nbs/tests/mpl_image_tests/baseline_images/test_68_mini_meta_median_diff.png # nbs/tests/mpl_image_tests/baseline_images/test_69_mini_meta_cohens_d.png # nbs/tests/mpl_image_tests/baseline_images/test_70_mini_meta_not_show.png # nbs/tests/mpl_image_tests/baseline_images/test_71_unpaired_delta_g.png # nbs/tests/mpl_image_tests/baseline_images/test_72_sequential_delta_g.png # nbs/tests/mpl_image_tests/baseline_images/test_73_baseline_delta_g.png # nbs/tests/mpl_image_tests/baseline_images/test_99_style_sheets.png # nbs/tests/mpl_image_tests/test_plot_aesthetics.py
2 parents cada3ca + 3de867f commit aa61ed2

File tree

315 files changed

+1004
-401
lines changed

Some content is hidden

Large Commits have some content hidden by default. Use the searchbox below for content that may be hidden.

315 files changed

+1004
-401
lines changed

dabest/__init__.py

Lines changed: 9 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -1,6 +1,14 @@
11
from ._api import load, prop_dataset
22
from ._stats_tools import effsize as effsize
3+
from ._stats_tools import confint_2group_diff as ci_2g
34
from ._effsize_objects import TwoGroupsEffectSize, PermutationTest
45
from ._dabest_object import Dabest
56

6-
__version__ = "2024.03.29"
7+
8+
import os
9+
if os.environ.get('SKIP_NUMBA_COMPILE') != '1':
10+
from ._stats_tools.precompile import precompile_all, _NUMBA_COMPILED
11+
if not _NUMBA_COMPILED:
12+
precompile_all()
13+
14+
__version__ = "2024.03.30"

dabest/_bootstrap_tools.py

Lines changed: 8 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -66,7 +66,9 @@ def __init__(
6666
reps: int = 5000, # Number of bootstrap iterations to perform.
6767
):
6868
# Turn to pandas series.
69-
x1 = pd.Series(x1).dropna()
69+
# x1 = pd.Series(x1).dropna()
70+
x1 = x1[~np.isnan(x1)]
71+
7072
diff = False
7173

7274
# Initialise stat_function
@@ -89,7 +91,9 @@ def __init__(
8991
if x2 is None:
9092
raise ValueError("Please specify x2.")
9193

92-
x2 = pd.Series(x2).dropna()
94+
# x2 = pd.Series(x2).dropna()
95+
x2 = x1[~np.isnan(x2)]
96+
9397
if len(x1) != len(x2):
9498
raise ValueError("x1 and x2 are not the same length.")
9599

@@ -134,7 +138,8 @@ def __init__(
134138

135139
elif x2 is not None and paired is None:
136140
diff = True
137-
x2 = pd.Series(x2).dropna()
141+
# x2 = pd.Series(x2).dropna()
142+
x2 = x2[~np.isnan(x2)]
138143
# Generate statarrays for both arrays.
139144
ref_statarray = sns.algorithms.bootstrap(x1, **sns_bootstrap_kwargs)
140145
exp_statarray = sns.algorithms.bootstrap(x2, **sns_bootstrap_kwargs)

dabest/_dabest_object.py

Lines changed: 3 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -112,7 +112,7 @@ def __init__(
112112
# Determine the kind of estimation plot we need to produce.
113113
if all([isinstance(i, (str, int, float)) for i in idx]):
114114
# flatten out idx.
115-
all_plot_groups = pd.unique([t for t in idx]).tolist()
115+
all_plot_groups = pd.Series([t for t in idx]).unique().tolist()
116116
if len(idx) > len(all_plot_groups):
117117
err0 = "`idx` contains duplicated groups. Please remove any duplicates and try again."
118118
raise ValueError(err0)
@@ -663,9 +663,9 @@ def _get_plot_data(self, x, y, all_plot_groups):
663663

664664

665665
if isinstance(plot_data[self.__xvar].dtype, pd.CategoricalDtype):
666-
plot_data[self.__xvar].cat.remove_unused_categories(inplace=True)
666+
plot_data[self.__xvar].cat.remove_unused_categories()
667667
plot_data[self.__xvar].cat.reorder_categories(
668-
all_plot_groups, ordered=True, inplace=True
668+
all_plot_groups, ordered=True
669669
)
670670
else:
671671
plot_data[self.__xvar] = pd.Categorical(

dabest/_delta_objects.py

Lines changed: 4 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -388,13 +388,14 @@ def __init__(self, effectsizedataframe, permutation_count,
388388
# compute the variances of each control group and each test group
389389
control_var=[]
390390
test_var=[]
391+
grouped_data = {name: group[yvar].copy() for name, group in dat.groupby(xvar, observed=False)}
391392
for j, current_tuple in enumerate(idx):
392393
cname = current_tuple[0]
393-
control = dat[dat[xvar] == cname][yvar].copy()
394+
control = grouped_data[cname]
394395
control_var.append(np.var(control, ddof=1))
395396

396397
tname = current_tuple[1]
397-
test = dat[dat[xvar] == tname][yvar].copy()
398+
test = grouped_data[tname]
398399
test_var.append(np.var(test, ddof=1))
399400
self.__control_var = np.array(control_var)
400401
self.__test_var = np.array(test_var)
@@ -414,7 +415,7 @@ def __init__(self, effectsizedataframe, permutation_count,
414415
self.__bootstraps)
415416

416417
# Compute the weighted average mean difference based on the raw data
417-
self.__difference = es.weighted_delta(self.__effsizedf["difference"],
418+
self.__difference = es.weighted_delta(np.array(self.__effsizedf["difference"]),
418419
self.__group_var)
419420

420421
sorted_weighted_deltas = npsort(self.__bootstraps_weighted_delta)

dabest/_effsize_objects.py

Lines changed: 172 additions & 22 deletions
Original file line numberDiff line numberDiff line change
@@ -9,6 +9,7 @@
99
import pandas as pd
1010
import lqrt
1111
from scipy.stats import norm
12+
import numpy as np
1213
from numpy import array, isnan, isinf, repeat, random, isin, abs, var
1314
from numpy import sort as npsort
1415
from numpy import nan as npnan
@@ -167,6 +168,8 @@ def __init__(
167168
self.__pct_interval_idx = (pct_idx_low, pct_idx_high)
168169
self.__pct_low = sorted_bootstraps[pct_idx_low]
169170
self.__pct_high = sorted_bootstraps[pct_idx_high]
171+
172+
self._get_bootstrap_baseline_ec()
170173

171174
self._perform_statistical_test()
172175

@@ -355,12 +358,11 @@ def _perform_statistical_test(self):
355358
# References:
356359
# https://en.wikipedia.org/wiki/McNemar%27s_test
357360

358-
df_temp = pd.DataFrame({"control": self.__control, "test": self.__test})
359-
x1 = len(df_temp[(df_temp["control"] == 0) & (df_temp["test"] == 0)])
360-
x2 = len(df_temp[(df_temp["control"] == 0) & (df_temp["test"] == 1)])
361-
x3 = len(df_temp[(df_temp["control"] == 1) & (df_temp["test"] == 0)])
362-
x4 = len(df_temp[(df_temp["control"] == 1) & (df_temp["test"] == 1)])
363-
table = [[x1, x2], [x3, x4]]
361+
x1 = np.sum((self.__control == 0) & (self.__test == 0))
362+
x2 = np.sum((self.__control == 0) & (self.__test == 1))
363+
x3 = np.sum((self.__control == 1) & (self.__test == 0))
364+
x4 = np.sum((self.__control == 1) & (self.__test == 1))
365+
table = np.array([[x1, x2], [x3, x4]])
364366
_mcnemar = mcnemar(table, exact=True, correction=True)
365367
self.__pvalue_mcnemar = _mcnemar.pvalue
366368
self.__statistic_mcnemar = _mcnemar.statistic
@@ -435,6 +437,92 @@ def to_dict(self):
435437
for a in attrs:
436438
out[a] = getattr(self, a)
437439
return out
440+
441+
def _get_bootstrap_baseline_ec(self):
442+
from ._stats_tools import confint_2group_diff as ci2g
443+
from ._stats_tools import effsize as es
444+
445+
# Cannot use self.__is_paired because it's for baseline curve
446+
is_paired = None
447+
448+
difference = es.two_group_difference(
449+
self.__control, self.__control, is_paired, self.__effect_size
450+
)
451+
self.__bec_difference = difference
452+
453+
jackknives = ci2g.compute_meandiff_jackknife(
454+
self.__control, self.__control, is_paired, self.__effect_size
455+
)
456+
457+
acceleration_value = ci2g._calc_accel(jackknives)
458+
459+
bootstraps = ci2g.compute_bootstrapped_diff(
460+
self.__control,
461+
self.__control,
462+
is_paired,
463+
self.__effect_size,
464+
self.__resamples,
465+
self.__random_seed,
466+
)
467+
self.__bootstraps_baseline_ec = bootstraps
468+
469+
sorted_bootstraps = npsort(self.__bootstraps_baseline_ec)
470+
# We don't have to consider infinities in bootstrap_baseline_ec
471+
472+
bias_correction = ci2g.compute_meandiff_bias_correction(
473+
self.__bootstraps_baseline_ec, difference
474+
)
475+
476+
# Compute BCa intervals.
477+
bca_idx_low, bca_idx_high = ci2g.compute_interval_limits(
478+
bias_correction,
479+
acceleration_value,
480+
self.__resamples,
481+
self.__ci,
482+
)
483+
484+
self.__bec_bca_interval_idx = (bca_idx_low, bca_idx_high)
485+
486+
if ~isnan(bca_idx_low) and ~isnan(bca_idx_high):
487+
self.__bec_bca_low = sorted_bootstraps[bca_idx_low]
488+
self.__bec_bca_high = sorted_bootstraps[bca_idx_high]
489+
490+
err1 = "The $lim_type limit of the interval"
491+
err2 = "was in the $loc 10 values."
492+
err3 = "The result for baseline curve should be considered unstable."
493+
err_temp = Template(" ".join([err1, err2, err3]))
494+
495+
if bca_idx_low <= 10:
496+
warnings.warn(
497+
err_temp.substitute(lim_type="lower", loc="bottom"), stacklevel=1
498+
)
499+
500+
if bca_idx_high >= self.__resamples - 9:
501+
warnings.warn(
502+
err_temp.substitute(lim_type="upper", loc="top"), stacklevel=1
503+
)
504+
505+
else:
506+
err1 = "The $lim_type limit of the BCa interval of baseline curve cannot be computed."
507+
err2 = "It is set to the effect size itself."
508+
err3 = "All bootstrap values were likely all the same."
509+
err_temp = Template(" ".join([err1, err2, err3]))
510+
511+
if isnan(bca_idx_low):
512+
self.__bec_bca_low = difference
513+
warnings.warn(err_temp.substitute(lim_type="lower"), stacklevel=0)
514+
515+
if isnan(bca_idx_high):
516+
self.__bec_bca_high = difference
517+
warnings.warn(err_temp.substitute(lim_type="upper"), stacklevel=0)
518+
519+
# Compute percentile intervals.
520+
pct_idx_low = int((self.__alpha / 2) * self.__resamples)
521+
pct_idx_high = int((1 - (self.__alpha / 2)) * self.__resamples)
522+
523+
self.__bec_pct_interval_idx = (pct_idx_low, pct_idx_high)
524+
self.__bec_pct_low = sorted_bootstraps[pct_idx_low]
525+
self.__bec_pct_high = sorted_bootstraps[pct_idx_high]
438526

439527
@property
440528
def difference(self):
@@ -671,6 +759,54 @@ def proportional_difference(self):
671759
return self.__proportional_difference
672760
except AttributeError:
673761
return npnan
762+
763+
@property
764+
def bec_difference(self):
765+
return self.__bec_difference
766+
767+
@property
768+
def bec_bootstraps(self):
769+
"""
770+
The generated baseline error bootstraps.
771+
"""
772+
return self.__bootstraps_baseline_ec
773+
774+
@property
775+
def bec_bca_interval_idx(self):
776+
return self.__bec_bca_interval_idx
777+
778+
@property
779+
def bec_bca_low(self):
780+
"""
781+
The bias-corrected and accelerated confidence interval lower limit for baseline error.
782+
"""
783+
return self.__bec_bca_low
784+
785+
@property
786+
def bec_bca_high(self):
787+
"""
788+
The bias-corrected and accelerated confidence interval upper limit for baseline error.
789+
"""
790+
return self.__bec_bca_high
791+
792+
@property
793+
def bec_pct_interval_idx(self):
794+
return self.__bec_pct_interval_idx
795+
796+
@property
797+
def bec_pct_low(self):
798+
"""
799+
The percentile confidence interval lower limit for baseline error.
800+
"""
801+
return self.__bec_pct_low
802+
803+
@property
804+
def bec_pct_high(self):
805+
"""
806+
The percentile confidence interval lower limit for baseline error.
807+
"""
808+
return self.__bec_pct_high
809+
674810

675811
# %% ../nbs/API/effsize_objects.ipynb 10
676812
class EffectSizeDataFrame(object):
@@ -725,18 +861,19 @@ def __pre_calc(self):
725861
out = []
726862
reprs = []
727863

864+
grouped_data = {name: group[yvar].copy() for name, group in dat.groupby(xvar, observed=False)}
728865
if self.__delta2:
729866
mixed_data = []
730867
for j, current_tuple in enumerate(idx):
731868
if self.__is_paired != "sequential":
732869
cname = current_tuple[0]
733-
control = dat[dat[xvar] == cname][yvar].copy()
870+
control = grouped_data[cname]
734871

735872
for ix, tname in enumerate(current_tuple[1:]):
736873
if self.__is_paired == "sequential":
737874
cname = current_tuple[ix]
738-
control = dat[dat[xvar] == cname][yvar].copy()
739-
test = dat[dat[xvar] == tname][yvar].copy()
875+
control = grouped_data[cname]
876+
test = grouped_data[tname]
740877
mixed_data.append(control)
741878
mixed_data.append(test)
742879
bootstraps_delta_delta = ci2g.compute_delta2_bootstrapped_diff(
@@ -752,13 +889,13 @@ def __pre_calc(self):
752889
for j, current_tuple in enumerate(idx):
753890
if self.__is_paired != "sequential":
754891
cname = current_tuple[0]
755-
control = dat[dat[xvar] == cname][yvar].copy()
892+
control = grouped_data[cname]
756893

757894
for ix, tname in enumerate(current_tuple[1:]):
758895
if self.__is_paired == "sequential":
759896
cname = current_tuple[ix]
760-
control = dat[dat[xvar] == cname][yvar].copy()
761-
test = dat[dat[xvar] == tname][yvar].copy()
897+
control = grouped_data[cname]
898+
test = grouped_data[tname]
762899

763900
result = TwoGroupsEffectSize(
764901
control,
@@ -843,6 +980,14 @@ def __pre_calc(self):
843980
"pvalue_kruskal",
844981
"statistic_kruskal",
845982
"proportional_difference",
983+
"bec_difference",
984+
"bec_bootstraps",
985+
"bec_bca_interval_idx",
986+
"bec_bca_low",
987+
"bec_bca_high",
988+
"bec_pct_interval_idx",
989+
"bec_pct_low",
990+
"bec_pct_high",
846991
]
847992
self.__results = out_.reindex(columns=columns_in_order)
848993
self.__results.dropna(axis="columns", how="all", inplace=True)
@@ -911,16 +1056,18 @@ def __calc_lqrt(self):
9111056

9121057
out = []
9131058

1059+
grouped_data = {name:group[yvar].copy() for name, group in dat.groupby(xvar)}
1060+
9141061
for j, current_tuple in enumerate(db_obj.idx):
9151062
if self.__is_paired != "sequential":
9161063
cname = current_tuple[0]
917-
control = dat[dat[xvar] == cname][yvar].copy()
1064+
control = grouped_data[cname]
9181065

9191066
for ix, tname in enumerate(current_tuple[1:]):
9201067
if self.__is_paired == "sequential":
9211068
cname = current_tuple[ix]
922-
control = dat[dat[xvar] == cname][yvar].copy()
923-
test = dat[dat[xvar] == tname][yvar].copy()
1069+
control = grouped_data[cname]
1070+
test = grouped_data[tname]
9241071

9251072
if self.__is_paired:
9261073
# Refactored here in v0.3.0 for performance issues.
@@ -1043,6 +1190,9 @@ def plot(
10431190

10441191
es_paired_lines=True,
10451192
es_paired_lines_kwargs=None,
1193+
1194+
# Basline EffectSize Curve
1195+
show_baseline_ec=False,
10461196
):
10471197
"""
10481198
Creates an estimation plot for the effect size of interest.
@@ -1222,19 +1372,12 @@ def plot(
12221372
Pass relevant keyword arguments. If None, the following keywords are passed:
12231373
{"color": 'k', "marker": "^", "alpha": 0.5, "zorder": -1, "size": 3, "side": "right"}
12241374
1225-
horizontal : boolean, default False
1226-
Whether or not to plot the effect size plot in a horizontal format.
12271375
horizontal_table_kwargs : dict, default None
1228-
Pass relevant keyword arguments to the horizontal table. If None, the following keywords are passed:
12291376
{'show: True, 'color' : 'yellow', 'alpha' :0.2, 'fontsize' : 12, 'text_color' : 'black',
12301377
'text_units' : None, 'control_marker' : '-', 'fontsize_label': 12, 'label': 'Δ'}
12311378
12321379
gridkey_rows : list, default None
1233-
Provide a list of row labels for the gridkey. The supplied idx is
1234-
checked against the row labels to determine whether the corresponding
12351380
cell should be populated or not.
1236-
This can also be set to "auto", which will attempt to auto populate the table.
1237-
gridkey_kwargs : dict, default None
12381381
Pass relevant keyword arguments to the gridkey. If None, the following keywords are passed:
12391382
{ 'show_es' : True, # If True, the gridkey will show the effect size of each comparison.
12401383
'show_Ns' :True, # If True, the gridkey will show the number of observations in eachgroup.
@@ -1262,6 +1405,13 @@ def plot(
12621405
Pass relevant plot keyword arguments. If None, the following keywords are passed:
12631406
{"linestyle": "-", "linewidth": 2, "zorder": -2, "color": 'dimgray', "alpha": 1}
12641407
1408+
show_baseline_ec : boolean, default False
1409+
Whether or not to display the baseline error curve. The baseline error curve
1410+
represents the distribution of the effect size when comparing the control
1411+
group to itself, providing a reference for the inherent variability or noise
1412+
in the data. When True, this curve is plotted alongside the main effect size
1413+
distribution, allowing for a visual comparison of the observed effect against
1414+
the baseline variability.
12651415
12661416
Returns
12671417
-------

0 commit comments

Comments
 (0)