Skip to content

Commit e8e672d

Browse files
authored
Merge pull request #13 from PharmCat/dev
Dev
2 parents cd2d8fc + d26332c commit e8e672d

File tree

5 files changed

+113
-43
lines changed

5 files changed

+113
-43
lines changed

.github/workflows/Documenter.yml

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -24,7 +24,7 @@ jobs:
2424
runs-on: ubuntu-latest
2525
timeout-minutes: 30
2626
steps:
27-
- uses: actions/checkout@v2
27+
- uses: actions/checkout@v3
2828
- uses: julia-actions/julia-buildpkg@latest
2929
- uses: julia-actions/julia-docdeploy@latest
3030
env:

Project.toml

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -1,7 +1,7 @@
11
name = "MetidaStats"
22
uuid = "75cdad26-409a-4e43-8ad7-d54b4fa665a0"
33
authors = ["PharmCat <v.s.arnautov@yandex.ru>"]
4-
version = "0.2.1"
4+
version = "0.2.2"
55

66
[deps]
77

README.md

Lines changed: 66 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -11,3 +11,69 @@ Metida descriptive statistics.
1111
```
1212
import Pkg; Pkg.add(url = "https://github.com/PharmCat/MetidaStats.jl.git")
1313
```
14+
15+
## Import DataFrame
16+
17+
```
18+
data = CSV.File("somedata.csv") |> DataFrame
19+
20+
# variables to analyze
21+
vars = [:Cmax, :AUClast]
22+
23+
# sorting variables
24+
sort = [:form, :period]
25+
26+
ds = dataimport(data; vars = vars, sort = sort)
27+
```
28+
29+
## Get descriptive statistics
30+
31+
```
32+
descriptives(ds, stats = [:n, :mean, :var])
33+
```
34+
35+
## Or without dataimport step
36+
37+
```
38+
descriptives(data; vars = vars, sort = sort, stats = [:n, :mean, :var])
39+
```
40+
41+
Keywords:
42+
43+
- `skipmissing` - drop NaN and Missing values, default = true;
44+
- `skipnonpositive` - drop non-positive values (and NaN, Missing) for "log-statistics" - :geom, :geomean, :logmean, :logvar, :geocv;
45+
- `stats` - default set `stats = [:n, :mean, :sd, :se, :median, :min, :max]`;
46+
- `corrected` - use corrected var (true);
47+
- `level` - level for confidence intervals (0.95);
48+
49+
Possible values for `stats` is:
50+
51+
* :n - number of observbations;
52+
* :posn - positive (non-negative) number of observations;
53+
* :mean - arithmetic mean;
54+
* :var - variance;
55+
* :bvar - variance with no correction;
56+
* :geom - geometric mean;
57+
* :logmean - arithmetic mean for log-transformed data;
58+
* :logvar - variance for log-transformed data;
59+
* :sd - standard deviation (or σ);
60+
* :se - standard error;
61+
* :cv - coefficient of variation;
62+
* :geocv - coefficient of variation for log-transformed data;
63+
* :lci - lower confidence interval;
64+
* :uci - upper confidence interval;
65+
* :lmeanci - lower confidence interval for mean;
66+
* :umeanci - lower confidence interval for mean;
67+
* :median - median;
68+
* :min - minimum;
69+
* :max - maximum;
70+
* :range - range;
71+
* :q1 - lower quartile;
72+
* :q3 - upper quartile;
73+
* :iqr - inter quartile range;
74+
* :kurt - kurtosis;
75+
* :skew - skewness;
76+
* :harmmean - harmonic mean;
77+
* :ses standard error of skewness;
78+
* :sek - standard error of kurtosis;
79+
* :sum - sum.

docs/src/index.md

Lines changed: 4 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -4,7 +4,7 @@
44
CurrentModule = MetidaStats
55
```
66

7-
Metida descriptive statistics.
7+
Metida descriptive statistics - provide tables with categirized descriptive statistics from tabular data.
88

99
*This program comes with absolutely no warranty. No liability is accepted for any loss and risk to public health resulting from use of this software.
1010

@@ -37,19 +37,19 @@ ds[1:5, :]
3737

3838
### Import:
3939

40-
```
40+
```@example dsexample
4141
di = MetidaStats.dataimport(ds, vars = [:var1, :var2], sort = [:col, :row])
4242
```
4343

4444
### Statistics:
4545

46-
```
46+
```@example dsexample
4747
des = MetidaStats.descriptives(di; skipmissing = true, skipnonpositive = true, stats = MetidaStats.STATLIST)
4848
```
4949

5050
### Make DataFrame
5151

52-
```
52+
```@example dsexample
5353
df = DataFrame(des)
5454
```
5555

src/descriptive.jl

Lines changed: 41 additions & 37 deletions
Original file line numberDiff line numberDiff line change
@@ -82,38 +82,41 @@ end
8282
* kwargs:
8383
- `skipmissing` - drop NaN and Missing values, default = true;
8484
- `skipnonpositive` - drop non-positive values (and NaN, Missing) for "log-statistics" - :geom, :geomean, :logmean, :logvar, :geocv;
85-
- `stats` - default set `stats = [:n, :mean, :sd, :se, :median, :min, :max]`
85+
- `stats` - default set `stats = [:n, :mean, :sd, :se, :median, :min, :max]`;
86+
- `corrected` - use corrected var (true);
87+
- `level` - level for confidence intervals (0.95);
8688
8789
Possible values for `stats` is:
90+
8891
* :n - number of observbations;
89-
:posn - positive (non-negative) number of observations;
90-
:mean - arithmetic mean;
91-
:var - variance;
92-
:bvar - variance with no correction;
93-
:geom - geometric mean;
94-
:logmean - arithmetic mean for log-transformed data;
95-
:logvar - variance for log-transformed data ``σ^2_{log}``;
96-
:sd - standard deviation (or σ);
97-
:se - standard error;
98-
:cv - coefficient of variation;
99-
:geocv - coefficient of variation for log-transformed data (``CV = sqrt{exp(σ^2_{log})-1}``);
100-
:lci - lower confidence interval;
101-
:uci - upper confidence interval;
102-
:lmeanci - lower confidence interval for mean;
103-
:umeanci - lower confidence interval for mean;
104-
:median - median,;
105-
:min - minimum;
106-
:max - maximum;
107-
:range - range;
108-
:q1 - lower quartile;
109-
:q3,
110-
:iqr,
111-
:kurt,
112-
:skew,
113-
:harmmean,
114-
:ses,
115-
:sek,
116-
:sum
92+
* :posn - positive (non-negative) number of observations;
93+
* :mean - arithmetic mean;
94+
* :var - variance;
95+
* :bvar - variance with no correction;
96+
* :geom - geometric mean;
97+
* :logmean - arithmetic mean for log-transformed data;
98+
* :logvar - variance for log-transformed data ``σ^2_{log}``;
99+
* :sd - standard deviation (or σ);
100+
* :se - standard error;
101+
* :cv - coefficient of variation;
102+
* :geocv - coefficient of variation for log-transformed data (``CV = sqrt{exp(σ^2_{log})-1}``);
103+
* :lci - lower confidence interval;
104+
* :uci - upper confidence interval;
105+
* :lmeanci - lower confidence interval for mean;
106+
* :umeanci - lower confidence interval for mean;
107+
* :median - median,;
108+
* :min - minimum;
109+
* :max - maximum;
110+
* :range - range;
111+
* :q1 - lower quartile;
112+
* :q3 - upper quartile;
113+
* :iqr - inter quartile range;
114+
* :kurt - kurtosis;
115+
* :skew - skewness;
116+
* :harmmean - harmonic mean;
117+
* :ses standard error of skewness;
118+
* :sek - standard error of kurtosis;
119+
* :sum - sum.
117120
118121
"""
119122
function descriptives(data, vars, sort = nothing; kwargs...)
@@ -124,6 +127,7 @@ function descriptives(data, vars, sort = nothing; kwargs...)
124127
if eltype(vars) <: Integer vars = Tables.columnnames(data)[vars] end
125128
if !isnothing(sort)
126129
vars = setdiff(vars, sort)
130+
if length(sort) == 0 sort = nothing end
127131
end
128132
descriptives(dataimport_(data, vars, sort); kwargs...)
129133
end
@@ -211,10 +215,10 @@ function descriptives_(obsvec, kwargs, logstats, cicalk)
211215
end
212216
n_ = length(vec)
213217
if cicalk
214-
if n_ > 1 q = quantile(TDist(n_ - 1), 1 - (1-kwargs[:level])/2) end
218+
if n_ > 1 q = quantile(TDist(n_ - 1), 1 - (1 - kwargs[:level]) / 2) end # add tdist / normal option # add multiple CI ?
215219
end
216220
# skipnonpositive
217-
#logstats = makelogvec #calk logstats
221+
# logstats = makelogvec #calk logstats
218222
if logstats
219223
if kwargs[:skipnonpositive]
220224
logvec = log.(skipnonpositive(obsvec))
@@ -272,21 +276,21 @@ function descriptives_(obsvec, kwargs, logstats, cicalk)
272276
elseif s == :uci
273277
haskey(result, :mean) || begin result[:mean] = sum(vec) / n_ end
274278
haskey(result, :sd) || begin result[:sd] = std(vec; corrected = kwargs[:corrected], mean = result[:mean]) end
275-
result[s] = result[:mean] + q*result[:sd]
279+
result[s] = result[:mean] + q * result[:sd]
276280
elseif s == :lci
277281
haskey(result, :mean) || begin result[:mean] = sum(vec) / n_ end
278282
haskey(result, :sd) || begin result[:sd] = std(vec; corrected = kwargs[:corrected], mean = result[:mean]) end
279-
result[s] = result[:mean] - q*result[:sd]
283+
result[s] = result[:mean] - q * result[:sd]
280284
elseif s == :umeanci
281285
haskey(result, :mean) || begin result[:mean] = sum(vec) / n_ end
282286
haskey(result, :sd) || begin result[:sd] = std(vec; corrected = kwargs[:corrected], mean = result[:mean]) end
283287
haskey(result, :se) || begin result[:se] = result[:sd] / sqrt(n_) end
284-
result[s] = result[:mean] + q*result[:se]
288+
result[s] = result[:mean] + q * result[:se]
285289
elseif s == :lmeanci
286290
haskey(result, :mean) || begin result[:mean] = sum(vec) / n_ end
287291
haskey(result, :sd) || begin result[:sd] = std(vec; corrected = kwargs[:corrected], mean = result[:mean]) end
288292
haskey(result, :se) || begin result[:se] = result[:sd] / sqrt(n_) end
289-
result[s] = result[:mean] - q*result[:se]
293+
result[s] = result[:mean] - q * result[:se]
290294
elseif s == :median
291295
result[s] = median(vec)
292296
elseif s == :min
@@ -403,13 +407,13 @@ function MetidaBase.metida_table_(obj::DataSet{DS}; sort = nothing, stats = noth
403407
stats STATLIST || error("Some statistics not known!")
404408
if isa(stats, Symbol) stats = [stats] end
405409
if isnothing(sort)
406-
ressetl = collect(intersect(resset, stats))
410+
ressetl = sortbyvec!(collect(intersect(resset, stats)), collect(keys(first(obj).result)))
407411
else
408412
ressetl = sortbyvec!(collect(intersect(resset, stats)), sort)
409413
end
410414
else
411415
if isnothing(sort)
412-
ressetl = collect(resset)
416+
ressetl = sortbyvec!(collect(resset), collect(keys(first(obj).result)))
413417
else
414418
ressetl = sortbyvec!(collect(resset), sort)
415419
end

0 commit comments

Comments
 (0)