Skip to content

Commit 7f0ed73

Browse files
authored
Merge pull request #201 from Joshua-Dias-Barreto/stat-comment
Added comments for statistical methods in the DataSeries class.
2 parents 71ab914 + 340450c commit 7f0ed73

File tree

1 file changed

+25
-1
lines changed

1 file changed

+25
-1
lines changed

src/DataFrame/DataSeries.class.st

Lines changed: 25 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -228,6 +228,8 @@ DataSeries >> correlationWith: otherSeries using: aCorrelationCoefficient [
228228

229229
{ #category : #statistics }
230230
DataSeries >> crossTabulateWith: aSeries [
231+
"A DataFrame is returned which is useful in quantitatively analyzing the relationship of values in one data series with the values in another data series"
232+
231233
| df |
232234

233235
(self size = aSeries size)
@@ -307,6 +309,8 @@ DataSeries >> first [
307309

308310
{ #category : #statistics }
309311
DataSeries >> firstQuartile [
312+
"25% of the values in a set are smaller than or equal to the first Quartile of that set"
313+
310314
^ self quartile: 1
311315
]
312316

@@ -319,6 +323,8 @@ DataSeries >> fourth [
319323

320324
{ #category : #statistics }
321325
DataSeries >> fourthQuartile [
326+
"Fourth Quartile is the maximum value in a set of values"
327+
322328
^ self quartile: 4
323329
]
324330

@@ -417,6 +423,8 @@ DataSeries >> initialize: aCapacity [
417423

418424
{ #category : #statistics }
419425
DataSeries >> interquartileRange [
426+
"The Inter Quartile Range is the difference between the third Quartile and the first Quartile"
427+
420428
^ self thirdQuartile - self firstQuartile
421429
]
422430

@@ -494,6 +502,8 @@ DataSeries >> ninth [
494502

495503
{ #category : #statistics }
496504
DataSeries >> quantile: aNumber [
505+
"A quantile determines how many values in a distribution are above or below a certain limit.
506+
Eg: if the parameter aNumber is 85, a value from the data series is returned which is greater than or equal to 85% of the values in the data series"
497507

498508
| sortedSeries index |
499509
sortedSeries := self withoutNils sorted.
@@ -506,6 +516,9 @@ DataSeries >> quantile: aNumber [
506516

507517
{ #category : #statistics }
508518
DataSeries >> quartile: aNumber [
519+
"Quartiles are three values that split sorted data into four parts, each with an equal number of observations.
520+
Eg: if the parameter aNumber is 3, the Third Quartile of the data series is returned"
521+
509522
^ self quantile: (25 * aNumber)
510523
]
511524

@@ -593,6 +606,8 @@ DataSeries >> second [
593606

594607
{ #category : #statistics }
595608
DataSeries >> secondQuartile [
609+
"50% of the values in a set are smaller than or equal to the second Quartile of that set. It is also known as the median"
610+
596611
^ self quartile: 2
597612
]
598613

@@ -679,6 +694,8 @@ DataSeries >> sum [
679694

680695
{ #category : #statistics }
681696
DataSeries >> summary [
697+
"A data series is returned which is a statistical summary of the data series. With keys as different statistical measures and values as the values returned when those statistical measures are applied on the data series."
698+
682699
| summary |
683700
summary := self species new.
684701
summary name: self name.
@@ -716,6 +733,8 @@ DataSeries >> third [
716733

717734
{ #category : #statistics }
718735
DataSeries >> thirdQuartile [
736+
"75% of the values in a set are smaller than or equal to the third Quartile of that set"
737+
719738
^ self quartile: 3
720739
]
721740

@@ -730,12 +749,15 @@ DataSeries >> uniqueValues [
730749

731750
{ #category : #statistics }
732751
DataSeries >> valueCounts [
733-
752+
"Calculates the frequency of each value in the data series and returns a data series in descending order of frequencies"
753+
734754
^ (self groupByUniqueValuesAndAggregateUsing: #size) sortDescending
735755
]
736756

737757
{ #category : #statistics }
738758
DataSeries >> valueFrequencies [
759+
"Calculates the relative frequency of values in the data series. Relative frequency is the ratio of the number of times a value occurs in a set to the total number of values in the set"
760+
739761
| count freq |
740762
count := self valueCounts.
741763
freq := count / self size.
@@ -887,5 +909,7 @@ DataSeries >> withoutNils [
887909

888910
{ #category : #statistics }
889911
DataSeries >> zerothQuartile [
912+
"Zeroth Quartile is the minimum value in a set of values"
913+
890914
^ self quartile: 0
891915
]

0 commit comments

Comments
 (0)