You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Custom distributions can be specified in `defData` and `defDataAdd` by using the specifying "custom" for the argument *dist*. The name of the user-defined function is specified as a string in the *formula* argument. The arguments of the custom function are specified in the *variance* argument, as a comma delimited string. One important feature of the custom function is that the parameter list used to define the function must include "**n = n**", but this will not be included in the data definition.
48
+
Custom distributions can be specified in `defData` and `defDataAdd` by setting the argument *dist* to "custom". When defining a custom distribution, you provide the name of the user-defined function as a string in the *formula* argument. The arguments of the custom function are listed in the *variance* argument, separated by commas and formatted as "**arg_1 = val_form_1, arg_2 = val_form_2, $\dots$, arg_K = val_form_K**".
49
+
50
+
Here, the *arg_k's* represent the names of the arguments passed to the customized function, where $k$ ranges from $1$ to $K$. You can use values or formulas for each *val_form_k*. If formulas are used, ensure that the variables have been previously generated. Double dot notation is available in specifying *value_formula_k*. One important requirement of the custom function is that the parameter list used to define the function must include an argument"**n = n**", but do not include $n$ in the definition as part of `defData` or `defDataAdd`.
49
51
50
52
### Example 1
51
53
52
-
Here is an example where we would like to generate data from a zero-inflated beta distribution. In this case, there is a user-defined function `zeroBeta` that takes on shape parameters $a$ and $b$, as well as $p_0$, the proportion of the sample that is zero:
54
+
Here is an example where we would like to generate data from a zero-inflated beta distribution. In this case, there is a user-defined function `zeroBeta` that takes on shape parameters $a$ and $b$, as well as $p_0$, the proportion of the sample that is zero. Note that the function also takes an argument $n$ that will not to be be specified in the data definition; $n$ will represent the number of observations being generated:
In this case, we are generating a mixture of truncated distributions, where the limits of the truncation vary across three different groups. `rnormt` is a customized (user-defined) function that generates truncated data from a Gaussian distribution. The function requires up to four arguments (the left truncation value, the right truncation value, the distribution average and the standard deviation).
96
+
In this second example, we are generating sets of truncated Gaussian distributions with means ranging from $-1$ to $1$. The limits of the truncation vary across three different groups. `rnormt` is a customized (user-defined) function that generates the truncated Gaussiandata. The function requires four arguments (the left truncation value, the right truncation value, the distribution average and the standard deviation).
95
97
96
98
```{r}
97
-
rnormt <- function(n, min, max, mu = 0, s = 1.5) {
99
+
rnormt <- function(n, min, max, mu, s) {
98
100
99
101
F.a <- pnorm(min, mean = mu, sd = s)
100
102
F.b <- pnorm(max, mean = mu, sd = s)
@@ -105,7 +107,8 @@ rnormt <- function(n, min, max, mu = 0, s = 1.5) {
105
107
}
106
108
```
107
109
108
-
In this example, the truncation limits vary by group membership. There are three groups. We only pass three parameters (the limits and the mean), using the default standard deviation.
110
+
111
+
In this example, truncation limits vary based on group membership. Initially, three groups are created, followed by the generation of truncated values. For Group 1, truncation occurs within the range of $-1$ to $1$, for Group 2, it's $-2$ to $2$ and for Group 3, it's $-3$ to $3$. We'll generate three data sets, each with a distinct mean denoted by M, using the double-dot notation to implement these different means.
109
112
110
113
```{r}
111
114
def <-
@@ -117,26 +120,40 @@ def <-
117
120
defData(
118
121
varname = "tn",
119
122
formula = "rnormt",
120
-
variance = "min = -limit, max = limit, mu = 0.5",
123
+
variance = "min = -limit, max = limit, mu = ..M, s = 1.5",
121
124
dist = "custom"
122
125
)
126
+
```
123
127
124
-
dd <- genData(100000, def)
128
+
The data generation requires three calls to `genData`. The output is a list of three data sets:
Copy file name to clipboardExpand all lines: vignettes/simstudy.Rmd
+1-1Lines changed: 1 addition & 1 deletion
Original file line number
Diff line number
Diff line change
@@ -231,7 +231,7 @@ The *clusterSize* distribution allocates a total sample size *N* (specified in t
231
231
232
232
#### custom
233
233
234
-
The *custom* distribution facilitates data generation for a user-defined distribution. The name of the user-defined function is specified as a string in the *formula* argument. The arguments of the function are specified in the *variance* argument, as a comma delimited string, such as "**name_arg_1 = value/formula_1, name_arg_2 = value/formula_2, ..., name_arg_K = value/formula_K**". The *name_arg_k*, $x \in \{1,2,...,K\}$, are required to create the $K$ arguments that are passed to the customized function. The *val/form_k* represent either values or a formula that is used to generate the values for the argument; if formulas are used, variables in the formulas must have been generated previously. Double dot notation is available in specifying the *value/formula_k*. One important feature of the custom function is that the parameter list used to define the function must include "**n = n**", but this will not be included in the data definition.
234
+
Custom distributions can be specified in `defData` and `defDataAdd` by setting the argument *dist* to "custom". When defining a custom distribution, provide the name of the user-defined function as a string in the *formula* argument. The arguments of the custom function are listed in the *variance* argument, separated by commas and formatted as "**arg_1 = val_form_1, arg_2 = val_form_2, $\dots$, arg_K = val_form_K**". The *arg_k's* represent the names of the arguments passed to the customized function, where $k$ ranges from $1$ to $K$. Values or formulas can be used for each *val_form_k*. If formulas are used, ensure that the variables have been previously generated. Double dot notation is available in specifying *value_formula_k*. One important requirement of the custom function is that the parameter list used to define the function must include an argument"**n = n**", but do not include $n$ in the definition as part of `defData` or `defDataAdd`.
0 commit comments