@@ -36,13 +36,25 @@ class ColumnSpecOptions(object):
3636
3737 :param step: Step to use for range of generated value. As an alternative, you may use the `dataRange` parameter
3838
39- :param numColumns: generate `n` columns numbered from 1 .. n-1 with same definition
39+ :param numColumns: generate `n` columns numbered from 1 .. n-1 with same definition. If generating random column
40+ values, it is recommended to use the `hash_fieldname` mechanism to generate random values
41+ to avoid all columns having the same value sequence.
4042
4143 :param numFeatures: generate `n` columns numbered from 0 .. n-1 with same definition. Alias for `numColumns`
4244
4345 :param structType: If specified as "array" and used with numColumns / numFeatures, will combine columns as array
4446
45- :param random: If True, will generate random values for column value. Defaults to `False`
47+ :param random: If True, will generate random values for column value. Defaults to `False`. When set to true,
48+ `randomSeed` and `randomSeedMethod` govern how the random values are generated.
49+
50+ :param randomSeed: If set, sets a value for the randomSeed. This will override the setting for the data generator
51+ object for this column. If set to `-1` generates a true psuedo random number (as opposed to
52+ one based on the randomSeed value)
53+
54+ :param randomSeedMethod: Controls how the random values are generated from the random seed.
55+ This may have the values `fixed`, `hash_fieldname` or None.
56+ If set to `hash_fieldname`, the `randomSeed` value is ignored and a hash of the field name
57+ is used as the seed.
4658
4759 :param baseColumn: Either the string name of the base column, or a list of columns to use to
4860 control data generation. The option ``baseColumns`` is an alias for ``baseColumn``.
0 commit comments