Generator method for creating a single-column Spark dataframes comprised of i.i.d. samples from a hypergeometric distribution.
Arguments
- sc
A Spark connection.
- nn
Sample Size.
- m
The number of successes among the population.
- n
The number of failures among the population.
- k
The number of draws.
- num_partitions
Number of partitions in the resulting Spark dataframe (default: default parallelism of the Spark cluster).
- seed
Random seed (default: a random long integer).
- output_col
Name of the output column containing sample values (default: "x").
See also
Other Spark statistical routines:
sdf_rbeta(),
sdf_rbinom(),
sdf_rcauchy(),
sdf_rchisq(),
sdf_rexp(),
sdf_rgamma(),
sdf_rgeom(),
sdf_rlnorm(),
sdf_rnorm(),
sdf_rpois(),
sdf_rt(),
sdf_runif(),
sdf_rweibull()