Generator methods for creating single-column Spark dataframes comprised of i.i.d. samples from some distribution.
Arguments
- sc
A Spark connection.
- n
Sample Size (default: 1000).
- num_partitions
Number of partitions in the resulting Spark dataframe (default: default parallelism of the Spark cluster).
- seed
Random seed (default: a random long integer).
- output_col
Name of the output column containing sample values (default: "x").