You want to pick a random value where the probabilities of the values are not equal (the distribution is not even). You might be trying to randomly select a banner to display on a web page, given a set of relative weights saying how often each banner is to be displayed. Alternatively, you might want to simulate behavior according to a normal distribution (the bell curve).
If you want a random value distributed according to a specific function - e.g., the Gaussian (Normal) distribution - consult a statistics textbook to find the appropriate function or algorithm. This subroutine generates random numbers that are normally distributed, with a standard deviation of 1 and a mean of 0.
sub gaussian_rand { my ($u1, $u2); # uniformly distributed random numbers my $w; # variance, then a weight my ($g1, $g2); # gaussian-distributed numbers do { $u1 = 2 * rand() - 1; $u2 = 2 * rand() - 1; $w = $u1*$u1 + $u2*$u2; } while ( $w >= 1 ); $w = sqrt( (-2 * log($w)) / $w ); $g2 = $u1 * $w; $g1 = $u2 * $w; # return both if wanted, else just one return wantarray ? ($g1, $g2) : $g1; }
If you have a list of weights and values you want to randomly pick from, follow this two-step process: First, turn the weights into a probability distribution with weight_to_dist
below, and then use the distribution to randomly pick a value with weighted_rand
:
# weight_to_dist: takes a hash mapping key to weight and returns # a hash mapping key to probability sub weight_to_dist { my %weights = @_; my %dist = (); my $total = 0; my ($key, $weight); local $_; foreach (values %weights) { $total += $_; } while ( ($key, $weight) = each %weights ) { $dist{$key} = $weight/$total; } return %dist; } # weighted_rand: takes a hash mapping key to probability, and # returns the corresponding element sub weighted_rand { my %dist = @_; my ($key, $weight); while (1) { # to avoid floating point inaccuracies my $rand = rand; while ( ($key, $weight) = each %dist ) { return $key if ($rand -= $weight) < 0; } } }
The gaussian_rand
function implements the polar Box Muller method for turning two independent uniformly distributed random numbers between 0 and 1 (such as rand
returns) into two numbers with a mean of 0 and a standard deviation of 1 (i.e., a Gaussian distribution). To generate numbers with a different mean and standard deviation, multiply the output of gaussian_rand
by the new standard deviation, and then add the new mean:
# gaussian_rand as above $mean = 25; $sdev = 2; $salary = gaussian_rand() * $sdev + $mean; printf("You have been hired at \$%.2f\n", $salary);
The weighted_rand
function picks a random number between 0 and 1. It then uses the probabilities generated by weight_to_dist
to see which element the random number corresponds to. Because of the vagaries of floating-point representation, the accumulated errors of representation might mean we don't find an element to return. This is why we wrap the code in a while
to pick a new random number and try again.
In addition, the CPAN module Math::Random has functions to return random numbers from a variety of distributions.
The rand
function in perlfunc (1) and Chapter 3 of Programming Perl; Recipe 2.7; the documentation for the CPAN module Math::Random