< BACKCONTINUE >

7.2 A Program Using Randomization

Example 7-1 introduces randomization in the context of a simple program. It randomly combines parts of sentences to construct a story. This isn't a bioinformatics program, but I've found that it's an effective way to learn the basics of randomization. You will learn how to randomly select elements from arrays, which you'll apply in the future examples that mutate DNA.

The example declares a few arrays filled with parts of sentences, then randomizes their assembly into complete sentences. It's a trivial children's game; yet it teaches several programming points.

Example 7-1. Children's game with random numbers
#!/usr/bin/perl
# Children's game, demonstrating primitive artificial intelligence,
#  using a random number generator to randomly select parts of sentences.

use strict;
use warnings;

# Declare the variables
my $count;
my $input;
my $number;
my $sentence;
my $story;

# Here are the arrays of parts of sentences:
my @nouns = (
'Dad',
'TV',
'Mom',
'Groucho',
'Rebecca',
'Harpo',
'Robin Hood',
'Joe and Moe',
);

my @verbs = (
'ran to',
'giggled with',
'put hot sauce into the orange juice of',
'exploded',
'dissolved',
'sang stupid songs with',
'jumped with',
);

my @prepositions = (
'at the store',
'over the rainbow',
'just for the fun of it',
'at the beach',
'before dinner',
'in New York City',
'in a dream',
'around the world',
);

# Seed the random number generator.
# time|$$ combines the current time with the current process id
# in a somewhat weak attempt to come up with a random seed.
srand(time|$$);

# This do-until loop composes six-sentence "stories".
#  until the user types "quit".
do {
    # (Re)set $story to the empty string each time through the loop
    $story = '';  

    # Make 6 sentences per story.
    for ($count = 0; $count < 6; $count++) {

        #  Notes on the following statements:
        #  1) scalar @array gives the number of elements in the array.
        #  2) rand returns a random number greater than 0 and 
        #     less than scalar(@array).
        #  3) int removes the fractional part of a number.
        #  4) . joins two strings together.
        $sentence   = $nouns[int(rand(scalar @nouns))]
                    . " " 
                    . $verbs[int(rand(scalar @verbs))]
                    . " "
                    . $nouns[int(rand(scalar @nouns))]
                    . " "
                    . $prepositions[int(rand(scalar @prepositions))] 
                    . '. ';

        $story .= $sentence;
    }

    # Print the story.
    print "\n",$story,"\n";

    # Get user input.
    print "\nType \"quit\" to quit, or press Enter to continue: ";

    $input = <STDIN>;

    # Exit loop at user's request
}  until($input =~ /^\s*q/i);

exit;

Here is some typical output from Example 7-1:

Joe and Moe jumped with Rebecca in New York City. Rebecca exploded Groucho
in a dream. Mom ran to Harpo over the rainbow. TV giggled with Joe and Moe
over the rainbow. Harpo exploded Joe and Moe at the beach. Robin Hood giggled
with Harpo at the beach. 

Type "quit" to quit, or press Enter to continue: 

Harpo put hot sauce into the orange juice of TV before dinner. Dad ran to
Groucho in a dream. Joe and Moe put hot sauce into the orange juice of TV
in New York City. Joe and Moe giggled with Joe and Moe over the rainbow. TV
put hot sauce into the orange juice of Mom just for the fun of it. Robin Hood
ran to Robin Hood at the beach. 

Type "quit" to quit, or press Enter to continue: quit

The structure of the example is quite simple. After enforcing the declarations of variables, and turning on warnings, with:

use strict;
use warnings;

the variables are declared, and the arrays are initialized with values.

7.2.1 Seeding the Random Number Generator

Next, the random number generator is seeded by a call to the built-in function srand. It takes one argument, the seed for the random number generator discussed earlier. As mentioned, you have to give a different seed at this step to get a different series of random numbers. Try changing this statement to something like:

srand(100);

and then run the program more than once. You'll get the same results each time.[2] The seed you're using:

[2] The latest random number generators automatically change the series, so if this experiment doesn't work, you're probably using a very new random number generator. However, sometimes you want to repeat a series. Note that newer versions of Perl automatically give you a good seed if you call srand like so: srand;.

time|$$ 

is a calculation that returns a different seed each time.

time returns a number representing the time, $$ returns a number representing the ID of the Perl program that's running (this typically changes each time you run the program), and | means bitwise OR and combines the bits of the two numbers (for details see the Perl documentation). There are other ways to pick a seed, but let's stick with this popular one.

7.2.2 Control Flow

The main loop of the program is a do-until loop. These loops are handy when you want to do something (like print a little story) before taking any actions (like asking the user if he wants to continue) each time through the loop. The do-until loop first executes the statements in the block and then performs a test to determine if it should repeat the statements in the block. Note that this is the reverse of the other types of loops you've seen that do the test first and then the block.

Since the $story variable is always being appended to, it needs to be emptied at the top of each loop. It's common to forget that variables that are increased in some way need to be reset at the correct spot, so watch for that in your programming. The clue is increasingly long strings or big numbers.

The for loop contains the main work of the program. As you've seen before, this loop initializes a counter, performs a test, and then increments the counter at the end of the block.

7.2.3 Making a Sentence

In Example 7-1, note that the statement that makes a sentence stretches out over a few lines of code. It's a bit complicated, and it's the real work of the whole program, so there are comments attached to help read it. Notice that the statement has been carefully formatted so that it's neatly laid out over its eight lines. The variable names have been well chosen, so it's clear that you're making a sentence out of a noun, a verb, a noun, and a prepositional phrase.

However, even with all that, there are rather deeply nested expressions within the square brackets that specify the array positions, and it requires a bit of scrutiny to read this code. You will see that you're building a string out of sentence parts separated by spaces and ending with a period and a space. The string is built by several applications of the dot string concatenation operator. These have been placed at the beginning of each line to clarify the overall structure of the statement.

7.2.4 Randomly Selecting an Element of an Array

Let's look closely at one of the sentence part selectors:

$verbs[int(rand(scalar @verbs))] 

These kinds of nested braces need to be read and evaluated from the inside out. So the expression that's most deeply surrounded by braces is:

scalar @verbs

You see from the comments before the statement that the built-in function scalar returns the number of elements in an array. The array in question, @verbs, has seven elements, so this expression returns 7.

So now you have:

$verbs[int(rand(7))]

and the most deeply nested expression is now:

rand(7)

The helpful comments in the code before the statement remind you that this statement returns a (pseudo)random number greater than 0 and less than 7. This number is a floating-point number (decimal number with a fraction). Recall that an array with seven elements will number them from 0 to 6.

So now you have something like this:

$verbs[int(3.47429)] 

and you want to evaluate the expression:

int(3.47429) 

The int function discards the fractional part of a floating-point number and returns just the integer part, in this case 3.

So you've come to the final step:

$verbs[3]

which gives you the fourth element of the @verbs array, as the comments have been kind enough to remind you.

7.2.5 Formatting

To randomly select a verb, you call a few functions:

scalar

Determines the size of the array

rand

Picks a random number in the range determined by the size of the array

int

Transforms the floating-point number rand returns into the integer value you need for an array element

Several of these function calls are combined in one line using nested braces. Sometimes this produces hard-to-read code, and the gentle reader may be nodding his or her head vigorously at this unflattering characterization of the author's painstaking handiwork. You could try rewriting these lines, using additional temporary variables. For instance, you can say:

$verb_array_size = scalar @verbs;
$random_floating_point = rand ( $verb_array_size );
$random_integer = int $random_floating_point;
$verb = $verbs[$random_integer];

and repeat for the other parts of speech, finally building your sentence with a statement such as:

$sentence = "$subject $verb $object $prepositional_phrase. ";

It's a matter of style. You will make these kinds of choices all the time as you program. The choice of layout in Example 7-1 was based on a tradeoff between a desire to express the overall task clearly (which won) balanced against the difficulty of reading highly nested function calls (which lost). Another reason for this layout choice is that, in the programs that follow, you'll select random elements in arrays with some regularity, so you'll get used to seeing this particular nesting of calls. In fact, perhaps you should make a little subroutine out of this kind of call if you will do the same thing many times?

Readability is the most important thing here, as it is in most code. You have to be able to read and understand code, your own as well as the code of others, and that is usually more important than trying to achieve other laudable goals such as fastest speed, smallest amount of memory used, or shortest program. It's not always important, but usually it's best to write for readability first, then go back and try to goose up the speed (or whatever) if necessary. You can even leave the more readable code in there as comments, so whoever has to read the code can still get a clear idea of the program and how you went about improving the speed (or whatever).

7.2.6 Another Way to Calculate the Random Position

Perl often has several ways to accomplish a task. the following is an alternate way to write this random number selection; it uses the same function calls but without the parentheses:

$verbs[int rand scalar @verbs]

This chaining of functions, each of which takes one argument, is common in Perl. To evaluate the expression, Perl first takes @verbs as an argument to scalar, which returns the size of the array. Then it takes that value as an argument to rand, which returns a floating-point number from 0 to less than the size of the array. It then uses that floating-point number as an argument to int, which returns the greatest integer less than the floating-point number. In other words, it calculates the same number to be used as the subscript for the array @verbs.

Why does Perl allow this? Because such calculations are very frequent, and, in the spirit of "Let the computer do the work," Perl designer Larry Wall decided to save you (and himself) the bother of typing and matching all those parentheses.

Having gone that far, Larry decided it'd be easy to add even more. You can eliminate the scalar and the int function calls and use:

$verbs[rand @verbs]

What's going on here? Since rand already expects a scalar value, it evaluates @verbs in a scalar context, which simply returns the size of the array. Larry cleverly designed array subscripts (which, of course, are always integer values) to automatically take just the integer part of a floating-point value if it was given as a subscript; so, out with the int.

< BACKCONTINUE >

Index terms contained in this section

$ (dollar sign)
      $$, returning Perl program ID number
. (dot)
      string concatenation operator
| (vertical bar)
      (bitwise OR) operator
arrays
      randomly selecting elements from 2nd
            alternate method
            formatting nested function calls
      size, determining (scalar function)
bitwise operators
      | (OR operator)
children's game with random numbers (example)
code
     formatting
            nested function calls
concatenating strings
      . (dot) operator
control flow
      program using randomization
decimal numbers
DNA
     mutations, investigating with randomization
            program using randomization
do-until loops
elements, array
      randomly selecting
expressions, nested
      order of evaluation
floating-point numbers
for loops
      program using randomization
formatting
     code
            function calls in nested braces
fractions
functions
      nested calls, formatting in code
int function 2nd 3rd
layout
     code
            nested function calls
loops
      do-until
mutations, investigating with randomization
      program using randomization
            formatting code
            seeding the random number generator
            selecting array elements randomly
            selecting array position, randomly
nested expressions
      order of evaluation
numbers
      floating-point
operators
      bitwise
      concatenation
or operator
      | (bitwise OR)
rand function 2nd
random number generators
      seeding
randomization
      program using
            array position, randomly selecting
            control flow
            formatting code
            making a sentence
            seeding the random number generator
            selecting array elements randomly
scalar function
      determining size of arrays
seeding random number generators
sentences, randomly combining parts of 2nd
srand function
strings
      concatenating
subscripts, array elements
      integer values for
time function
variables
      resetting

2002, O'Reilly & Associates, Inc.