Just as there are numerous ways to create references, there are also several ways to use, or dereference, a reference.
Anywhere you might ordinarily put an alphanumeric identifier as part of a variable or subroutine name, you can just replace the identifier with a simple scalar variable containing a reference of the correct type. For example:
$foo = "two humps"; $scalarref = \$foo; $camel_model = $$scalarref; # $camel_model is now "two humps"
Here are various dereferences:
$bar = $$scalarref; push(@$arrayref, $filename); $$arrayref[0] = "January"; $$hashref{"KEY"} = "VALUE"; &$coderef(1,2,3); print $globref "output\n";
It's important to understand that we are specifically
not dereferencing $arrayref[0]
or $hashref{"KEY"}
there. The dereferencing of the
scalar variable happens before any array or hash
lookups. To dereference anything more complicated than a simple
scalar variable, you must use one of the next two methods described
below. However, "simple scalars" can include an identifier that
itself uses this first method recursively. Therefore, the following
prints "howdy":
$refrefref = \\\"howdy"; print $$$$refrefref;
You can think of the dollar signs as executing right to left.
The second way is just like the first, except using a
BLOCK
instead of a variable. Anywhere you'd put
an alphanumeric identifier as part of a variable or subroutine name,
you can replace the identifier with a BLOCK
returning a reference of the correct type. In other words, the
previous examples could also be handled like this:
$bar = ${$scalarref}; push(@{$arrayref}, $filename); ${$arrayref}[0] = "January"; ${$hashref}{"KEY"} = "VALUE"; &{$coderef}(1,2,3);
Admittedly, it's silly to use the braces in these simple cases, but
the BLOCK
can contain any arbitrary expression. In particular, it can
contain subscripted expressions. In the following example,
$dispatch{$index}
is assumed to contain a reference to a
subroutine. The example invokes the subroutine with three arguments.
&{ $dispatch{$index} }(1, 2, 3);
For references to arrays or hashes, a third method of dereferencing
the reference involves the use of the ->
infix
operator. This is a form of syntactic sugar that makes it easier to
get at individual array or hash elements, especially when the
reference expression is complicated. Each of these trios is
equivalent, corresponding to the three notations we've introduced.
(We've inserted some spaces to line up equivalent elements.)
$ $arrayref [0] = "January"; #1 ${ $arrayref }[0] = "January"; #2 $arrayref->[0] = "January"; #3 $ $hashref {KEY} = "F#major"; #1 ${ $hashref }{KEY} = "F#major"; #2 $hashref->{KEY} = "F#major"; #3
You can see from this example that the first $
is missing from the third
notation. It is, however, implied, and since it is implied, the
notation can only be used to reference scalar values, not slices. But
just as with the second notation, you can use any expression to the left
of the ->
, including another dereference, because arrow
operators associate left to right:
print $array[3]->{"English"}->[0];
Note that $array[3]
and $array->[3]
are not the same.
The first is talking about the fourth element of @array
, while the
second one is talking about the fourth element of the (possibly anonymous)
array whose reference is contained in $array
.
Suppose now that $array[3]
is undefined. The following statement
is still legal:
$array[3]->{"English"}->[0] = "January";
This is one of those cases mentioned earlier in which references
spring into existence when used in an lvalue context. Supposing
$array[3]
to have been undefined, it's
automatically defined with a hash reference so that we can look up
{"English"}
in it. Once that's done,
$array[3]->{"English"}
will automatically get
defined with an array reference so that we can look up
[0]
in it. But this only happens when you're
trying to create an element. Nothing would spring into existence if
you were just trying to print out the value. You'd just get the
undefined value out of it.
One more shortcut here. The arrow is optional between brace- or bracket-enclosed subscripts, so you can shrink the above code down to:
$array[3]{"English"}[0] = "January";
Which, in the case of ordinary arrays, gives you multi-dimensional arrays just like C's arrays:
$answer[$x][$y][$z] += 42;
Well, okay, not entirely like C's arrays. For one thing, C doesn't know how to grow its arrays on demand, while Perl does. Also, there are similar constructs in the two languages that parse differently. In Perl, the following two statements do the same thing:
$listref->[2][2] = "hello"; # pretty clear $$listref[2][2] = "hello"; # a bit confusing
This second of these statements may disconcert the C programmer, who is
accustomed to using *a[i]
to mean "what's pointed to by the
ith element of a
". But in Perl, the five prefix
dereferencers ($ @ * % &
) effectively bind more tightly than the
subscripting braces or brackets.[5]
Therefore, it is $$listref
and not
$listref[$i]
that is taken to be a reference to an array.
If you want the C notion, you either have to write ${$listref[$i]}
to
force the $listref[$i]
to get evaluated before the leading
$
dereferencer, or you have to use the ->
notation:
[5] But not because of operator precedence. The funny characters in Perl are not operators in that sense. The grammar simply prohibits anything more complicated than a simple variable or block from following the initial funny character, for various funny reasons.
$listref[$i]->[$j] = "hello";
If a reference happens to be a reference to an object (a blessed thingy, that is), then there are probably methods to access the innards of the object, and you should probably stick to those methods unless you're writing the class package that defines the object's methods. (Such a package is allowed to treat the object as a mere thingy when it wants to.) In other words, be nice, and don't violate the object's encapsulation without a very good reason. Perl does not enforce encapsulation. We are not totalitarians here. We do expect some basic civility, however.
You can use the ref operator to determine what type of thingy a reference is pointing to. Think of ref as a "typeof" operator that returns true if its argument is a reference and false otherwise. The value returned depends on the type of thing referenced. Built-in types include:
REF SCALAR ARRAY HASH CODE GLOB
If you simply use a hard reference in a string context, it'll be converted
to a string containing both the type and the address: SCALAR(0x1fc0e)
.
(The reverse conversion cannot be done, since reference count
information has been lost.)
You can use the bless operator to
associate a referenced thingy with a package functioning as an object
class. When you do this, ref will
return that package name instead of the internal type. An object
reference used in a string context returns a string with both the
external and internal types, along with the address:
MyType=HASH(0x20d10)
. See Chapter 5 for more details about objects.
Since the dereference syntax always indicates the kind of reference
desired, a typeglob can be used the same way a reference can, despite
the fact that a typeglob contains multiple thingies of various types.
So ${*foo}
and ${\$foo}
both refer to the same scalar
variable. The latter is more efficient though.
Here's a trick for interpolating the value of a subroutine call into a string:
print "My sub returned @{[ mysub(1,2,3) ]} that time.\n";
It works like this. At compile time, when the @{...}
is seen
within the double-quoted string, it's parsed as a block that will return
a reference. Within the block, there are square brackets that will
create a reference to an anonymous array from whatever is in the
brackets. So at run-time, mysub(1,2,3)
is called, and the
results are loaded into an anonymous array, a reference to which is then
returned within the block. That array reference is then immediately
dereferenced by the surrounding @{...}
, and the array value is
interpolated into the double-quoted string just as an ordinary array
would be. This chicanery is also useful for arbitrary expressions,
such as:
print "That yields @{[ $n + 5 ]} widgets\n";
Be careful though. The inside of the square brackets is supplying a
list context to its expression. In this case it doesn't matter,
although it's possible that the above call to mysub()
might care.
When it does matter, a similar trick can be done with a scalar
reference. It just isn't quite as pretty:
print "That yields ${ \($n + 5) } widgets.";
Earlier we talked about creating anonymous subroutines with a nameless
sub {}
. Since anonymous subroutines have to be
generated someplace within your code (in order to generate the
reference that you poke into some variable), such routines can be
thought of as coming into existence at run-time. (That is, they have a
time of generation as well as a location of definition.) Because of
this fact, anonymous subroutines can act as closures
with respect to my variables - that
is, with respect to variables visible lexically within the current
scope. Closure is a notion out of the Lisp world that says if you
define an anonymous function in a particular lexical context at a
particular moment, it pretends to run in that context even when it's
called outside of the context. In other words, you are guaranteed to
get the same copy of a lexical variable, even
though many other instances of the same lexical variable may have been
created before or since. This gives you a way to pass arguments to a
subroutine when you define it as well as when you call it. It's
useful for setting up little bits of code to run later, such as
callbacks.
You can also think of closures as a way to write a subroutine template without using eval. The lexical variables are like parameters to fill in the template.
Here's a small example of how closures work:
sub newprint { my $x = shift; return sub { my $y = shift; print "$x, $y!\n"; }; } $h = newprint("Howdy"); $g = newprint("Greetings"); # Time passes... &$h("world"); &$g("earthlings");
This prints:
Howdy, world! Greetings, earthlings!
Note in particular how $x
continues to refer to the value passed into
newprint()
despite the fact that the my $x
has
seemingly gone out of
scope by the time the anonymous subroutine runs. That's what closures
are all about.
This method only applies to my variables. Global variables work as they always worked (since they're neither created nor destroyed the way lexical variables are). By and large, closures are not something you need to trouble yourself about. When you do need them, they just sorta do what you expect.[6]
[6] Always presuming you expect the right thing, of course.
Perl doesn't provide member pointers like C++ does, but you can get a similar effect using a closure. Suppose you want a pointer to a method for a particular object. You can remember both the object and the method as lexical variables bound to a closure:
sub get_method_ref { my ($self, $method) = @_; return sub { return $self->$method(@_) }; } $dog_wag = get_method_ref($dog, 'wag'); &$dog_wag("tail"); # Calls $dog->wag('tail').