dslinux/user/perl/pod Makefile.SH buildtoc checkpods.PL perl.pod perl5004delta.pod perl5005delta.pod perl561delta.pod perl56delta.pod perl570delta.pod perl571delta.pod perl572delta.pod perl573delta.pod perl581delta.pod perl582delta.pod perl583delta.pod perl584delta.pod perl585delta.pod perl586delta.pod perl587delta.pod perl588delta.pod perl58delta.pod perlapi.pod perlapio.pod perlartistic.pod perlbook.pod perlboot.pod perlbot.pod perlcall.pod perlcheat.pod perlclib.pod perlcompile.pod perldata.pod perldbmfilter.pod perldebguts.pod perldebtut.pod perldebug.pod perldiag.pod perldoc.pod perldsc.pod perlebcdic.pod perlembed.pod perlfaq.pod perlfaq1.pod perlfaq2.pod perlfaq3.pod perlfaq4.pod perlfaq5.pod perlfaq6.pod perlfaq7.pod perlfaq8.pod perlfaq9.pod perlfilter.pod perlfork.pod perlform.pod perlfunc.pod perlglossary.pod perlgpl.pod perlguts.pod perlhack.pod perlhist.pod perlintern.pod perlintro.pod perliol.pod perlipc.pod perllexwarn.pod perllocale.pod perllol.pod p! erlmod.pod perlmodinstall.pod perlmodlib.PL perlmodlib.pod perlmodstyle.pod perlnewmod.pod perlnumber.pod perlobj.pod perlop.pod perlopentut.pod perlothrtut.pod perlpacktut.pod perlpod.pod perlpodspec.pod perlport.pod perlre.pod perlref.pod perlreftut.pod perlrequick.pod perlreref.pod perlretut.pod perlrun.pod perlsec.pod perlstyle.pod perlsub.pod perlsyn.pod perlthrtut.pod perltie.pod perltoc.pod perltodo.pod perltooc.pod perltoot.pod perltrap.pod perlunicode.pod perluniintro.pod perlutil.pod perlvar.pod perlxs.pod perlxstut.pod pod2html.PL pod2latex.PL pod2man.PL pod2text.PL pod2usage.PL podchecker.PL podselect.PL roffitall rofftoc splitman splitpod

cayenne dslinux_cayenne at user.in-berlin.de
Mon Dec 4 18:01:46 CET 2006

Update of /cvsroot/dslinux/dslinux/user/perl/pod
In directory antilope:/tmp/cvs-serv17422/pod

Added Files:
	Makefile.SH buildtoc checkpods.PL perl.pod perl5004delta.pod 
	perl5005delta.pod perl561delta.pod perl56delta.pod 
	perl570delta.pod perl571delta.pod perl572delta.pod 
	perl573delta.pod perl581delta.pod perl582delta.pod 
	perl583delta.pod perl584delta.pod perl585delta.pod 
	perl586delta.pod perl587delta.pod perl588delta.pod 
	perl58delta.pod perlapi.pod perlapio.pod perlartistic.pod 
	perlbook.pod perlboot.pod perlbot.pod perlcall.pod 
	perlcheat.pod perlclib.pod perlcompile.pod perldata.pod 
	perldbmfilter.pod perldebguts.pod perldebtut.pod perldebug.pod 
	perldiag.pod perldoc.pod perldsc.pod perlebcdic.pod 
	perlembed.pod perlfaq.pod perlfaq1.pod perlfaq2.pod 
	perlfaq3.pod perlfaq4.pod perlfaq5.pod perlfaq6.pod 
	perlfaq7.pod perlfaq8.pod perlfaq9.pod perlfilter.pod 
	perlfork.pod perlform.pod perlfunc.pod perlglossary.pod 
	perlgpl.pod perlguts.pod perlhack.pod perlhist.pod 
	perlintern.pod perlintro.pod perliol.pod perlipc.pod 
	perllexwarn.pod perllocale.pod perllol.pod perlmod.pod 
	perlmodinstall.pod perlmodlib.PL perlmodlib.pod 
	perlmodstyle.pod perlnewmod.pod perlnumber.pod perlobj.pod 
	perlop.pod perlopentut.pod perlothrtut.pod perlpacktut.pod 
	perlpod.pod perlpodspec.pod perlport.pod perlre.pod 
	perlref.pod perlreftut.pod perlrequick.pod perlreref.pod 
	perlretut.pod perlrun.pod perlsec.pod perlstyle.pod 
	perlsub.pod perlsyn.pod perlthrtut.pod perltie.pod perltoc.pod 
	perltodo.pod perltooc.pod perltoot.pod perltrap.pod 
	perlunicode.pod perluniintro.pod perlutil.pod perlvar.pod 
	perlxs.pod perlxstut.pod pod2html.PL pod2latex.PL pod2man.PL 
	pod2text.PL pod2usage.PL podchecker.PL podselect.PL roffitall 
	rofftoc splitman splitpod 
Log Message:
Adding fresh perl source to HEAD to branch from

--- NEW FILE: perluniintro.pod ---
=head1 NAME

perluniintro - Perl Unicode introduction


This document gives a general idea of Unicode and how to use Unicode
in Perl.

=head2 Unicode

Unicode is a character set standard which plans to codify all of the
writing systems of the world, plus many other symbols.

Unicode and ISO/IEC 10646 are coordinated standards that provide code
points for characters in almost all modern character set standards,
covering more than 30 writing systems and hundreds of languages,
including all commercially-important modern languages.  All characters
in the largest Chinese, Japanese, and Korean dictionaries are also
encoded. The standards will eventually cover almost all characters in
more than 250 writing systems and thousands of languages.
Unicode 1.0 was released in October 1991, and 4.0 in April 2003.

A Unicode I<character> is an abstract entity.  It is not bound to any
particular integer width, especially not to the C language C<char>.
Unicode is language-neutral and display-neutral: it does not encode the
language of the text and it does not define fonts or other graphical
layout details.  Unicode operates on characters and on text built from
those characters.

Unicode defines characters like C<LATIN CAPITAL LETTER A> or C<GREEK
SMALL LETTER ALPHA> and unique numbers for the characters, in this
case 0x0041 and 0x03B1, respectively.  These unique numbers are called
I<code points>.

The Unicode standard prefers using hexadecimal notation for the code
points.  If numbers like C<0x0041> are unfamiliar to you, take a peek
at a later section, L</"Hexadecimal Notation">.  The Unicode standard
uses the notation C<U+0041 LATIN CAPITAL LETTER A>, to give the
hexadecimal code point and the normative name of the character.

Unicode also defines various I<properties> for the characters, like
"uppercase" or "lowercase", "decimal digit", or "punctuation";
these properties are independent of the names of the characters.
Furthermore, various operations on the characters like uppercasing,
lowercasing, and collating (sorting) are defined.

A Unicode character consists either of a single code point, or a
I<base character> (like C<LATIN CAPITAL LETTER A>), followed by one or
more I<modifiers> (like C<COMBINING ACUTE ACCENT>).  This sequence of
base character and modifiers is called a I<combining character

Whether to call these combining character sequences "characters"
depends on your point of view. If you are a programmer, you probably
would tend towards seeing each element in the sequences as one unit,
or "character".  The whole sequence could be seen as one "character",
however, from the user's point of view, since that's probably what it
looks like in the context of the user's language.

With this "whole sequence" view of characters, the total number of
characters is open-ended. But in the programmer's "one unit is one
character" point of view, the concept of "characters" is more
deterministic.  In this document, we take that second  point of view:
one "character" is one Unicode code point, be it a base character or
a combining character.

For some combinations, there are I<precomposed> characters.
C<LATIN CAPITAL LETTER A WITH ACUTE>, for example, is defined as
a single code point.  These precomposed characters are, however,
only available for some combinations, and are mainly
meant to support round-trip conversions between Unicode and legacy
standards (like the ISO 8859).  In the general case, the composing
method is more extensible.  To support conversion between
different compositions of the characters, various I<normalization
forms> to standardize representations are also defined.

Because of backward compatibility with legacy encodings, the "a unique
number for every character" idea breaks down a bit: instead, there is
"at least one number for every character".  The same character could
be represented differently in several legacy encodings.  The
converse is also not true: some code points do not have an assigned
character.  Firstly, there are unallocated code points within
otherwise used blocks.  Secondly, there are special Unicode control
characters that do not represent true characters.

A common myth about Unicode is that it would be "16-bit", that is,
Unicode is only represented as C<0x10000> (or 65536) characters from
C<0x0000> to C<0xFFFF>.  B<This is untrue.>  Since Unicode 2.0 (July
1996), Unicode has been defined all the way up to 21 bits (C<0x10FFFF>),
and since Unicode 3.1 (March 2001), characters have been defined
beyond C<0xFFFF>.  The first C<0x10000> characters are called the
I<Plane 0>, or the I<Basic Multilingual Plane> (BMP).  With Unicode
3.1, 17 (yes, seventeen) planes in all were defined--but they are
nowhere near full of defined characters, yet.

Another myth is that the 256-character blocks have something to
do with languages--that each block would define the characters used
by a language or a set of languages.  B<This is also untrue.>
The division into blocks exists, but it is almost completely
accidental--an artifact of how the characters have been and
still are allocated.  Instead, there is a concept called I<scripts>,
which is more useful: there is C<Latin> script, C<Greek> script, and
so on.  Scripts usually span varied parts of several blocks.
For further information see L<Unicode::UCD>.

The Unicode code points are just abstract numbers.  To input and
output these abstract numbers, the numbers must be I<encoded> or
I<serialised> somehow.  Unicode defines several I<character encoding
forms>, of which I<UTF-8> is perhaps the most popular.  UTF-8 is a
variable length encoding that encodes Unicode characters as 1 to 6
bytes (only 4 with the currently defined characters).  Other encodings
include UTF-16 and UTF-32 and their big- and little-endian variants
(UTF-8 is byte-order independent) The ISO/IEC 10646 defines the UCS-2
and UCS-4 encoding forms.

For more information about encodings--for instance, to learn what
I<surrogates> and I<byte order marks> (BOMs) are--see L<perlunicode>.

=head2 Perl's Unicode Support

Starting from Perl 5.6.0, Perl has had the capacity to handle Unicode
natively.  Perl 5.8.0, however, is the first recommended release for
serious Unicode work.  The maintenance release 5.6.1 fixed many of the
problems of the initial Unicode implementation, but for example
regular expressions still do not work with Unicode in 5.6.1.

B<Starting from Perl 5.8.0, the use of C<use utf8> is no longer
necessary.> In earlier releases the C<utf8> pragma was used to declare
that operations in the current block or file would be Unicode-aware.
This model was found to be wrong, or at least clumsy: the "Unicodeness"
is now carried with the data, instead of being attached to the
operations.  Only one case remains where an explicit C<use utf8> is
needed: if your Perl script itself is encoded in UTF-8, you can use
UTF-8 in your identifier names, and in string and regular expression
literals, by saying C<use utf8>.  This is not the default because
scripts with legacy 8-bit data in them would break.  See L<utf8>.

=head2 Perl's Unicode Model

Perl supports both pre-5.6 strings of eight-bit native bytes, and
strings of Unicode characters.  The principle is that Perl tries to
keep its data as eight-bit bytes for as long as possible, but as soon
as Unicodeness cannot be avoided, the data is transparently upgraded
to Unicode.

Internally, Perl currently uses either whatever the native eight-bit
character set of the platform (for example Latin-1) is, defaulting to
UTF-8, to encode Unicode strings. Specifically, if all code points in
the string are C<0xFF> or less, Perl uses the native eight-bit
character set.  Otherwise, it uses UTF-8.

A user of Perl does not normally need to know nor care how Perl
happens to encode its internal strings, but it becomes relevant when
outputting Unicode strings to a stream without a PerlIO layer -- one with
the "default" encoding.  In such a case, the raw bytes used internally
(the native character set or UTF-8, as appropriate for each string)
will be used, and a "Wide character" warning will be issued if those
strings contain a character beyond 0x00FF.

For example,

      perl -e 'print "\x{DF}\n", "\x{0100}\x{DF}\n"'              

produces a fairly useless mixture of native bytes and UTF-8, as well
as a warning:

     Wide character in print at ...

To output UTF-8, use the C<:utf8> output layer.  Prepending

      binmode(STDOUT, ":utf8");

to this sample program ensures that the output is completely UTF-8,
and removes the program's warning.

You can enable automatic UTF-8-ification of your standard file
handles, default C<open()> layer, and C<@ARGV> by using either
the C<-C> command line switch or the C<PERL_UNICODE> environment
variable, see L<perlrun> for the documentation of the C<-C> switch.

Note that this means that Perl expects other software to work, too:
if Perl has been led to believe that STDIN should be UTF-8, but then
STDIN coming in from another command is not UTF-8, Perl will complain
about the malformed UTF-8.

All features that combine Unicode and I/O also require using the new
PerlIO feature.  Almost all Perl 5.8 platforms do use PerlIO, though:
you can see whether yours is by running "perl -V" and looking for

=head2 Unicode and EBCDIC

Perl 5.8.0 also supports Unicode on EBCDIC platforms.  There,
Unicode support is somewhat more complex to implement since
additional conversions are needed at every step.  Some problems
remain, see L<perlebcdic> for details.

In any case, the Unicode support on EBCDIC platforms is better than
in the 5.6 series, which didn't work much at all for EBCDIC platform.
On EBCDIC platforms, the internal Unicode encoding form is UTF-EBCDIC
instead of UTF-8.  The difference is that as UTF-8 is "ASCII-safe" in
that ASCII characters encode to UTF-8 as-is, while UTF-EBCDIC is

=head2 Creating Unicode

To create Unicode characters in literals for code points above C<0xFF>,
use the C<\x{...}> notation in double-quoted strings:

    my $smiley = "\x{263a}";

Similarly, it can be used in regular expression literals

    $smiley =~ /\x{263a}/;

At run-time you can use C<chr()>:

    my $hebrew_alef = chr(0x05d0);

See L</"Further Resources"> for how to find all these numeric codes.

Naturally, C<ord()> will do the reverse: it turns a character into
a code point.

Note that C<\x..> (no C<{}> and only two hexadecimal digits), C<\x{...}>,
and C<chr(...)> for arguments less than C<0x100> (decimal 256)
generate an eight-bit character for backward compatibility with older
Perls.  For arguments of C<0x100> or more, Unicode characters are
always produced. If you want to force the production of Unicode
characters regardless of the numeric value, use C<pack("U", ...)>
instead of C<\x..>, C<\x{...}>, or C<chr()>.

You can also use the C<charnames> pragma to invoke characters
by name in double-quoted strings:

    use charnames ':full';
    my $arabic_alef = "\N{ARABIC LETTER ALEF}";

And, as mentioned above, you can also C<pack()> numbers into Unicode

   my $georgian_an  = pack("U", 0x10a0);

Note that both C<\x{...}> and C<\N{...}> are compile-time string
constants: you cannot use variables in them.  if you want similar
run-time functionality, use C<chr()> and C<charnames::vianame()>.

If you want to force the result to Unicode characters, use the special
C<"U0"> prefix.  It consumes no arguments but forces the result to be
in Unicode characters, instead of bytes.

   my $chars = pack("U0C*", 0x80, 0x42);

Likewise, you can force the result to be bytes by using the special
C<"C0"> prefix.

=head2 Handling Unicode

Handling Unicode is for the most part transparent: just use the
strings as usual.  Functions like C<index()>, C<length()>, and
C<substr()> will work on the Unicode characters; regular expressions
will work on the Unicode characters (see L<perlunicode> and L<perlretut>).

Note that Perl considers combining character sequences to be
separate characters, so for example

    use charnames ':full';

will print 2, not 1.  The only exception is that regular expressions
have C<\X> for matching a combining character sequence.

Life is not quite so transparent, however, when working with legacy
encodings, I/O, and certain special cases:

=head2 Legacy Encodings

When you combine legacy data and Unicode the legacy data needs
to be upgraded to Unicode.  Normally ISO 8859-1 (or EBCDIC, if
applicable) is assumed.  You can override this assumption by
using the C<encoding> pragma, for example

    use encoding 'latin2'; # ISO 8859-2

in which case literals (string or regular expressions), C<chr()>,
and C<ord()> in your whole script are assumed to produce Unicode
characters from ISO 8859-2 code points.  Note that the matching for
encoding names is forgiving: instead of C<latin2> you could have
said C<Latin 2>, or C<iso8859-2>, or other variations.  With just

    use encoding;

the environment variable C<PERL_ENCODING> will be consulted.
If that variable isn't set, the encoding pragma will fail.

The C<Encode> module knows about many encodings and has interfaces
for doing conversions between those encodings:

    use Encode 'decode';
    $data = decode("iso-8859-3", $data); # convert from legacy to utf-8

=head2 Unicode I/O

Normally, writing out Unicode data

    print FH $some_string_with_unicode, "\n";

produces raw bytes that Perl happens to use to internally encode the
Unicode string.  Perl's internal encoding depends on the system as
well as what characters happen to be in the string at the time. If
any of the characters are at code points C<0x100> or above, you will get
a warning.  To ensure that the output is explicitly rendered in the
encoding you desire--and to avoid the warning--open the stream with
the desired encoding. Some examples:

    open FH, ">:utf8", "file";

    open FH, ">:encoding(ucs2)",      "file";
    open FH, ">:encoding(UTF-8)",     "file";
    open FH, ">:encoding(shift_jis)", "file";

and on already open streams, use C<binmode()>:

    binmode(STDOUT, ":utf8");

    binmode(STDOUT, ":encoding(ucs2)");
    binmode(STDOUT, ":encoding(UTF-8)");
    binmode(STDOUT, ":encoding(shift_jis)");

The matching of encoding names is loose: case does not matter, and
many encodings have several aliases.  Note that the C<:utf8> layer
must always be specified exactly like that; it is I<not> subject to
the loose matching of encoding names.

See L<PerlIO> for the C<:utf8> layer, L<PerlIO::encoding> and
L<Encode::PerlIO> for the C<:encoding()> layer, and
L<Encode::Supported> for many encodings supported by the C<Encode>

Reading in a file that you know happens to be encoded in one of the
Unicode or legacy encodings does not magically turn the data into
Unicode in Perl's eyes.  To do that, specify the appropriate
layer when opening files

    open(my $fh,'<:utf8', 'anything');
    my $line_of_unicode = <$fh>;

    open(my $fh,'<:encoding(Big5)', 'anything');
    my $line_of_unicode = <$fh>;

The I/O layers can also be specified more flexibly with
the C<open> pragma.  See L<open>, or look at the following example.

    use open ':utf8'; # input and output default layer will be UTF-8
    open X, ">file";
    print X chr(0x100), "\n";
    close X;
    open Y, "<file";
    printf "%#x\n", ord(<Y>); # this should print 0x100
    close Y;

With the C<open> pragma you can use the C<:locale> layer

    BEGIN { $ENV{LC_ALL} = $ENV{LANG} = 'ru_RU.KOI8-R' }
    # the :locale will probe the locale environment variables like LC_ALL
    use open OUT => ':locale'; # russki parusski
    open(O, ">koi8");
    print O chr(0x430); # Unicode CYRILLIC SMALL LETTER A = KOI8-R 0xc1
    close O;
    open(I, "<koi8");
    printf "%#x\n", ord(<I>), "\n"; # this should print 0xc1
    close I;

or you can also use the C<':encoding(...)'> layer

    open(my $epic,'<:encoding(iso-8859-7)','iliad.greek');
    my $line_of_unicode = <$epic>;

These methods install a transparent filter on the I/O stream that
converts data from the specified encoding when it is read in from the
stream.  The result is always Unicode.

The L<open> pragma affects all the C<open()> calls after the pragma by
setting default layers.  If you want to affect only certain
streams, use explicit layers directly in the C<open()> call.

You can switch encodings on an already opened stream by using
C<binmode()>; see L<perlfunc/binmode>.

The C<:locale> does not currently (as of Perl 5.8.0) work with
C<open()> and C<binmode()>, only with the C<open> pragma.  The
C<:utf8> and C<:encoding(...)> methods do work with all of C<open()>,
C<binmode()>, and the C<open> pragma.

Similarly, you may use these I/O layers on output streams to
automatically convert Unicode to the specified encoding when it is
written to the stream. For example, the following snippet copies the
contents of the file "text.jis" (encoded as ISO-2022-JP, aka JIS) to
the file "text.utf8", encoded as UTF-8:

    open(my $nihongo, '<:encoding(iso-2022-jp)', 'text.jis');
    open(my $unicode, '>:utf8',                  'text.utf8');
    while (<$nihongo>) { print $unicode $_ }

The naming of encodings, both by the C<open()> and by the C<open>
pragma, is similar to the C<encoding> pragma in that it allows for
flexible names: C<koi8-r> and C<KOI8R> will both be understood.

Common encodings recognized by ISO, MIME, IANA, and various other
standardisation organisations are recognised; for a more detailed
list see L<Encode::Supported>.

C<read()> reads characters and returns the number of characters.
C<seek()> and C<tell()> operate on byte counts, as do C<sysread()>
and C<sysseek()>.

Notice that because of the default behaviour of not doing any
conversion upon input if there is no default layer,
it is easy to mistakenly write code that keeps on expanding a file
by repeatedly encoding the data:

    open F, "file";
    local $/; ## read in the whole file of 8-bit characters
    $t = <F>;
    close F;
    open F, ">:utf8", "file";
    print F $t; ## convert to UTF-8 on output
    close F;

If you run this code twice, the contents of the F<file> will be twice
UTF-8 encoded.  A C<use open ':utf8'> would have avoided the bug, or
explicitly opening also the F<file> for input as UTF-8.

B<NOTE>: the C<:utf8> and C<:encoding> features work only if your
Perl has been built with the new PerlIO feature (which is the default
on most systems).

=head2 Displaying Unicode As Text

Sometimes you might want to display Perl scalars containing Unicode as
simple ASCII (or EBCDIC) text.  The following subroutine converts
its argument so that Unicode characters with code points greater than
255 are displayed as C<\x{...}>, control characters (like C<\n>) are
displayed as C<\x..>, and the rest of the characters as themselves:

   sub nice_string {
         map { $_ > 255 ?                  # if wide character...
               sprintf("\\x{%04X}", $_) :  # \x{...}
               chr($_) =~ /[[:cntrl:]]/ ?  # else if control character ...
               sprintf("\\x%02X", $_) :    # \x..
               quotemeta(chr($_))          # else quoted or as themselves
         } unpack("U*", $_[0]));           # unpack Unicode characters

For example,


returns the string


which is ready to be printed.

=head2 Special Cases

=over 4

=item *

Bit Complement Operator ~ And vec()

The bit complement operator C<~> may produce surprising results if
used on strings containing characters with ordinal values above
255. In such a case, the results are consistent with the internal
encoding of the characters, but not with much else. So don't do
that. Similarly for C<vec()>: you will be operating on the
internally-encoded bit patterns of the Unicode characters, not on
the code point values, which is very probably not what you want.

=item *

Peeking At Perl's Internal Encoding

Normal users of Perl should never care how Perl encodes any particular
Unicode string (because the normal ways to get at the contents of a
string with Unicode--via input and output--should always be via
explicitly-defined I/O layers). But if you must, there are two
ways of looking behind the scenes.

One way of peeking inside the internal encoding of Unicode characters
is to use C<unpack("C*", ...> to get the bytes or C<unpack("H*", ...)>
to display the bytes:

    # this prints  c4 80  for the UTF-8 bytes 0xc4 0x80
    print join(" ", unpack("H*", pack("U", 0x100))), "\n";

Yet another way would be to use the Devel::Peek module:

    perl -MDevel::Peek -e 'Dump(chr(0x100))'

That shows the C<UTF8> flag in FLAGS and both the UTF-8 bytes
and Unicode characters in C<PV>.  See also later in this document
the discussion about the C<utf8::is_utf8()> function.


=head2 Advanced Topics

=over 4

=item *

String Equivalence

The question of string equivalence turns somewhat complicated
in Unicode: what do you mean by "equal"?


The short answer is that by default Perl compares equivalence (C<eq>,
C<ne>) based only on code points of the characters.  In the above
case, the answer is no (because 0x00C1 != 0x0041).  But sometimes, any
CAPITAL LETTER As should be considered equal, or even As of any case.

The long answer is that you need to consider character normalization
and casing issues: see L<Unicode::Normalize>, Unicode Technical
Reports #15 and #21, I<Unicode Normalization Forms> and I<Case
Mappings>, http://www.unicode.org/unicode/reports/tr15/ and 

As of Perl 5.8.0, the "Full" case-folding of I<Case
Mappings/SpecialCasing> is implemented.

=item *

String Collation

People like to see their strings nicely sorted--or as Unicode
parlance goes, collated.  But again, what do you mean by collate?

(Does C<LATIN CAPITAL LETTER A WITH ACUTE> come before or after

The short answer is that by default, Perl compares strings (C<lt>,
C<le>, C<cmp>, C<ge>, C<gt>) based only on the code points of the
characters.  In the above case, the answer is "after", since
C<0x00C1> > C<0x00C0>.

The long answer is that "it depends", and a good answer cannot be
given without knowing (at the very least) the language context.
See L<Unicode::Collate>, and I<Unicode Collation Algorithm>


=head2 Miscellaneous

=over 4

=item *

Character Ranges and Classes

Character ranges in regular expression character classes (C</[a-z]/>)
and in the C<tr///> (also known as C<y///>) operator are not magically
Unicode-aware.  What this means that C<[A-Za-z]> will not magically start
to mean "all alphabetic letters"; not that it does mean that even for
8-bit characters, you should be using C</[[:alpha:]]/> in that case.

For specifying character classes like that in regular expressions,
you can use the various Unicode properties--C<\pL>, or perhaps
C<\p{Alphabetic}>, in this particular case.  You can use Unicode
code points as the end points of character ranges, but there is no
magic associated with specifying a certain range.  For further
information--there are dozens of Unicode character classes--see

=item *

String-To-Number Conversions

Unicode does define several other decimal--and numeric--characters
besides the familiar 0 to 9, such as the Arabic and Indic digits.
Perl does not support string-to-number conversion for digits other
than ASCII 0 to 9 (and ASCII a to f for hexadecimal).


=head2 Questions With Answers

=over 4

=item *

Will My Old Scripts Break?

Very probably not.  Unless you are generating Unicode characters
somehow, old behaviour should be preserved.  About the only behaviour
that has changed and which could start generating Unicode is the old
behaviour of C<chr()> where supplying an argument more than 255
produced a character modulo 255.  C<chr(300)>, for example, was equal
to C<chr(45)> or "-" (in ASCII), now it is LATIN CAPITAL LETTER I WITH

=item *

How Do I Make My Scripts Work With Unicode?

Very little work should be needed since nothing changes until you
generate Unicode data.  The most important thing is getting input as
Unicode; for that, see the earlier I/O discussion.

=item *

How Do I Know Whether My String Is In Unicode?

You shouldn't care.  No, you really shouldn't.  No, really.  If you
have to care--beyond the cases described above--it means that we
didn't get the transparency of Unicode quite right.

Okay, if you insist:

    print utf8::is_utf8($string) ? 1 : 0, "\n";

But note that this doesn't mean that any of the characters in the
string are necessary UTF-8 encoded, or that any of the characters have
code points greater than 0xFF (255) or even 0x80 (128), or that the
string has any characters at all.  All the C<is_utf8()> does is to
return the value of the internal "utf8ness" flag attached to the
C<$string>.  If the flag is off, the bytes in the scalar are interpreted
as a single byte encoding.  If the flag is on, the bytes in the scalar
are interpreted as the (multi-byte, variable-length) UTF-8 encoded code
points of the characters.  Bytes added to an UTF-8 encoded string are
automatically upgraded to UTF-8.  If mixed non-UTF-8 and UTF-8 scalars
are merged (double-quoted interpolation, explicit concatenation, and
printf/sprintf parameter substitution), the result will be UTF-8 encoded
as if copies of the byte strings were upgraded to UTF-8: for example,

    $a = "ab\x80c";
    $b = "\x{100}";
    print "$a = $b\n";

the output string will be UTF-8-encoded C<ab\x80c = \x{100}\n>, but
C<$a> will stay byte-encoded.

Sometimes you might really need to know the byte length of a string
instead of the character length. For that use either the
C<Encode::encode_utf8()> function or the C<bytes> pragma and its only
defined function C<length()>:

    my $unicode = chr(0x100);
    print length($unicode), "\n"; # will print 1
    require Encode;
    print length(Encode::encode_utf8($unicode)), "\n"; # will print 2
    use bytes;
    print length($unicode), "\n"; # will also print 2
                                  # (the 0xC4 0x80 of the UTF-8)

=item *

How Do I Detect Data That's Not Valid In a Particular Encoding?

Use the C<Encode> package to try converting it.
For example,

    use Encode 'decode_utf8';
    if (decode_utf8($string_of_bytes_that_I_think_is_utf8)) {
        # valid
    } else {
        # invalid

For UTF-8 only, you can use:

    use warnings;
    @chars = unpack("U0U*", $string_of_bytes_that_I_think_is_utf8);

If invalid, a C<Malformed UTF-8 character (byte 0x##) in unpack>
warning is produced. The "U0" means "expect strictly UTF-8 encoded
Unicode".  Without that the C<unpack("U*", ...)> would accept also
data like C<chr(0xFF>), similarly to the C<pack> as we saw earlier.

=item *

How Do I Convert Binary Data Into a Particular Encoding, Or Vice Versa?

This probably isn't as useful as you might think.
Normally, you shouldn't need to.

In one sense, what you are asking doesn't make much sense: encodings
are for characters, and binary data are not "characters", so converting
"data" into some encoding isn't meaningful unless you know in what
character set and encoding the binary data is in, in which case it's
not just binary data, now is it?

If you have a raw sequence of bytes that you know should be
interpreted via a particular encoding, you can use C<Encode>:

    use Encode 'from_to';
    from_to($data, "iso-8859-1", "utf-8"); # from latin-1 to utf-8

The call to C<from_to()> changes the bytes in C<$data>, but nothing
material about the nature of the string has changed as far as Perl is
concerned.  Both before and after the call, the string C<$data>
contains just a bunch of 8-bit bytes. As far as Perl is concerned,
the encoding of the string remains as "system-native 8-bit bytes".

You might relate this to a fictional 'Translate' module:

   use Translate;
   my $phrase = "Yes";
   Translate::from_to($phrase, 'english', 'deutsch');
   ## phrase now contains "Ja"

The contents of the string changes, but not the nature of the string.
Perl doesn't know any more after the call than before that the
contents of the string indicates the affirmative.

Back to converting data.  If you have (or want) data in your system's
native 8-bit encoding (e.g. Latin-1, EBCDIC, etc.), you can use
pack/unpack to convert to/from Unicode.

    $native_string  = pack("C*", unpack("U*", $Unicode_string));
    $Unicode_string = pack("U*", unpack("C*", $native_string));

If you have a sequence of bytes you B<know> is valid UTF-8,
but Perl doesn't know it yet, you can make Perl a believer, too:

    use Encode 'decode_utf8';
    $Unicode = decode_utf8($bytes);

You can convert well-formed UTF-8 to a sequence of bytes, but if
you just want to convert random binary data into UTF-8, you can't.
B<Any random collection of bytes isn't well-formed UTF-8>.  You can
use C<unpack("C*", $string)> for the former, and you can create
well-formed Unicode data by C<pack("U*", 0xff, ...)>.

=item *

How Do I Display Unicode?  How Do I Input Unicode?

See http://www.alanwood.net/unicode/ and

=item *

How Does Unicode Work With Traditional Locales?

In Perl, not very well.  Avoid using locales through the C<locale>
pragma.  Use only one or the other.  But see L<perlrun> for the
description of the C<-C> switch and its environment counterpart,
C<$ENV{PERL_UNICODE}> to see how to enable various Unicode features,
for example by using locale settings.


=head2 Hexadecimal Notation

The Unicode standard prefers using hexadecimal notation because
that more clearly shows the division of Unicode into blocks of 256 characters.
Hexadecimal is also simply shorter than decimal.  You can use decimal
notation, too, but learning to use hexadecimal just makes life easier
with the Unicode standard.  The C<U+HHHH> notation uses hexadecimal,
for example.

The C<0x> prefix means a hexadecimal number, the digits are 0-9 I<and>
a-f (or A-F, case doesn't matter).  Each hexadecimal digit represents
four bits, or half a byte.  C<print 0x..., "\n"> will show a
hexadecimal number in decimal, and C<printf "%x\n", $decimal> will
show a decimal number in hexadecimal.  If you have just the
"hex digits" of a hexadecimal number, you can use the C<hex()> function.

    print 0x0009, "\n";    # 9
    print 0x000a, "\n";    # 10
    print 0x000f, "\n";    # 15
    print 0x0010, "\n";    # 16
    print 0x0011, "\n";    # 17
    print 0x0100, "\n";    # 256

    print 0x0041, "\n";    # 65

    printf "%x\n",  65;    # 41
    printf "%#x\n", 65;    # 0x41

    print hex("41"), "\n"; # 65

=head2 Further Resources

=over 4

=item *

Unicode Consortium


=item *

Unicode FAQ


=item *

Unicode Glossary


=item *

Unicode Useful Resources


=item *

Unicode and Multilingual Support in HTML, Fonts, Web Browsers and Other Applications


=item *

UTF-8 and Unicode FAQ for Unix/Linux


=item *

Legacy Character Sets


=item *

The Unicode support files live within the Perl installation in the


in Perl 5.8.0 or newer, and 


in the Perl 5.6 series.  (The renaming to F<lib/unicore> was done to
avoid naming conflicts with lib/Unicode in case-insensitive filesystems.)
The main Unicode data file is F<UnicodeData.txt> (or F<Unicode.301> in
Perl 5.6.1.)  You can find the C<$Config{installprivlib}> by

    perl "-V:installprivlib"

You can explore various information from the Unicode data files using
the C<Unicode::UCD> module.



If you cannot upgrade your Perl to 5.8.0 or later, you can still
do some Unicode processing by using the modules C<Unicode::String>,
C<Unicode::Map8>, and C<Unicode::Map>, available from CPAN.
If you have the GNU recode installed, you can also use the
Perl front-end C<Convert::Recode> for character conversions.

The following are fast conversions from ISO 8859-1 (Latin-1) bytes
to UTF-8 bytes and back, the code works even with older Perl 5 versions.

    # ISO 8859-1 to UTF-8

    # UTF-8 to ISO 8859-1

=head1 SEE ALSO

L<perlunicode>, L<Encode>, L<encoding>, L<open>, L<utf8>, L<bytes>,
L<perlretut>, L<perlrun>, L<Unicode::Collate>, L<Unicode::Normalize>,


Thanks to the kind readers of the perl5-porters at perl.org,
perl-unicode at perl.org, linux-utf8 at nl.linux.org, and unicore at unicode.org
mailing lists for their valuable feedback.


Copyright 2001-2002 Jarkko Hietaniemi E<lt>jhi at iki.fiE<gt>

This document may be distributed under the same terms as Perl itself.

--- NEW FILE: perl581delta.pod ---
=head1 NAME

perl581delta - what is new for perl v5.8.1


This document describes differences between the 5.8.0 release and
the 5.8.1 release.

If you are upgrading from an earlier release such as 5.6.1, first read
the L<perl58delta>, which describes differences between 5.6.0 and

In case you are wondering about 5.6.1, it was bug-fix-wise rather
identical to the development release 5.7.1.  Confused?  This timeline
hopefully helps a bit: it lists the new major releases, their maintenance
releases, and the development releases.

          New     Maintenance  Development
[...1063 lines suppressed...]
information at http://www.perl.com/ , the Perl Home Page.

If you believe you have an unreported bug, please run the B<perlbug>
program included with your release.  Be sure to trim your bug down
to a tiny but sufficient test case.  Your bug report, along with the
output of C<perl -V>, will be sent off to perlbug at perl.org to be
analysed by the Perl porting team.  You can browse and search
the Perl 5 bugs at http://bugs.perl.org/

=head1 SEE ALSO

The F<Changes> file for exhaustive details on what changed.

The F<INSTALL> file for how to build Perl.

The F<README> file for general stuff.

The F<Artistic> and F<Copying> files for copyright information.


--- NEW FILE: perl573delta.pod ---
=head1 NAME

perl573delta - what's new for perl v5.7.3


This document describes differences between the 5.7.2 release and the
5.7.3 release.  

(To view the differences between the 5.6.0 release and the 5.7.0
release, see L<perl570delta>.  To view the differences between the
5.7.0 release and the 5.7.1 release, see L<perl571delta>.  To view
the differences between the 5.7.1 release and the 5.7.2 release,
see L<perl572delta>.)

=head1 Changes

This is just a selected list of some of the more notable changes.
The numbers refer to the Perl repository change numbers; see
L<Changes58> (or L<Changes> in Perl 5.8.1).  In addition to these
changes, lots of work took place in integrating threads, PerlIO, and
Unicode; general code cleanup; and last but not least porting to
non-UNIX lands such as Win32, VMS, Cygwin, DJGPP, VOS, MacOS Classic,

=over 4

=item 11362

add LC_MESSAGES to POSIX :locale_h export tag

=item 11371

add DEL to [:cntrl:]

=item 11375

make h2ph understand constants like 1234L and 5678LL

=item 11405

Win32: fix bugs in handling of the virtualized environment

=item 11410

fix a bug in the security taint checking of open()

=item 11423

make perl fork() safe even on platforms that don't have pthread_atfork()

=item 11459

make switching optimization and debugging levels during Perl builds
easier via the OPTIMIZE environment variable

=item 11475

make split()'s unused captures to be undef, not ''

=item 11485

Search::Dict: allow transforming lines before comparing 

=item 11490

allow installing extra modules or bundles when building Perl

=item 11516

add -Wall in cflags when compiling with gcc to weed out dubious
C practices

=item 11541

pluggable optimizer

=item 11549

WinCE: integrate the port

=item 11589

Win32: 4-arg select was broken

=item 11594

introduce the perlivp utility for verifying the Perl installation
(IVP = Installation Verification Procedure)

=item 11623

rename lib/unicode to lib/unicore to avoid case-insensitivity problems
with lib/Unicode

=item 111631

remove Time::Piece

=item 11643

document that use utf8 is not the right way most of the time

=item 11656

allow building perl with -DUSE_UTF8_SCRIPTS which makes UTF-8
the default script encoding (not the default since that would
break all scripts having legacy eight-bit data in them)

=item 11725

division preserving 64-bit integers

=item 11743

document the coderef-in- at INC feature

=item 11794

modulo (%) preserving 64-bit integers

=item 11825

update to Unicode 3.1.1

=item 11865

add the \[$@%&*] prototype support

=item 11874

oct() and hex() in glorious 64 bit

=item 11877

Class::Struct: allow recursive classes

=item 11993

fix unpack U to be the reverse of pack U

=item 12056

VMS: waitpid enhancements

=item 12180

unpack("Z*Z*", pack("Z*Z*", ..)) was broken

=item 12243

Devel::Peek: display UTF-8 SVs also as \x{...}

=item 12288

Data::Dumper: option to sort hashes

=item 12542

add perlpodspec

=item 12652

threadsafe DynaLoader, re, Opcode, File::Glob, and B

=item 12756

support BeOS better

=item 12874

read-only hashes (user-level interface is Hash::Util)

=item 13162

add Devel::PPPort

=item 13179

add the sort pragma

=item 13326

VMS: fix perl -P

=item 13358

add perlpacktut

=item 13452

SUPER-UX: add hints file

=item 13575

Win32: non-blocking waitpid(-1,WNOHANG)

=item 13684

introduce the -t option for gentler taint checking

=item 14694

add the if pragma

=item 14832

implement IV/UV/NV/long double un/packing with j/J/F/D

=item 14854

document the new taint behaviour of exec LIST and system LIST


=head1 Reporting Bugs

If you find what you think is a bug, you might check the articles
recently posted to the comp.lang.perl.misc newsgroup and the perl
bug database at http://bugs.perl.org.  There may also be
information at http://www.perl.com/, the Perl Home Page.

If you believe you have an unreported bug, please run the B<perlbug>
program included with your release.  Be sure to trim your bug down
to a tiny but sufficient test case.  Your bug report, along with the
output of C<perl -V>, will be sent off to perlbug at perl.org to be
analysed by the Perl porting team.

=head1 SEE ALSO

The F<Changes> file for exhaustive details on what changed.

The F<INSTALL> file for how to build Perl.

The F<README> file for general stuff.

The F<Artistic> and F<Copying> files for copyright information.

=head1 HISTORY

Written by Jarkko Hietaniemi <F<jhi at iki.fi>>, with many contributions
from The Perl Porters and Perl Users submitting feedback and patches.

Send omissions or corrections to <F<perlbug at perl.org>>.


--- NEW FILE: perlmod.pod ---
=head1 NAME

perlmod - Perl modules (packages and symbol tables)


=head2 Packages
X<package> X<namespace> X<variable, global> X<global variable> X<global>

Perl provides a mechanism for alternative namespaces to protect
packages from stomping on each other's variables.  In fact, there's
really no such thing as a global variable in Perl.  The package
statement declares the compilation unit as being in the given
namespace.  The scope of the package declaration is from the
declaration itself through the end of the enclosing block, C<eval>,
or file, whichever comes first (the same scope as the my() and
local() operators).  Unqualified dynamic identifiers will be in
this namespace, except for those few identifiers that if unqualified,
default to the main package instead of the current one as described
below.  A package statement affects only dynamic variables--including
those you've used local() on--but I<not> lexical variables created
with my().  Typically it would be the first declaration in a file
included by the C<do>, C<require>, or C<use> operators.  You can
switch into a package in more than one place; it merely influences
which symbol table is used by the compiler for the rest of that
block.  You can refer to variables and filehandles in other packages
by prefixing the identifier with the package name and a double
colon: C<$Package::Variable>.  If the package name is null, the
C<main> package is assumed.  That is, C<$::sail> is equivalent to

The old package delimiter was a single quote, but double colon is now the
preferred delimiter, in part because it's more readable to humans, and
in part because it's more readable to B<emacs> macros.  It also makes C++
programmers feel like they know what's going on--as opposed to using the
single quote as separator, which was there to make Ada programmers feel
like they knew what was going on.  Because the old-fashioned syntax is still
supported for backwards compatibility, if you try to use a string like
C<"This is $owner's house">, you'll be accessing C<$owner::s>; that is,
the $s variable in package C<owner>, which is probably not what you meant.
Use braces to disambiguate, as in C<"This is ${owner}'s house">.
X<::> X<'>

Packages may themselves contain package separators, as in
C<$OUTER::INNER::var>.  This implies nothing about the order of
name lookups, however.  There are no relative packages: all symbols
are either local to the current package, or must be fully qualified
from the outer package name down.  For instance, there is nowhere
within package C<OUTER> that C<$INNER::var> refers to
C<$OUTER::INNER::var>.  C<INNER> refers to a totally
separate global package.

Only identifiers starting with letters (or underscore) are stored
in a package's symbol table.  All other symbols are kept in package
C<main>, including all punctuation variables, like $_.  In addition,
when unqualified, the identifiers STDIN, STDOUT, STDERR, ARGV,
ARGVOUT, ENV, INC, and SIG are forced to be in package C<main>,
even when used for other purposes than their built-in ones.  If you
have a package called C<m>, C<s>, or C<y>, then you can't use the
qualified form of an identifier because it would be instead interpreted
as a pattern match, a substitution, or a transliteration.
X<variable, punctuation> 

Variables beginning with underscore used to be forced into package
main, but we decided it was more useful for package writers to be able
to use leading underscore to indicate private variables and method names.
However, variables and functions named with a single C<_>, such as
$_ and C<sub _>, are still forced into the package C<main>.  See also
L<perlvar/"Technical Note on the Syntax of Variable Names">.

C<eval>ed strings are compiled in the package in which the eval() was
compiled.  (Assignments to C<$SIG{}>, however, assume the signal
handler specified is in the C<main> package.  Qualify the signal handler
name if you wish to have a signal handler in a package.)  For an
example, examine F<perldb.pl> in the Perl library.  It initially switches
to the C<DB> package so that the debugger doesn't interfere with variables
in the program you are trying to debug.  At various points, however, it
temporarily switches back to the C<main> package to evaluate various
expressions in the context of the C<main> package (or wherever you came
from).  See L<perldebug>.

The special symbol C<__PACKAGE__> contains the current package, but cannot
(easily) be used to construct variable names.

See L<perlsub> for other scoping issues related to my() and local(),
and L<perlref> regarding closures.

=head2 Symbol Tables
X<symbol table> X<stash> X<%::> X<%main::> X<typeglob> X<glob> X<alias>

The symbol table for a package happens to be stored in the hash of that
name with two colons appended.  The main symbol table's name is thus
C<%main::>, or C<%::> for short.  Likewise the symbol table for the nested
package mentioned earlier is named C<%OUTER::INNER::>.

The value in each entry of the hash is what you are referring to when you
use the C<*name> typeglob notation.  In fact, the following have the same
effect, though the first is more efficient because it does the symbol
table lookups at compile time:

    local *main::foo    = *main::bar;
    local $main::{foo}  = $main::{bar};

(Be sure to note the B<vast> difference between the second line above
and C<local $main::foo = $main::bar>. The former is accessing the hash
C<%main::>, which is the symbol table of package C<main>. The latter is
simply assigning scalar C<$bar> in package C<main> to scalar C<$foo> of
the same package.)

You can use this to print out all the variables in a package, for
instance.  The standard but antiquated F<dumpvar.pl> library and
the CPAN module Devel::Symdump make use of this.

Assignment to a typeglob performs an aliasing operation, i.e.,

    *dick = *richard;

causes variables, subroutines, formats, and file and directory handles
accessible via the identifier C<richard> also to be accessible via the
identifier C<dick>.  If you want to alias only a particular variable or
subroutine, assign a reference instead:

    *dick = \$richard;

Which makes $richard and $dick the same variable, but leaves
@richard and @dick as separate arrays.  Tricky, eh?

There is one subtle difference between the following statements:

    *foo = *bar;
    *foo = \$bar;

C<*foo = *bar> makes the typeglobs themselves synonymous while
C<*foo = \$bar> makes the SCALAR portions of two distinct typeglobs
refer to the same scalar value. This means that the following code:

    $bar = 1;
    *foo = \$bar;       # Make $foo an alias for $bar

        local $bar = 2; # Restrict changes to block
        print $foo;     # Prints '1'!

Would print '1', because C<$foo> holds a reference to the I<original>
C<$bar> -- the one that was stuffed away by C<local()> and which will be
restored when the block ends. Because variables are accessed through the
typeglob, you can use C<*foo = *bar> to create an alias which can be
localized. (But be aware that this means you can't have a separate
C<@foo> and C<@bar>, etc.)

What makes all of this important is that the Exporter module uses glob
aliasing as the import/export mechanism. Whether or not you can properly
localize a variable that has been exported from a module depends on how
it was exported:

    @EXPORT = qw($FOO); # Usual form, can't be localized
    @EXPORT = qw(*FOO); # Can be localized

You can work around the first case by using the fully qualified name
(C<$Package::FOO>) where you need a local value, or by overriding it
by saying C<*FOO = *Package::FOO> in your script.

The C<*x = \$y> mechanism may be used to pass and return cheap references
into or from subroutines if you don't want to copy the whole
thing.  It only works when assigning to dynamic variables, not

    %some_hash = ();			# can't be my()
    *some_hash = fn( \%another_hash );
    sub fn {
	local *hashsym = shift;
	# now use %hashsym normally, and you
	# will affect the caller's %another_hash
	my %nhash = (); # do what you want
	return \%nhash;

On return, the reference will overwrite the hash slot in the
symbol table specified by the *some_hash typeglob.  This
is a somewhat tricky way of passing around references cheaply
when you don't want to have to remember to dereference variables

Another use of symbol tables is for making "constant" scalars.
X<constant> X<scalar, constant>

    *PI = \3.14159265358979;

Now you cannot alter C<$PI>, which is probably a good thing all in all.
This isn't the same as a constant subroutine, which is subject to
optimization at compile-time.  A constant subroutine is one prototyped
to take no arguments and to return a constant expression.  See
L<perlsub> for details on these.  The C<use constant> pragma is a
convenient shorthand for these.

You can say C<*foo{PACKAGE}> and C<*foo{NAME}> to find out what name and
package the *foo symbol table entry comes from.  This may be useful
in a subroutine that gets passed typeglobs as arguments:

    sub identify_typeglob {
        my $glob = shift;
        print 'You gave me ', *{$glob}{PACKAGE}, '::', *{$glob}{NAME}, "\n";
    identify_typeglob *foo;
    identify_typeglob *bar::baz;

This prints

    You gave me main::foo
    You gave me bar::baz

The C<*foo{THING}> notation can also be used to obtain references to the
individual elements of *foo.  See L<perlref>.

Subroutine definitions (and declarations, for that matter) need
not necessarily be situated in the package whose symbol table they
occupy.  You can define a subroutine outside its package by
explicitly qualifying the name of the subroutine:

    package main;
    sub Some_package::foo { ... }   # &foo defined in Some_package

This is just a shorthand for a typeglob assignment at compile time:

    BEGIN { *Some_package::foo = sub { ... } }

and is I<not> the same as writing:

	package Some_package;
	sub foo { ... }

In the first two versions, the body of the subroutine is
lexically in the main package, I<not> in Some_package. So
something like this:

    package main;

    $Some_package::name = "fred";
    $main::name = "barney";

    sub Some_package::foo {
	print "in ", __PACKAGE__, ": \$name is '$name'\n";



    in main: $name is 'barney'

rather than:

    in Some_package: $name is 'fred'

This also has implications for the use of the SUPER:: qualifier
(see L<perlobj>).


Four specially named code blocks are executed at the beginning and at the end
of a running Perl program.  These are the C<BEGIN>, C<CHECK>, C<INIT>, and
C<END> blocks.

These code blocks can be prefixed with C<sub> to give the appearance of a
subroutine (although this is not considered good style).  One should note
that these code blocks don't really exist as named subroutines (despite
their appearance). The thing that gives this away is the fact that you can
have B<more than one> of these code blocks in a program, and they will get
B<all> executed at the appropriate moment.  So you can't execute any of
these code blocks by name.

A C<BEGIN> code block is executed as soon as possible, that is, the moment
it is completely defined, even before the rest of the containing file (or
string) is parsed.  You may have multiple C<BEGIN> blocks within a file (or
eval'ed string) -- they will execute in order of definition.  Because a C<BEGIN>
code block executes immediately, it can pull in definitions of subroutines
and such from other files in time to be visible to the rest of the compile
and run time.  Once a C<BEGIN> has run, it is immediately undefined and any
code it used is returned to Perl's memory pool.

It should be noted that C<BEGIN> code blocks B<are> executed inside string
C<eval()>'s.  The C<CHECK> and C<INIT> code blocks are B<not> executed inside
a string eval, which e.g. can be a problem in a mod_perl environment.

An C<END> code block is executed as late as possible, that is, after
perl has finished running the program and just before the interpreter
is being exited, even if it is exiting as a result of a die() function.
(But not if it's morphing into another program via C<exec>, or
being blown out of the water by a signal--you have to trap that yourself
(if you can).)  You may have multiple C<END> blocks within a file--they
will execute in reverse order of definition; that is: last in, first
out (LIFO).  C<END> blocks are not executed when you run perl with the
C<-c> switch, or if compilation fails.

Note that C<END> code blocks are B<not> executed at the end of a string
C<eval()>: if any C<END> code blocks are created in a string C<eval()>,
they will be executed just as any other C<END> code block of that package
in LIFO order just before the interpreter is being exited.

Inside an C<END> code block, C<$?> contains the value that the program is
going to pass to C<exit()>.  You can modify C<$?> to change the exit
value of the program.  Beware of changing C<$?> by accident (e.g. by
running something via C<system>).

C<CHECK> and C<INIT> code blocks are useful to catch the transition between
the compilation phase and the execution phase of the main program.

C<CHECK> code blocks are run just after the B<initial> Perl compile phase ends
and before the run time begins, in LIFO order.  C<CHECK> code blocks are used
in the Perl compiler suite to save the compiled state of the program.

C<INIT> blocks are run just before the Perl runtime begins execution, in
"first in, first out" (FIFO) order. For example, the code generators
documented in L<perlcc> make use of C<INIT> blocks to initialize and
resolve pointers to XSUBs.

When you use the B<-n> and B<-p> switches to Perl, C<BEGIN> and
C<END> work just as they do in B<awk>, as a degenerate case.
Both C<BEGIN> and C<CHECK> blocks are run when you use the B<-c>
switch for a compile-only syntax check, although your main code
is not.

The B<begincheck> program makes it all clear, eventually:


  # begincheck

  print         " 8. Ordinary code runs at runtime.\n";

  END { print   "14.   So this is the end of the tale.\n" }
  INIT { print  " 5. INIT blocks run FIFO just before runtime.\n" }
  CHECK { print " 4.   So this is the fourth line.\n" }

  print         " 9.   It runs in order, of course.\n";

  BEGIN { print " 1. BEGIN blocks run FIFO during compilation.\n" }
  END { print   "13.   Read perlmod for the rest of the story.\n" }
  CHECK { print " 3. CHECK blocks run LIFO at compilation's end.\n" }
  INIT { print  " 6.   Run this again, using Perl's -c switch.\n" }

  print         "10.   This is anti-obfuscated code.\n";

  END { print   "12. END blocks run LIFO at quitting time.\n" }
  BEGIN { print " 2.   So this line comes out second.\n" }
  INIT { print  " 7.   You'll see the difference right away.\n" }

  print         "11.   It merely _looks_ like it should be confusing.\n";


=head2 Perl Classes
X<class> X<@ISA>

There is no special class syntax in Perl, but a package may act
as a class if it provides subroutines to act as methods.  Such a
package may also derive some of its methods from another class (package)
by listing the other package name(s) in its global @ISA array (which
must be a package global, not a lexical).

For more on this, see L<perltoot> and L<perlobj>.

=head2 Perl Modules

A module is just a set of related functions in a library file, i.e.,
a Perl package with the same name as the file.  It is specifically
designed to be reusable by other modules or programs.  It may do this
by providing a mechanism for exporting some of its symbols into the
symbol table of any package using it, or it may function as a class
definition and make its semantics available implicitly through
method calls on the class and its objects, without explicitly
exporting anything.  Or it can do a little of both.

For example, to start a traditional, non-OO module called Some::Module,
create a file called F<Some/Module.pm> and start with this template:

    package Some::Module;  # assumes Some/Module.pm

    use strict;
    use warnings;

    BEGIN {
        use Exporter   ();

        # set the version for version checking
        $VERSION     = 1.00;
        # if using RCS/CVS, this may be preferred
        $VERSION = sprintf "%d.%03d", q$Revision: 1.2 $ =~ /(\d+)/g;

        @ISA         = qw(Exporter);
        @EXPORT      = qw(&func1 &func2 &func4);
        %EXPORT_TAGS = ( );     # eg: TAG => [ qw!name1 name2! ],

        # your exported package globals go here,
        # as well as any optionally exported functions
        @EXPORT_OK   = qw($Var1 %Hashit &func3);
    our @EXPORT_OK;

    # exported package globals go here
    our $Var1;
    our %Hashit;

    # non-exported package globals go here
    our @more;
    our $stuff;

    # initialize package globals, first exported ones
    $Var1   = '';
    %Hashit = ();

    # then the others (which are still accessible as $Some::Module::stuff)
    $stuff  = '';
    @more   = ();

    # all file-scoped lexicals must be created before
    # the functions below that use them.

    # file-private lexicals go here
    my $priv_var    = '';
    my %secret_hash = ();

    # here's a file-private function as a closure,
    # callable as &$priv_func;  it cannot be prototyped.
    my $priv_func = sub {
        # stuff goes here.

    # make all your functions, whether exported or not;
    # remember to put something interesting in the {} stubs
    sub func1      {}    # no prototype
    sub func2()    {}    # proto'd void
    sub func3($$)  {}    # proto'd to 2 scalars

    # this one isn't exported, but could be called!
    sub func4(\%)  {}    # proto'd to 1 hash ref

    END { }       # module clean-up code here (global destructor)


    1;  # don't forget to return a true value from the file

Then go on to declare and use your variables in functions without
any qualifications.  See L<Exporter> and the L<perlmodlib> for
details on mechanics and style issues in module creation.

Perl modules are included into your program by saying

    use Module;


    use Module LIST;

This is exactly equivalent to

    BEGIN { require Module; import Module; }


    BEGIN { require Module; import Module LIST; }

As a special case

    use Module ();

is exactly equivalent to

    BEGIN { require Module; }

All Perl module files have the extension F<.pm>.  The C<use> operator
assumes this so you don't have to spell out "F<Module.pm>" in quotes.
This also helps to differentiate new modules from old F<.pl> and
F<.ph> files.  Module names are also capitalized unless they're
functioning as pragmas; pragmas are in effect compiler directives,
and are sometimes called "pragmatic modules" (or even "pragmata"
if you're a classicist).

The two statements:

    require SomeModule;
    require "SomeModule.pm";

differ from each other in two ways.  In the first case, any double
colons in the module name, such as C<Some::Module>, are translated
into your system's directory separator, usually "/".   The second
case does not, and would have to be specified literally.  The other
difference is that seeing the first C<require> clues in the compiler
that uses of indirect object notation involving "SomeModule", as
in C<$ob = purge SomeModule>, are method calls, not function calls.
(Yes, this really can make a difference.)

Because the C<use> statement implies a C<BEGIN> block, the importing
of semantics happens as soon as the C<use> statement is compiled,
before the rest of the file is compiled.  This is how it is able
to function as a pragma mechanism, and also how modules are able to
declare subroutines that are then visible as list or unary operators for
the rest of the current file.  This will not work if you use C<require>
instead of C<use>.  With C<require> you can get into this problem:

    require Cwd;		# make Cwd:: accessible
    $here = Cwd::getcwd();

    use Cwd;			# import names from Cwd::
    $here = getcwd();

    require Cwd;	    	# make Cwd:: accessible
    $here = getcwd(); 		# oops! no main::getcwd()

In general, C<use Module ()> is recommended over C<require Module>,
because it determines module availability at compile time, not in the
middle of your program's execution.  An exception would be if two modules
each tried to C<use> each other, and each also called a function from
that other module.  In that case, it's easy to use C<require> instead.

Perl packages may be nested inside other package names, so we can have
package names containing C<::>.  But if we used that package name
directly as a filename it would make for unwieldy or impossible
filenames on some systems.  Therefore, if a module's name is, say,
C<Text::Soundex>, then its definition is actually found in the library
file F<Text/Soundex.pm>.

Perl modules always have a F<.pm> file, but there may also be
dynamically linked executables (often ending in F<.so>) or autoloaded
subroutine definitions (often ending in F<.al>) associated with the
module.  If so, these will be entirely transparent to the user of
the module.  It is the responsibility of the F<.pm> file to load
(or arrange to autoload) any additional functionality.  For example,
although the POSIX module happens to do both dynamic loading and
autoloading, the user can say just C<use POSIX> to get it all.

=head2 Making your module threadsafe
X<threadsafe> X<thread safe>
X<module, threadsafe> X<module, thread safe>
X<CLONE> X<CLONE_SKIP> X<thread> X<threads> X<ithread>

Since 5.6.0, Perl has had support for a new type of threads called
interpreter threads (ithreads). These threads can be used explicitly
and implicitly.

Ithreads work by cloning the data tree so that no data is shared
between different threads. These threads can be used by using the C<threads>
module or by doing fork() on win32 (fake fork() support). When a
thread is cloned all Perl data is cloned, however non-Perl data cannot
be cloned automatically.  Perl after 5.7.2 has support for the C<CLONE>
special subroutine.  In C<CLONE> you can do whatever
you need to do,
like for example handle the cloning of non-Perl data, if necessary.
C<CLONE> will be called once as a class method for every package that has it
defined (or inherits it).  It will be called in the context of the new thread,
so all modifications are made in the new area.  Currently CLONE is called with
no parameters other than the invocant package name, but code should not assume
that this will remain unchanged, as it is likely that in future extra parameters
will be passed in to give more information about the state of cloning.

If you want to CLONE all objects you will need to keep track of them per
package. This is simply done using a hash and Scalar::Util::weaken().

Perl after 5.8.7 has support for the C<CLONE_SKIP> special subroutine.
Like C<CLONE>, C<CLONE_SKIP> is called once per package; however, it is
called just before cloning starts, and in the context of the parent
thread. If it returns a true value, then no objects of that class will
be cloned; or rather, they will be copied as unblessed, undef values.
This provides a simple mechanism for making a module threadsafe; just add
C<sub CLONE_SKIP { 1 }> at the top of the class, and C<DESTROY()> will be
now only be called once per object. Of course, if the child thread needs
to make use of the objects, then a more sophisticated approach is

Like C<CLONE>, C<CLONE_SKIP> is currently called with no parameters other
than the invocant package name, although that may change. Similarly, to
allow for future expansion, the return value should be a single C<0> or
C<1> value.

=head1 SEE ALSO

See L<perlmodlib> for general style issues related to building Perl
modules and classes, as well as descriptions of the standard library
and CPAN, L<Exporter> for how Perl's standard import/export mechanism
works, L<perltoot> and L<perltooc> for an in-depth tutorial on
creating classes, L<perlobj> for a hard-core reference document on
objects, L<perlsub> for an explanation of functions and scoping,
and L<perlxstut> and L<perlguts> for more information on writing
extension modules.

--- NEW FILE: perldebug.pod ---
=head1 NAME
X<debug> X<debugger>

perldebug - Perl debugging


First of all, have you tried using the B<-w> switch?

If you're new to the Perl debugger, you may prefer to read
L<perldebtut>, which is a tutorial introduction to the debugger .

=head1 The Perl Debugger

If you invoke Perl with the B<-d> switch, your script runs under the
Perl source debugger.  This works like an interactive Perl
environment, prompting for debugger commands that let you examine
source code, set breakpoints, get stack backtraces, change the values of
[...1106 lines suppressed...]
have to type the path or C<which $scriptname>.

  $ perl -Sd foo.pl

=head1 BUGS

You cannot get stack frame information or in any fashion debug functions
that were not compiled by Perl, such as those from C or C++ extensions.

If you alter your @_ arguments in a subroutine (such as with C<shift>
or C<pop>), the stack backtrace will not show the original values.

The debugger does not currently work in conjunction with the B<-W>
command-line switch, because it itself is not free of warnings.

If you're in a slow syscall (like C<wait>ing, C<accept>ing, or C<read>ing
from your keyboard or a socket) and haven't set up your own C<$SIG{INT}>
handler, then you won't be able to CTRL-C your way back to the debugger,
because the debugger's own C<$SIG{INT}> handler doesn't understand that
it needs to raise an exception to longjmp(3) out of slow syscalls.

--- NEW FILE: perlintro.pod ---
=head1 NAME

perlintro -- a brief introduction and overview of Perl


This document is intended to give you a quick overview of the Perl
programming language, along with pointers to further documentation.  It
is intended as a "bootstrap" guide for those who are new to the
language, and provides just enough information for you to be able to
read other peoples' Perl and understand roughly what it's doing, or
write your own simple scripts.

This introductory document does not aim to be complete.  It does not
even aim to be entirely accurate.  In some cases perfection has been
sacrificed in the goal of getting the general idea across.  You are
I<strongly> advised to follow this introduction with more information
from the full Perl manual, the table of contents to which can be found
in L<perltoc>.

Throughout this document you'll see references to other parts of the 
Perl documentation.  You can read that documentation using the C<perldoc>
command or whatever method you're using to read this document.

=head2 What is Perl?

Perl is a general-purpose programming language originally developed for 
text manipulation and now used for a wide range of tasks including 
system administration, web development, network programming, GUI 
development, and more.

The language is intended to be practical (easy to use, efficient,
complete) rather than beautiful (tiny, elegant, minimal).  Its major
features are that it's easy to use, supports both procedural and
object-oriented (OO) programming, has powerful built-in support for text
processing, and has one of the world's most impressive collections of
third-party modules.

Different definitions of Perl are given in L<perl>, L<perlfaq1> and 
no doubt other places.  From this we can determine that Perl is different 
things to different people, but that lots of people think it's at least
worth writing about.

=head2 Running Perl programs

To run a Perl program from the Unix command line:

    perl progname.pl

Alternatively, put this as the first line of your script:

    #!/usr/bin/env perl

... and run the script as C</path/to/script.pl>.  Of course, it'll need
to be executable first, so C<chmod 755 script.pl> (under Unix).

For more information, including instructions for other platforms such as
Windows and Mac OS, read L<perlrun>.

=head2 Basic syntax overview

A Perl script or program consists of one or more statements.  These
statements are simply written in the script in a straightforward
fashion.  There is no need to have a C<main()> function or anything of
that kind.

Perl statements end in a semi-colon:

    print "Hello, world";

Comments start with a hash symbol and run to the end of the line

    # This is a comment

Whitespace is irrelevant:

        "Hello, world"

... except inside quoted strings:

    # this would print with a linebreak in the middle
    print "Hello

Double quotes or single quotes may be used around literal strings:

    print "Hello, world";
    print 'Hello, world';

However, only double quotes "interpolate" variables and special
characters such as newlines (C<\n>):

    print "Hello, $name\n";     # works fine
    print 'Hello, $name\n';     # prints $name\n literally

Numbers don't need quotes around them:

    print 42;

You can use parentheses for functions' arguments or omit them
according to your personal taste.  They are only required 
occasionally to clarify issues of precedence.

    print("Hello, world\n");
    print "Hello, world\n";

More detailed information about Perl syntax can be found in L<perlsyn>.

=head2 Perl variable types

Perl has three main variable types: scalars, arrays, and hashes.

=over 4

=item Scalars

A scalar represents a single value:

    my $animal = "camel";
    my $answer = 42;

Scalar values can be strings, integers or floating point numbers, and Perl 
will automatically convert between them as required.  There is no need 
to pre-declare your variable types.

Scalar values can be used in various ways:

    print $animal;
    print "The animal is $animal\n";
    print "The square of $answer is ", $answer * $answer, "\n";

There are a number of "magic" scalars with names that look like
punctuation or line noise.  These special variables are used for all
kinds of purposes, and are documented in L<perlvar>.  The only one you
need to know about for now is C<$_> which is the "default variable".
It's used as the default argument to a number of functions in Perl, and
it's set implicitly by certain looping constructs.  

    print;          # prints contents of $_ by default

=item Arrays

An array represents a list of values:

    my @animals = ("camel", "llama", "owl");
    my @numbers = (23, 42, 69);
    my @mixed   = ("camel", 42, 1.23);

Arrays are zero-indexed.  Here's how you get at elements in an array:

    print $animals[0];              # prints "camel"
    print $animals[1];              # prints "llama"

The special variable C<$#array> tells you the index of the last element 
of an array:

    print $mixed[$#mixed];       # last element, prints 1.23

You might be tempted to use C<$#array + 1> to tell you how many items there 
are in an array.  Don't bother.  As it happens, using C<@array> where Perl
expects to find a scalar value ("in scalar context") will give you the number
of elements in the array:

    if (@animals < 5) { ... }

The elements we're getting from the array start with a C<$> because 
we're getting just a single value out of the array -- you ask for a scalar, 
you get a scalar.

To get multiple values from an array:

    @animals[0,1];                  # gives ("camel", "llama");
    @animals[0..2];                 # gives ("camel", "llama", "owl");
    @animals[1..$#animals];         # gives all except the first element

This is called an "array slice".

You can do various useful things to lists:

    my @sorted    = sort @animals;
    my @backwards = reverse @numbers;

There are a couple of special arrays too, such as C<@ARGV> (the command
line arguments to your script) and C<@_> (the arguments passed to a
subroutine).  These are documented in L<perlvar>.

=item Hashes

A hash represents a set of key/value pairs:

    my %fruit_color = ("apple", "red", "banana", "yellow");

You can use whitespace and the C<< => >> operator to lay them out more

    my %fruit_color = (
        apple  => "red",
        banana => "yellow",

To get at hash elements:

    $fruit_color{"apple"};           # gives "red"

You can get at lists of keys and values with C<keys()> and

    my @fruits = keys %fruit_colors;
    my @colors = values %fruit_colors;

Hashes have no particular internal order, though you can sort the keys
and loop through them.

Just like special scalars and arrays, there are also special hashes.  
The most well known of these is C<%ENV> which contains environment
variables.  Read all about it (and other special variables) in


Scalars, arrays and hashes are documented more fully in L<perldata>.

More complex data types can be constructed using references, which allow
you to build lists and hashes within lists and hashes.

A reference is a scalar value and can refer to any other Perl data
type. So by storing a reference as the value of an array or hash
element, you can easily create lists and hashes within lists and    
hashes. The following example shows a 2 level hash of hash
structure using anonymous hash references.

    my $variables = {
        scalar  =>  { 
                     description => "single item",
                     sigil => '$',
        array   =>  {
                     description => "ordered list of items",
                     sigil => '@',
        hash    =>  {
                     description => "key/value pairs",
                     sigil => '%',

    print "Scalars begin with a $variables->{'scalar'}->{'sigil'}\n";

Exhaustive information on the topic of references can be found in
L<perlreftut>, L<perllol>, L<perlref> and L<perldsc>.

=head2 Variable scoping

Throughout the previous section all the examples have used the syntax:

    my $var = "value";

The C<my> is actually not required; you could just use:

    $var = "value";

However, the above usage will create global variables throughout your
program, which is bad programming practice.  C<my> creates lexically
scoped variables instead.  The variables are scoped to the block
(i.e. a bunch of statements surrounded by curly-braces) in which they
are defined.

    my $a = "foo";
    if ($some_condition) {
        my $b = "bar";
        print $a;           # prints "foo"
        print $b;           # prints "bar"
    print $a;               # prints "foo"
    print $b;               # prints nothing; $b has fallen out of scope

Using C<my> in combination with a C<use strict;> at the top of
your Perl scripts means that the interpreter will pick up certain common 
programming errors.  For instance, in the example above, the final
C<print $b> would cause a compile-time error and prevent you from
running the program.  Using C<strict> is highly recommended.

=head2 Conditional and looping constructs

Perl has most of the usual conditional and looping constructs except for
case/switch (but if you really want it, there is a Switch module in Perl
5.8 and newer, and on CPAN. See the section on modules, below, for more
information about modules and CPAN).

The conditions can be any Perl expression.  See the list of operators in
the next section for information on comparison and boolean logic operators, 
which are commonly used in conditional statements.

=over 4

=item if

    if ( condition ) {
    } elsif ( other condition ) {
    } else {

There's also a negated version of it:

    unless ( condition ) {

This is provided as a more readable version of C<if (!I<condition>)>.

Note that the braces are required in Perl, even if you've only got one
line in the block.  However, there is a clever way of making your one-line
conditional blocks more English like:

    # the traditional way
    if ($zippy) {
        print "Yow!";

    # the Perlish post-condition way
    print "Yow!" if $zippy;
    print "We have no bananas" unless $bananas;

=item while

    while ( condition ) {

There's also a negated version, for the same reason we have C<unless>:

    until ( condition ) {

You can also use C<while> in a post-condition:

    print "LA LA LA\n" while 1;          # loops forever

=item for

Exactly like C:

    for ($i=0; $i <= $max; $i++) {

The C style for loop is rarely needed in Perl since Perl provides
the more friendly list scanning C<foreach> loop.

=item foreach

    foreach (@array) {
        print "This element is $_\n";

    # you don't have to use the default $_ either...
    foreach my $key (keys %hash) {
        print "The value of $key is $hash{$key}\n";


For more detail on looping constructs (and some that weren't mentioned in
this overview) see L<perlsyn>.

=head2 Builtin operators and functions

Perl comes with a wide selection of builtin functions.  Some of the ones
we've already seen include C<print>, C<sort> and C<reverse>.  A list of
them is given at the start of L<perlfunc> and you can easily read 
about any given function by using C<perldoc -f I<functionname>>.

Perl operators are documented in full in L<perlop>, but here are a few
of the most common ones:

=over 4

=item Arithmetic

    +   addition
    -   subtraction
    *   multiplication
    /   division

=item Numeric comparison

    ==  equality
    !=  inequality
    <   less than
    >   greater than
    <=  less than or equal
    >=  greater than or equal

=item String comparison

    eq  equality
    ne  inequality
    lt  less than
    gt  greater than
    le  less than or equal
    ge  greater than or equal

(Why do we have separate numeric and string comparisons?  Because we don't 
have special variable types, and Perl needs to know whether to sort 
numerically (where 99 is less than 100) or alphabetically (where 100 comes
before 99).

=item Boolean logic

    &&  and
    ||  or
    !   not

(C<and>, C<or> and C<not> aren't just in the above table as descriptions 
of the operators -- they're also supported as operators in their own
right.  They're more readable than the C-style operators, but have 
different precedence to C<&&> and friends.  Check L<perlop> for more 

=item Miscellaneous

    =   assignment
    .   string concatenation
    x   string multiplication
    ..  range operator (creates a list of numbers)


Many operators can be combined with a C<=> as follows:

    $a += 1;        # same as $a = $a + 1
    $a -= 1;        # same as $a = $a - 1
    $a .= "\n";     # same as $a = $a . "\n";

=head2 Files and I/O

You can open a file for input or output using the C<open()> function.
It's documented in extravagant detail in L<perlfunc> and L<perlopentut>, 
but in short:

    open(INFILE,  "input.txt")   or die "Can't open input.txt: $!";
    open(OUTFILE, ">output.txt") or die "Can't open output.txt: $!";
    open(LOGFILE, ">>my.log")    or die "Can't open logfile: $!";

You can read from an open filehandle using the C<< <> >> operator.  In
scalar context it reads a single line from the filehandle, and in list
context it reads the whole file in, assigning each line to an element of
the list:

    my $line  = <INFILE>;
    my @lines = <INFILE>;

Reading in the whole file at one time is called slurping. It can
be useful but it may be a memory hog. Most text file processing
can be done a line at a time with Perl's looping constructs.

The C<< <> >> operator is most often seen in a C<while> loop:

    while (<INFILE>) {     # assigns each line in turn to $_ 
        print "Just read in this line: $_";

We've already seen how to print to standard output using C<print()>.
However, C<print()> can also take an optional first argument specifying
which filehandle to print to:

    print STDERR "This is your final warning.\n";
    print OUTFILE $record;
    print LOGFILE $logmessage;

When you're done with your filehandles, you should C<close()> them
(though to be honest, Perl will clean up after you if you forget):

    close INFILE;

=head2 Regular expressions

Perl's regular expression support is both broad and deep, and is the
subject of lengthy documentation in L<perlrequick>, L<perlretut>, and
elsewhere.  However, in short:

=over 4

=item Simple matching

    if (/foo/)       { ... }  # true if $_ contains "foo"
    if ($a =~ /foo/) { ... }  # true if $a contains "foo"

The C<//> matching operator is documented in L<perlop>.  It operates on
C<$_> by default, or can be bound to another variable using the C<=~>
binding operator (also documented in L<perlop>).

=item Simple substitution

    s/foo/bar/;               # replaces foo with bar in $_
    $a =~ s/foo/bar/;         # replaces foo with bar in $a
    $a =~ s/foo/bar/g;        # replaces ALL INSTANCES of foo with bar in $a

The C<s///> substitution operator is documented in L<perlop>.

=item More complex regular expressions

You don't just have to match on fixed strings.  In fact, you can match
on just about anything you could dream of by using more complex regular
expressions.  These are documented at great length in L<perlre>, but for
the meantime, here's a quick cheat sheet:

    .                   a single character
    \s                  a whitespace character (space, tab, newline)
    \S                  non-whitespace character
    \d                  a digit (0-9)
    \D                  a non-digit
    \w                  a word character (a-z, A-Z, 0-9, _)
    \W                  a non-word character
    [aeiou]             matches a single character in the given set
    [^aeiou]            matches a single character outside the given set
    (foo|bar|baz)       matches any of the alternatives specified

    ^                   start of string
    $                   end of string

Quantifiers can be used to specify how many of the previous thing you 
want to match on, where "thing" means either a literal character, one 
of the metacharacters listed above, or a group of characters or 
metacharacters in parentheses.

    *                   zero or more of the previous thing
    +                   one or more of the previous thing
    ?                   zero or one of the previous thing
    {3}                 matches exactly 3 of the previous thing
    {3,6}               matches between 3 and 6 of the previous thing
    {3,}                matches 3 or more of the previous thing

Some brief examples:

    /^\d+/              string starts with one or more digits
    /^$/                nothing in the string (start and end are adjacent)
    /(\d\s){3}/         a three digits, each followed by a whitespace 
                        character (eg "3 4 5 ")
    /(a.)+/             matches a string in which every odd-numbered letter 
                        is a (eg "abacadaf")

    # This loop reads from STDIN, and prints non-blank lines:
    while (<>) {
        next if /^$/;

=item Parentheses for capturing

As well as grouping, parentheses serve a second purpose.  They can be 
used to capture the results of parts of the regexp match for later use.
The results end up in C<$1>, C<$2> and so on.

    # a cheap and nasty way to break an email address up into parts

    if ($email =~ /([^@]+)@(.+)/) {
        print "Username is $1\n";
        print "Hostname is $2\n";

=item Other regexp features

Perl regexps also support backreferences, lookaheads, and all kinds of
other complex details.  Read all about them in L<perlrequick>,
L<perlretut>, and L<perlre>.


=head2 Writing subroutines

Writing subroutines is easy:

    sub log {
        my $logmessage = shift;
        print LOGFILE $logmessage;

What's that C<shift>?  Well, the arguments to a subroutine are available
to us as a special array called C<@_> (see L<perlvar> for more on that).
The default argument to the C<shift> function just happens to be C<@_>.
So C<my $logmessage = shift;> shifts the first item off the list of
arguments and assigns it to C<$logmessage>. 

We can manipulate C<@_> in other ways too:

    my ($logmessage, $priority) = @_;       # common
    my $logmessage = $_[0];                 # uncommon, and ugly

Subroutines can also return values:

    sub square {
        my $num = shift;
        my $result = $num * $num;
        return $result;

For more information on writing subroutines, see L<perlsub>.

=head2 OO Perl

OO Perl is relatively simple and is implemented using references which
know what sort of object they are based on Perl's concept of packages.
However, OO Perl is largely beyond the scope of this document.  
Read L<perlboot>, L<perltoot>, L<perltooc> and L<perlobj>.

As a beginning Perl programmer, your most common use of OO Perl will be
in using third-party modules, which are documented below.

=head2 Using Perl modules

Perl modules provide a range of features to help you avoid reinventing
the wheel, and can be downloaded from CPAN ( http://www.cpan.org/ ).  A
number of popular modules are included with the Perl distribution

Categories of modules range from text manipulation to network protocols
to database integration to graphics.  A categorized list of modules is
also available from CPAN.

To learn how to install modules you download from CPAN, read

To learn how to use a particular module, use C<perldoc I<Module::Name>>.
Typically you will want to C<use I<Module::Name>>, which will then give
you access to exported functions or an OO interface to the module.

L<perlfaq> contains questions and answers related to many common
tasks, and often provides suggestions for good CPAN modules to use.

L<perlmod> describes Perl modules in general.  L<perlmodlib> lists the
modules which came with your Perl installation.

If you feel the urge to write Perl modules, L<perlnewmod> will give you
good advice.

=head1 AUTHOR

Kirrily "Skud" Robert <skud at cpan.org>

--- NEW FILE: perlunicode.pod ---
=head1 NAME

perlunicode - Unicode support in Perl


=head2 Important Caveats

Unicode support is an extensive requirement. While Perl does not
implement the Unicode standard or the accompanying technical reports
from cover to cover, Perl does support many Unicode features.

=over 4

=item Input and Output Layers

Perl knows when a filehandle uses Perl's internal Unicode encodings
(UTF-8, or UTF-EBCDIC if in EBCDIC) if the filehandle is opened with
the ":utf8" layer.  Other encodings can be converted to Perl's
[...1462 lines suppressed...]

=item *

A large scalar that you know can only contain ASCII

Scalars that contain only ASCII and are marked as UTF-8 are sometimes
a drag to your program. If you recognize such a situation, just remove
the UTF-8 flag:

  utf8::downgrade($val) if $] > 5.007;


=head1 SEE ALSO

L<perluniintro>, L<encoding>, L<Encode>, L<open>, L<utf8>, L<bytes>,
L<perlretut>, L<perlvar/"${^UNICODE}">


--- NEW FILE: perllexwarn.pod ---
=head1 NAME
X<warning, lexical> X<warnings> X<warning>

perllexwarn - Perl Lexical Warnings


The C<use warnings> pragma is a replacement for both the command line
flag B<-w> and the equivalent Perl variable, C<$^W>.

The pragma works just like the existing "strict" pragma.
This means that the scope of the warning pragma is limited to the
enclosing block. It also means that the pragma setting will not
leak across files (via C<use>, C<require> or C<do>). This allows
authors to independently define the degree of warning checks that will
be applied to their module.

By default, optional warnings are disabled, so any legacy code that
doesn't attempt to control the warnings will work unchanged.

All warnings are enabled in a block by either of these:

    use warnings;
    use warnings 'all';

Similarly all warnings are disabled in a block by either of these:

    no warnings;
    no warnings 'all';

For example, consider the code below:

    use warnings;
    my @a;
        no warnings;
	my $b = @a[0];
    my $c = @a[0];

The code in the enclosing block has warnings enabled, but the inner
block has them disabled. In this case that means the assignment to the
scalar C<$c> will trip the C<"Scalar value @a[0] better written as $a[0]">
warning, but the assignment to the scalar C<$b> will not.

=head2 Default Warnings and Optional Warnings

Before the introduction of lexical warnings, Perl had two classes of
warnings: mandatory and optional. 

As its name suggests, if your code tripped a mandatory warning, you
would get a warning whether you wanted it or not.
For example, the code below would always produce an C<"isn't numeric">
warning about the "2:".

    my $a = "2:" + 3;

With the introduction of lexical warnings, mandatory warnings now become
I<default> warnings. The difference is that although the previously
mandatory warnings are still enabled by default, they can then be
subsequently enabled or disabled with the lexical warning pragma. For
example, in the code below, an C<"isn't numeric"> warning will only
be reported for the C<$a> variable.

    my $a = "2:" + 3;
    no warnings;
    my $b = "2:" + 3;

Note that neither the B<-w> flag or the C<$^W> can be used to
disable/enable default warnings. They are still mandatory in this case.

=head2 What's wrong with B<-w> and C<$^W>

Although very useful, the big problem with using B<-w> on the command
line to enable warnings is that it is all or nothing. Take the typical
scenario when you are writing a Perl program. Parts of the code you
will write yourself, but it's very likely that you will make use of
pre-written Perl modules. If you use the B<-w> flag in this case, you
end up enabling warnings in pieces of code that you haven't written.

Similarly, using C<$^W> to either disable or enable blocks of code is
fundamentally flawed. For a start, say you want to disable warnings in
a block of code. You might expect this to be enough to do the trick:

         local ($^W) = 0;
	 my $a =+ 2;
	 my $b; chop $b;

When this code is run with the B<-w> flag, a warning will be produced
for the C<$a> line -- C<"Reversed += operator">.

The problem is that Perl has both compile-time and run-time warnings. To
disable compile-time warnings you need to rewrite the code like this:

         BEGIN { $^W = 0 }
	 my $a =+ 2;
	 my $b; chop $b;

The other big problem with C<$^W> is the way you can inadvertently
change the warning setting in unexpected places in your code. For example,
when the code below is run (without the B<-w> flag), the second call
to C<doit> will trip a C<"Use of uninitialized value"> warning, whereas
the first will not.

    sub doit
        my $b; chop $b;


        local ($^W) = 1;

This is a side-effect of C<$^W> being dynamically scoped.

Lexical warnings get around these limitations by allowing finer control
over where warnings can or can't be tripped.

=head2 Controlling Warnings from the Command Line

There are three Command Line flags that can be used to control when
warnings are (or aren't) produced:

=over 5

=item B<-w>

This is  the existing flag. If the lexical warnings pragma is B<not>
used in any of you code, or any of the modules that you use, this flag
will enable warnings everywhere. See L<Backward Compatibility> for
details of how this flag interacts with lexical warnings.

=item B<-W>

If the B<-W> flag is used on the command line, it will enable all warnings
throughout the program regardless of whether warnings were disabled
locally using C<no warnings> or C<$^W =0>. This includes all files that get
included via C<use>, C<require> or C<do>.
Think of it as the Perl equivalent of the "lint" command.

=item B<-X>

Does the exact opposite to the B<-W> flag, i.e. it disables all warnings.


=head2 Backward Compatibility

If you are used with working with a version of Perl prior to the
introduction of lexically scoped warnings, or have code that uses both
lexical warnings and C<$^W>, this section will describe how they interact.

How Lexical Warnings interact with B<-w>/C<$^W>:

=over 5

=item 1.

If none of the three command line flags (B<-w>, B<-W> or B<-X>) that
control warnings is used and neither C<$^W> or the C<warnings> pragma
are used, then default warnings will be enabled and optional warnings
This means that legacy code that doesn't attempt to control the warnings
will work unchanged.

=item 2.

The B<-w> flag just sets the global C<$^W> variable as in 5.005 -- this
means that any legacy code that currently relies on manipulating C<$^W>
to control warning behavior will still work as is. 

=item 3.

Apart from now being a boolean, the C<$^W> variable operates in exactly
the same horrible uncontrolled global way, except that it cannot
disable/enable default warnings.

=item 4.

If a piece of code is under the control of the C<warnings> pragma,
both the C<$^W> variable and the B<-w> flag will be ignored for the
scope of the lexical warning.

=item 5.

The only way to override a lexical warnings setting is with the B<-W>
or B<-X> command line flags.


The combined effect of 3 & 4 is that it will allow code which uses
the C<warnings> pragma to control the warning behavior of $^W-type
code (using a C<local $^W=0>) if it really wants to, but not vice-versa.

=head2 Category Hierarchy
X<warning, categories>

A hierarchy of "categories" have been defined to allow groups of warnings
to be enabled/disabled in isolation.

The current hierarchy is:

  all -+
       +- closure
       +- deprecated
       +- exiting
       +- glob
       +- io -----------+
       |                |
       |                +- closed
       |                |
       |                +- exec
       |                |
       |                +- layer
       |                |
       |                +- newline
       |                |
       |                +- pipe
       |                |
       |                +- unopened
       +- misc
       +- numeric
       +- once
       +- overflow
       +- pack
       +- portable
       +- recursion
       +- redefine
       +- regexp
       +- severe -------+
       |                |
       |                +- debugging
       |                |
       |                +- inplace
       |                |
       |                +- internal
       |                |
       |                +- malloc
       +- signal
       +- substr
       +- syntax -------+
       |                |
       |                +- ambiguous
       |                |
       |                +- bareword
       |                |
       |                +- digit
       |                |
       |                +- parenthesis
       |                |
       |                +- precedence
       |                |
       |                +- printf
       |                |
       |                +- prototype
       |                |
       |                +- qw
       |                |
       |                +- reserved
       |                |
       |                +- semicolon
       +- taint
       +- threads
       +- uninitialized
       +- unpack
       +- untie
       +- utf8
       +- void
       +- y2k

Just like the "strict" pragma any of these categories can be combined

    use warnings qw(void redefine);
    no warnings qw(io syntax untie);

Also like the "strict" pragma, if there is more than one instance of the
C<warnings> pragma in a given scope the cumulative effect is additive. 

    use warnings qw(void); # only "void" warnings enabled
    use warnings qw(io);   # only "void" & "io" warnings enabled
    no warnings qw(void);  # only "io" warnings enabled

To determine which category a specific warning has been assigned to see

Note: In Perl 5.6.1, the lexical warnings category "deprecated" was a
sub-category of the "syntax" category. It is now a top-level category
in its own right.

=head2 Fatal Warnings
X<warning, fatal>

The presence of the word "FATAL" in the category list will escalate any
warnings detected from the categories specified in the lexical scope
into fatal errors. In the code below, the use of C<time>, C<length>
and C<join> can all produce a C<"Useless use of xxx in void context">

    use warnings;


        use warnings FATAL => qw(void);
        length "abc";

    join "", 1,2,3;

    print "done\n";

When run it produces this output

    Useless use of time in void context at fatal line 3.
    Useless use of length in void context at fatal line 7.  

The scope where C<length> is used has escalated the C<void> warnings
category into a fatal error, so the program terminates immediately it
encounters the warning.

To explicitly turn off a "FATAL" warning you just disable the warning
it is associated with.  So, for example, to disable the "void" warning
in the example above, either of these will do the trick:

    no warnings qw(void);
    no warnings FATAL => qw(void);

If you want to downgrade a warning that has been escalated into a fatal
error back to a normal warning, you can use the "NONFATAL" keyword. For
example, the code below will promote all warnings into fatal errors,
except for those in the "syntax" category.

    use warnings FATAL => 'all', NONFATAL => 'syntax';

=head2 Reporting Warnings from a Module
X<warning, reporting> X<warning, registering>

The C<warnings> pragma provides a number of functions that are useful for
module authors. These are used when you want to report a module-specific
warning to a calling module has enabled warnings via the C<warnings>

Consider the module C<MyMod::Abc> below.

    package MyMod::Abc;

    use warnings::register;

    sub open {
        my $path = shift;
        if ($path !~ m#^/#) {
            warnings::warn("changing relative path to /var/abc")
                if warnings::enabled();
            $path = "/var/abc/$path";


The call to C<warnings::register> will create a new warnings category
called "MyMod::abc", i.e. the new category name matches the current
package name. The C<open> function in the module will display a warning
message if it gets given a relative path as a parameter. This warnings
will only be displayed if the code that uses C<MyMod::Abc> has actually
enabled them with the C<warnings> pragma like below.

    use MyMod::Abc;
    use warnings 'MyMod::Abc';

It is also possible to test whether the pre-defined warnings categories are
set in the calling module with the C<warnings::enabled> function. Consider
this snippet of code:

    package MyMod::Abc;

    sub open {
                         "open is deprecated, use new instead");

    sub new

The function C<open> has been deprecated, so code has been included to
display a warning message whenever the calling module has (at least) the
"deprecated" warnings category enabled. Something like this, say.

    use warnings 'deprecated';
    use MyMod::Abc;

Either the C<warnings::warn> or C<warnings::warnif> function should be
used to actually display the warnings message. This is because they can
make use of the feature that allows warnings to be escalated into fatal
errors. So in this case

    use MyMod::Abc;
    use warnings FATAL => 'MyMod::Abc';

the C<warnings::warnif> function will detect this and die after
displaying the warning message.

The three warnings functions, C<warnings::warn>, C<warnings::warnif>
and C<warnings::enabled> can optionally take an object reference in place
of a category name. In this case the functions will use the class name
of the object as the warnings category.

Consider this example:

    package Original;

    no warnings;
    use warnings::register;

    sub new
        my $class = shift;
        bless [], $class;

    sub check
        my $self = shift;
        my $value = shift;

        if ($value % 2 && warnings::enabled($self))
          { warnings::warn($self, "Odd numbers are unsafe") }

    sub doit
        my $self = shift;
        my $value = shift;
        # ...


    package Derived;

    use warnings::register;
    use Original;
    our @ISA = qw( Original );
    sub new
        my $class = shift;
        bless [], $class;


The code below makes use of both modules, but it only enables warnings from 

    use Original;
    use Derived;
    use warnings 'Derived';
    my $a = new Original;
    my $b = new Derived;

When this code is run only the C<Derived> object, C<$b>, will generate
a warning. 

    Odd numbers are unsafe at main.pl line 7

Notice also that the warning is reported at the line where the object is first

=head1 TODO

    The debugger saves and restores C<$^W> at runtime. I haven't checked
    whether the debugger will still work with the lexical warnings
    patch applied.

    I *think* I've got diagnostics to work with the lexical warnings
    patch, but there were design decisions made in diagnostics to work
    around the limitations of C<$^W>. Now that those limitations are gone,
    the module should be revisited.

  document calling the warnings::* functions from XS

=head1 SEE ALSO

L<warnings>, L<perldiag>.

=head1 AUTHOR

Paul Marquess

--- NEW FILE: perlfaq.pod ---
=head1 NAME

perlfaq - frequently asked questions about Perl


The perlfaq comprises several documents that answer the most commonly
asked questions about Perl and Perl programming. It's divided by topic
into nine major sections outlined in this document.

=head2 Where to get the perlfaq

The perlfaq comes with the standard Perl distribution, so if you have Perl
you should have the perlfaq. You should also have the C<perldoc> tool
that let's you read the L<perlfaq>:

	$ perldoc perlfaq

Besides your local system, you can find the perlfaq on the web, including
[...1371 lines suppressed...]

=item *

How do I find out my hostname, domainname, or IP address?

=item *

How do I fetch a news article or the active newsgroups?

=item *

How do I fetch/put an FTP file?

=item *

How can I do RPC in Perl?


--- NEW FILE: perlfaq6.pod ---
=head1 NAME

perlfaq6 - Regular Expressions ($Revision: 1.2 $, $Date: 2006-12-04 17:01:33 $)


This section is surprisingly small because the rest of the FAQ is
littered with answers involving regular expressions.  For example,
decoding a URL and checking whether something is a number are handled
with regular expressions, but those answers are found elsewhere in
this document (in L<perlfaq9>: "How do I decode or create those %-encodings
on the web" and L<perlfaq4>: "How do I determine whether a scalar is
a number/whole/integer/float", to be precise).

=head2 How can I hope to use regular expressions without creating illegible and unmaintainable code?
X<regex, legibility> X<regexp, legibility>
X<regular expression, legibility> X</x>

Three techniques can make regular expressions maintainable and

=over 4

=item Comments Outside the Regex

Describe what you're doing and how you're doing it, using normal Perl

    # turn the line into the first word, a colon, and the
    # number of characters on the rest of the line
    s/^(\w+)(.*)/ lc($1) . ":" . length($2) /meg;

=item Comments Inside the Regex

The C</x> modifier causes whitespace to be ignored in a regex pattern
(except in a character class), and also allows you to use normal
comments there, too.  As you can imagine, whitespace and comments help
a lot.

C</x> lets you turn this:


into this:

    s{ <                    # opening angle bracket
        (?:                 # Non-backreffing grouping paren
             [^>'"] *       # 0 or more things that are neither > nor ' nor "
                |           #    or else
             ".*?"          # a section between double quotes (stingy match)
                |           #    or else
             '.*?'          # a section between single quotes (stingy match)
        ) +                 #   all occurring one or more times
       >                    # closing angle bracket
    }{}gsx;                 # replace with nothing, i.e. delete

It's still not quite so clear as prose, but it is very useful for
describing the meaning of each part of the pattern.

=item Different Delimiters

While we normally think of patterns as being delimited with C</>
characters, they can be delimited by almost any character.  L<perlre>
describes this.  For example, the C<s///> above uses braces as
delimiters.  Selecting another delimiter can avoid quoting the
delimiter within the pattern:

    s/\/usr\/local/\/usr\/share/g;	# bad delimiter choice
    s#/usr/local#/usr/share#g;		# better


=head2 I'm having trouble matching over more than one line.  What's wrong?
X<regex, multiline> X<regexp, multiline> X<regular expression, multiline>

Either you don't have more than one line in the string you're looking
at (probably), or else you aren't using the correct modifier(s) on
your pattern (possibly).

There are many ways to get multiline data into a string.  If you want
it to happen automatically while reading input, you'll want to set $/
(probably to '' for paragraphs or C<undef> for the whole file) to
allow you to read more than one line at a time.

Read L<perlre> to help you decide which of C</s> and C</m> (or both)
you might want to use: C</s> allows dot to include newline, and C</m>
allows caret and dollar to match next to a newline, not just at the
end of the string.  You do need to make sure that you've actually
got a multiline string in there.

For example, this program detects duplicate words, even when they span
line breaks (but not paragraph ones).  For this example, we don't need
C</s> because we aren't using dot in a regular expression that we want
to cross line boundaries.  Neither do we need C</m> because we aren't
wanting caret or dollar to match at any point inside the record next
to newlines.  But it's imperative that $/ be set to something other
than the default, or else we won't actually ever have a multiline
record read in.

    $/ = '';  		# read in more whole paragraph, not just one line
    while ( <> ) {
	while ( /\b([\w'-]+)(\s+\1)+\b/gi ) {  	# word starts alpha
	    print "Duplicate $1 at paragraph $.\n";

Here's code that finds sentences that begin with "From " (which would
be mangled by many mailers):

    $/ = '';  		# read in more whole paragraph, not just one line
    while ( <> ) {
	while ( /^From /gm ) { # /m makes ^ match next to \n
	    print "leading from in paragraph $.\n";

Here's code that finds everything between START and END in a paragraph:

    undef $/;  		# read in whole file, not just one line or paragraph
    while ( <> ) {
	while ( /START(.*?)END/sgm ) { # /s makes . cross line boundaries
	    print "$1\n";

=head2 How can I pull out lines between two patterns that are themselves on different lines?

You can use Perl's somewhat exotic C<..> operator (documented in

    perl -ne 'print if /START/ .. /END/' file1 file2 ...

If you wanted text and not lines, you would use

    perl -0777 -ne 'print "$1\n" while /START(.*?)END/gs' file1 file2 ...

But if you want nested occurrences of C<START> through C<END>, you'll
run up against the problem described in the question in this section
on matching balanced text.

Here's another example of using C<..>:

    while (<>) {
        $in_header =   1  .. /^$/;
        $in_body   = /^$/ .. eof();
	# now choose between them
    } continue {
	reset if eof();		# fix $.

=head2 I put a regular expression into $/ but it didn't work. What's wrong?
X<$/, regexes in> X<$INPUT_RECORD_SEPARATOR, regexes in>
X<$RS, regexes in>

Up to Perl 5.8.0, $/ has to be a string.  This may change in 5.10,
but don't get your hopes up. Until then, you can use these examples
if you really need to do this.

If you have File::Stream, this is easy.

			 use File::Stream;
             my $stream = File::Stream->new(
                  separator => qr/\s*,\s*/,

			 print "$_\n" while <$stream>;

If you don't have File::Stream, you have to do a little more work.

You can use the four argument form of sysread to continually add to
a buffer.  After you add to the buffer, you check if you have a
complete line (using your regular expression).

       local $_ = "";
       while( sysread FH, $_, 8192, length ) {
          while( s/^((?s).*?)your_pattern/ ) {
             my $record = $1;
             # do stuff here.

 You can do the same thing with foreach and a match using the
 c flag and the \G anchor, if you do not mind your entire file
 being in memory at the end.

       local $_ = "";
       while( sysread FH, $_, 8192, length ) {
          foreach my $record ( m/\G((?s).*?)your_pattern/gc ) {
             # do stuff here.
          substr( $_, 0, pos ) = "" if pos;

=head2 How do I substitute case insensitively on the LHS while preserving case on the RHS?
X<replace, case preserving> X<substitute, case preserving>
X<substitution, case preserving> X<s, case preserving>

Here's a lovely Perlish solution by Larry Rosler.  It exploits
properties of bitwise xor on ASCII strings.

    $_= "this is a TEsT case";

    $old = 'test';
    $new = 'success';

     { uc $new | (uc $1 ^ $1) .
	(uc(substr $1, -1) ^ substr $1, -1) x
	    (length($new) - length $1)


And here it is as a subroutine, modeled after the above:

    sub preserve_case($$) {
	my ($old, $new) = @_;
	my $mask = uc $old ^ $old;

	uc $new | $mask .
	    substr($mask, -1) x (length($new) - length($old))

    $a = "this is a TEsT case";
    $a =~ s/(test)/preserve_case($1, "success")/egi;
    print "$a\n";

This prints:

    this is a SUcCESS case

As an alternative, to keep the case of the replacement word if it is
longer than the original, you can use this code, by Jeff Pinyan:

  sub preserve_case {
    my ($from, $to) = @_;
    my ($lf, $lt) = map length, @_;

    if ($lt < $lf) { $from = substr $from, 0, $lt }
    else { $from .= substr $to, $lf }

    return uc $to | ($from ^ uc $from);

This changes the sentence to "this is a SUcCess case."

Just to show that C programmers can write C in any programming language,
if you prefer a more C-like solution, the following script makes the
substitution have the same case, letter by letter, as the original.
(It also happens to run about 240% slower than the Perlish solution runs.)
If the substitution has more characters than the string being substituted,
the case of the last character is used for the rest of the substitution.

    # Original by Nathan Torkington, massaged by Jeffrey Friedl
    sub preserve_case($$)
        my ($old, $new) = @_;
        my ($state) = 0; # 0 = no change; 1 = lc; 2 = uc
        my ($i, $oldlen, $newlen, $c) = (0, length($old), length($new));
        my ($len) = $oldlen < $newlen ? $oldlen : $newlen;

        for ($i = 0; $i < $len; $i++) {
            if ($c = substr($old, $i, 1), $c =~ /[\W\d_]/) {
                $state = 0;
            } elsif (lc $c eq $c) {
                substr($new, $i, 1) = lc(substr($new, $i, 1));
                $state = 1;
            } else {
                substr($new, $i, 1) = uc(substr($new, $i, 1));
                $state = 2;
        # finish up with any remaining new (for when new is longer than old)
        if ($newlen > $oldlen) {
            if ($state == 1) {
                substr($new, $oldlen) = lc(substr($new, $oldlen));
            } elsif ($state == 2) {
                substr($new, $oldlen) = uc(substr($new, $oldlen));
        return $new;

=head2 How can I make C<\w> match national character sets?

Put C<use locale;> in your script.  The \w character class is taken
from the current locale.

See L<perllocale> for details.

=head2 How can I match a locale-smart version of C</[a-zA-Z]/>?

You can use the POSIX character class syntax C</[[:alpha:]]/>
documented in L<perlre>.

No matter which locale you are in, the alphabetic characters are
the characters in \w without the digits and the underscore.
As a regex, that looks like C</[^\W\d_]/>.  Its complement,
the non-alphabetics, is then everything in \W along with
the digits and the underscore, or C</[\W\d_]/>.

=head2 How can I quote a variable to use in a regex?
X<regex, escaping> X<regexp, escaping> X<regular expression, escaping>

The Perl parser will expand $variable and @variable references in
regular expressions unless the delimiter is a single quote.  Remember,
too, that the right-hand side of a C<s///> substitution is considered
a double-quoted string (see L<perlop> for more details).  Remember
also that any regex special characters will be acted on unless you
precede the substitution with \Q.  Here's an example:

    $string = "Placido P. Octopus";
    $regex  = "P.";

    $string =~ s/$regex/Polyp/;
    # $string is now "Polypacido P. Octopus"

Because C<.> is special in regular expressions, and can match any
single character, the regex C<P.> here has matched the <Pl> in the
original string.

To escape the special meaning of C<.>, we use C<\Q>:

    $string = "Placido P. Octopus";
    $regex  = "P.";

    $string =~ s/\Q$regex/Polyp/;
    # $string is now "Placido Polyp Octopus"

The use of C<\Q> causes the <.> in the regex to be treated as a
regular character, so that C<P.> matches a C<P> followed by a dot.

=head2 What is C</o> really for?

Using a variable in a regular expression match forces a re-evaluation
(and perhaps recompilation) each time the regular expression is
encountered.  The C</o> modifier locks in the regex the first time
it's used.  This always happens in a constant regular expression, and
in fact, the pattern was compiled into the internal format at the same
time your entire program was.

Use of C</o> is irrelevant unless variable interpolation is used in
the pattern, and if so, the regex engine will neither know nor care
whether the variables change after the pattern is evaluated the I<very
first> time.

C</o> is often used to gain an extra measure of efficiency by not
performing subsequent evaluations when you know it won't matter
(because you know the variables won't change), or more rarely, when
you don't want the regex to notice if they do.

For example, here's a "paragrep" program:

    $/ = '';  # paragraph mode
    $pat = shift;
    while (<>) {
        print if /$pat/o;

=head2 How do I use a regular expression to strip C style comments from a file?

While this actually can be done, it's much harder than you'd think.
For example, this one-liner

    perl -0777 -pe 's{/\*.*?\*/}{}gs' foo.c

will work in many but not all cases.  You see, it's too simple-minded for
certain kinds of C programs, in particular, those with what appear to be
comments in quoted strings.  For that, you'd need something like this,
created by Jeffrey Friedl and later modified by Fred Curtis.

    $/ = undef;
    $_ = <>;
    s#/\*[^*]*\*+([^/*][^*]*\*+)*/|("(\\.|[^"\\])*"|'(\\.|[^'\\])*'|.[^/"'\\]*)#defined $2 ? $2 : ""#gse;

This could, of course, be more legibly written with the C</x> modifier, adding
whitespace and comments.  Here it is expanded, courtesy of Fred Curtis.

       /\*         ##  Start of /* ... */ comment
       [^*]*\*+    ##  Non-* followed by 1-or-more *'s
       )*          ##  0-or-more things which don't start with /
                   ##    but do end with '*'
       /           ##  End of /* ... */ comment

     |         ##     OR  various things which aren't comments:

         "           ##  Start of " ... " string
           \\.           ##  Escaped char
         |               ##    OR
           [^"\\]        ##  Non "\
         "           ##  End of " ... " string

       |         ##     OR

         '           ##  Start of ' ... ' string
           \\.           ##  Escaped char
         |               ##    OR
           [^'\\]        ##  Non '\
         '           ##  End of ' ... ' string

       |         ##     OR

         .           ##  Anything other char
         [^/"'\\]*   ##  Chars which doesn't start a comment, string or escape
     }{defined $2 ? $2 : ""}gxse;

A slight modification also removes C++ comments:

    s#/\*[^*]*\*+([^/*][^*]*\*+)*/|//[^\n]*|("(\\.|[^"\\])*"|'(\\.|[^'\\])*'|.[^/"'\\]*)#defined $2 ? $2 : ""#gse;

=head2 Can I use Perl regular expressions to match balanced text?
X<regex, matching balanced test> X<regexp, matching balanced test>
X<regular expression, matching balanced test>

Historically, Perl regular expressions were not capable of matching
balanced text.  As of more recent versions of perl including 5.6.1
experimental features have been added that make it possible to do this.
Look at the documentation for the (??{ }) construct in recent perlre manual
pages to see an example of matching balanced parentheses.  Be sure to take
special notice of the  warnings present in the manual before making use
of this feature.

CPAN contains many modules that can be useful for matching text
depending on the context.  Damian Conway provides some useful
patterns in Regexp::Common.  The module Text::Balanced provides a
general solution to this problem.

One of the common applications of balanced text matching is working
with XML and HTML.  There are many modules available that support
these needs.  Two examples are HTML::Parser and XML::Parser. There
are many others.

An elaborate subroutine (for 7-bit ASCII only) to pull out balanced
and possibly nested single chars, like C<`> and C<'>, C<{> and C<}>,
or C<(> and C<)> can be found in
http://www.cpan.org/authors/id/TOMC/scripts/pull_quotes.gz .

The C::Scan module from CPAN also contains such subs for internal use,
but they are undocumented.

=head2 What does it mean that regexes are greedy?  How can I get around it?
X<greedy> X<greediness>

Most people mean that greedy regexes match as much as they can.
Technically speaking, it's actually the quantifiers (C<?>, C<*>, C<+>,
C<{}>) that are greedy rather than the whole pattern; Perl prefers local
greed and immediate gratification to overall greed.  To get non-greedy
versions of the same quantifiers, use (C<??>, C<*?>, C<+?>, C<{}?>).

An example:

        $s1 = $s2 = "I am very very cold";
        $s1 =~ s/ve.*y //;      # I am cold
        $s2 =~ s/ve.*?y //;     # I am very cold

Notice how the second substitution stopped matching as soon as it
encountered "y ".  The C<*?> quantifier effectively tells the regular
expression engine to find a match as quickly as possible and pass
control on to whatever is next in line, like you would if you were
playing hot potato.

=head2 How do I process each word on each line?

Use the split function:

    while (<>) {
	foreach $word ( split ) {
	    # do something with $word here

Note that this isn't really a word in the English sense; it's just
chunks of consecutive non-whitespace characters.

To work with only alphanumeric sequences (including underscores), you
might consider

    while (<>) {
	foreach $word (m/(\w+)/g) {
	    # do something with $word here

=head2 How can I print out a word-frequency or line-frequency summary?

To do this, you have to parse out each word in the input stream.  We'll
pretend that by word you mean chunk of alphabetics, hyphens, or
apostrophes, rather than the non-whitespace chunk idea of a word given
in the previous question:

    while (<>) {
	while ( /(\b[^\W_\d][\w'-]+\b)/g ) {   # misses "`sheep'"
    while ( ($word, $count) = each %seen ) {
	print "$count $word\n";

If you wanted to do the same thing for lines, you wouldn't need a
regular expression:

    while (<>) {
    while ( ($line, $count) = each %seen ) {
	print "$count $line";

If you want these output in a sorted order, see L<perlfaq4>: "How do I
sort a hash (optionally by value instead of key)?".

=head2 How can I do approximate matching?
X<match, approximate> X<matching, approximate>

See the module String::Approx available from CPAN.

=head2 How do I efficiently match many regular expressions at once?
X<regex, efficiency> X<regexp, efficiency>
X<regular expression, efficiency>

( contributed by brian d foy )

Avoid asking Perl to compile a regular expression every time
you want to match it.  In this example, perl must recompile
the regular expression for every iteration of the foreach()
loop since it has no way to know what $pattern will be.

    @patterns = qw( foo bar baz );

    LINE: while( <> )
		foreach $pattern ( @patterns )
	    	print if /\b$pattern\b/i;
	    	next LINE;

The qr// operator showed up in perl 5.005.  It compiles a
regular expression, but doesn't apply it.  When you use the
pre-compiled version of the regex, perl does less work. In
this example, I inserted a map() to turn each pattern into
its pre-compiled form.  The rest of the script is the same,
but faster.

    @patterns = map { qr/\b$_\b/i } qw( foo bar baz );

    LINE: while( <> )
		foreach $pattern ( @patterns )
	    	print if /\b$pattern\b/i;
	    	next LINE;

In some cases, you may be able to make several patterns into
a single regular expression.  Beware of situations that require
backtracking though.

	$regex = join '|', qw( foo bar baz );

    LINE: while( <> )
		print if /\b(?:$regex)\b/i;

For more details on regular expression efficiency, see Mastering
Regular Expressions by Jeffrey Freidl.  He explains how regular
expressions engine work and why some patterns are surprisingly
inefficient.  Once you understand how perl applies regular
expressions, you can tune them for individual situations.

=head2 Why don't word-boundary searches with C<\b> work for me?

(contributed by brian d foy)

Ensure that you know what \b really does: it's the boundary between a
word character, \w, and something that isn't a word character. That
thing that isn't a word character might be \W, but it can also be the
start or end of the string.

It's not (not!) the boundary between whitespace and non-whitespace,
and it's not the stuff between words we use to create sentences.

In regex speak, a word boundary (\b) is a "zero width assertion",
meaning that it doesn't represent a character in the string, but a
condition at a certain position.

For the regular expression, /\bPerl\b/, there has to be a word
boundary before the "P" and after the "l".  As long as something other
than a word character precedes the "P" and succeeds the "l", the
pattern will match. These strings match /\bPerl\b/.

	"Perl"    # no word char before P or after l
	"Perl "   # same as previous (space is not a word char)
	"'Perl'"  # the ' char is not a word char
	"Perl's"  # no word char before P, non-word char after "l"

These strings do not match /\bPerl\b/.

	"Perl_"   # _ is a word char!
	"Perler"  # no word char before P, but one after l

You don't have to use \b to match words though.  You can look for
non-word characters surrounded by word characters.  These strings
match the pattern /\b'\b/.

	"don't"   # the ' char is surrounded by "n" and "t"
	"qep'a'"  # the ' char is surrounded by "p" and "a"

These strings do not match /\b'\b/.

	"foo'"    # there is no word char after non-word '

You can also use the complement of \b, \B, to specify that there
should not be a word boundary.

In the pattern /\Bam\B/, there must be a word character before the "a"
and after the "m". These patterns match /\Bam\B/:

	"llama"   # "am" surrounded by word chars
	"Samuel"  # same

These strings do not match /\Bam\B/

	"Sam"      # no word boundary before "a", but one after "m"
	"I am Sam" # "am" surrounded by non-word chars

=head2 Why does using $&, $`, or $' slow my program down?

(contributed by Anno Siegel)

Once Perl sees that you need one of these variables anywhere in the
program, it provides them on each and every pattern match. That means
that on every pattern match the entire string will be copied, part of it
to $`, part to $&, and part to $'. Thus the penalty is most severe with
long strings and patterns that match often. Avoid $&, $', and $` if you
can, but if you can't, once you've used them at all, use them at will
because you've already paid the price. Remember that some algorithms
really appreciate them. As of the 5.005 release, the $& variable is no
longer "expensive" the way the other two are.

Since Perl 5.6.1 the special variables @- and @+ can functionally replace
$`, $& and $'.  These arrays contain pointers to the beginning and end
of each match (see perlvar for the full story), so they give you
essentially the same information, but without the risk of excessive
string copying.

=head2 What good is C<\G> in a regular expression?

You use the C<\G> anchor to start the next match on the same
string where the last match left off.  The regular
expression engine cannot skip over any characters to find
the next match with this anchor, so C<\G> is similar to the
beginning of string anchor, C<^>.  The C<\G> anchor is typically
used with the C<g> flag.  It uses the value of pos()
as the position to start the next match.  As the match
operator makes successive matches, it updates pos() with the
position of the next character past the last match (or the
first character of the next match, depending on how you like
to look at it). Each string has its own pos() value.

Suppose you want to match all of consective pairs of digits
in a string like "1122a44" and stop matching when you
encounter non-digits.  You want to match C<11> and C<22> but
the letter <a> shows up between C<22> and C<44> and you want
to stop at C<a>. Simply matching pairs of digits skips over
the C<a> and still matches C<44>.

	$_ = "1122a44";
	my @pairs = m/(\d\d)/g;   # qw( 11 22 44 )

If you use the \G anchor, you force the match after C<22> to
start with the C<a>.  The regular expression cannot match
there since it does not find a digit, so the next match
fails and the match operator returns the pairs it already

	$_ = "1122a44";
	my @pairs = m/\G(\d\d)/g; # qw( 11 22 )

You can also use the C<\G> anchor in scalar context. You
still need the C<g> flag.

	$_ = "1122a44";
	while( m/\G(\d\d)/g )
		print "Found $1\n";

After the match fails at the letter C<a>, perl resets pos()
and the next match on the same string starts at the beginning.

	$_ = "1122a44";
	while( m/\G(\d\d)/g )
		print "Found $1\n";

	print "Found $1 after while" if m/(\d\d)/g; # finds "11"

You can disable pos() resets on fail with the C<c> flag.
Subsequent matches start where the last successful match
ended (the value of pos()) even if a match on the same
string as failed in the meantime. In this case, the match
after the while() loop starts at the C<a> (where the last
match stopped), and since it does not use any anchor it can
skip over the C<a> to find "44".

	$_ = "1122a44";
	while( m/\G(\d\d)/gc )
		print "Found $1\n";

	print "Found $1 after while" if m/(\d\d)/g; # finds "44"

Typically you use the C<\G> anchor with the C<c> flag
when you want to try a different match if one fails,
such as in a tokenizer. Jeffrey Friedl offers this example
which works in 5.004 or later.

    while (<>) {
      PARSER: {
           m/ \G( \d+\b    )/gcx   && do { print "number: $1\n";  redo; };
           m/ \G( \w+      )/gcx   && do { print "word:   $1\n";  redo; };
           m/ \G( \s+      )/gcx   && do { print "space:  $1\n";  redo; };
           m/ \G( [^\w\d]+ )/gcx   && do { print "other:  $1\n";  redo; };

For each line, the PARSER loop first tries to match a series
of digits followed by a word boundary.  This match has to
start at the place the last match left off (or the beginning
of the string on the first match). Since C<m/ \G( \d+\b
)/gcx> uses the C<c> flag, if the string does not match that
regular expression, perl does not reset pos() and the next
match starts at the same position to try a different

=head2 Are Perl regexes DFAs or NFAs?  Are they POSIX compliant?

While it's true that Perl's regular expressions resemble the DFAs
(deterministic finite automata) of the egrep(1) program, they are in
fact implemented as NFAs (non-deterministic finite automata) to allow
backtracking and backreferencing.  And they aren't POSIX-style either,
because those guarantee worst-case behavior for all cases.  (It seems
that some people prefer guarantees of consistency, even when what's
guaranteed is slowness.)  See the book "Mastering Regular Expressions"
(from O'Reilly) by Jeffrey Friedl for all the details you could ever
hope to know on these matters (a full citation appears in

=head2 What's wrong with using grep in a void context?

The problem is that grep builds a return list, regardless of the context.
This means you're making Perl go to the trouble of building a list that
you then just throw away. If the list is large, you waste both time and space.
If your intent is to iterate over the list, then use a for loop for this

In perls older than 5.8.1, map suffers from this problem as well.
But since 5.8.1, this has been fixed, and map is context aware - in void
context, no lists are constructed.

=head2 How can I match strings with multibyte characters?
X<regex, and multibyte characters> X<regexp, and multibyte characters>
X<regular expression, and multibyte characters>

Starting from Perl 5.6 Perl has had some level of multibyte character
support.  Perl 5.8 or later is recommended.  Supported multibyte
character repertoires include Unicode, and legacy encodings
through the Encode module.  See L<perluniintro>, L<perlunicode>,
and L<Encode>.

If you are stuck with older Perls, you can do Unicode with the
C<Unicode::String> module, and character conversions using the
C<Unicode::Map8> and C<Unicode::Map> modules.  If you are using
Japanese encodings, you might try using the jperl 5.005_03.

Finally, the following set of approaches was offered by Jeffrey
Friedl, whose article in issue #5 of The Perl Journal talks about
this very matter.

Let's suppose you have some weird Martian encoding where pairs of
ASCII uppercase letters encode single Martian letters (i.e. the two
bytes "CV" make a single Martian letter, as do the two bytes "SG",
"VS", "XX", etc.). Other bytes represent single characters, just like

So, the string of Martian "I am CVSGXX!" uses 12 bytes to encode the
nine characters 'I', ' ', 'a', 'm', ' ', 'CV', 'SG', 'XX', '!'.

Now, say you want to search for the single character C</GX/>. Perl
doesn't know about Martian, so it'll find the two bytes "GX" in the "I
am CVSGXX!"  string, even though that character isn't there: it just
looks like it is because "SG" is next to "XX", but there's no real
"GX".  This is a big problem.

Here are a few ways, all painful, to deal with it:

   $martian =~ s/([A-Z][A-Z])/ $1 /g; # Make sure adjacent "martian"
                                      # bytes are no longer adjacent.
   print "found GX!\n" if $martian =~ /GX/;

Or like this:

   @chars = $martian =~ m/([A-Z][A-Z]|[^A-Z])/g;
   # above is conceptually similar to:     @chars = $text =~ m/(.)/g;
   foreach $char (@chars) {
       print "found GX!\n", last if $char eq 'GX';

Or like this:

   while ($martian =~ m/\G([A-Z][A-Z]|.)/gs) {  # \G probably unneeded
       print "found GX!\n", last if $1 eq 'GX';

Here's another, slightly less painful, way to do it from Benjamin
Goldberg, who uses a zero-width negative look-behind assertion.

	print "found GX!\n" if	$martian =~ m/

This succeeds if the "martian" character GX is in the string, and fails
otherwise.  If you don't like using (?<!), a zero-width negative
look-behind assertion, you can replace (?<![A-Z]) with (?:^|[^A-Z]).

It does have the drawback of putting the wrong thing in $-[0] and $+[0],
but this usually can be worked around.

=head2 How do I match a pattern that is supplied by the user?

Well, if it's really a pattern, then just use

    chomp($pattern = <STDIN>);
    if ($line =~ /$pattern/) { }

Alternatively, since you have no guarantee that your user entered
a valid regular expression, trap the exception this way:

    if (eval { $line =~ /$pattern/ }) { }

If all you really want is to search for a string, not a pattern,
then you should either use the index() function, which is made for
string searching, or, if you can't be disabused of using a pattern
match on a non-pattern, then be sure to use C<\Q>...C<\E>, documented
in L<perlre>.

    $pattern = <STDIN>;

    open (FILE, $input) or die "Couldn't open input $input: $!; aborting";
    while (<FILE>) {
	print if /\Q$pattern\E/;
    close FILE;


Copyright (c) 1997-2006 Tom Christiansen, Nathan Torkington, and
other authors as noted. All rights reserved.

This documentation is free; you can redistribute it and/or modify it
under the same terms as Perl itself.

Irrespective of its distribution, all code examples in this file
are hereby placed into the public domain.  You are permitted and
encouraged to use this code in your own programs for fun
or for profit as you see fit.  A simple comment in the code giving
credit would be courteous but is not required.

--- NEW FILE: perlhack.pod ---
=head1 NAME

perlhack - How to hack at the Perl internals


This document attempts to explain how Perl development takes place,
and ends with some suggestions for people wanting to become bona fide

The perl5-porters mailing list is where the Perl standard distribution
is maintained and developed.  The list can get anywhere from 10 to 150
messages a day, depending on the heatedness of the debate.  Most days
there are two or three patches, extensions, features, or bugs being
discussed at a time.

A searchable archive of the list is at either:

[...2721 lines suppressed...]
debugger. Play, poke, investigate, fiddle! You'll probably get to
understand not just your chosen area but a much wider range of F<perl>'s
activity as well, and probably sooner than you'd think.


=over 3

=item I<The Road goes ever on and on, down from the door where it began.>


If you can do these things, you've started on the long road to Perl porting. 
Thanks for wanting to help make Perl better - and happy hacking!

=head1 AUTHOR

This document was written by Nathan Torkington, and is maintained by
the perl5-porters mailing list.

--- NEW FILE: perltooc.pod ---
=head1 NAME

perltooc - Tom's OO Tutorial for Class Data in Perl


When designing an object class, you are sometimes faced with the situation
of wanting common state shared by all objects of that class.
Such I<class attributes> act somewhat like global variables for the entire
class, but unlike program-wide globals, class attributes have meaning only to
the class itself.

Here are a few examples where class attributes might come in handy:

=over 4

=item *

to keep a count of the objects you've created, or how many are
[...1303 lines suppressed...]

Irrespective of its distribution, all code examples in this file
are hereby placed into the public domain.  You are permitted and
encouraged to use this code in your own programs for fun
or for profit as you see fit.  A simple comment in the code giving
credit would be courteous but is not required.


Russ Allbery, Jon Orwant, Randy Ray, Larry Rosler, Nat Torkington,
and Stephen Warren all contributed suggestions and corrections to this
piece.  Thanks especially to Damian Conway for his ideas and feedback,
and without whose indirect prodding I might never have taken the time
to show others how much Perl has to offer in the way of objects once
you start thinking outside the tiny little box that today's "popular"
object-oriented languages enforce.

=head1 HISTORY

Last edit: Sun Feb  4 20:50:28 EST 2001

--- NEW FILE: perlnewmod.pod ---
=head1 NAME

perlnewmod - preparing a new module for distribution


This document gives you some suggestions about how to go about writing
Perl modules, preparing them for distribution, and making them available
via CPAN.

One of the things that makes Perl really powerful is the fact that Perl
hackers tend to want to share the solutions to problems they've faced,
so you and I don't have to battle with the same problem again.

The main way they do this is by abstracting the solution into a Perl
module. If you don't know what one of these is, the rest of this
document isn't going to be much use to you. You're also missing out on
an awful lot of useful code; consider having a look at L<perlmod>,
L<perlmodlib> and L<perlmodinstall> before coming back here.

When you've found that there isn't a module available for what you're
trying to do, and you've had to write the code yourself, consider
packaging up the solution into a module and uploading it to CPAN so that
others can benefit.

=head2 Warning

We're going to primarily concentrate on Perl-only modules here, rather
than XS modules. XS modules serve a rather different purpose, and
you should consider different things before distributing them - the
popularity of the library you are gluing, the portability to other
operating systems, and so on. However, the notes on preparing the Perl
side of the module and packaging and distributing it will apply equally
well to an XS module as a pure-Perl one.

=head2 What should I make into a module?

You should make a module out of any code that you think is going to be
useful to others. Anything that's likely to fill a hole in the communal
library and which someone else can slot directly into their program. Any
part of your code which you can isolate and extract and plug into
something else is a likely candidate.

Let's take an example. Suppose you're reading in data from a local
format into a hash-of-hashes in Perl, turning that into a tree, walking
the tree and then piping each node to an Acme Transmogrifier Server.

Now, quite a few people have the Acme Transmogrifier, and you've had to
write something to talk the protocol from scratch - you'd almost
certainly want to make that into a module. The level at which you pitch
it is up to you: you might want protocol-level modules analogous to
L<Net::SMTP|Net::SMTP> which then talk to higher level modules analogous
to L<Mail::Send|Mail::Send>. The choice is yours, but you do want to get
a module out for that server protocol.

Nobody else on the planet is going to talk your local data format, so we
can ignore that. But what about the thing in the middle? Building tree
structures from Perl variables and then traversing them is a nice,
general problem, and if nobody's already written a module that does
that, you might want to modularise that code too.

So hopefully you've now got a few ideas about what's good to modularise.
Let's now see how it's done.

=head2 Step-by-step: Preparing the ground

Before we even start scraping out the code, there are a few things we'll
want to do in advance.

=over 3

=item Look around

Dig into a bunch of modules to see how they're written. I'd suggest
starting with L<Text::Tabs|Text::Tabs>, since it's in the standard
library and is nice and simple, and then looking at something a little
more complex like L<File::Copy|File::Copy>.  For object oriented
code, C<WWW::Mechanize> or the C<Email::*> modules provide some good

These should give you an overall feel for how modules are laid out and

=item Check it's new

There are a lot of modules on CPAN, and it's easy to miss one that's
similar to what you're planning on contributing. Have a good plough
through the L<http://search.cpan.org> and make sure you're not the one
reinventing the wheel!

=item Discuss the need

You might love it. You might feel that everyone else needs it. But there
might not actually be any real demand for it out there. If you're unsure
about the demand your module will have, consider sending out feelers
on the C<comp.lang.perl.modules> newsgroup, or as a last resort, ask the
modules list at C<modules at perl.org>. Remember that this is a closed list
with a very long turn-around time - be prepared to wait a good while for
a response from them.

=item Choose a name

Perl modules included on CPAN have a naming hierarchy you should try to
fit in with. See L<perlmodlib> for more details on how this works, and
browse around CPAN and the modules list to get a feel of it. At the very
least, remember this: modules should be title capitalised, (This::Thing)
fit in with a category, and explain their purpose succinctly.

=item Check again

While you're doing that, make really sure you haven't missed a module
similar to the one you're about to write.

When you've got your name sorted out and you're sure that your module is
wanted and not currently available, it's time to start coding.


=head2 Step-by-step: Making the module

=over 3

=item Start with F<module-starter> or F<h2xs>

The F<module-starter> utility is distributed as part of the
L<Module::Starter|Module::Starter> CPAN package.  It creates a directory
with stubs of all the necessary files to start a new module, according
to recent "best practice" for module development, and is invoked from
the command line, thus:

    module-starter --module=Foo::Bar \
       --author="Your Name" --email=yourname at cpan.org

If you do not wish to install the L<Module::Starter|Module::Starter>
package from CPAN, F<h2xs> is an older tool, originally intended for the
development of XS modules, which comes packaged with the Perl

A typical invocation of L<h2xs|h2xs> for a pure Perl module is:

    h2xs -AX --skip-exporter --use-new-tests -n Foo::Bar 

The C<-A> omits the Autoloader code, C<-X> omits XS elements,
C<--skip-exporter> omits the Exporter code, C<--use-new-tests> sets up a
modern testing environment, and C<-n> specifies the name of the module.

=item Use L<strict|strict> and L<warnings|warnings>

A module's code has to be warning and strict-clean, since you can't
guarantee the conditions that it'll be used under. Besides, you wouldn't
want to distribute code that wasn't warning or strict-clean anyway,

=item Use L<Carp|Carp>

The L<Carp|Carp> module allows you to present your error messages from
the caller's perspective; this gives you a way to signal a problem with
the caller and not your module. For instance, if you say this:

    warn "No hostname given";

the user will see something like this:

    No hostname given at /usr/local/lib/perl5/site_perl/5.6.0/Net/Acme.pm
    line 123.

which looks like your module is doing something wrong. Instead, you want
to put the blame on the user, and say this:

    No hostname given at bad_code, line 10.

You do this by using L<Carp|Carp> and replacing your C<warn>s with
C<carp>s. If you need to C<die>, say C<croak> instead. However, keep
C<warn> and C<die> in place for your sanity checks - where it really is
your module at fault.

=item Use L<Exporter|Exporter> - wisely!

L<Exporter|Exporter> gives you a standard way of exporting symbols and
subroutines from your module into the caller's namespace. For instance,
saying C<use Net::Acme qw(&frob)> would import the C<frob> subroutine.

The package variable C<@EXPORT> will determine which symbols will get
exported when the caller simply says C<use Net::Acme> - you will hardly
ever want to put anything in there. C<@EXPORT_OK>, on the other hand,
specifies which symbols you're willing to export. If you do want to
export a bunch of symbols, use the C<%EXPORT_TAGS> and define a standard
export set - look at L<Exporter> for more details.

=item Use L<plain old documentation|perlpod>

The work isn't over until the paperwork is done, and you're going to
need to put in some time writing some documentation for your module.
C<module-starter> or C<h2xs> will provide a stub for you to fill in; if
you're not sure about the format, look at L<perlpod> for an
introduction. Provide a good synopsis of how your module is used in
code, a description, and then notes on the syntax and function of the
individual subroutines or methods. Use Perl comments for developer notes
and POD for end-user notes.

=item Write tests

You're encouraged to create self-tests for your module to ensure it's
working as intended on the myriad platforms Perl supports; if you upload
your module to CPAN, a host of testers will build your module and send
you the results of the tests. Again, C<module-starter> and C<h2xs>
provide a test framework which you can extend - you should do something
more than just checking your module will compile.
L<Test::Simple|Test::Simple> and L<Test::More|Test::More> are good
places to start when writing a test suite.

=item Write the README

If you're uploading to CPAN, the automated gremlins will extract the
README file and place that in your CPAN directory. It'll also appear in
the main F<by-module> and F<by-category> directories if you make it onto
the modules list. It's a good idea to put here what the module actually
does in detail, and the user-visible changes since the last release.


=head2 Step-by-step: Distributing your module

=over 3

=item Get a CPAN user ID

Every developer publishing modules on CPAN needs a CPAN ID.  Visit
C<http://pause.perl.org/>, select "Request PAUSE Account", and wait for
your request to be approved by the PAUSE administrators.

=item C<perl Makefile.PL; make test; make dist>

Once again, C<module-starter> or C<h2xs> has done all the work for you.
They produce the standard C<Makefile.PL> you see when you download and
install modules, and this produces a Makefile with a C<dist> target.

Once you've ensured that your module passes its own tests - always a
good thing to make sure - you can C<make dist>, and the Makefile will
hopefully produce you a nice tarball of your module, ready for upload.

=item Upload the tarball

The email you got when you received your CPAN ID will tell you how to
log in to PAUSE, the Perl Authors Upload SErver. From the menus there,
you can upload your module to CPAN.

=item Announce to the modules list

Once uploaded, it'll sit unnoticed in your author directory. If you want
it connected to the rest of the CPAN, you'll need to go to "Register
Namespace" on PAUSE.  Once registered, your module will appear in the
by-module and by-category listings on CPAN.

=item Announce to clpa

If you have a burning desire to tell the world about your release, post
an announcement to the moderated C<comp.lang.perl.announce> newsgroup.

=item Fix bugs!

Once you start accumulating users, they'll send you bug reports. If
you're lucky, they'll even send you patches. Welcome to the joys of
maintaining a software project...


=head1 AUTHOR

Simon Cozens, C<simon at cpan.org>

Updated by Kirrily "Skud" Robert, C<skud at cpan.org>

=head1 SEE ALSO

L<perlmod>, L<perlmodlib>, L<perlmodinstall>, L<h2xs>, L<strict>,
L<Carp>, L<Exporter>, L<perlpod>, L<Test::Simple>, L<Test::More>
L<ExtUtils::MakeMaker>, L<Module::Build>, L<Module::Starter>
http://www.cpan.org/ , Ken Williams' tutorial on building your own
module at http://mathforum.org/~ken/perl_modules.html

--- NEW FILE: checkpods.PL ---

use Config;
use File::Basename qw(&basename &dirname);
use Cwd;

# List explicitly here the variables you want Configure to
# generate.  Metaconfig only looks for shell variables, so you
# have to mention them as if they were shell variables, not
# %Config entries.  Thus you write
#  $startperl
# to ensure Configure will look for $Config{startperl}.

# This forces PL files to create target in same directory as PL file.
# This is so that make depend always knows where to find PL derivatives.
$origdir = cwd;
chdir dirname($0);
$file = basename($0, '.PL');
$file .= '.com' if $^O eq 'VMS';

open OUT,">$file" or die "Can't create $file: $!";

print "Extracting $file (with variable substitutions)\n";

# In this section, perl variables will be expanded during extraction.
# You can use $Config{...} to use Configure variables.

print OUT <<"!GROK!THIS!";
    eval 'exec $Config{perlpath} -S \$0 \${1+"\$@"}'
	if \$running_under_some_shell;

# In the following, perl variables are not expanded during extraction.

print OUT <<'!NO!SUBS!';
# From roderick at gate.netThu Sep  5 17:19:30 1996
# Date: Thu, 05 Sep 1996 00:11:22 -0400
# From: Roderick Schertler <roderick at gate.net>
# To: perl5-porters at africa.nicoh.com
# Subject: POD lines with only spaces
# There are some places in the documentation where a POD directive is
# ignored because the line before it contains whitespace (and so the
# directive doesn't start a paragraph).  This patch adds a way to check
# for these to the pod Makefile (though it isn't made part of the build
# process, which would be a good idea), and fixes those places where the
# problem currently exists.
#  Version 1.00  Original.
#  Version 1.01  Andy Dougherty <doughera at lafayette.edu>
#    Trivial modifications to output format for easier auto-parsing
#    Broke it out as a separate function to avoid nasty
#	Make/Shell/Perl quoting problems, and also to make it easier
#	to grow.  Someone will probably want to rewrite in terms of
#	some sort of Pod::Checker module.  Or something.  Consider this
#	a placeholder for the future.
#  Version 1.02  Roderick Schertler <roderick at argon.org>
#	Check for pod directives following any kind of unempty line, not
#	just lines of whitespace.

@directive = qw(head1 head2 item over back cut pod for begin end);
@directive{@directive} = (1) x @directive;

$exit = $last_unempty = 0;
while (<>) {
    if (/^=(\S+)/ && $directive{$1} && $last_unempty) {
	printf "%s: line %5d, no blank line preceding directive =%s\n",
		$ARGV, $., $1;
	$exit = 1;
    $last_unempty = ($_ ne '');
    if (eof) {
	$last_unempty = 0;
exit $exit

close OUT or die "Can't close $file: $!";
chmod 0755, $file or die "Can't reset permissions for $file: $!\n";
exec("$Config{'eunicefix'} $file") if $Config{'eunicefix'} ne ':';
chdir $origdir;

--- NEW FILE: perl56delta.pod ---
=head1 NAME

perl56delta - what's new for perl v5.6.0


This document describes differences between the 5.005 release and the 5.6.0

=head1 Core Enhancements

=head2 Interpreter cloning, threads, and concurrency

Perl 5.6.0 introduces the beginnings of support for running multiple
interpreters concurrently in different threads.  In conjunction with
the perl_clone() API call, which can be used to selectively duplicate
the state of any given interpreter, it is possible to compile a
piece of code once in an interpreter, clone that interpreter
one or more times, and run all the resulting interpreters in distinct
[...2983 lines suppressed...]
analysed by the Perl porting team.

=head1 SEE ALSO

The F<Changes> file for exhaustive details on what changed.

The F<INSTALL> file for how to build Perl.

The F<README> file for general stuff.

The F<Artistic> and F<Copying> files for copyright information.

=head1 HISTORY

Written by Gurusamy Sarathy <F<gsar at activestate.com>>, with many
contributions from The Perl Porters.

Send omissions or corrections to <F<perlbug at perl.org>>.


--- NEW FILE: perl572delta.pod ---
=head1 NAME

perl572delta - what's new for perl v5.7.2


This document describes differences between the 5.7.1 release and the
5.7.2 release.  

(To view the differences between the 5.6.0 release and the 5.7.0
release, see L<perl570delta>.  To view the differences between the
5.7.0 release and the 5.7.1 release, see L<perl571delta>.)

=head1 Security Vulnerability Closed

(This change was already made in 5.7.0 but bears repeating here.)

A security vulnerability affecting all Perl versions prior to 5.6.1
was found in August 2000.  The vulnerability does not affect default
installations and as far as is known affects only the Linux platform.

You should upgrade your Perl to 5.6.1 as soon as possible.  Patches
for earlier releases exist but using the patches require full
recompilation from the source code anyway, so 5.6.1 is your best

See http://www.cpan.org/src/5.0/sperl-2000-08-05/sperl-2000-08-05.txt
for more information.

=head1 Incompatible Changes

=head2 64-bit platforms and malloc

If your pointers are 64 bits wide, the Perl malloc is no more being
used because it simply does not work with 8-byte pointers.  Also,
usually the system malloc on such platforms are much better optimized
for such large memory models than the Perl malloc.

=head2 AIX Dynaloading

The AIX dynaloading now uses in AIX releases 4.3 and newer the native
dlopen interface of AIX instead of the old emulated interface.  This
change will probably break backward compatibility with compiled
modules.  The change was made to make Perl more compliant with other
applications like modperl which are using the AIX native interface.

=head2 Socket Extension Dynamic in VMS

The Socket extension is now dynamically loaded instead of being
statically built in.  This may or may not be a problem with ancient
TCP/IP stacks of VMS: we do not know since we weren't able to test
Perl in such configurations.

=head2 Different Definition of the Unicode Character Classes \p{In...}

As suggested by the Unicode consortium, the Unicode character classes
now prefer I<scripts> as opposed to I<blocks> (as defined by Unicode);
in Perl, when the C<\p{In....}> and the C<\p{In....}> regular expression
constructs are used.  This has changed the definition of some of those
character classes.

The difference between scripts and blocks is that scripts are the
glyphs used by a language or a group of languages, while the blocks
are more artificial groupings of 256 characters based on the Unicode

In general this change results in more inclusive Unicode character
classes, but changes to the other direction also do take place:
for example while the script C<Latin> includes all the Latin
characters and their various diacritic-adorned versions, it
does not include the various punctuation or digits (since they
are not solely C<Latin>).

Changes in the character class semantics may have happened if a script
and a block happen to have the same name, for example C<Hebrew>.
In such cases the script wins and C<\p{InHebrew}> now means the script
definition of Hebrew.  The block definition in still available,
though, by appending C<Block> to the name: C<\p{InHebrewBlock}> means
what C<\p{InHebrew}> meant in perl 5.6.0.  For the full list
of affected character classes, see L<perlunicode/Blocks>.

=head2 Deprecations

The current user-visible implementation of pseudo-hashes (the weird
use of the first array element) is deprecated starting from Perl 5.8.0
and will be removed in Perl 5.10.0, and the feature will be
implemented differently.  Not only is the current interface rather
ugly, but the current implementation slows down normal array and hash
use quite noticeably. The C<fields> pragma interface will remain

The syntaxes C<< @a->[...] >> and  C<< @h->{...} >> have now been deprecated.

The suidperl is also considered to be too much a risk to continue
maintaining and the suidperl code is likely to be removed in a future

The C<package;> syntax (C<package> without an argument has been
deprecated.  Its semantics were never that clear and its
implementation even less so.  If you have used that feature to
disallow all but fully qualified variables, C<use strict;> instead.

The chdir(undef) and chdir('') behaviors to match chdir() has been
deprecated.  In future versions, chdir(undef) and chdir('') will
simply fail.

=head1 Core Enhancements

In general a lot of fixing has happened in the area of Perl's
understanding of numbers, both integer and floating point.  Since in
many systems the standard number parsing functions like C<strtoul()>
and C<atof()> seem to have bugs, Perl tries to work around their
deficiencies.  This results hopefully in more accurate numbers.

=over 4

=item *

The rules for allowing underscores (underbars) in numeric constants
have been relaxed and simplified: now you can have an underscore
B<between digits>.

=item *

GMAGIC (right-hand side magic) could in many cases such as string
concatenation be invoked too many times.

=item *

Lexicals I: lexicals outside an eval "" weren't resolved
correctly inside a subroutine definition inside the eval "" if they
were not already referenced in the top level of the eval""ed code.

=item *

Lexicals II: lexicals leaked at file scope into subroutines that
were declared before the lexicals.

=item *

Lvalue subroutines can now return C<undef> in list context.

=item *

The C<op_clear> and C<op_null> are now exported.

=item *

A new special regular expression variable has been introduced:
C<$^N>, which contains the most-recently closed group (submatch).

=item *

L<utime> now supports C<utime undef, undef, @files> to change the
file timestamps to the current time.

=item *

The Perl parser has been stress tested using both random input and
Markov chain input.

=item *

C<eval "v200"> now works.

=item *

VMS now works under PerlIO.

=item *

END blocks are now run even if you exit/die in a BEGIN block.
The execution of END blocks is now controlled by 
PL_exit_flags & PERL_EXIT_DESTRUCT_END. This enables the new
behaviour for perl embedders. This will default in 5.10. See


=head1 Modules and Pragmata

=head2 New Modules and Distributions

=over 4

=item *

L<Attribute::Handlers> - Simpler definition of attribute handlers

=item *

L<ExtUtils::Constant> - generate XS code to import C header constants

=item *

L<I18N::Langinfo> - query locale information

=item *

L<I18N::LangTags> - functions for dealing with RFC3066-style language tags

=item *

L<libnet> - a collection of perl5 modules related to network programming

Perl installation leaves libnet unconfigured, use F<libnetcfg> to configure.

=item *

L<List::Util> - selection of general-utility list subroutines

=item *

L<Locale::Maketext> - framework for localization

=item *

L<Memoize> - Make your functions faster by trading space for time

=item *

L<NEXT> - pseudo-class for method redispatch

=item *

L<Scalar::Util> - selection of general-utility scalar subroutines

=item *

L<Test::More> - yet another framework for writing test scripts

=item *

L<Test::Simple> - Basic utilities for writing tests

=item *

L<Time::HiRes> - high resolution ualarm, usleep, and gettimeofday

=item *

L<Time::Piece> - Object Oriented time objects

(Previously known as L<Time::Object>.)

=item *

L<Time::Seconds> - a simple API to convert seconds to other date values

=item *

L<UnicodeCD> - Unicode Character Database


=head2 Updated And Improved Modules and Pragmata

=over 4

=item *

L<B::Deparse> module has been significantly enhanced.  It now
can deparse almost all of the standard test suite (so that the
tests still succeed).  There is a make target "test.deparse"
for trying this out.

=item *

L<Class::Struct> now assigns the array/hash element if the accessor
is called with an array/hash element as the B<sole> argument.

=item *

L<Cwd> extension is now (even) faster.

=item *

L<DB_File> extension has been updated to version 1.77.

=item *

L<Fcntl>, L<Socket>, and L<Sys::Syslog> have been rewritten to use the
new-style constant dispatch section (see L<ExtUtils::Constant>).

=item *

L<File::Find> is now (again) reentrant.  It also has been made
more portable.

=item *

L<File::Glob> now supports C<GLOB_LIMIT> constant to limit the
size of the returned list of filenames.

=item *

L<IO::Socket::INET> now supports C<LocalPort> of zero (usually meaning
that the operating system will make one up.)

=item *

The L<vars> pragma now supports declaring fully qualified variables.
(Something that C<our()> does not and will not support.)


=head1 Utility Changes

=over 4

=item *

The F<emacs/e2ctags.pl> is now much faster.

=item *

L<h2ph> now supports C trigraphs.

=item *

L<h2xs> uses the new L<ExtUtils::Constant> module which will affect
newly created extensions that define constants.  Since the new code is
more correct (if you have two constants where the first one is a
prefix of the second one, the first constant B<never> gets defined),
less lossy (it uses integers for integer constant, as opposed to the
old code that used floating point numbers even for integer constants),
and slightly faster, you might want to consider regenerating your
extension code (the new scheme makes regenerating easy).
L<h2xs> now also supports C trigraphs.

=item *

L<libnetcfg> has been added to configure the libnet.

=item *

The F<Pod::Html> (and thusly L<pod2html>) now allows specifying
a cache directory.


=head1 New Documentation

=over 4

=item *

L<Locale::Maketext::TPJ13> is an article about software localization,
originally published in The Perl Journal #13, republished here with
kind permission.

=item *

More README.$PLATFORM files have been converted into pod, which also
means that they also be installed as perl$PLATFORM documentation
files.  The new files are L<perlapollo>, L<perlbeos>, L<perldgux>,
L<perlhurd>, L<perlmint>, L<perlnetware>, L<perlplan9>, L<perlqnx>,
and L<perltru64>.

=item *

The F<Todo> and F<Todo-5.6> files have been merged into L<perltodo>.

=item *

Use of the F<gprof> tool to profile Perl has been documented in
L<perlhack>.  There is a make target "perl.gprof" for generating a
gprofiled Perl executable.


=head1 Installation and Configuration Improvements

=head2 New Or Improved Platforms

=over 4

=item *

AIX should now work better with gcc, threads, and 64-bitness.  Also the
long doubles support in AIX should be better now.  See L<perlaix>.

=item *

AtheOS ( http://www.atheos.cx/ ) is a new platform.

=item *

DG/UX platform now supports the 5.005-style threads.  See L<perldgux>.

=item *

DYNIX/ptx platform (a.k.a. dynixptx) is supported at or near osvers 4.5.2.

=item *

Several Mac OS (Classic) portability patches have been applied.  We
hope to get a fully working port by 5.8.0.  (The remaining problems
relate to the changed IO model of Perl.)  See L<perlmacos>.

=item *

Mac OS X (or Darwin) should now be able to build Perl even on HFS+
filesystems.  (The case-insensitivity confused the Perl build process.)

=item *

NetWare from Novell is now supported.  See L<perlnetware>.

=item *

The Amdahl UTS UNIX mainframe platform is now supported.


=head2 Generic Improvements

=over 4

=item *

In AFS installations one can configure the root of the AFS to be
somewhere else than the default F</afs> by using the Configure
parameter C<-Dafsroot=/some/where/else>.

=item *

The version of Berkeley DB used when the Perl (and, presumably, the
DB_File extension) was built is now available as
C<@Config{qw(db_version_major db_version_minor db_version_patch)}>

=item *

The Thread extension is now not built at all under ithreads
(C<Configure -Duseithreads>) because it wouldn't work anyway (the
Thread extension requires being Configured with C<-Duse5005threads>).

=item *

The C<B::Deparse> compiler backend has been so significantly improved
that almost the whole Perl test suite passes after being deparsed.  A
make target has been added to help in further testing: C<make test.deparse>.


=head1 Selected Bug Fixes

=over 5

=item *

The autouse pragma didn't work for Multi::Part::Function::Names.

=item *

The behaviour of non-decimal but numeric string constants such as
"0x23" was platform-dependent: in some platforms that was seen as 35,
in some as 0, in some as a floating point number (don't ask).  This
was caused by Perl using the operating system libraries in a situation
where the result of the string to number conversion is undefined: now
Perl consistently handles such strings as zero in numeric contexts.

=item *

L<dprofpp> -R didn't work.

=item *

PERL5OPT with embedded spaces didn't work.

=item *

L<Sys::Syslog> ignored the C<LOG_AUTH> constant.


=head2 Platform Specific Changes and Fixes

=over 4

=item *

Some versions of glibc have a broken modfl().  This affects builds
with C<-Duselongdouble>.  This version of Perl detects this brokenness
and has a workaround for it.  The glibc release 2.2.2 is known to have
fixed the modfl() bug.


=head1 New or Changed Diagnostics

=over 4

=item *

In the regular expression diagnostics the C<E<lt>E<lt> HERE> marker
introduced in 5.7.0 has been changed to be C<E<lt>-- HERE> since too
many people found the C<E<lt>E<lt>> to be too similar to here-document

=item *

If you try to L<perlfunc/pack> a number less than 0 or larger than 255
using the C<"C"> format you will get an optional warning.  Similarly
for the C<"c"> format and a number less than -128 or more than 127.

=item *

Certain regex modifiers such as C<(?o)> make sense only if applied to
the entire regex.  You will an optional warning if you try to do otherwise.

=item *

Using arrays or hashes as references (e.g. C<< %foo->{bar} >> has been
deprecated for a while.  Now you will get an optional warning.


=head1 Source Code Enhancements

=head2 MAGIC constants

The MAGIC constants (e.g. C<'P'>) have been macrofied
(e.g. C<PERL_MAGIC_TIED>) for better source code readability
and maintainability.

=head2 Better commented code

F<perly.c>, F<sv.c>, and F<sv.h> have now been extensively commented.

=head2 Regex pre-/post-compilation items matched up

The regex compiler now maintains a structure that identifies nodes in
the compiled bytecode with the corresponding syntactic features of the
original regex expression.  The information is attached to the new
C<offsets> member of the C<struct regexp>. See L<perldebguts> for more
complete information.

=head2 gcc -Wall

The C code has been made much more C<gcc -Wall> clean.  Some warning
messages still remain, though, so if you are compiling with gcc you
will see some warnings about dubious practices.  The warnings are
being worked on.

=head1 New Tests

Several new tests have been added, especially for the F<lib> subsection.

The tests are now reported in a different order than in earlier Perls.
(This happens because the test scripts from under t/lib have been moved
to be closer to the library/extension they are testing.)

=head1 Known Problems

Note that unlike other sections in this document (which describe
changes since 5.7.0) this section is cumulative containing known
problems for all the 5.7 releases.

=head2 AIX

=over 4

=item *

In AIX 4.2 Perl extensions that use C++ functions that use statics
may have problems in that the statics are not getting initialized.
In newer AIX releases this has been solved by linking Perl with
the libC_r library, but unfortunately in AIX 4.2 the said library
has an obscure bug where the various functions related to time
(such as time() and gettimeofday()) return broken values, and
therefore in AIX 4.2 Perl is not linked against the libC_r.

=item *

vac May Produce Buggy Code For Perl

The AIX C compiler vac version may produce buggy code,
resulting in few random tests failing, but when the failing tests
are run by hand, they succeed.  We suggest upgrading to at least
vac version, that has been known to compile Perl correctly.
"lslpp -L|grep vac.C" will tell you the vac version.


=head2 Amiga Perl Invoking Mystery

One cannot call Perl using the C<volume:> syntax, that is, C<perl -v>
works, but for example C<bin:perl -v> doesn't.  The exact reason is
known but the current suspect is the F<ixemul> library.

=head2 lib/ftmp-security tests warn 'system possibly insecure'

Don't panic.  Read INSTALL 'make test' section instead.

=head2 Cygwin intermittent failures of lib/Memoize/t/expire_file 11 and 12

The subtests 11 and 12 sometimes fail and sometimes work.

=head2 HP-UX lib/io_multihomed Fails When LP64-Configured

The lib/io_multihomed test may hang in HP-UX if Perl has been
configured to be 64-bit. Because other 64-bit platforms do not hang in
this test, HP-UX is suspect. All other tests pass in 64-bit HP-UX. The
test attempts to create and connect to "multihomed" sockets (sockets
which have multiple IP addresses).

=head2  HP-UX lib/posix Subtest 9 Fails When LP64-Configured

If perl is configured with -Duse64bitall, the successful result of the
subtest 10 of lib/posix may arrive before the successful result of the
subtest 9, which confuses the test harness so much that it thinks the
subtest 9 failed.

=head2 Linux With Sfio Fails op/misc Test 48

No known fix.

=head2 OS/390

OS/390 has rather many test failures but the situation is actually
better than it was in 5.6.0, it's just that so many new modules and
tests have been added.

 Failed Test                     Stat Wstat Total Fail  Failed  List of Failed
 ../ext/B/Deparse.t                            14    1   7.14%  14
 ../ext/B/Showlex.t                             1    1 100.00%  1
 ../ext/Encode/Encode/Tcl.t                   610   13   2.13%  592 594 596 598
                                                                600 602 604-610
 ../ext/IO/lib/IO/t/io_unix.t     113 28928     5    3  60.00%  3-5
 ../ext/POSIX/POSIX.t                          29    1   3.45%  14
 ../ext/Storable/t/lock.t         255 65280     5    3  60.00%  3-5
 ../lib/locale.t                  129 33024   117   19  16.24%  99-117
 ../lib/warnings.t                            434    1   0.23%  75
 ../lib/ExtUtils.t                             27    1   3.70%  25
 ../lib/Math/BigInt/t/bigintpm.t             1190    1   0.08%  1145
 ../lib/Unicode/UCD.t                          81   48  59.26%  1-16 49-64 66-81
 ../lib/User/pwent.t                            9    1  11.11%  4
 op/pat.t                                     660    6   0.91%  242-243 424-425
 op/split.t                         0     9    ??   ??       %  ??
 op/taint.t                                   174    3   1.72%  156 162 168
 op/tr.t                                       70    3   4.29%  50 58-59
 Failed 16/422 test scripts, 96.21% okay. 105/23251 subtests failed, 99.55% okay.

=head2 op/sprintf tests 129 and 130

The op/sprintf tests 129 and 130 are known to fail on some platforms.
Examples include any platform using sfio, and Compaq/Tandem's NonStop-UX.
The failing platforms do not comply with the ANSI C Standard, line
19ff on page 134 of ANSI X3.159 1989 to be exact.  (They produce
something other than "1" and "-1" when formatting 0.6 and -0.6 using
the printf format "%.0f", most often they produce "0" and "-0".)

=head2  Failure of Thread tests

B<Note that support for 5.005-style threading remains experimental.>

The following tests are known to fail due to fundamental problems in
the 5.005 threading implementation. These are not new failures--Perl
5.005_0x has the same bugs, but didn't have these tests.

  lib/autouse.t                 4
  t/lib/thr5005.t               19-20

=head2 UNICOS

=over 4

=item *

ext/POSIX/sigaction subtests 6 and 13 may fail.

=item *

lib/ExtUtils may spuriously claim that subtest 28 failed,
which is interesting since the test only has 27 tests.

=item *

Numerous numerical test failures

  op/numconvert                 209,210,217,218
  op/override                   7
  ext/Time/HiRes/HiRes          9
  lib/Math/BigInt/t/bigintpm    1145
  lib/Math/Trig                 25

These tests fail because of yet unresolved floating point inaccuracies.


=head2 UTS

There are a few known test failures, see L<perluts>.

=head2 VMS

Rather many tests are failing in VMS but that actually more tests
succeed in VMS than they used to, it's just that there are many,
many more tests than there used to be.

Here are the known failures from some compiler/platform combinations.

DEC C V5.3-006 on OpenVMS VAX V6.2

  [-.ext.list.util.t]tainted..............FAILED on test 3
  [-.ext.posix]sigaction..................FAILED on test 7
  [-.ext.time.hires]hires.................FAILED on test 14
  [-.lib.file.find]taint..................FAILED on test 17
  [-.lib.math.bigint.t]bigintpm...........FAILED on test 1183
  [-.lib.test.simple.t]exit...............FAILED on test 1
  [.lib]vmsish............................FAILED on test 13
  [.op]sprintf............................FAILED on test 12
  Failed 8/399 tests, 91.23% okay.

DEC C V6.0-001 on OpenVMS Alpha V7.2-1 and
Compaq C V6.2-008 on OpenVMS Alpha V7.1

  [-.ext.list.util.t]tainted..............FAILED on test 3 
  [-.lib.file.find]taint..................FAILED on test 17
  [-.lib.test.simple.t]exit...............FAILED on test 1
  [.lib]vmsish............................FAILED on test 13
  Failed 4/399 tests, 92.48% okay.

Compaq C V6.4-005 on OpenVMS Alpha 7.2.1

  [-.ext.b]showlex........................FAILED on test 1
  [-.ext.list.util.t]tainted..............FAILED on test 3
  [-.lib.file.find]taint..................FAILED on test 17 
  [-.lib.test.simple.t]exit...............FAILED on test 1
  [.lib]vmsish............................FAILED on test 13
  [.op]misc...............................FAILED on test 49
  Failed 6/401 tests, 92.77% okay.

=head2 Win32

In multi-CPU boxes there are some problems with the I/O buffering:
some output may appear twice.

=head2 Localising a Tied Variable Leaks Memory

    use Tie::Hash;
    tie my %tie_hash => 'Tie::StdHash';


    local($tie_hash{Foo}) = 1; # leaks

Code like the above is known to leak memory every time the local()
is executed.

=head2 Self-tying of Arrays and Hashes Is Forbidden

Self-tying of arrays and hashes is broken in rather deep and
hard-to-fix ways.  As a stop-gap measure to avoid people from getting
frustrated at the mysterious results (core dumps, most often) it is
for now forbidden (you will get a fatal error even from an attempt).

=head2 Variable Attributes are not Currently Usable for Tieing

This limitation will hopefully be fixed in future.  (Subroutine
attributes work fine for tieing, see L<Attribute::Handlers>).

=head2 Building Extensions Can Fail Because Of Largefiles

Some extensions like mod_perl are known to have issues with
`largefiles', a change brought by Perl 5.6.0 in which file offsets
default to 64 bits wide, where supported.  Modules may fail to compile
at all or compile and work incorrectly.  Currently there is no good
solution for the problem, but Configure now provides appropriate
non-largefile ccflags, ldflags, libswanted, and libs in the %Config
hash (e.g., $Config{ccflags_nolargefiles}) so the extensions that are
having problems can try configuring themselves without the
largefileness.  This is admittedly not a clean solution, and the
solution may not even work at all.  One potential failure is whether
one can (or, if one can, whether it's a good idea) link together at
all binaries with different ideas about file offsets, all this is

=head2 The Compiler Suite Is Still Experimental

The compiler suite is slowly getting better but is nowhere near
working order yet.

=head2 The Long Double Support is Still Experimental

The ability to configure Perl's numbers to use "long doubles",
floating point numbers of hopefully better accuracy, is still
experimental.  The implementations of long doubles are not yet
widespread and the existing implementations are not quite mature
or standardised, therefore trying to support them is a rare
and moving target.  The gain of more precision may also be offset
by slowdown in computations (more bits to move around, and the
operations are more likely to be executed by less optimised

=head1 Reporting Bugs

If you find what you think is a bug, you might check the articles
recently posted to the comp.lang.perl.misc newsgroup and the perl
bug database at http://bugs.perl.org/  There may also be
information at http://www.perl.com/perl/ , the Perl Home Page.

If you believe you have an unreported bug, please run the B<perlbug>
program included with your release.  Be sure to trim your bug down
to a tiny but sufficient test case.  Your bug report, along with the
output of C<perl -V>, will be sent off to perlbug at perl.org to be
analysed by the Perl porting team.

=head1 SEE ALSO

The F<Changes> file for exhaustive details on what changed.

The F<INSTALL> file for how to build Perl.

The F<README> file for general stuff.

The F<Artistic> and F<Copying> files for copyright information.

=head1 HISTORY

Written by Jarkko Hietaniemi <F<jhi at iki.fi>>, with many contributions
from The Perl Porters and Perl Users submitting feedback and patches.

Send omissions or corrections to <F<perlbug at perl.org>>.


--- NEW FILE: perltoc.pod ---

# !!!!!!!   DO NOT EDIT THIS FILE   !!!!!!!
# This file is autogenerated by buildtoc from all the other pods.
# Edit those files and run buildtoc --build-toc to effect changes.

=head1 NAME

perltoc - perl documentation table of contents


This page provides a brief table of contents for the rest of the Perl
documentation set.  It is meant to be scanned quickly or grepped
through to locate the proper section you're looking for.


=head2 perl - Practical Extraction and Report Language

[...22585 lines suppressed...]

=item pl2pm

=item pod2html

=item pod2man

=item s2p

=item splain

=item xsubpp


=head1 AUTHOR

Larry Wall <F<larry at wall.org>>, with the help of oodles
of other folks.

--- NEW FILE: perlfaq5.pod ---
=head1 NAME

perlfaq5 - Files and Formats ($Revision: 1.2 $, $Date: 2006-12-04 17:01:32 $)


This section deals with I/O and the "f" issues: filehandles, flushing,
formats, and footers.

=head2 How do I flush/unbuffer an output filehandle?  Why must I do this?
X<flush> X<buffer> X<unbuffer> X<autoflush>

Perl does not support truly unbuffered output (except
insofar as you can C<syswrite(OUT, $char, 1)>), although it
does support is "command buffering", in which a physical
write is performed after every output command.

The C standard I/O library (stdio) normally buffers
characters sent to devices so that there isn't a system call
[...1080 lines suppressed...]

If your array contains lines, just print them:

    print @lines;


Copyright (c) 1997-2006 Tom Christiansen, Nathan Torkington, and
other authors as noted. All rights reserved.

This documentation is free; you can redistribute it and/or modify it
under the same terms as Perl itself.

Irrespective of its distribution, all code examples here are in the public
domain.  You are permitted and encouraged to use this code and any
derivatives thereof in your own programs for fun or for profit as you
see fit.  A simple comment in the code giving credit to the FAQ would
be courteous but is not required.

--- NEW FILE: perl588delta.pod ---
=head1 NAME

perldelta - what is new for perl v5.8.8


This document describes differences between the 5.8.7 release and
the 5.8.8 release.

=head1 Incompatible Changes

There are no changes intentionally incompatible with 5.8.7. If any exist,
they are bugs and reports are welcome.

=head1 Core Enhancements


=item *
[...1593 lines suppressed...]
information at http://www.perl.org, the Perl Home Page.

If you believe you have an unreported bug, please run the B<perlbug>
program included with your release.  Be sure to trim your bug down
to a tiny but sufficient test case.  Your bug report, along with the
output of C<perl -V>, will be sent off to perlbug at perl.org to be
analysed by the Perl porting team.  You can browse and search
the Perl 5 bugs at http://bugs.perl.org/

=head1 SEE ALSO

The F<Changes> file for exhaustive details on what changed.

The F<INSTALL> file for how to build Perl.

The F<README> file for general stuff.

The F<Artistic> and F<Copying> files for copyright information.


--- NEW FILE: perl582delta.pod ---
=head1 NAME

perl582delta - what is new for perl v5.8.2


This document describes differences between the 5.8.1 release and
the 5.8.2 release.

If you are upgrading from an earlier release such as 5.6.1, first read
the L<perl58delta>, which describes differences between 5.6.0 and
5.8.0, and the L<perl581delta>, which describes differences between
5.8.0 and 5.8.1.

=head1 Incompatible Changes

For threaded builds for modules calling certain re-entrant system calls,
binary compatibility was accidentally lost between 5.8.0 and 5.8.1.
Binary compatibility with 5.8.0 has been restored in 5.8.2, which
necessitates breaking compatibility with 5.8.1. We see this as the
lesser of two evils.

This will only affect people who have a threaded perl 5.8.1, and compiled
modules which use these calls, and now attempt to run the compiled modules
with 5.8.2. The fix is to re-compile and re-install the modules using 5.8.2.

=head1 Core Enhancements

=head2 Hash Randomisation

The hash randomisation introduced with 5.8.1 has been amended. It
transpired that although the implementation introduced in 5.8.1 was source
compatible with 5.8.0, it was not binary compatible in certain cases. 5.8.2
contains an improved implementation which is both source and binary
compatible with both 5.8.0 and 5.8.1, and remains robust against the form of
attack which prompted the change for 5.8.1.

We are grateful to the Debian project for their input in this area.
See L<perlsec/"Algorithmic Complexity Attacks"> for the original
rationale behind this change.

=head2 Threading

Several memory leaks associated with variables shared between threads
have been fixed.

=head1 Modules and Pragmata

=head2 Updated Modules And Pragmata

The following modules and pragmata have been updated since Perl 5.8.1:

=over 4

=item Devel::PPPort

=item Digest::MD5

=item I18N::LangTags

=item libnet

=item MIME::Base64

=item Pod::Perldoc

=item strict

Documentation improved

=item Tie::Hash

Documentation improved

=item Time::HiRes

=item Unicode::Collate

=item Unicode::Normalize


Documentation improved


=head1 Selected Bug Fixes

Some syntax errors involving unrecognized filetest operators are now handled
correctly by the parser.

=head1 Changed Internals

Interpreter initialization is more complete when -DMULTIPLICITY is off.
This should resolve problems with initializing and destroying the Perl
interpreter more than once in a single process.                      

=head1 Platform Specific Problems

Dynamic linker flags have been tweaked for Solaris and OS X, which should
solve problems seen while building some XS modules.

Bugs in OS/2 sockets and tmpfile have been fixed.

In OS X C<setreuid> and friends are troublesome - perl will now work
around their problems as best possible.

=head1 Future Directions

Starting with 5.8.3 we intend to make more frequent maintenance releases,
with a smaller number of changes in each. The intent is to propagate
bug fixes out to stable releases more rapidly and make upgrading stable
releases less of an upheaval. This should give end users more
flexibility in their choice of upgrade timing, and allow them easier
assessment of the impact of upgrades. The current plan is for code freezes
as follows

=over 4

=item *

5.8.3 23:59:59 GMT, Wednesday December 31st 2003

=item *

5.8.4 23:59:59 GMT, Wednesday March 31st 2004

=item *

5.8.5 23:59:59 GMT, Wednesday June 30th 2004


with the release following soon after, when testing is complete.

See L<perl581delta/"Future Directions"> for more soothsaying.

=head1 Reporting Bugs

If you find what you think is a bug, you might check the articles
recently posted to the comp.lang.perl.misc newsgroup and the perl
bug database at http://bugs.perl.org/.  There may also be
information at http://www.perl.com/, the Perl Home Page.

If you believe you have an unreported bug, please run the B<perlbug>
program included with your release.  Be sure to trim your bug down
to a tiny but sufficient test case.  Your bug report, along with the
output of C<perl -V>, will be sent off to perlbug at perl.org to be
analysed by the Perl porting team.  You can browse and search
the Perl 5 bugs at http://bugs.perl.org/

=head1 SEE ALSO

The F<Changes> file for exhaustive details on what changed.

The F<INSTALL> file for how to build Perl.

The F<README> file for general stuff.

The F<Artistic> and F<Copying> files for copyright information.


--- NEW FILE: perlport.pod ---
=head1 NAME

perlport - Writing portable Perl


Perl runs on numerous operating systems.  While most of them share
much in common, they also have their own unique features.

This document is meant to help you to find out what constitutes portable
Perl code.  That way once you make a decision to write portably,
you know where the lines are drawn, and you can stay within them.

There is a tradeoff between taking full advantage of one particular
type of computer and taking advantage of a full range of them.
Naturally, as you broaden your range and become more diverse, the
common factors drop, and you are left with an increasingly smaller
area of common ground in which you can operate to accomplish a
particular task.  Thus, when you begin attacking a problem, it is
[...2221 lines suppressed...]
Nick Ing-Simmons <nick at ing-simmons.net>,
Andreas J. KE<ouml>nig <a.koenig at mind.de>,
Markus Laker <mlaker at contax.co.uk>,
Andrew M. Langmead <aml at world.std.com>,
Larry Moore <ljmoore at freespace.net>,
Paul Moore <Paul.Moore at uk.origin-it.com>,
Chris Nandor <pudge at pobox.com>,
Matthias Neeracher <neeracher at mac.com>,
Philip Newton <pne at cpan.org>,
Gary Ng <71564.1743 at CompuServe.COM>,
Tom Phoenix <rootbeer at teleport.com>,
AndrE<eacute> Pirard <A.Pirard at ulg.ac.be>,
Peter Prymmer <pvhp at forte.com>,
Hugo van der Sanden <hv at crypt0.demon.co.uk>,
Gurusamy Sarathy <gsar at activestate.com>,
Paul J. Schinder <schinder at pobox.com>,
Michael G Schwern <schwern at pobox.com>,
Dan Sugalski <dan at sidhe.org>,
Nathan Torkington <gnat at frii.com>.

--- NEW FILE: perl5004delta.pod ---
=head1 NAME

perl5004delta - what's new for perl5.004


This document describes differences between the 5.003 release (as
documented in I<Programming Perl>, second edition--the Camel Book) and
this one.

=head1 Supported Environments

Perl5.004 builds out of the box on Unix, Plan 9, LynxOS, VMS, OS/2,
QNX, AmigaOS, and Windows NT.  Perl runs on Windows 95 as well, but it
cannot be built there, for lack of a reasonable command interpreter.

=head1 Core Changes

Most importantly, many bugs were fixed, including several security
[...1573 lines suppressed...]

=head1 SEE ALSO

The F<Changes> file for exhaustive details on what changed.

The F<INSTALL> file for how to build Perl.  This file has been
significantly updated for 5.004, so even veteran users should
look through it.

The F<README> file for general stuff.

The F<Copying> file for copyright information.

=head1 HISTORY

Constructed by Tom Christiansen, grabbing material with permission
from innumerable contributors, with kibitzing by more than a few Perl

Last update: Wed May 14 11:14:09 EDT 1997

--- NEW FILE: perltrap.pod ---
=head1 NAME

perltrap - Perl traps for the unwary


The biggest trap of all is forgetting to C<use warnings> or use the B<-w>
switch; see L<perllexwarn> and L<perlrun>. The second biggest trap is not
making your entire program runnable under C<use strict>.  The third biggest
trap is not reading the list of changes in this version of Perl; see

=head2 Awk Traps

Accustomed B<awk> users should take special note of the following:

=over 4

=item *
[...1551 lines suppressed...]
Running doit.pl gives the following:

    # perl 4 prints: 3 (aborts the subroutine early)
    # perl 5 prints: 8

Same behavior if you replace C<do> with C<require>.

=item * C<split> on empty string with LIMIT specified

    $string = '';
    @list = split(/foo/, $string, 2)

Perl4 returns a one element list containing the empty string but Perl5
returns an empty list.


As always, if any of these are ever officially declared as bugs,
they'll be fixed and removed.

--- NEW FILE: perlre.pod ---
=head1 NAME
X<regular expression> X<regex> X<regexp>

perlre - Perl regular expressions


This page describes the syntax of regular expressions in Perl.  

If you haven't used regular expressions before, a quick-start
introduction is available in L<perlrequick>, and a longer tutorial
introduction is available in L<perlretut>.

For reference on how regular expressions are used in matching
operations, plus various examples of the same, see discussions of
C<m//>, C<s///>, C<qr//> and C<??> in L<perlop/"Regexp Quote-Like

Matching operations can have various modifiers.  Modifiers
[...1367 lines suppressed...]
=head1 SEE ALSO



L<perlop/"Regexp Quote-Like Operators">.

L<perlop/"Gory details of parsing quoted constructs">.





I<Mastering Regular Expressions> by Jeffrey Friedl, published
by O'Reilly and Associates.

--- NEW FILE: perl584delta.pod ---
=head1 NAME

perl584delta - what is new for perl v5.8.4


This document describes differences between the 5.8.3 release and
the 5.8.4 release.

=head1 Incompatible Changes

Many minor bugs have been fixed. Scripts which happen to rely on previously
erroneous behaviour will consider these fixes as incompatible changes :-)
You are advised to perform sufficient acceptance testing on this release
to satisfy yourself that this does not affect you, before putting this
release into production.

The diagnostic output of Carp has been changed slightly, to add a space after
the comma between arguments. This makes it much easier for tools such as
web browsers to wrap it, but might confuse any automatic tools which perform
detailed parsing of Carp output.

The internal dump output has been improved, so that non-printable characters
such as newline and backspace are output in C<\x> notation, rather than
octal. This might just confuse non-robust tools which parse the output of
modules such as Devel::Peek.

=head1 Core Enhancements

=head2 Malloc wrapping

Perl can now be built to detect attempts to assign pathologically large chunks
of memory.  Previously such assignments would suffer from integer wrap-around
during size calculations causing a misallocation, which would crash perl, and
could theoretically be used for "stack smashing" attacks.  The wrapping
defaults to enabled on platforms where we know it works (most AIX
configurations, BSDi, Darwin, DEC OSF/1, FreeBSD, HP/UX, GNU Linux, OpenBSD,
Solaris, VMS and most Win32 compilers) and defaults to disabled on other

=head2 Unicode Character Database 4.0.1

The copy of the Unicode Character Database included in Perl 5.8 has
been updated to 4.0.1 from 4.0.0.

=head2 suidperl less insecure

Paul Szabo has analysed and patched C<suidperl> to remove existing known
insecurities. Currently there are no known holes in C<suidperl>, but previous
experience shows that we cannot be confident that these were the last. You may
no longer invoke the set uid perl directly, so to preserve backwards
compatibility with scripts that invoke #!/usr/bin/suidperl the only set uid
binary is now C<sperl5.8.>I<n> (C<sperl5.8.4> for this release). C<suidperl>
is installed as a hard link to C<perl>; both C<suidperl> and C<perl> will
invoke C<sperl5.8.4> automatically the set uid binary, so this change should
be completely transparent.

For new projects the core perl team would strongly recommend that you use
dedicated, single purpose security tools such as C<sudo> in preference to

=head2 format

In addition to bug fixes, C<format>'s features have been enhanced. See

=head1 Modules and Pragmata

The (mis)use of C</tmp> in core modules and documentation has been tidied up.
Some modules available both within the perl core and independently from CPAN
("dual-life modules") have not yet had these changes applied; the changes
will be integrated into future stable perl releases as the modules are
updated on CPAN.

=head2 Updated modules

=over 4

=item Attribute::Handlers

=item B

=item Benchmark

=item CGI

=item Carp

=item Cwd

=item Exporter

=item File::Find

=item IO

=item IPC::Open3

=item Local::Maketext

=item Math::BigFloat

=item Math::BigInt

=item Math::BigRat

=item MIME::Base64

=item ODBM_File

=item POSIX

=item Shell

=item Socket

There is experimental support for Linux abstract Unix domain sockets.

=item Storable

=item Switch

Synced with its CPAN version 2.10

=item Sys::Syslog

C<syslog()> can now use numeric constants for facility names and priorities,
in addition to strings.

=item Term::ANSIColor

=item Time::HiRes

=item Unicode::UCD

=item Win32

Win32.pm/Win32.xs has moved from the libwin32 module to core Perl

=item base

=item open

=item threads

Detached threads are now also supported on Windows.

=item utf8


=head1 Performance Enhancements

=over 4

=item *

Accelerated Unicode case mappings (C</i>, C<lc>, C<uc>, etc).

=item *

In place sort optimised (eg C<@a = sort @a>)

=item *

Unnecessary assignment optimised away in

  my $s = undef;
  my @a = ();
  my %h = ();

=item *

Optimised C<map> in scalar context


=head1 Utility Changes

The Perl debugger (F<lib/perl5db.pl>) can now save all debugger commands for
sourcing later, and can display the parent inheritance tree of a given class.

=head1 Installation and Configuration Improvements

The build process on both VMS and Windows has had several minor improvements
made. On Windows Borland's C compiler can now compile perl with PerlIO and/or

C<perl.exe> on Windows now has a "Camel" logo icon. The use of a camel with
the topic of Perl is a trademark of O'Reilly and Associates Inc., and is used
with their permission (ie distribution of the source, compiling a Windows
executable from it, and using that executable locally). Use of the supplied
camel for anything other than a perl executable's icon is specifically not
covered, and anyone wishing to redistribute perl binaries I<with> the icon
should check directly with O'Reilly beforehand.

Perl should build cleanly on Stratus VOS once more.

=head1 Selected Bug Fixes

More utf8 bugs fixed, notably in how C<chomp>, C<chop>, C<send>, and
C<syswrite> and interact with utf8 data. Concatenation now works correctly
when C<use bytes;> is in scope.

Pragmata are now correctly propagated into (?{...}) constructions in regexps.
Code such as

   my $x = qr{ ... (??{ $x }) ... };

will now (correctly) fail under use strict. (As the inner C<$x> is and
has always referred to C<$::x>)

The "const in void context" warning has been suppressed for a constant in an
optimised-away boolean expression such as C<5 || print;>

C<perl -i> could C<fchmod(stdin)> by mistake. This is serious if stdin is
attached to a terminal, and perl is running as root. Now fixed.

=head1 New or Changed Diagnostics

C<Carp> and the internal diagnostic routines used by C<Devel::Peek> have been
made clearer, as described in L</Incompatible Changes>

=head1 Changed Internals

Some bugs have been fixed in the hash internals. Restricted hashes and
their place holders are now allocated and deleted at slightly different times,
but this should not be visible to user code.

=head1 Future Directions

Code freeze for the next maintenance release (5.8.5) will be on 30th June
2004, with release by mid July.

=head1 Platform Specific Problems

This release is known not to build on Windows 95.

=head1 Reporting Bugs

If you find what you think is a bug, you might check the articles
recently posted to the comp.lang.perl.misc newsgroup and the perl
bug database at http://bugs.perl.org.  There may also be
information at http://www.perl.org, the Perl Home Page.

If you believe you have an unreported bug, please run the B<perlbug>
program included with your release.  Be sure to trim your bug down
to a tiny but sufficient test case.  Your bug report, along with the
output of C<perl -V>, will be sent off to perlbug at perl.org to be
analysed by the Perl porting team.  You can browse and search
the Perl 5 bugs at http://bugs.perl.org/

=head1 SEE ALSO

The F<Changes> file for exhaustive details on what changed.

The F<INSTALL> file for how to build Perl.

The F<README> file for general stuff.

The F<Artistic> and F<Copying> files for copyright information.


--- NEW FILE: splitman ---

while (<>) {
    if ($seqno = 1 .. /^\.TH/) {
	unless ($seqno =~ /e/i) {
	    $header .= $_;

    if ( /^\.Ip\s*"(.*)"\s*\d+$/) {
	$desking = 0;
	$desc = $1;
	if (name($desc) ne $myname) {
	    $myname = name($desc);
	    print $myname, "\n";
	    open(MAN, "> $myname.3pl");
	    print MAN <<EOALL;
.TH $myname 3PL "\\*(RP"
.B $desc
	} else {
	    print MAN <<EOMORE;
.ti +3n
.B $desc
    unless ($desking) {
	$desking = 1;
    print MAN;

sub name {
    ($_[0] =~ /(\w+)/)[0];

--- NEW FILE: perlguts.pod ---
=head1 NAME

perlguts - Introduction to the Perl API


This document attempts to describe how to use the Perl API, as well as
to provide some info on the basic workings of the Perl core. It is far
from complete and probably contains many errors. Please refer any
questions or comments to the author below.

=head1 Variables

=head2 Datatypes

Perl has three typedefs that handle Perl's three main data types:

    SV  Scalar Value
    AV  Array Value
[...2532 lines suppressed...]
need to enter a name and description for your op at the appropriate
place in the C<PL_custom_op_names> and C<PL_custom_op_descs> hashes.

Forthcoming versions of C<B::Generate> (version 1.0 and above) should
directly support the creation of custom ops by name.

=head1 AUTHORS

Until May 1997, this document was maintained by Jeff Okamoto
E<lt>okamoto at corp.hp.comE<gt>.  It is now maintained as part of Perl
itself by the Perl 5 Porters E<lt>perl5-porters at perl.orgE<gt>.

With lots of help and suggestions from Dean Roehrich, Malcolm Beattie,
Andreas Koenig, Paul Hudson, Ilya Zakharevich, Paul Marquess, Neil
Bowers, Matthew Green, Tim Bunce, Spider Boardman, Ulrich Pfeifer,
Stephen McCamant, and Gurusamy Sarathy.

=head1 SEE ALSO

perlapi(1), perlintern(1), perlxs(1), perlembed(1)

--- NEW FILE: perlfaq3.pod ---
=head1 NAME

perlfaq3 - Programming Tools ($Revision: 1.2 $, $Date: 2006-12-04 17:01:32 $)


This section of the FAQ answers questions related to programmer tools
and programming support.

=head2 How do I do (anything)?

Have you looked at CPAN (see L<perlfaq2>)?  The chances are that
someone has already written a module that can solve your problem.
Have you read the appropriate manpages?  Here's a brief index:

	Basics	        perldata, perlvar, perlsyn, perlop, perlsub
	Execution	perlrun, perldebug
	Functions	perlfunc
	Objects		perlref, perlmod, perlobj, perltie
	Data Structures	perlref, perllol, perldsc
	Modules		perlmod, perlmodlib, perlsub
	Regexes		perlre, perlfunc, perlop, perllocale
	Moving to perl5	perltrap, perl
	Linking w/C	perlxstut, perlxs, perlcall, perlguts, perlembed
	Various 	http://www.cpan.org/misc/olddoc/FMTEYEWTK.tgz
			(not a man-page but still useful, a collection
			 of various essays on Perl techniques)

A crude table of contents for the Perl manpage set is found in L<perltoc>.

=head2 How can I use Perl interactively?

The typical approach uses the Perl debugger, described in the
perldebug(1) manpage, on an "empty" program, like this:

    perl -de 42

Now just type in any legal Perl code, and it will be immediately
evaluated.  You can also examine the symbol table, get stack
backtraces, check variable values, set breakpoints, and other
operations typically found in symbolic debuggers.

=head2 Is there a Perl shell?

The psh (Perl sh) is currently at version 1.8. The Perl Shell is a shell
that combines the interactive nature of a Unix shell with the power of
Perl. The goal is a full featured shell that behaves as expected for
normal shell activity and uses Perl syntax and functionality for
control-flow statements and other things. You can get psh at
http://sourceforge.net/projects/psh/ .

Zoidberg is a similar project and provides a shell written in perl,
configured in perl and operated in perl. It is intended as a login shell
and development environment. It can be found at http://zoidberg.sf.net/
or your local CPAN mirror.

The Shell.pm module (distributed with Perl) makes Perl try commands
which aren't part of the Perl language as shell commands.  perlsh from
the source distribution is simplistic and uninteresting, but may still
be what you want.

=head2 How do I find which modules are installed on my system?

You can use the ExtUtils::Installed module to show all installed
distributions, although it can take awhile to do its magic.  The
standard library which comes with Perl just shows up as "Perl" (although
you can get those with Module::CoreList).

	use ExtUtils::Installed;

	my $inst    = ExtUtils::Installed->new();
	my @modules = $inst->modules();

If you want a list of all of the Perl module filenames, you
can use File::Find::Rule.

	use File::Find::Rule;

	my @files = File::Find::Rule->file()->name( '*.pm' )->in( @INC );

If you do not have that module, you can do the same thing
with File::Find which is part of the standard library.

    use File::Find;
    my @files;

      sub {
      	push @files, $File::Find::name
      		if -f $File::Find::name && /\.pm$/


	print join "\n", @files;

If you simply need to quickly check to see if a module is
available, you can check for its documentation.  If you can
read the documentation the module is most likely installed.
If you cannot read the documentation, the module might not
have any (in rare cases).

	prompt% perldoc Module::Name

You can also try to include the module in a one-liner to see if
perl finds it.

	perl -MModule::Name -e1

=head2 How do I debug my Perl programs?

Have you tried C<use warnings> or used C<-w>?  They enable warnings
to detect dubious practices.

Have you tried C<use strict>?  It prevents you from using symbolic
references, makes you predeclare any subroutines that you call as bare
words, and (probably most importantly) forces you to predeclare your
variables with C<my>, C<our>, or C<use vars>.

Did you check the return values of each and every system call?  The operating
system (and thus Perl) tells you whether they worked, and if not

  open(FH, "> /etc/cantwrite")
    or die "Couldn't write to /etc/cantwrite: $!\n";

Did you read L<perltrap>?  It's full of gotchas for old and new Perl
programmers and even has sections for those of you who are upgrading
from languages like I<awk> and I<C>.

Have you tried the Perl debugger, described in L<perldebug>?  You can
step through your program and see what it's doing and thus work out
why what it's doing isn't what it should be doing.

=head2 How do I profile my Perl programs?

You should get the Devel::DProf module from the standard distribution
(or separately on CPAN) and also use Benchmark.pm from the standard
distribution.  The Benchmark module lets you time specific portions of
your code, while Devel::DProf gives detailed breakdowns of where your
code spends its time.

Here's a sample use of Benchmark:

  use Benchmark;

  @junk = `cat /etc/motd`;
  $count = 10_000;

  timethese($count, {
            'map' => sub { my @a = @junk;
			   map { s/a/b/ } @a;
			   return @a },
            'for' => sub { my @a = @junk;
			   for (@a) { s/a/b/ };
			   return @a },

This is what it prints (on one machine--your results will be dependent
on your hardware, operating system, and the load on your machine):

  Benchmark: timing 10000 iterations of for, map...
         for:  4 secs ( 3.97 usr  0.01 sys =  3.98 cpu)
         map:  6 secs ( 4.97 usr  0.00 sys =  4.97 cpu)

Be aware that a good benchmark is very hard to write.  It only tests the
data you give it and proves little about the differing complexities
of contrasting algorithms.

=head2 How do I cross-reference my Perl programs?

The B::Xref module can be used to generate cross-reference reports
for Perl programs.

    perl -MO=Xref[,OPTIONS] scriptname.plx

=head2 Is there a pretty-printer (formatter) for Perl?

Perltidy is a Perl script which indents and reformats Perl scripts
to make them easier to read by trying to follow the rules of the
L<perlstyle>. If you write Perl scripts, or spend much time reading
them, you will probably find it useful.  It is available at

Of course, if you simply follow the guidelines in L<perlstyle>,
you shouldn't need to reformat.  The habit of formatting your code
as you write it will help prevent bugs.  Your editor can and should
help you with this.  The perl-mode or newer cperl-mode for emacs
can provide remarkable amounts of help with most (but not all)
code, and even less programmable editors can provide significant
assistance.  Tom Christiansen and many other VI users  swear by
the following settings in vi and its clones:

    set ai sw=4
    map! ^O {^M}^[O^T

Put that in your F<.exrc> file (replacing the caret characters
with control characters) and away you go.  In insert mode, ^T is
for indenting, ^D is for undenting, and ^O is for blockdenting--
as it were.  A more complete example, with comments, can be found at

The a2ps http://www-inf.enst.fr/%7Edemaille/a2ps/black+white.ps.gz does
lots of things related to generating nicely printed output of
documents, as does enscript at http://people.ssh.fi/mtr/genscript/ .

=head2 Is there a ctags for Perl?

(contributed by brian d foy)

Exuberent ctags supports Perl: http://ctags.sourceforge.net/

You might also try pltags: http://www.mscha.com/pltags.zip

=head2 Is there an IDE or Windows Perl Editor?

Perl programs are just plain text, so any editor will do.

If you're on Unix, you already have an IDE--Unix itself.  The UNIX
philosophy is the philosophy of several small tools that each do one
thing and do it well.  It's like a carpenter's toolbox.

If you want an IDE, check the following (in alphabetical order, not
order of preference):

=over 4

=item Eclipse


The Eclipse Perl Integration Project integrates Perl
editing/debugging with Eclipse.

=item Enginsite


Perl Editor by EngInSite is a complete integrated development
environment (IDE) for creating, testing, and  debugging  Perl scripts;
the tool runs on Windows 9x/NT/2000/XP or later.

=item Komodo


ActiveState's cross-platform (as of October 2004, that's Windows, Linux,
and Solaris), multi-language IDE has Perl support, including a regular expression
debugger and remote debugging.

=item Open Perl IDE


Open Perl IDE is an integrated development environment for writing
and debugging Perl scripts with ActiveState's ActivePerl distribution
under Windows 95/98/NT/2000.

=item OptiPerl


OptiPerl is a Windows IDE with simulated CGI environment, including
debugger and syntax highlighting editor.

=item PerlBuilder


PerlBuidler is an integrated development environment for Windows that
supports Perl development.

=item visiPerl+


>From Help Consulting, for Windows.

=item Visual Perl


Visual Perl is a Visual Studio.NET plug-in from ActiveState.

=item Zeus


Zeus for Window is another Win32 multi-language editor/IDE
that comes with support for Perl:


For editors: if you're on Unix you probably have vi or a vi clone
already, and possibly an emacs too, so you may not need to download
anything. In any emacs the cperl-mode (M-x cperl-mode) gives you
perhaps the best available Perl editing mode in any editor.

If you are using Windows, you can use any editor that lets you work
with plain text, such as NotePad or WordPad.  Word processors, such as
Microsoft Word or WordPerfect, typically do not work since they insert
all sorts of behind-the-scenes information, although some allow you to
save files as "Text Only". You can also download text editors designed
specifically for programming, such as Textpad (
http://www.textpad.com/ ) and UltraEdit ( http://www.ultraedit.com/ ),
among others.

If you are using MacOS, the same concerns apply.  MacPerl (for Classic
environments) comes with a simple editor. Popular external editors are
BBEdit ( http://www.bbedit.com/ ) or Alpha (
http://www.his.com/~jguyer/Alpha/Alpha8.html ). MacOS X users can use
Unix editors as well. Neil Bowers (the man behind Geekcruises) has a
list of Mac editors that can handle Perl (
http://www.neilbowers.org/macperleditors.html ).

=over 4

=item GNU Emacs


=item MicroEMACS


=item XEmacs


=item Jed



or a vi clone such as

=over 4

=item Elvis

ftp://ftp.cs.pdx.edu/pub/elvis/ http://www.fh-wedel.de/elvis/

=item Vile


=item Vim



For vi lovers in general, Windows or elsewhere:


nvi ( http://www.bostic.com/vi/ , available from CPAN in src/misc/) is
yet another vi clone, unfortunately not available for Windows, but in
UNIX platforms you might be interested in trying it out, firstly because
strictly speaking it is not a vi clone, it is the real vi, or the new
incarnation of it, and secondly because you can embed Perl inside it
to use Perl as the scripting language.  nvi is not alone in this,
though: at least also vim and vile offer an embedded Perl.

The following are Win32 multilanguage editor/IDESs that support Perl:

=over 4

=item Codewright


=item MultiEdit


=item SlickEdit



There is also a toyedit Text widget based editor written in Perl
that is distributed with the Tk module on CPAN.  The ptkdb
( http://world.std.com/~aep/ptkdb/ ) is a Perl/tk based debugger that
acts as a development environment of sorts.  Perl Composer
( http://perlcomposer.sourceforge.net/ ) is an IDE for Perl/Tk
GUI creation.

In addition to an editor/IDE you might be interested in a more
powerful shell environment for Win32.  Your options include

=over 4

=item Bash

from the Cygwin package ( http://sources.redhat.com/cygwin/ )

=item Ksh

from the MKS Toolkit ( http://www.mks.com/ ), or the Bourne shell of
the U/WIN environment ( http://www.research.att.com/sw/tools/uwin/ )

=item Tcsh

ftp://ftp.astron.com/pub/tcsh/ , see also

=item Zsh

ftp://ftp.blarg.net/users/amol/zsh/ , see also http://www.zsh.org/


MKS and U/WIN are commercial (U/WIN is free for educational and
research purposes), Cygwin is covered by the GNU Public License (but
that shouldn't matter for Perl use).  The Cygwin, MKS, and U/WIN all
contain (in addition to the shells) a comprehensive set of standard
UNIX toolkit utilities.

If you're transferring text files between Unix and Windows using FTP
be sure to transfer them in ASCII mode so the ends of lines are
appropriately converted.

On Mac OS the MacPerl Application comes with a simple 32k text editor
that behaves like a rudimentary IDE.  In contrast to the MacPerl Application
the MPW Perl tool can make use of the MPW Shell itself as an editor (with
no 32k limit).

=over 4

=item Affrus

is a full Perl development environment with full debugger support
( http://www.latenightsw.com ).

=item Alpha

is an editor, written and extensible in Tcl, that nonetheless has
built in support for several popular markup and programming languages
including Perl and HTML ( http://www.his.com/~jguyer/Alpha/Alpha8.html ).

=item BBEdit and BBEdit Lite

are text editors for Mac OS that have a Perl sensitivity mode
( http://web.barebones.com/ ).


Pepper and Pe are programming language sensitive text editors for Mac
OS X and BeOS respectively ( http://www.hekkelman.com/ ).

=head2 Where can I get Perl macros for vi?

For a complete version of Tom Christiansen's vi configuration file,
see http://www.cpan.org/authors/Tom_Christiansen/scripts/toms.exrc.gz ,
the standard benchmark file for vi emulators.  The file runs best with nvi,
the current version of vi out of Berkeley, which incidentally can be built
with an embedded Perl interpreter--see http://www.cpan.org/src/misc/ .

=head2 Where can I get perl-mode for emacs?

Since Emacs version 19 patchlevel 22 or so, there have been both a
perl-mode.el and support for the Perl debugger built in.  These should
come with the standard Emacs 19 distribution.

In the Perl source directory, you'll find a directory called "emacs",
which contains a cperl-mode that color-codes keywords, provides
context-sensitive help, and other nifty things.

Note that the perl-mode of emacs will have fits with C<"main'foo">
(single quote), and mess up the indentation and highlighting.  You
are probably using C<"main::foo"> in new Perl code anyway, so this
shouldn't be an issue.

=head2 How can I use curses with Perl?

The Curses module from CPAN provides a dynamically loadable object
module interface to a curses library.  A small demo can be found at the
directory http://www.cpan.org/authors/Tom_Christiansen/scripts/rep.gz ;
this program repeats a command and updates the screen as needed, rendering
B<rep ps axu> similar to B<top>.

=head2 How can I use X or Tk with Perl?

Tk is a completely Perl-based, object-oriented interface to the Tk toolkit
that doesn't force you to use Tcl just to get at Tk.  Sx is an interface
to the Athena Widget set.  Both are available from CPAN.  See the
directory http://www.cpan.org/modules/by-category/08_User_Interfaces/

Invaluable for Perl/Tk programming are the Perl/Tk FAQ at
http://phaseit.net/claird/comp.lang.perl.tk/ptkFAQ.html , the Perl/Tk Reference
Guide available at
http://www.cpan.org/authors/Stephen_O_Lidie/ , and the
online manpages at
http://www-users.cs.umn.edu/%7Eamundson/perl/perltk/toc.html .

=head2 How can I make my Perl program run faster?

The best way to do this is to come up with a better algorithm.  This
can often make a dramatic difference.  Jon Bentley's book
I<Programming Pearls> (that's not a misspelling!)  has some good tips
on optimization, too.  Advice on benchmarking boils down to: benchmark
and profile to make sure you're optimizing the right part, look for
better algorithms instead of microtuning your code, and when all else
fails consider just buying faster hardware.  You will probably want to
read the answer to the earlier question "How do I profile my Perl
programs?" if you haven't done so already.

A different approach is to autoload seldom-used Perl code.  See the
AutoSplit and AutoLoader modules in the standard distribution for
that.  Or you could locate the bottleneck and think about writing just
that part in C, the way we used to take bottlenecks in C code and
write them in assembler.  Similar to rewriting in C, modules that have
critical sections can be written in C (for instance, the PDL module
from CPAN).

If you're currently linking your perl executable to a shared
I<libc.so>, you can often gain a 10-25% performance benefit by
rebuilding it to link with a static libc.a instead.  This will make a
bigger perl executable, but your Perl programs (and programmers) may
thank you for it.  See the F<INSTALL> file in the source distribution
for more information.

The undump program was an ancient attempt to speed up Perl program by
storing the already-compiled form to disk.  This is no longer a viable
option, as it only worked on a few architectures, and wasn't a good
solution anyway.

=head2 How can I make my Perl program take less memory?

When it comes to time-space tradeoffs, Perl nearly always prefers to
throw memory at a problem.  Scalars in Perl use more memory than
strings in C, arrays take more than that, and hashes use even more.  While
there's still a lot to be done, recent releases have been addressing
these issues.  For example, as of 5.004, duplicate hash keys are
shared amongst all hashes using them, so require no reallocation.

In some cases, using substr() or vec() to simulate arrays can be
highly beneficial.  For example, an array of a thousand booleans will
take at least 20,000 bytes of space, but it can be turned into one
125-byte bit vector--a considerable memory savings.  The standard
Tie::SubstrHash module can also help for certain types of data
structure.  If you're working with specialist data structures
(matrices, for instance) modules that implement these in C may use
less memory than equivalent Perl modules.

Another thing to try is learning whether your Perl was compiled with
the system malloc or with Perl's builtin malloc.  Whichever one it
is, try using the other one and see whether this makes a difference.
Information about malloc is in the F<INSTALL> file in the source
distribution.  You can find out whether you are using perl's malloc by
typing C<perl -V:usemymalloc>.

Of course, the best way to save memory is to not do anything to waste
it in the first place. Good programming practices can go a long way
toward this:

=over 4

=item * Don't slurp!

Don't read an entire file into memory if you can process it line
by line. Or more concretely, use a loop like this:

	# Good Idea
	while (<FILE>) {
	   # ...

instead of this:

	# Bad Idea
	@data = <FILE>;
	foreach (@data) {
	    # ...

When the files you're processing are small, it doesn't much matter which
way you do it, but it makes a huge difference when they start getting

=item * Use map and grep selectively

Remember that both map and grep expect a LIST argument, so doing this:

        @wanted = grep {/pattern/} <FILE>;

will cause the entire file to be slurped. For large files, it's better
to loop:

        while (<FILE>) {
                push(@wanted, $_) if /pattern/;

=item * Avoid unnecessary quotes and stringification

Don't quote large strings unless absolutely necessary:

        my $copy = "$large_string";

makes 2 copies of $large_string (one for $copy and another for the
quotes), whereas

        my $copy = $large_string;

only makes one copy.

Ditto for stringifying large arrays:

                local $, = "\n";
                print @big_array;

is much more memory-efficient than either

        print join "\n", @big_array;


                local $" = "\n";
                print "@big_array";

=item * Pass by reference

Pass arrays and hashes by reference, not by value. For one thing, it's
the only way to pass multiple lists or hashes (or both) in a single
call/return. It also avoids creating a copy of all the contents. This
requires some judgment, however, because any changes will be propagated
back to the original data. If you really want to mangle (er, modify) a
copy, you'll have to sacrifice the memory needed to make one.

=item * Tie large variables to disk.

For "big" data stores (i.e. ones that exceed available memory) consider
using one of the DB modules to store it on disk instead of in RAM. This
will incur a penalty in access time, but that's probably better than
causing your hard disk to thrash due to massive swapping.


=head2 Is it safe to return a reference to local or lexical data?

Yes. Perl's garbage collection system takes care of this so
everything works out right.

    sub makeone {
	my @a = ( 1 .. 10 );
	return \@a;

    for ( 1 .. 10 ) {
        push @many, makeone();

    print $many[4][5], "\n";

    print "@many\n";

=head2 How can I free an array or hash so my program shrinks?

(contributed by Michael Carman)

You usually can't. Memory allocated to lexicals (i.e. my() variables)
cannot be reclaimed or reused even if they go out of scope. It is
reserved in case the variables come back into scope. Memory allocated
to global variables can be reused (within your program) by using
undef()ing and/or delete().

On most operating systems, memory allocated to a program can never be
returned to the system. That's why long-running programs sometimes re-
exec themselves. Some operating systems (notably, systems that use
mmap(2) for allocating large chunks of memory) can reclaim memory that
is no longer used, but on such systems, perl must be configured and
compiled to use the OS's malloc, not perl's.

In general, memory allocation and de-allocation isn't something you can
or should be worrying about much in Perl.

See also "How can I make my Perl program take less memory?"

=head2 How can I make my CGI script more efficient?

Beyond the normal measures described to make general Perl programs
faster or smaller, a CGI program has additional issues.  It may be run
several times per second.  Given that each time it runs it will need
to be re-compiled and will often allocate a megabyte or more of system
memory, this can be a killer.  Compiling into C B<isn't going to help
you> because the process start-up overhead is where the bottleneck is.

There are two popular ways to avoid this overhead.  One solution
involves running the Apache HTTP server (available from
http://www.apache.org/ ) with either of the mod_perl or mod_fastcgi
plugin modules.

With mod_perl and the Apache::Registry module (distributed with
mod_perl), httpd will run with an embedded Perl interpreter which
pre-compiles your script and then executes it within the same address
space without forking.  The Apache extension also gives Perl access to
the internal server API, so modules written in Perl can do just about
anything a module written in C can.  For more on mod_perl, see

With the FCGI module (from CPAN) and the mod_fastcgi
module (available from http://www.fastcgi.com/ ) each of your Perl
programs becomes a permanent CGI daemon process.

Both of these solutions can have far-reaching effects on your system
and on the way you write your CGI programs, so investigate them with

See http://www.cpan.org/modules/by-category/15_World_Wide_Web_HTML_HTTP_CGI/ .

=head2 How can I hide the source for my Perl program?

Delete it. :-) Seriously, there are a number of (mostly
unsatisfactory) solutions with varying levels of "security".

First of all, however, you I<can't> take away read permission, because
the source code has to be readable in order to be compiled and
interpreted.  (That doesn't mean that a CGI script's source is
readable by people on the web, though--only by people with access to
the filesystem.)  So you have to leave the permissions at the socially
friendly 0755 level.

Some people regard this as a security problem.  If your program does
insecure things and relies on people not knowing how to exploit those
insecurities, it is not secure.  It is often possible for someone to
determine the insecure things and exploit them without viewing the
source.  Security through obscurity, the name for hiding your bugs
instead of fixing them, is little security indeed.

You can try using encryption via source filters (Starting from Perl
5.8 the Filter::Simple and Filter::Util::Call modules are included in
the standard distribution), but any decent programmer will be able to
decrypt it.  You can try using the byte code compiler and interpreter
described below, but the curious might still be able to de-compile it.
You can try using the native-code compiler described below, but
crackers might be able to disassemble it.  These pose varying degrees
of difficulty to people wanting to get at your code, but none can
definitively conceal it (true of every language, not just Perl).

It is very easy to recover the source of Perl programs.  You simply
feed the program to the perl interpreter and use the modules in
the B:: hierarchy.  The B::Deparse module should be able to
defeat most attempts to hide source.  Again, this is not
unique to Perl.

If you're concerned about people profiting from your code, then the
bottom line is that nothing but a restrictive license will give you
legal security.  License your software and pepper it with threatening
statements like "This is unpublished proprietary software of XYZ Corp.
Your access to it does not give you permission to use it blah blah
blah."  We are not lawyers, of course, so you should see a lawyer if
you want to be sure your license's wording will stand up in court.

=head2 How can I compile my Perl program into byte code or C?

(contributed by brian d foy)

In general, you can't do this.  There are some things that may work
for your situation though.  People usually ask this question
because they want to distribute their works without giving away
the source code, and most solutions trade disk space for convenience.
You probably won't see much of a speed increase either, since most
solutions simply bundle a Perl interpreter in the final product
(but see L<How can I make my Perl program run faster?>).

The Perl Archive Toolkit ( http://par.perl.org/index.cgi ) is Perl's
analog to Java's JAR.  It's freely available and on CPAN (
http://search.cpan.org/dist/PAR/ ).

The B::* namespace, often called "the Perl compiler", but is really a way
for Perl programs to peek at its innards rather than create pre-compiled
versions of your program.  However. the B::Bytecode module can turn your
script  into a bytecode format that could be loaded later by the
ByteLoader module and executed as a regular Perl script.

There are also some commercial products that may work for you, although
you have to buy a license for them.

The Perl Dev Kit ( http://www.activestate.com/Products/Perl_Dev_Kit/ )
from ActiveState can "Turn your Perl programs into ready-to-run
executables for HP-UX, Linux, Solaris and Windows."

Perl2Exe ( http://www.indigostar.com/perl2exe.htm ) is a command line
program for converting perl scripts to executable files.  It targets both
Windows and unix platforms.

=head2 How can I compile Perl into Java?

You can also integrate Java and Perl with the
Perl Resource Kit from O'Reilly Media.  See
http://www.oreilly.com/catalog/prkunix/ .

Perl 5.6 comes with Java Perl Lingo, or JPL.  JPL, still in
development, allows Perl code to be called from Java.  See jpl/README
in the Perl source tree.

=head2 How can I get C<#!perl> to work on [MS-DOS,NT,...]?

For OS/2 just use

    extproc perl -S -your_switches

as the first line in C<*.cmd> file (C<-S> due to a bug in cmd.exe's
"extproc" handling).  For DOS one should first invent a corresponding
batch file and codify it in C<ALTERNATE_SHEBANG> (see the
F<dosish.h> file in the source distribution for more information).

The Win95/NT installation, when using the ActiveState port of Perl,
will modify the Registry to associate the C<.pl> extension with the
perl interpreter.  If you install another port, perhaps even building
your own Win95/NT Perl from the standard sources by using a Windows port
of gcc (e.g., with cygwin or mingw32), then you'll have to modify
the Registry yourself.  In addition to associating C<.pl> with the
interpreter, NT people can use: C<SET PATHEXT=%PATHEXT%;.PL> to let them
run the program C<install-linux.pl> merely by typing C<install-linux>.

Under "Classic" MacOS, a perl program will have the appropriate Creator and
Type, so that double-clicking them will invoke the MacPerl application.
Under Mac OS X, clickable apps can be made from any C<#!> script using Wil
Sanchez' DropScript utility: http://www.wsanchez.net/software/ .

I<IMPORTANT!>: Whatever you do, PLEASE don't get frustrated, and just
throw the perl interpreter into your cgi-bin directory, in order to
get your programs working for a web server.  This is an EXTREMELY big
security risk.  Take the time to figure out how to do it correctly.

=head2 Can I write useful Perl programs on the command line?

Yes.  Read L<perlrun> for more information.  Some examples follow.
(These assume standard Unix shell quoting rules.)

    # sum first and last fields
    perl -lane 'print $F[0] + $F[-1]' *

    # identify text files
    perl -le 'for(@ARGV) {print if -f && -T _}' *

    # remove (most) comments from C program
    perl -0777 -pe 's{/\*.*?\*/}{}gs' foo.c

    # make file a month younger than today, defeating reaper daemons
    perl -e '$X=24*60*60; utime(time(),time() + 30 * $X, at ARGV)' *

    # find first unused uid
    perl -le '$i++ while getpwuid($i); print $i'

    # display reasonable manpath
    echo $PATH | perl -nl -072 -e '
	s![^/+]*$!man!&&-d&&!$s{$_}++&&push at m,$_;END{print"@m"}'

OK, the last one was actually an Obfuscated Perl Contest entry. :-)

=head2 Why don't Perl one-liners work on my DOS/Mac/VMS system?

The problem is usually that the command interpreters on those systems
have rather different ideas about quoting than the Unix shells under
which the one-liners were created.  On some systems, you may have to
change single-quotes to double ones, which you must I<NOT> do on Unix
or Plan9 systems.  You might also have to change a single % to a %%.

For example:

    # Unix
    perl -e 'print "Hello world\n"'

    # DOS, etc.
    perl -e "print \"Hello world\n\""

    # Mac
    print "Hello world\n"
     (then Run "Myscript" or Shift-Command-R)

    # MPW
    perl -e 'print "Hello world\n"'

    # VMS
    perl -e "print ""Hello world\n"""

The problem is that none of these examples are reliable: they depend on the
command interpreter.  Under Unix, the first two often work. Under DOS,
it's entirely possible that neither works.  If 4DOS was the command shell,
you'd probably have better luck like this:

  perl -e "print <Ctrl-x>"Hello world\n<Ctrl-x>""

Under the Mac, it depends which environment you are using.  The MacPerl
shell, or MPW, is much like Unix shells in its support for several
quoting variants, except that it makes free use of the Mac's non-ASCII
characters as control characters.

Using qq(), q(), and qx(), instead of "double quotes", 'single
quotes', and `backticks`, may make one-liners easier to write.

There is no general solution to all of this.  It is a mess.

[Some of this answer was contributed by Kenneth Albanowski.]

=head2 Where can I learn about CGI or Web programming in Perl?

For modules, get the CGI or LWP modules from CPAN.  For textbooks,
see the two especially dedicated to web stuff in the question on
books.  For problems and questions related to the web, like "Why
do I get 500 Errors" or "Why doesn't it run from the browser right
when it runs fine on the command line", see the troubleshooting
guides and references in L<perlfaq9> or in the CGI MetaFAQ:


=head2 Where can I learn about object-oriented Perl programming?

A good place to start is L<perltoot>, and you can use L<perlobj>,
L<perlboot>, L<perltoot>, L<perltooc>, and L<perlbot> for reference.

A good book on OO on Perl is the "Object-Oriented Perl"
by Damian Conway from Manning Publications, or "Learning Perl
References, Objects, & Modules" by Randal Schwartz and Tom
Phoenix from O'Reilly Media.

=head2 Where can I learn about linking C with Perl?

If you want to call C from Perl, start with L<perlxstut>,
moving on to L<perlxs>, L<xsubpp>, and L<perlguts>.  If you want to
call Perl from C, then read L<perlembed>, L<perlcall>, and
L<perlguts>.  Don't forget that you can learn a lot from looking at
how the authors of existing extension modules wrote their code and
solved their problems.

You might not need all the power of XS. The Inline::C module lets
you put C code directly in your Perl source. It handles all the
magic to make it work. You still have to learn at least some of
the perl API but you won't have to deal with the complexity of the
XS support files.

=head2 I've read perlembed, perlguts, etc., but I can't embed perl in my C program; what am I doing wrong?

Download the ExtUtils::Embed kit from CPAN and run `make test'.  If
the tests pass, read the pods again and again and again.  If they
fail, see L<perlbug> and send a bug report with the output of
C<make test TEST_VERBOSE=1> along with C<perl -V>.

=head2 When I tried to run my script, I got this message. What does it mean?

A complete list of Perl's error messages and warnings with explanatory
text can be found in L<perldiag>. You can also use the splain program
(distributed with Perl) to explain the error messages:

    perl program 2>diag.out
    splain [-v] [-p] diag.out

or change your program to explain the messages for you:

    use diagnostics;


    use diagnostics -verbose;

=head2 What's MakeMaker?

This module (part of the standard Perl distribution) is designed to
write a Makefile for an extension module from a Makefile.PL.  For more
information, see L<ExtUtils::MakeMaker>.


Copyright (c) 1997-2006 Tom Christiansen, Nathan Torkington, and
other authors as noted. All rights reserved.

This documentation is free; you can redistribute it and/or modify it
under the same terms as Perl itself.

Irrespective of its distribution, all code examples here are in the public
domain.  You are permitted and encouraged to use this code and any
derivatives thereof in your own programs for fun or for profit as you
see fit.  A simple comment in the code giving credit to the FAQ would
be courteous but is not required.

--- NEW FILE: perlretut.pod ---
=head1 NAME

perlretut - Perl regular expressions tutorial


This page provides a basic tutorial on understanding, creating and
using regular expressions in Perl.  It serves as a complement to the
reference page on regular expressions L<perlre>.  Regular expressions
are an integral part of the C<m//>, C<s///>, C<qr//> and C<split>
operators and so this tutorial also overlaps with
L<perlop/"Regexp Quote-Like Operators"> and L<perlfunc/split>.

Perl is widely renowned for excellence in text processing, and regular
expressions are one of the big factors behind this fame.  Perl regular
expressions display an efficiency and flexibility unknown in most
other computer languages.  Mastering even the basics of regular
expressions will allow you to manipulate text with surprising ease.

[...2484 lines suppressed...]
Jeffrey Friedl (published by O'Reilly, ISBN 1556592-257-3).


Copyright (c) 2000 Mark Kvale
All rights reserved.

This document may be distributed under the same terms as Perl itself.

=head2 Acknowledgments

The inspiration for the stop codon DNA example came from the ZIP
code example in chapter 7 of I<Mastering Regular Expressions>.

The author would like to thank Jeff Pinyan, Andrew Johnson, Peter
Haworth, Ronald J Kimball, and Joe Smith for all their helpful


--- NEW FILE: perlmodinstall.pod ---
=head1 NAME

perlmodinstall - Installing CPAN Modules


You can think of a module as the fundamental unit of reusable Perl
code; see L<perlmod> for details.  Whenever anyone creates a chunk of
Perl code that they think will be useful to the world, they register
as a Perl developer at http://www.cpan.org/modules/04pause.html
so that they can then upload their code to the CPAN.  The CPAN is the
Comprehensive Perl Archive Network and can be accessed at
http://www.cpan.org/ , and searched at http://search.cpan.org/ .

This documentation is for people who want to download CPAN modules
and install them on their own computer.


First, are you sure that the module isn't already on your system?  Try
C<perl -MFoo -e 1>.  (Replace "Foo" with the name of the module; for
instance, C<perl -MCGI::Carp -e 1>.

If you don't see an error message, you have the module.  (If you do
see an error message, it's still possible you have the module, but
that it's not in your path, which you can display with C<perl -e
"print qq(@INC)">.)  For the remainder of this document, we'll assume
that you really honestly truly lack an installed module, but have
found it on the CPAN.

So now you have a file ending in .tar.gz (or, less often, .zip).  You
know there's a tasty module inside.  There are four steps you must now

=over 5

=item B<DECOMPRESS> the file

=item B<UNPACK> the file into a directory

=item B<BUILD> the module (sometimes unnecessary)

=item B<INSTALL> the module.


Here's how to perform each step for each operating system.  This is
<not> a substitute for reading the README and INSTALL files that
might have come with your module!

Also note that these instructions are tailored for installing the
module into your system's repository of Perl modules -- but you can
install modules into any directory you wish.  For instance, where I
say C<perl Makefile.PL>, you can substitute C<perl Makefile.PL
PREFIX=/my/perl_directory> to install the modules into
C</my/perl_directory>.  Then you can use the modules from your Perl
programs with C<use lib "/my/perl_directory/lib/site_perl";> or
sometimes just C<use "/my/perl_directory";>.  If you're on a system
that requires superuser/root access to install modules into the
directories you see when you type C<perl -e "print qq(@INC)">, you'll
want to install them into a local directory (such as your home
directory) and use this approach.

=over 4

=item *

B<If you're on a Unix or Unix-like system,>

You can use Andreas Koenig's CPAN module
( http://www.cpan.org/modules/by-module/CPAN )
to automate the following steps, from DECOMPRESS through INSTALL.


Decompress the file with C<gzip -d yourmodule.tar.gz>

You can get gzip from ftp://prep.ai.mit.edu/pub/gnu/

Or, you can combine this step with the next to save disk space:

     gzip -dc yourmodule.tar.gz | tar -xof -


Unpack the result with C<tar -xof yourmodule.tar>


Go into the newly-created directory and type:

      perl Makefile.PL
      make test


      perl Makefile.PL PREFIX=/my/perl_directory

to install it locally.  (Remember that if you do this, you'll have to
put C<use lib "/my/perl_directory";> near the top of the program that
is to use this module.


While still in that directory, type:

      make install

Make sure you have the appropriate permissions to install the module
in your Perl 5 library directory.  Often, you'll need to be root.

That's all you need to do on Unix systems with dynamic linking.
Most Unix systems have dynamic linking -- if yours doesn't, or if for
another reason you have a statically-linked perl, B<and> the
module requires compilation, you'll need to build a new Perl binary
that includes the module.  Again, you'll probably need to be root.

=item *

B<If you're running ActivePerl (Win95/98/2K/NT/XP, Linux, Solaris)>

First, type C<ppm> from a shell and see whether ActiveState's PPM
repository has your module.  If so, you can install it with C<ppm> and
you won't have to bother with any of the other steps here.  You might
be able to use the CPAN instructions from the "Unix or Linux" section
above as well; give it a try.  Otherwise, you'll have to follow the
steps below.


You can use the shareware Winzip ( http://www.winzip.com ) to
decompress and unpack modules.


If you used WinZip, this was already done for you.


You'll need the C<nmake> utility, available at
or dmake, available on CPAN.

Does the module require compilation (i.e. does it have files that end
in .xs, .c, .h, .y, .cc, .cxx, or .C)?  If it does, life is now
officially tough for you, because you have to compile the module
yourself -- no easy feat on Windows.  You'll need a compiler such as
Visual C++.  Alternatively, you can download a pre-built PPM package
from ActiveState.

Go into the newly-created directory and type:

      perl Makefile.PL
      nmake test


While still in that directory, type:

      nmake install

=item *

B<If you're using a Macintosh with "Classic" MacOS and MacPerl,>


First, make sure you have the latest B<cpan-mac> distribution (
http://www.cpan.org/authors/id/CNANDOR/ ), which has utilities for
doing all of the steps.  Read the cpan-mac directions carefully and
install it.  If you choose not to use cpan-mac for some reason, there
are alternatives listed here.

After installing cpan-mac, drop the module archive on the
B<untarzipme> droplet, which will decompress and unpack for you.

B<Or>, you can either use the shareware B<StuffIt Expander> program
( http://www.aladdinsys.com/expander/ )
in combination with B<DropStuff with Expander Enhancer>
( http://www.aladdinsys.com/dropstuff/ )
or the freeware B<MacGzip> program (
http://persephone.cps.unizar.es/general/gente/spd/gzip/gzip.html ).


If you're using untarzipme or StuffIt, the archive should be extracted
now.  B<Or>, you can use the freeware B<suntar> or I<Tar> (
http://hyperarchive.lcs.mit.edu/HyperArchive/Archive/cmp/ ).


Check the contents of the distribution.
Read the module's documentation, looking for
reasons why you might have trouble using it with MacPerl.  Look for
F<.xs> and F<.c> files, which normally denote that the distribution
must be compiled, and you cannot install it "out of the box."

If a module does not work on MacPerl but should, or needs to be
compiled, see if the module exists already as a port on the
MacPerl Module Porters site ( http://pudge.net/mmp/ ).
For more information on doing XS with MacPerl yourself, see
Arved Sandstrom's XS tutorial ( http://macperl.com/depts/Tutorials/ ),
and then consider uploading your binary to the CPAN and
registering it on the MMP site.


If you are using cpan-mac, just drop the folder on the
B<installme> droplet, and use the module.

B<Or>, if you aren't using cpan-mac, do some manual labor.

Make sure the newlines for the modules are in Mac format, not Unix format.
If they are not then you might have decompressed them incorrectly.  Check
your decompression and unpacking utilities settings to make sure they are
translating text files properly.

As a last resort, you can use the perl one-liner:

    perl -i.bak -pe 's/(?:\015)?\012/\015/g' <filenames>

on the source files.

Then move the files (probably just the F<.pm> files, though there
may be some additional ones, too; check the module documentation)
to their final destination: This will
most likely be in C<$ENV{MACPERL}site_lib:> (i.e.,
C<HD:MacPerl folder:site_lib:>).  You can add new paths to
the default C<@INC> in the Preferences menu item in the
MacPerl application (C<$ENV{MACPERL}site_lib:> is added
automagically).  Create whatever directory structures are required
(i.e., for C<Some::Module>, create
C<$ENV{MACPERL}site_lib:Some:> and put
C<Module.pm> in that directory).

Then run the following script (or something like it):

     #!perl -w
     use AutoSplit;
     my $dir = "${MACPERL}site_perl";
     autosplit("$dir:Some:Module.pm", "$dir:auto", 0, 1, 1);

=item *

B<If you're on the DJGPP port of DOS,>


djtarx ( ftp://ftp.simtel.net/pub/simtelnet/gnu/djgpp/v2/ )
will both uncompress and unpack.


See above.


Go into the newly-created directory and type:

      perl Makefile.PL
      make test

You will need the packages mentioned in F<README.dos>
in the Perl distribution.


While still in that directory, type:

     make install	

You will need the packages mentioned in F<README.dos> in the Perl distribution.

=item *

B<If you're on OS/2,>

Get the EMX development suite and gzip/tar, from either Hobbes (
http://hobbes.nmsu.edu ) or Leo ( http://www.leo.org ), and then follow
the instructions for Unix.

=item *

B<If you're on VMS,>

When downloading from CPAN, save your file with a C<.tgz>
extension instead of C<.tar.gz>.  All other periods in the
filename should be replaced with underscores.  For example,
C<Your-Module-1.33.tar.gz> should be downloaded as



    gzip -d Your-Module.tgz

or, for zipped modules, type

    unzip Your-Module.zip

Executables for gzip, zip, and VMStar:


and their source code:


Note that GNU's gzip/gunzip is not the same as Info-ZIP's zip/unzip
package.  The former is a simple compression tool; the latter permits
creation of multi-file archives.


If you're using VMStar:

     VMStar xf Your-Module.tar

Or, if you're fond of VMS command syntax:

     tar/extract/verbose Your_Module.tar


Make sure you have MMS (from Digital) or the freeware MMK ( available
from MadGoat at http://www.madgoat.com ).  Then type this to create
the DESCRIP.MMS for the module:

    perl Makefile.PL

Now you're ready to build:

    mms test

Substitute C<mmk> for C<mms> above if you're using MMK.



    mms install

Substitute C<mmk> for C<mms> above if you're using MMK.

=item *

B<If you're on MVS>,

Introduce the F<.tar.gz> file into an HFS as binary; don't translate from


Decompress the file with C<gzip -d yourmodule.tar.gz>

You can get gzip from


Unpack the result with

     pax -o to=IBM-1047,from=ISO8859-1 -r < yourmodule.tar

The BUILD and INSTALL steps are identical to those for Unix.  Some
modules generate Makefiles that work better with GNU make, which is
available from http://www.mks.com/s390/gnu/



Note that not all modules will work with on all platforms.
See L<perlport> for more information on portability issues.
Read the documentation to see if the module will work on your
system.  There are basically three categories
of modules that will not work "out of the box" with all
platforms (with some possibility of overlap):

=over 4

=item *

B<Those that should, but don't.>  These need to be fixed; consider
contacting the author and possibly writing a patch.

=item *

B<Those that need to be compiled, where the target platform
doesn't have compilers readily available.>  (These modules contain
F<.xs> or F<.c> files, usually.)  You might be able to find
existing binaries on the CPAN or elsewhere, or you might
want to try getting compilers and building it yourself, and then
release the binary for other poor souls to use.

=item *

B<Those that are targeted at a specific platform.>
(Such as the Win32:: modules.)  If the module is targeted
specifically at a platform other than yours, you're out
of luck, most likely.


Check the CPAN Testers if a module should work with your platform
but it doesn't behave as you'd expect, or you aren't sure whether or
not a module will work under your platform.  If the module you want
isn't listed there, you can test it yourself and let CPAN Testers know,
you can join CPAN Testers, or you can request it be tested.


=head1 HEY

If you have any suggested changes for this page, let me know.  Please
don't send me mail asking for help on how to install your modules.
There are too many modules, and too few Orwants, for me to be able to
answer or even acknowledge all your questions.  Contact the module
author instead, or post to comp.lang.perl.modules, or ask someone
familiar with Perl on your operating system.

=head1 AUTHOR

Jon Orwant

orwant at medita.mit.edu

with invaluable help from Chris Nandor, and valuable help from Brandon
Allbery, Charles Bailey, Graham Barr, Dominic Dunlop, Jarkko
Hietaniemi, Ben Holzman, Tom Horsley, Nick Ing-Simmons, Tuomas
J. Lukka, Laszlo Molnar, Alan Olsen, Peter Prymmer, Gurusamy Sarathy,
Christoph Spalinger, Dan Sugalski, Larry Virden, and Ilya Zakharevich.

First version July 22, 1998; last revised November 21, 2001.


Copyright (C) 1998, 2002, 2003 Jon Orwant.  All Rights Reserved.

Permission is granted to make and distribute verbatim copies of this
documentation provided the copyright notice and this permission notice are
preserved on all copies.

Permission is granted to copy and distribute modified versions of this
documentation under the conditions for verbatim copying, provided also
that they are marked clearly as modified versions, that the authors'
names and title are unchanged (though subtitles and additional
authors' names may be added), and that the entire resulting derived
work is distributed under the terms of a permission notice identical
to this one.

Permission is granted to copy and distribute translations of this
documentation into another language, under the above conditions for
modified versions.

--- NEW FILE: perlglossary.pod ---
=head1 NAME

perlglossary - Perl Glossary


A glossary of terms (technical and otherwise) used in the Perl documentation.
Other useful sources include the Free On-Line Dictionary of Computing
L<http://foldoc.doc.ic.ac.uk/foldoc/index.html>, the Jargon File
L<http://catb.org/~esr/jargon/>, and Wikipedia L<http://www.wikipedia.org/>.

=head2 A

=over 4

=item accessor methods

A L</method> used to indirectly inspect or update an L</object>'s
state (its L<instance variables|/instance variable>).
[...3344 lines suppressed...]
A subpattern L</assertion> matching the L</null string> between

=item zombie

A process that has died (exited) but whose parent has not yet received
proper notification of its demise by virtue of having called
L<wait|perlfunc/wait> or L<waitpid|perlfunc/waitpid>.  If you
L<fork|perlfunc/fork>, you must clean up after your child processes
when they exit, or else the process table will fill up and your system
administrator will Not Be Happy with you.



Based on the Glossary of Programming Perl, Third Edition,
by Larry Wall, Tom Christiansen & Jon Orwant.
Copyright (c) 2000, 1996, 1991 O'Reilly Media, Inc.
This document may be distributed under the same terms as Perl itself.

--- NEW FILE: perlartistic.pod ---

=head1 NAME

perlartistic - the Perl Artistic License


 You can refer to this document in Pod via "L<perlartistic>"
 Or you can see this document by entering "perldoc perlartistic"


This is B<"The Artistic License">. It's here so that modules,
programs, etc., that want to declare this as their distribution
license, can link to it.

It is also one of the two licenses Perl allows itself to be
redistributed and/or modified; for the other one, the GNU General
Public License, see the L<perlgpl>.

=head1 The "Artistic License"

=head2 Preamble

The intent of this document is to state the conditions under which a
Package may be copied, such that the Copyright Holder maintains some
semblance of artistic control over the development of the package,
while giving the users of the package the right to use and distribute
the Package in a more-or-less customary fashion, plus the right to make
reasonable modifications.

=head2 Definitions


=item "Package"

refers to the collection of files distributed by the
Copyright Holder, and derivatives of that collection of files created
through textual modification.

=item "Standard Version"

refers to such a Package if it has not been
modified, or has been modified in accordance with the wishes of the
Copyright Holder as specified below.

=item "Copyright Holder"

is whoever is named in the copyright or
copyrights for the package.

=item "You"

is you, if you're thinking about copying or distributing this Package.

=item "Reasonable copying fee"

is whatever you can justify on the basis
of media cost, duplication charges, time of people involved, and so on.
(You will not be required to justify it to the Copyright Holder, but
only to the computing community at large as a market that must bear the

=item "Freely Available"

means that no fee is charged for the item
itself, though there may be fees involved in handling the item. It also
means that recipients of the item may redistribute it under the same
conditions they received it.


=head2 Conditions


=item 1.

You may make and give away verbatim copies of the source form of the
Standard Version of this Package without restriction, provided that you
duplicate all of the original copyright notices and associated disclaimers.

=item 2.

You may apply bug fixes, portability fixes and other modifications
derived from the Public Domain or from the Copyright Holder.  A Package
modified in such a way shall still be considered the Standard Version.

=item 3.

You may otherwise modify your copy of this Package in any way, provided
that you insert a prominent notice in each changed file stating how and
when you changed that file, and provided that you do at least ONE of the


=item a)

place your modifications in the Public Domain or otherwise make them
Freely Available, such as by posting said modifications to Usenet or an
equivalent medium, or placing the modifications on a major archive site
such as uunet.uu.net, or by allowing the Copyright Holder to include
your modifications in the Standard Version of the Package.

=item b)

use the modified Package only within your corporation or organization.

=item c)

rename any non-standard executables so the names do not conflict with
standard executables, which must also be provided, and provide a
separate manual page for each non-standard executable that clearly
documents how it differs from the Standard Version.

=item d)

make other distribution arrangements with the Copyright Holder.


=item 4.

You may distribute the programs of this Package in object code or
executable form, provided that you do at least ONE of the following:


=item a)

distribute a Standard Version of the executables and library files,
together with instructions (in the manual page or equivalent) on where
to get the Standard Version.

=item b)

accompany the distribution with the machine-readable source of the
Package with your modifications.

=item c)

give non-standard executables non-standard names, and clearly
document the differences in manual pages (or equivalent), together with
instructions on where to get the Standard Version.

=item d)

make other distribution arrangements with the Copyright Holder.


=item 5.

You may charge a reasonable copying fee for any distribution of this
Package.  You may charge any fee you choose for support of this
Package.  You may not charge a fee for this Package itself.  However,
you may distribute this Package in aggregate with other (possibly
commercial) programs as part of a larger (possibly commercial) software
distribution provided that you do not advertise this Package as a
product of your own.  You may embed this Package's interpreter within
an executable of yours (by linking); this shall be construed as a mere
form of aggregation, provided that the complete Standard Version of the
interpreter is so embedded.

=item 6.

The scripts and library files supplied as input to or produced as
output from the programs of this Package do not automatically fall
under the copyright of this Package, but belong to whoever generated
them, and may be sold commercially, and may be aggregated with this
Package.  If such scripts or library files are aggregated with this
Package via the so-called "undump" or "unexec" methods of producing a
binary executable image, then distribution of such an image shall
neither be construed as a distribution of this Package nor shall it
fall under the restrictions of Paragraphs 3 and 4, provided that you do
not represent such an executable image as a Standard Version of this

=item 7.

C subroutines (or comparably compiled subroutines in other
languages) supplied by you and linked into this Package in order to
emulate subroutines and variables of the language defined by this
Package shall not be considered part of this Package, but are the
equivalent of input as in Paragraph 6, provided these subroutines do
not change the language in any way that would cause it to fail the
regression tests for the language.

=item 8.

Aggregation of this Package with a commercial distribution is always
permitted provided that the use of this Package is embedded; that is,
when no overt attempt is made to make this Package's interfaces visible
to the end user of the commercial distribution.  Such use shall not be
construed as a distribution of this Package.

=item 9.

The name of the Copyright Holder may not be used to endorse or promote
products derived from this software without specific prior written permission.

=item 10.



The End


--- NEW FILE: perlpacktut.pod ---
=head1 NAME

perlpacktut - tutorial on C<pack> and C<unpack>


C<pack> and C<unpack> are two functions for transforming data according
to a user-defined template, between the guarded way Perl stores values
and some well-defined representation as might be required in the 
environment of a Perl program. Unfortunately, they're also two of 
the most misunderstood and most often overlooked functions that Perl
provides. This tutorial will demystify them for you.

=head1 The Basic Principle

Most programming languages don't shelter the memory where variables are
stored. In C, for instance, you can take the address of some variable,
and the C<sizeof> operator tells you how many bytes are allocated to
[...1107 lines suppressed...]
               unpack( 'H2' x length( $mem ), $mem ) ),
          length( $mem ) % 16 ? "\n" : '';

=head1 Funnies Section

    # Pulling digits out of nowhere...
    print unpack( 'C', pack( 'x' ) ),
          unpack( '%B*', pack( 'A' ) ),
          unpack( 'H', pack( 'A' ) ),
          unpack( 'A', unpack( 'C', pack( 'A' ) ) ), "\n";

    # One for the road ;-)
    my $advice = pack( 'all u can in a van' );

=head1 Authors

Simon Cozens and Wolfgang Laun.

--- NEW FILE: perlform.pod ---
=head1 NAME
X<format> X<report> X<chart>

perlform - Perl formats


Perl has a mechanism to help you generate simple reports and charts.  To
facilitate this, Perl helps you code up your output page close to how it
will look when it's printed.  It can keep track of things like how many
lines are on a page, what page you're on, when to print page headers,
etc.  Keywords are borrowed from FORTRAN: format() to declare and write()
to execute; see their entries in L<perlfunc>.  Fortunately, the layout is
much more legible, more like BASIC's PRINT USING statement.  Think of it
as a poor man's nroff(1).

Formats, like packages and subroutines, are declared rather than
executed, so they may occur at any point in your program.  (Usually it's
best to keep them all together though.) They have their own namespace
apart from all the other "types" in Perl.  This means that if you have a
function named "Foo", it is not the same thing as having a format named
"Foo".  However, the default name for the format associated with a given
filehandle is the same as the name of the filehandle.  Thus, the default
format for STDOUT is named "STDOUT", and the default format for filehandle
TEMP is named "TEMP".  They just look the same.  They aren't.

Output record formats are declared as follows:

    format NAME =

If the name is omitted, format "STDOUT" is defined. A single "." in 
column 1 is used to terminate a format.  FORMLIST consists of a sequence 
of lines, each of which may be one of three types:

=over 4

=item 1.

A comment, indicated by putting a '#' in the first column.

=item 2.

A "picture" line giving the format for one output line.

=item 3.

An argument line supplying values to plug into the previous picture line.


Picture lines contain output field definitions, intermingled with
literal text. These lines do not undergo any kind of variable interpolation.
Field definitions are made up from a set of characters, for starting and
extending a field to its desired width. This is the complete set of
characters for field definitions:
X<format, picture line>
X<@> X<^> X<< < >> X<< | >> X<< > >> X<#> X<0> X<.> X<...>
X<@*> X<^*> X<~> X<~~>
   @    start of regular field
   ^    start of special field
   <    pad character for left adjustification
   |    pad character for centering
   >    pad character for right adjustificat
   #    pad character for a right justified numeric field
   0    instead of first #: pad number with leading zeroes
   .    decimal point within a numeric field
   ...  terminate a text field, show "..." as truncation evidence
   @*   variable width field for a multi-line value
   ^*   variable width field for next line of a multi-line value
   ~    suppress line with all fields empty
   ~~   repeat line until all fields are exhausted

Each field in a picture line starts with either "@" (at) or "^" (caret),
indicating what we'll call, respectively, a "regular" or "special" field.
The choice of pad characters determines whether a field is textual or
numeric. The tilde operators are not part of a field.  Let's look at
the various possibilities in detail.

=head2 Text Fields
X<format, text field>

The length of the field is supplied by padding out the field with multiple 
"E<lt>", "E<gt>", or "|" characters to specify a non-numeric field with,
respectively, left justification, right justification, or centering. 
For a regular field, the value (up to the first newline) is taken and
printed according to the selected justification, truncating excess characters.
If you terminate a text field with "...", three dots will be shown if
the value is truncated. A special text field may be used to do rudimentary 
multi-line text block filling; see L</Using Fill Mode> for details.

      format STDOUT =
      @<<<<<<   @||||||   @>>>>>>
      "left",   "middle", "right"
      left      middle    right

=head2 Numeric Fields
X<#> X<format, numeric field>

Using "#" as a padding character specifies a numeric field, with
right justification. An optional "." defines the position of the
decimal point. With a "0" (zero) instead of the first "#", the
formatted number will be padded with leading zeroes if necessary.
A special numeric field is blanked out if the value is undefined.
If the resulting value would exceed the width specified the field is
filled with "#" as overflow evidence.

      format STDOUT =
      @###   @.###   @##.###  @###   @###   ^####
       42,   3.1415,  undef,    0, 10000,   undef
        42   3.142     0.000     0   ####

=head2 The Field @* for Variable Width Multi-Line Text

The field "@*" can be used for printing multi-line, nontruncated
values; it should (but need not) appear by itself on a line. A final
line feed is chomped off, but all other characters are emitted verbatim.

=head2 The Field ^* for Variable Width One-line-at-a-time Text

Like "@*", this is a variable width field. The value supplied must be a 
scalar variable. Perl puts the first line (up to the first "\n") of the 
text into the field, and then chops off the front of the string so that 
the next time the variable is referenced, more of the text can be printed. 
The variable will I<not> be restored.

      $text = "line 1\nline 2\nline 3";
      format STDOUT =
      Text: ^*
      ~~    ^*
      Text: line 1
            line 2
            line 3

=head2 Specifying Values
X<format, specifying values>

The values are specified on the following format line in the same order as
the picture fields.  The expressions providing the values must be
separated by commas.  They are all evaluated in a list context
before the line is processed, so a single list expression could produce
multiple list elements.  The expressions may be spread out to more than
one line if enclosed in braces.  If so, the opening brace must be the first
token on the first line.  If an expression evaluates to a number with a
decimal part, and if the corresponding picture specifies that the decimal
part should appear in the output (that is, any picture except multiple "#"
characters B<without> an embedded "."), the character used for the decimal
point is B<always> determined by the current LC_NUMERIC locale.  This
means that, if, for example, the run-time environment happens to specify a
German locale, "," will be used instead of the default ".".  See
L<perllocale> and L<"WARNINGS"> for more information.

=head2 Using Fill Mode
X<format, fill mode>

On text fields the caret enables a kind of fill mode.  Instead of an
arbitrary expression, the value supplied must be a scalar variable
that contains a text string.  Perl puts the next portion of the text into
the field, and then chops off the front of the string so that the next time
the variable is referenced, more of the text can be printed.  (Yes, this
means that the variable itself is altered during execution of the write()
call, and is not restored.)  The next portion of text is determined by
a crude line breaking algorithm. You may use the carriage return character
(C<\r>) to force a line break. You can change which characters are legal 
to break on by changing the variable C<$:> (that's 
$FORMAT_LINE_BREAK_CHARACTERS if you're using the English module) to a 
list of the desired characters.

Normally you would use a sequence of fields in a vertical stack associated 
with the same scalar variable to print out a block of text. You might wish 
to end the final field with the text "...", which will appear in the output 
if the text was too long to appear in its entirety.  

=head2 Suppressing Lines Where All Fields Are Void
X<format, suppressing lines>

Using caret fields can produce lines where all fields are blank. You can
suppress such lines by putting a "~" (tilde) character anywhere in the
line.  The tilde will be translated to a space upon output.

=head2 Repeating Format Lines
X<format, repeating lines>

If you put two contiguous tilde characters "~~" anywhere into a line,
the line will be repeated until all the fields on the line are exhausted,
i.e. undefined. For special (caret) text fields this will occur sooner or
later, but if you use a text field of the at variety, the  expression you
supply had better not give the same value every time forever! (C<shift(@f)>
is a simple example that would work.)  Don't use a regular (at) numeric 
field in such lines, because it will never go blank.

=head2 Top of Form Processing
X<format, top of form> X<top> X<header>

Top-of-form processing is by default handled by a format with the
same name as the current filehandle with "_TOP" concatenated to it.
It's triggered at the top of each page.  See L<perlfunc/write>.


 # a report on the /etc/passwd file
 format STDOUT_TOP =
                         Passwd File
 Name                Login    Office   Uid   Gid Home
 format STDOUT =
 @<<<<<<<<<<<<<<<<<< @||||||| @<<<<<<@>>>> @>>>> @<<<<<<<<<<<<<<<<<
 $name,              $login,  $office,$uid,$gid, $home

 # a report from a bug report form
 format STDOUT_TOP =
                         Bug Reports
 @<<<<<<<<<<<<<<<<<<<<<<<     @|||         @>>>>>>>>>>>>>>>>>>>>>>>
 $system,                      $%,         $date
 format STDOUT =
 Subject: @<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<
 Index: @<<<<<<<<<<<<<<<<<<<<<<<<<<<< ^<<<<<<<<<<<<<<<<<<<<<<<<<<<<
        $index,                       $description
 Priority: @<<<<<<<<<< Date: @<<<<<<< ^<<<<<<<<<<<<<<<<<<<<<<<<<<<<
           $priority,        $date,   $description
 From: @<<<<<<<<<<<<<<<<<<<<<<<<<<<<< ^<<<<<<<<<<<<<<<<<<<<<<<<<<<<
       $from,                         $description
 Assigned to: @<<<<<<<<<<<<<<<<<<<<<< ^<<<<<<<<<<<<<<<<<<<<<<<<<<<<
              $programmer,            $description
 ~                                    ^<<<<<<<<<<<<<<<<<<<<<<<<<<<<
 ~                                    ^<<<<<<<<<<<<<<<<<<<<<<<<<<<<
 ~                                    ^<<<<<<<<<<<<<<<<<<<<<<<<<<<<
 ~                                    ^<<<<<<<<<<<<<<<<<<<<<<<<<<<<
 ~                                    ^<<<<<<<<<<<<<<<<<<<<<<<...

It is possible to intermix print()s with write()s on the same output
channel, but you'll have to handle C<$-> (C<$FORMAT_LINES_LEFT>)

=head2 Format Variables
X<format variables>
X<format, variables>

The current format name is stored in the variable C<$~> (C<$FORMAT_NAME>),
and the current top of form format name is in C<$^> (C<$FORMAT_TOP_NAME>).
The current output page number is stored in C<$%> (C<$FORMAT_PAGE_NUMBER>),
and the number of lines on the page is in C<$=> (C<$FORMAT_LINES_PER_PAGE>).
Whether to autoflush output on this handle is stored in C<$|>
(C<$OUTPUT_AUTOFLUSH>).  The string output before each top of page (except
the first) is stored in C<$^L> (C<$FORMAT_FORMFEED>).  These variables are
set on a per-filehandle basis, so you'll need to select() into a different
one to affect them:

	    $~ = "My_Other_Format",
	    $^ = "My_Top_Format"

Pretty ugly, eh?  It's a common idiom though, so don't be too surprised
when you see it.  You can at least use a temporary variable to hold
the previous filehandle: (this is a much better approach in general,
because not only does legibility improve, you now have intermediary
stage in the expression to single-step the debugger through):

    $ofh = select(OUTF);
    $~ = "My_Other_Format";
    $^ = "My_Top_Format";

If you use the English module, you can even read the variable names:

    use English '-no_match_vars';
    $ofh = select(OUTF);
    $FORMAT_NAME     = "My_Other_Format";
    $FORMAT_TOP_NAME = "My_Top_Format";

But you still have those funny select()s.  So just use the FileHandle
module.  Now, you can access these special variables using lowercase
method names instead:

    use FileHandle;
    format_name     OUTF "My_Other_Format";
    format_top_name OUTF "My_Top_Format";

Much better!

=head1 NOTES

Because the values line may contain arbitrary expressions (for at fields,
not caret fields), you can farm out more sophisticated processing
to other functions, like sprintf() or one of your own.  For example:

    format Ident =

To get a real at or caret into the field, do this:

    format Ident =
    I have an @ here.

To center a whole line of text, do something like this:

    format Ident =
	    "Some text line"

There is no builtin way to say "float this to the right hand side
of the page, however wide it is."  You have to specify where it goes.
The truly desperate can generate their own format on the fly, based
on the current number of columns, and then eval() it:

    $format  = "format STDOUT = \n"
             . '^' . '<' x $cols . "\n"
             . '$entry' . "\n"
             . "\t^" . "<" x ($cols-8) . "~~\n"
             . '$entry' . "\n"
             . ".\n";
    print $format if $Debugging;
    eval $format;
    die $@ if $@;

Which would generate a format looking something like this:

 format STDOUT =

Here's a little program that's somewhat like fmt(1):

 format =
 ^<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<< ~~


 $/ = '';
 while (<>) {
     s/\s*\n\s*/ /g;

=head2 Footers
X<format, footer> X<footer>

While $FORMAT_TOP_NAME contains the name of the current header format,
there is no corresponding mechanism to automatically do the same thing
for a footer.  Not knowing how big a format is going to be until you
evaluate it is one of the major problems.  It's on the TODO list.

Here's one strategy:  If you have a fixed-size footer, you can get footers
by checking $FORMAT_LINES_LEFT before each write() and print the footer
yourself if necessary.

Here's another strategy: Open a pipe to yourself, using C<open(MYSELF, "|-")>
(see L<perlfunc/open()>) and always write() to MYSELF instead of STDOUT.
Have your child process massage its STDIN to rearrange headers and footers
however you like.  Not very convenient, but doable.

=head2 Accessing Formatting Internals
X<format, internals>

For low-level access to the formatting mechanism.  you may use formline()
and access C<$^A> (the $ACCUMULATOR variable) directly.

For example:

    $str = formline <<'END', 1,2,3;
    @<<<  @|||  @>>>

    print "Wow, I just stored `$^A' in the accumulator!\n";

Or to make an swrite() subroutine, which is to write() what sprintf()
is to printf(), do this:

    use Carp;
    sub swrite {
	croak "usage: swrite PICTURE ARGS" unless @_;
	my $format = shift;
	$^A = "";
	formline($format, at _);
	return $^A;

    $string = swrite(<<'END', 1, 2, 3);
 Check me out
 @<<<  @|||  @>>>
    print $string;


The lone dot that ends a format can also prematurely end a mail
message passing through a misconfigured Internet mailer (and based on
experience, such misconfiguration is the rule, not the exception).  So
when sending format code through mail, you should indent it so that
the format-ending dot is not on the left margin; this will prevent
SMTP cutoff.

Lexical variables (declared with "my") are not visible within a
format unless the format is declared within the scope of the lexical
variable.  (They weren't visible at all before version 5.001.)

Formats are the only part of Perl that unconditionally use information
from a program's locale; if a program's environment specifies an
LC_NUMERIC locale, it is always used to specify the decimal point
character in formatted output.  Perl ignores all other aspects of locale
handling unless the C<use locale> pragma is in effect.  Formatted output
cannot be controlled by C<use locale> because the pragma is tied to the
block structure of the program, and, for historical reasons, formats
exist outside that block structure.  See L<perllocale> for further
discussion of locale handling.

Within strings that are to be displayed in a fixed length text field,
each control character is substituted by a space. (But remember the
special meaning of C<\r> when using fill mode.) This is done to avoid
misalignment when control characters "disappear" on some output media.

--- NEW FILE: perlreftut.pod ---
=head1 NAME

perlreftut - Mark's very short tutorial about references


One of the most important new features in Perl 5 was the capability to
manage complicated data structures like multidimensional arrays and
nested hashes.  To enable these, Perl 5 introduced a feature called
`references', and using references is the key to managing complicated,
structured data in Perl.  Unfortunately, there's a lot of funny syntax
to learn, and the main manual page can be hard to follow.  The manual
is quite complete, and sometimes people find that a problem, because
it can be hard to tell what is important and what isn't.

Fortunately, you only need to know 10% of what's in the main page to get
90% of the benefit.  This page will show you that 10%.

=head1 Who Needs Complicated Data Structures?

One problem that came up all the time in Perl 4 was how to represent a
hash whose values were lists.  Perl 4 had hashes, of course, but the
values had to be scalars; they couldn't be lists.

Why would you want a hash of lists?  Let's take a simple example: You
have a file of city and country names, like this:

	Chicago, USA
	Frankfurt, Germany
	Berlin, Germany
	Washington, USA
	Helsinki, Finland
	New York, USA

and you want to produce an output like this, with each country mentioned
once, and then an alphabetical list of the cities in that country:

	Finland: Helsinki.
	Germany: Berlin, Frankfurt.
	USA:  Chicago, New York, Washington.

The natural way to do this is to have a hash whose keys are country
names.  Associated with each country name key is a list of the cities in
that country.  Each time you read a line of input, split it into a country
and a city, look up the list of cities already known to be in that
country, and append the new city to the list.  When you're done reading
the input, iterate over the hash as usual, sorting each list of cities
before you print it out.

If hash values can't be lists, you lose.  In Perl 4, hash values can't
be lists; they can only be strings.  You lose.  You'd probably have to
combine all the cities into a single string somehow, and then when
time came to write the output, you'd have to break the string into a
list, sort the list, and turn it back into a string.  This is messy
and error-prone.  And it's frustrating, because Perl already has
perfectly good lists that would solve the problem if only you could
use them.

=head1 The Solution

By the time Perl 5 rolled around, we were already stuck with this
design: Hash values must be scalars.  The solution to this is

A reference is a scalar value that I<refers to> an entire array or an
entire hash (or to just about anything else).  Names are one kind of
reference that you're already familiar with.  Think of the President
of the United States: a messy, inconvenient bag of blood and bones.
But to talk about him, or to represent him in a computer program, all
you need is the easy, convenient scalar string "George Bush".

References in Perl are like names for arrays and hashes.  They're
Perl's private, internal names, so you can be sure they're
unambiguous.  Unlike "George Bush", a reference only refers to one
thing, and you always know what it refers to.  If you have a reference
to an array, you can recover the entire array from it.  If you have a
reference to a hash, you can recover the entire hash.  But the
reference is still an easy, compact scalar value.

You can't have a hash whose values are arrays; hash values can only be
scalars.  We're stuck with that.  But a single reference can refer to
an entire array, and references are scalars, so you can have a hash of
references to arrays, and it'll act a lot like a hash of arrays, and
it'll be just as useful as a hash of arrays.

We'll come back to this city-country problem later, after we've seen
some syntax for managing references.

=head1 Syntax

There are just two ways to make a reference, and just two ways to use
it once you have it.

=head2 Making References

=head3 B<Make Rule 1>

If you put a C<\> in front of a variable, you get a
reference to that variable.

    $aref = \@array;         # $aref now holds a reference to @array
    $href = \%hash;          # $href now holds a reference to %hash
    $sref = \$scalar;        # $sref now holds a reference to $scalar

Once the reference is stored in a variable like $aref or $href, you
can copy it or store it just the same as any other scalar value:

    $xy = $aref;             # $xy now holds a reference to @array
    $p[3] = $href;           # $p[3] now holds a reference to %hash
    $z = $p[3];              # $z now holds a reference to %hash

These examples show how to make references to variables with names.
Sometimes you want to make an array or a hash that doesn't have a
name.  This is analogous to the way you like to be able to use the
string C<"\n"> or the number 80 without having to store it in a named
variable first.

B<Make Rule 2>

C<[ ITEMS ]> makes a new, anonymous array, and returns a reference to
that array.  C<{ ITEMS }> makes a new, anonymous hash, and returns a
reference to that hash.

    $aref = [ 1, "foo", undef, 13 ];
    # $aref now holds a reference to an array

    $href = { APR => 4, AUG => 8 };
    # $href now holds a reference to a hash

The references you get from rule 2 are the same kind of
references that you get from rule 1:

	# This:
	$aref = [ 1, 2, 3 ];

	# Does the same as this:
	@array = (1, 2, 3);
	$aref = \@array;

The first line is an abbreviation for the following two lines, except
that it doesn't create the superfluous array variable C<@array>.

If you write just C<[]>, you get a new, empty anonymous array.
If you write just C<{}>, you get a new, empty anonymous hash.

=head2 Using References

What can you do with a reference once you have it?  It's a scalar
value, and we've seen that you can store it as a scalar and get it back
again just like any scalar.  There are just two more ways to use it:

=head3 B<Use Rule 1>

You can always use an array reference, in curly braces, in place of
the name of an array.  For example, C<@{$aref}> instead of C<@array>.

Here are some examples of that:


	@a		@{$aref}		An array
	reverse @a	reverse @{$aref}	Reverse the array
	$a[3]		${$aref}[3]		An element of the array
	$a[3] = 17;	${$aref}[3] = 17	Assigning an element

On each line are two expressions that do the same thing.  The
left-hand versions operate on the array C<@a>.  The right-hand
versions operate on the array that is referred to by C<$aref>.  Once
they find the array they're operating on, both versions do the same
things to the arrays.

Using a hash reference is I<exactly> the same:

	%h		%{$href}	      A hash
	keys %h		keys %{$href}	      Get the keys from the hash
	$h{'red'}	${$href}{'red'}	      An element of the hash
	$h{'red'} = 17	${$href}{'red'} = 17  Assigning an element

Whatever you want to do with a reference, B<Use Rule 1> tells you how
to do it.  You just write the Perl code that you would have written
for doing the same thing to a regular array or hash, and then replace
the array or hash name with C<{$reference}>.  "How do I loop over an
array when all I have is a reference?"  Well, to loop over an array, you
would write

        for my $element (@array) {

so replace the array name, C<@array>, with the reference:

        for my $element (@{$aref}) {

"How do I print out the contents of a hash when all I have is a
reference?"  First write the code for printing out a hash:

        for my $key (keys %hash) {
          print "$key => $hash{$key}\n";

And then replace the hash name with the reference:

        for my $key (keys %{$href}) {
          print "$key => ${$href}{$key}\n";

=head3 B<Use Rule 2>

B<Use Rule 1> is all you really need, because it tells you how to do
absolutely everything you ever need to do with references.  But the
most common thing to do with an array or a hash is to extract a single
element, and the B<Use Rule 1> notation is cumbersome.  So there is an

C<${$aref}[3]> is too hard to read, so you can write C<< $aref->[3] >>

C<${$href}{red}> is too hard to read, so you can write
C<< $href->{red} >> instead.

If C<$aref> holds a reference to an array, then C<< $aref->[3] >> is
the fourth element of the array.  Don't confuse this with C<$aref[3]>,
which is the fourth element of a totally different array, one
deceptively named C<@aref>.  C<$aref> and C<@aref> are unrelated the
same way that C<$item> and C<@item> are.

Similarly, C<< $href->{'red'} >> is part of the hash referred to by
the scalar variable C<$href>, perhaps even one with no name.
C<$href{'red'}> is part of the deceptively named C<%href> hash.  It's
easy to forget to leave out the C<< -> >>, and if you do, you'll get
bizarre results when your program gets array and hash elements out of
totally unexpected hashes and arrays that weren't the ones you wanted
to use.

=head2 An Example

Let's see a quick example of how all this is useful.

First, remember that C<[1, 2, 3]> makes an anonymous array containing
C<(1, 2, 3)>, and gives you a reference to that array.

Now think about

	@a = ( [1, 2, 3],
               [4, 5, 6],
	       [7, 8, 9]

@a is an array with three elements, and each one is a reference to
another array.

C<$a[1]> is one of these references.  It refers to an array, the array
containing C<(4, 5, 6)>, and because it is a reference to an array,
B<Use Rule 2> says that we can write C<< $a[1]->[2] >> to get the
third element from that array.  C<< $a[1]->[2] >> is the 6.
Similarly, C<< $a[0]->[1] >> is the 2.  What we have here is like a
two-dimensional array; you can write C<< $a[ROW]->[COLUMN] >> to get
or set the element in any row and any column of the array.

The notation still looks a little cumbersome, so there's one more

=head2 Arrow Rule

In between two B<subscripts>, the arrow is optional.

Instead of C<< $a[1]->[2] >>, we can write C<$a[1][2]>; it means the
same thing.  Instead of C<< $a[0]->[1] = 23 >>, we can write
C<$a[0][1] = 23>; it means the same thing.

Now it really looks like two-dimensional arrays!

You can see why the arrows are important.  Without them, we would have
had to write C<${$a[1]}[2]> instead of C<$a[1][2]>.  For
three-dimensional arrays, they let us write C<$x[2][3][5]> instead of
the unreadable C<${${$x[2]}[3]}[5]>.

=head1 Solution

Here's the answer to the problem I posed earlier, of reformatting a
file of city and country names.

    1   my %table;

    2   while (<>) {
    3    chomp;
    4     my ($city, $country) = split /, /;
    5     $table{$country} = [] unless exists $table{$country};
    6     push @{$table{$country}}, $city;
    7   }

    8   foreach $country (sort keys %table) {
    9     print "$country: ";
   10     my @cities = @{$table{$country}};
   11     print join ', ', sort @cities;
   12     print ".\n";
   13	}

The program has two pieces: Lines 2--7 read the input and build a data
structure, and lines 8-13 analyze the data and print out the report.
We're going to have a hash, C<%table>, whose keys are country names,
and whose values are references to arrays of city names.  The data
structure will look like this:

        |       |   |   +-----------+--------+
        |Germany| *---->| Frankfurt | Berlin |
        |       |   |   +-----------+--------+
        |       |   |   +----------+
        |Finland| *---->| Helsinki |
        |       |   |   +----------+
        |       |   |   +---------+------------+----------+
        |  USA  | *---->| Chicago | Washington | New York |
        |       |   |   +---------+------------+----------+

We'll look at output first.  Supposing we already have this structure,
how do we print it out?

    8   foreach $country (sort keys %table) {
    9     print "$country: ";
   10     my @cities = @{$table{$country}};
   11     print join ', ', sort @cities;
   12     print ".\n";
   13	}

C<%table> is an
ordinary hash, and we get a list of keys from it, sort the keys, and
loop over the keys as usual.  The only use of references is in line 10.
C<$table{$country}> looks up the key C<$country> in the hash
and gets the value, which is a reference to an array of cities in that country.
B<Use Rule 1> says that
we can recover the array by saying
C<@{$table{$country}}>.  Line 10 is just like

	@cities = @array;

except that the name C<array> has been replaced by the reference
C<{$table{$country}}>.  The C<@> tells Perl to get the entire array.
Having gotten the list of cities, we sort it, join it, and print it
out as usual.

Lines 2-7 are responsible for building the structure in the first
place.  Here they are again:

    2   while (<>) {
    3    chomp;
    4     my ($city, $country) = split /, /;
    5     $table{$country} = [] unless exists $table{$country};
    6     push @{$table{$country}}, $city;
    7   }

Lines 2-4 acquire a city and country name.  Line 5 looks to see if the
country is already present as a key in the hash.  If it's not, the
program uses the C<[]> notation (B<Make Rule 2>) to manufacture a new,
empty anonymous array of cities, and installs a reference to it into
the hash under the appropriate key.

Line 6 installs the city name into the appropriate array.
C<$table{$country}> now holds a reference to the array of cities seen
in that country so far.  Line 6 is exactly like

	push @array, $city;

except that the name C<array> has been replaced by the reference
C<{$table{$country}}>.  The C<push> adds a city name to the end of the
referred-to array.

There's one fine point I skipped.  Line 5 is unnecessary, and we can
get rid of it.

    2   while (<>) {
    3    chomp;
    4     my ($city, $country) = split /, /;
    5   ####  $table{$country} = [] unless exists $table{$country};
    6     push @{$table{$country}}, $city;
    7   }

If there's already an entry in C<%table> for the current C<$country>,
then nothing is different.  Line 6 will locate the value in
C<$table{$country}>, which is a reference to an array, and push
C<$city> into the array.  But
what does it do when
C<$country> holds a key, say C<Greece>, that is not yet in C<%table>?

This is Perl, so it does the exact right thing.  It sees that you want
to push C<Athens> onto an array that doesn't exist, so it helpfully
makes a new, empty, anonymous array for you, installs it into
C<%table>, and then pushes C<Athens> onto it.  This is called
`autovivification'--bringing things to life automatically.  Perl saw
that they key wasn't in the hash, so it created a new hash entry
automatically. Perl saw that you wanted to use the hash value as an
array, so it created a new empty array and installed a reference to it
in the hash automatically.  And as usual, Perl made the array one
element longer to hold the new city name.

=head1 The Rest

I promised to give you 90% of the benefit with 10% of the details, and
that means I left out 90% of the details.  Now that you have an
overview of the important parts, it should be easier to read the
L<perlref> manual page, which discusses 100% of the details.

Some of the highlights of L<perlref>:

=over 4

=item *

You can make references to anything, including scalars, functions, and
other references.

=item *

In B<Use Rule 1>, you can omit the curly brackets whenever the thing
inside them is an atomic scalar variable like C<$aref>.  For example,
C<@$aref> is the same as C<@{$aref}>, and C<$$aref[1]> is the same as
C<${$aref}[1]>.  If you're just starting out, you may want to adopt
the habit of always including the curly brackets.

=item *

This doesn't copy the underlying array:

        $aref2 = $aref1;

You get two references to the same array.  If you modify
C<< $aref1->[23] >> and then look at
C<< $aref2->[23] >> you'll see the change.

To copy the array, use

        $aref2 = [@{$aref1}];

This uses C<[...]> notation to create a new anonymous array, and
C<$aref2> is assigned a reference to the new array.  The new array is
initialized with the contents of the array referred to by C<$aref1>.

Similarly, to copy an anonymous hash, you can use

        $href2 = {%{$href1}};

=item *

To see if a variable contains a reference, use the C<ref> function.  It
returns true if its argument is a reference.  Actually it's a little
better than that: It returns C<HASH> for hash references and C<ARRAY>
for array references.

=item *

If you try to use a reference like a string, you get strings like

	ARRAY(0x80f5dec)   or    HASH(0x826afc0)

If you ever see a string that looks like this, you'll know you
printed out a reference by mistake.

A side effect of this representation is that you can use C<eq> to see
if two references refer to the same thing.  (But you should usually use
C<==> instead because it's much faster.)

=item *

You can use a string as if it were a reference.  If you use the string
C<"foo"> as an array reference, it's taken to be a reference to the
array C<@foo>.  This is called a I<soft reference> or I<symbolic
reference>.  The declaration C<use strict 'refs'> disables this
feature, which can cause all sorts of trouble if you use it by accident.


You might prefer to go on to L<perllol> instead of L<perlref>; it
discusses lists of lists and multidimensional arrays in detail.  After
that, you should move on to L<perldsc>; it's a Data Structure Cookbook
that shows recipes for using and printing out arrays of hashes, hashes
of arrays, and other kinds of data.

=head1 Summary

Everyone needs compound data structures, and in Perl the way you get
them is with references.  There are four important rules for managing
references: Two for making references and two for using them.  Once
you know these rules you can do most of the important things you need
to do with references.

=head1 Credits

Author: Mark Jason Dominus, Plover Systems (C<mjd-perl-ref+ at plover.com>)

This article originally appeared in I<The Perl Journal>
( http://www.tpj.com/ ) volume 3, #2.  Reprinted with permission.

The original title was I<Understand References Today>.

=head2 Distribution Conditions

Copyright 1998 The Perl Journal.

This documentation is free; you can redistribute it and/or modify it
under the same terms as Perl itself.

Irrespective of its distribution, all code examples in these files are
hereby placed into the public domain.  You are permitted and
encouraged to use this code in your own programs for fun or for profit
as you see fit.  A simple comment in the code giving credit would be
courteous but is not required.


--- NEW FILE: perldiag.pod ---
=head1 NAME

perldiag - various Perl diagnostics


These messages are classified as follows (listed in increasing order of

    (W) A warning (optional).
    (D) A deprecation (optional).
    (S) A severe warning (default).
    (F) A fatal error (trappable).
    (P) An internal error you should never see (trappable).
    (X) A very fatal error (nontrappable).
    (A) An alien error message (not generated by Perl).

The majority of messages from the first three classifications above
(W, D & S) can be controlled using the C<warnings> pragma.
[...4575 lines suppressed...]
about what you want.  Your best bet is to put a setuid C wrapper around
your script.

=item You need to quote "%s"

(W syntax) You assigned a bareword as a signal handler name.
Unfortunately, you already have a subroutine of that name declared,
which means that Perl 5 will try to call the subroutine when the
assignment is executed, which is probably not what you want.  (If it IS
what you want, put an & in front.)

=item Your random numbers are not that random

(F) When trying to initialise the random seed for hashes, Perl could
not get any randomness out of your system.  This usually indicates
Something Very Wrong.



--- NEW FILE: perlstyle.pod ---
=head1 NAME

perlstyle - Perl style guide


Each programmer will, of course, have his or her own preferences in
regards to formatting, but there are some general guidelines that will
make your programs easier to read, understand, and maintain.

The most important thing is to run your programs under the B<-w>
flag at all times.  You may turn it off explicitly for particular
portions of code via the C<no warnings> pragma or the C<$^W> variable
if you must.  You should also always run under C<use strict> or know the
reason why not.  The C<use sigtrap> and even C<use diagnostics> pragmas
may also prove useful.

Regarding aesthetics of code lay out, about the only thing Larry
cares strongly about is that the closing curly bracket of
a multi-line BLOCK should line up with the keyword that started the construct.
Beyond that, he has other preferences that aren't so strong:

=over 4

=item *

4-column indent.

=item *

Opening curly on same line as keyword, if possible, otherwise line up.

=item *

Space before the opening curly of a multi-line BLOCK.

=item *

One-line BLOCK may be put on one line, including curlies.

=item *

No space before the semicolon.

=item *

Semicolon omitted in "short" one-line BLOCK.

=item *

Space around most operators.

=item *

Space around a "complex" subscript (inside brackets).

=item *

Blank lines between chunks that do different things.

=item *

Uncuddled elses.

=item *

No space between function name and its opening parenthesis.

=item *

Space after each comma.

=item *

Long lines broken after an operator (except C<and> and C<or>).

=item *

Space after last parenthesis matching on current line.

=item *

Line up corresponding items vertically.

=item *

Omit redundant punctuation as long as clarity doesn't suffer.


Larry has his reasons for each of these things, but he doesn't claim that
everyone else's mind works the same as his does.

Here are some other more substantive style issues to think about:

=over 4

=item *

Just because you I<CAN> do something a particular way doesn't mean that
you I<SHOULD> do it that way.  Perl is designed to give you several
ways to do anything, so consider picking the most readable one.  For

    open(FOO,$foo) || die "Can't open $foo: $!";

is better than

    die "Can't open $foo: $!" unless open(FOO,$foo);

because the second way hides the main point of the statement in a
modifier.  On the other hand

    print "Starting analysis\n" if $verbose;

is better than

    $verbose && print "Starting analysis\n";

because the main point isn't whether the user typed B<-v> or not.

Similarly, just because an operator lets you assume default arguments
doesn't mean that you have to make use of the defaults.  The defaults
are there for lazy systems programmers writing one-shot programs.  If
you want your program to be readable, consider supplying the argument.

Along the same lines, just because you I<CAN> omit parentheses in many
places doesn't mean that you ought to:

    return print reverse sort num values %array;
    return print(reverse(sort num (values(%array))));

When in doubt, parenthesize.  At the very least it will let some poor
schmuck bounce on the % key in B<vi>.

Even if you aren't in doubt, consider the mental welfare of the person
who has to maintain the code after you, and who will probably put
parentheses in the wrong place.

=item *

Don't go through silly contortions to exit a loop at the top or the
bottom, when Perl provides the C<last> operator so you can exit in
the middle.  Just "outdent" it a little to make it more visible:

	for (;;) {
	  last LINE if $foo;
	    next LINE if /^#/;

=item *

Don't be afraid to use loop labels--they're there to enhance
readability as well as to allow multilevel loop breaks.  See the
previous example.

=item *

Avoid using C<grep()> (or C<map()>) or `backticks` in a void context, that is,
when you just throw away their return values.  Those functions all
have return values, so use them.  Otherwise use a C<foreach()> loop or
the C<system()> function instead.

=item *

For portability, when using features that may not be implemented on
every machine, test the construct in an eval to see if it fails.  If
you know what version or patchlevel a particular feature was
implemented, you can test C<$]> (C<$PERL_VERSION> in C<English>) to see if it
will be there.  The C<Config> module will also let you interrogate values
determined by the B<Configure> program when Perl was installed.

=item *

Choose mnemonic identifiers.  If you can't remember what mnemonic means,
you've got a problem.

=item *

While short identifiers like C<$gotit> are probably ok, use underscores to
separate words in longer identifiers.  It is generally easier to read
C<$var_names_like_this> than C<$VarNamesLikeThis>, especially for
non-native speakers of English. It's also a simple rule that works
consistently with C<VAR_NAMES_LIKE_THIS>.

Package names are sometimes an exception to this rule.  Perl informally
reserves lowercase module names for "pragma" modules like C<integer> and
C<strict>.  Other modules should begin with a capital letter and use mixed
case, but probably without underscores due to limitations in primitive
file systems' representations of module names as files that must fit into a
few sparse bytes.

=item *

You may find it helpful to use letter case to indicate the scope
or nature of a variable. For example:

    $ALL_CAPS_HERE   constants only (beware clashes with perl vars!)
    $Some_Caps_Here  package-wide global/static
    $no_caps_here    function scope my() or local() variables

Function and method names seem to work best as all lowercase.
E.g., C<$obj-E<gt>as_string()>.

You can use a leading underscore to indicate that a variable or
function should not be used outside the package that defined it.

=item *

If you have a really hairy regular expression, use the C</x> modifier and
put in some whitespace to make it look a little less like line noise.
Don't use slash as a delimiter when your regexp has slashes or backslashes.

=item *

Use the new C<and> and C<or> operators to avoid having to parenthesize
list operators so much, and to reduce the incidence of punctuation
operators like C<&&> and C<||>.  Call your subroutines as if they were
functions or list operators to avoid excessive ampersands and parentheses.

=item *

Use here documents instead of repeated C<print()> statements.

=item *

Line up corresponding things vertically, especially if it'd be too long
to fit on one line anyway.

    $IDX = $ST_MTIME;
    $IDX = $ST_ATIME 	   if $opt_u;
    $IDX = $ST_CTIME 	   if $opt_c;
    $IDX = $ST_SIZE  	   if $opt_s;

    mkdir $tmpdir, 0700	or die "can't mkdir $tmpdir: $!";
    chdir($tmpdir)      or die "can't chdir $tmpdir: $!";
    mkdir 'tmp',   0777	or die "can't mkdir $tmpdir/tmp: $!";

=item *

Always check the return codes of system calls.  Good error messages should
go to C<STDERR>, include which program caused the problem, what the failed
system call and arguments were, and (VERY IMPORTANT) should contain the
standard system error message for what went wrong.  Here's a simple but
sufficient example:

    opendir(D, $dir)	 or die "can't opendir $dir: $!";

=item *

Line up your transliterations when it makes sense:

    tr [abc]

=item *

Think about reusability.  Why waste brainpower on a one-shot when you
might want to do something like it again?  Consider generalizing your
code.  Consider writing a module or object class.  Consider making your
code run cleanly with C<use strict> and C<use warnings> (or B<-w>) in
effect.  Consider giving away your code.  Consider changing your whole
world view.  Consider... oh, never mind.

=item *

Try to document your code and use Pod formatting in a consistent way. Here
are commonly expected conventions:

=over 4

=item *

use C<CE<lt>E<gt>> for function, variable and module names (and more
generally anything that can be considered part of code, like filehandles
or specific values). Note that function names are considered more readable
with parentheses after their name, that is C<function()>.

=item *

use C<BE<lt>E<gt>> for commands names like B<cat> or B<grep>.

=item *

use C<FE<lt>E<gt>> or C<CE<lt>E<gt>> for file names. C<FE<lt>E<gt>> should
be the only Pod code for file names, but as most Pod formatters render it
as italic, Unix and Windows paths with their slashes and backslashes may
be less readable, and better rendered with C<CE<lt>E<gt>>.


=item *

Be consistent.

=item *

Be nice.


--- NEW FILE: perlfork.pod ---
=head1 NAME

perlfork - Perl's fork() emulation


    NOTE:  As of the 5.8.0 release, fork() emulation has considerably
    matured.  However, there are still a few known bugs and differences
    from real fork() that might affect you.  See the "BUGS" and
    "CAVEATS AND LIMITATIONS" sections below.

Perl provides a fork() keyword that corresponds to the Unix system call
of the same name.  On most Unix-like platforms where the fork() system
call is available, Perl's fork() simply calls it.

On some platforms such as Windows where the fork() system call is not
available, Perl can be built to emulate fork() at the interpreter level.
While the emulation is designed to be as compatible as possible with the
real fork() at the level of the Perl program, there are certain
important differences that stem from the fact that all the pseudo child
"processes" created this way live in the same real process as far as the
operating system is concerned.

This document provides a general overview of the capabilities and
limitations of the fork() emulation.  Note that the issues discussed here
are not applicable to platforms where a real fork() is available and Perl
has been configured to use it.


The fork() emulation is implemented at the level of the Perl interpreter.
What this means in general is that running fork() will actually clone the
running interpreter and all its state, and run the cloned interpreter in
a separate thread, beginning execution in the new thread just after the
point where the fork() was called in the parent.  We will refer to the
thread that implements this child "process" as the pseudo-process.

To the Perl program that called fork(), all this is designed to be
transparent.  The parent returns from the fork() with a pseudo-process
ID that can be subsequently used in any process manipulation functions;
the child returns from the fork() with a value of C<0> to signify that
it is the child pseudo-process.

=head2 Behavior of other Perl features in forked pseudo-processes

Most Perl features behave in a natural way within pseudo-processes.

=over 8

=item $$ or $PROCESS_ID

This special variable is correctly set to the pseudo-process ID.
It can be used to identify pseudo-processes within a particular
session.  Note that this value is subject to recycling if any
pseudo-processes are launched after others have been wait()-ed on.

=item %ENV

Each pseudo-process maintains its own virtual environment.  Modifications
to %ENV affect the virtual environment, and are only visible within that
pseudo-process, and in any processes (or pseudo-processes) launched from

=item chdir() and all other builtins that accept filenames

Each pseudo-process maintains its own virtual idea of the current directory.
Modifications to the current directory using chdir() are only visible within
that pseudo-process, and in any processes (or pseudo-processes) launched from
it.  All file and directory accesses from the pseudo-process will correctly
map the virtual working directory to the real working directory appropriately.

=item wait() and waitpid()

wait() and waitpid() can be passed a pseudo-process ID returned by fork().
These calls will properly wait for the termination of the pseudo-process
and return its status.

=item kill()

kill() can be used to terminate a pseudo-process by passing it the ID returned
by fork().  This should not be used except under dire circumstances, because
the operating system may not guarantee integrity of the process resources
when a running thread is terminated.  Note that using kill() on a
pseudo-process() may typically cause memory leaks, because the thread that
implements the pseudo-process does not get a chance to clean up its resources.

=item exec()

Calling exec() within a pseudo-process actually spawns the requested
executable in a separate process and waits for it to complete before
exiting with the same exit status as that process.  This means that the
process ID reported within the running executable will be different from
what the earlier Perl fork() might have returned.  Similarly, any process
manipulation functions applied to the ID returned by fork() will affect the
waiting pseudo-process that called exec(), not the real process it is
waiting for after the exec().

=item exit()

exit() always exits just the executing pseudo-process, after automatically
wait()-ing for any outstanding child pseudo-processes.  Note that this means
that the process as a whole will not exit unless all running pseudo-processes
have exited.

=item Open handles to files, directories and network sockets

All open handles are dup()-ed in pseudo-processes, so that closing
any handles in one process does not affect the others.  See below for
some limitations.


=head2 Resource limits

In the eyes of the operating system, pseudo-processes created via the fork()
emulation are simply threads in the same process.  This means that any
process-level limits imposed by the operating system apply to all
pseudo-processes taken together.  This includes any limits imposed by the
operating system on the number of open file, directory and socket handles,
limits on disk space usage, limits on memory size, limits on CPU utilization

=head2 Killing the parent process

If the parent process is killed (either using Perl's kill() builtin, or
using some external means) all the pseudo-processes are killed as well,
and the whole process exits.

=head2 Lifetime of the parent process and pseudo-processes

During the normal course of events, the parent process and every
pseudo-process started by it will wait for their respective pseudo-children
to complete before they exit.  This means that the parent and every
pseudo-child created by it that is also a pseudo-parent will only exit
after their pseudo-children have exited.

A way to mark a pseudo-processes as running detached from their parent (so
that the parent would not have to wait() for them if it doesn't want to)
will be provided in future.


=over 8

=item BEGIN blocks

The fork() emulation will not work entirely correctly when called from
within a BEGIN block.  The forked copy will run the contents of the
BEGIN block, but will not continue parsing the source stream after the
BEGIN block.  For example, consider the following code:

    BEGIN {
        fork and exit;		# fork child and exit the parent
	print "inner\n";
    print "outer\n";

This will print:


rather than the expected:


This limitation arises from fundamental technical difficulties in
cloning and restarting the stacks used by the Perl parser in the
middle of a parse.

=item Open filehandles

Any filehandles open at the time of the fork() will be dup()-ed.  Thus,
the files can be closed independently in the parent and child, but beware
that the dup()-ed handles will still share the same seek pointer.  Changing
the seek position in the parent will change it in the child and vice-versa.
One can avoid this by opening files that need distinct seek pointers
separately in the child.

=item Forking pipe open() not yet implemented

The C<open(FOO, "|-")> and C<open(BAR, "-|")> constructs are not yet
implemented.  This limitation can be easily worked around in new code
by creating a pipe explicitly.  The following example shows how to
write to a forked child:

    # simulate open(FOO, "|-")
    sub pipe_to_fork ($) {
	my $parent = shift;
	pipe my $child, $parent or die;
	my $pid = fork();
	die "fork() failed: $!" unless defined $pid;
	if ($pid) {
	    close $child;
	else {
	    close $parent;
	    open(STDIN, "<&=" . fileno($child)) or die;

    if (pipe_to_fork('FOO')) {
	# parent
	print FOO "pipe_to_fork\n";
	close FOO;
    else {
	# child
	while (<STDIN>) { print; }

And this one reads from the child:

    # simulate open(FOO, "-|")
    sub pipe_from_fork ($) {
	my $parent = shift;
	pipe $parent, my $child or die;
	my $pid = fork();
	die "fork() failed: $!" unless defined $pid;
	if ($pid) {
	    close $child;
	else {
	    close $parent;
	    open(STDOUT, ">&=" . fileno($child)) or die;

    if (pipe_from_fork('BAR')) {
	# parent
	while (<BAR>) { print; }
	close BAR;
    else {
	# child
	print "pipe_from_fork\n";

Forking pipe open() constructs will be supported in future.

=item Global state maintained by XSUBs 

External subroutines (XSUBs) that maintain their own global state may
not work correctly.  Such XSUBs will either need to maintain locks to
protect simultaneous access to global data from different pseudo-processes,
or maintain all their state on the Perl symbol table, which is copied
naturally when fork() is called.  A callback mechanism that provides
extensions an opportunity to clone their state will be provided in the
near future.

=item Interpreter embedded in larger application

The fork() emulation may not behave as expected when it is executed in an
application which embeds a Perl interpreter and calls Perl APIs that can
evaluate bits of Perl code.  This stems from the fact that the emulation
only has knowledge about the Perl interpreter's own data structures and
knows nothing about the containing application's state.  For example, any
state carried on the application's own call stack is out of reach.

=item Thread-safety of extensions

Since the fork() emulation runs code in multiple threads, extensions
calling into non-thread-safe libraries may not work reliably when
calling fork().  As Perl's threading support gradually becomes more
widely adopted even on platforms with a native fork(), such extensions
are expected to be fixed for thread-safety.


=head1 BUGS

=over 8

=item *

Having pseudo-process IDs be negative integers breaks down for the integer
C<-1> because the wait() and waitpid() functions treat this number as
being special.  The tacit assumption in the current implementation is that
the system never allocates a thread ID of C<1> for user threads.  A better
representation for pseudo-process IDs will be implemented in future.

=item *

In certain cases, the OS-level handles created by the pipe(), socket(),
and accept() operators are apparently not duplicated accurately in
pseudo-processes.  This only happens in some situations, but where it
does happen, it may result in deadlocks between the read and write ends
of pipe handles, or inability to send or receive data across socket

=item *

This document may be incomplete in some respects.


=head1 AUTHOR

Support for concurrent interpreters and the fork() emulation was implemented
by ActiveState, with funding from Microsoft Corporation.

This document is authored and maintained by Gurusamy Sarathy
E<lt>gsar at activestate.comE<gt>.

=head1 SEE ALSO

L<perlfunc/"fork">, L<perlipc>


--- NEW FILE: perlfaq2.pod ---
=head1 NAME

perlfaq2 - Obtaining and Learning about Perl ($Revision: 1.2 $, $Date: 2006-12-04 17:01:32 $)


This section of the FAQ answers questions about where to find
source and documentation for Perl, support, and
related matters.

=head2 What machines support perl?  Where do I get it?

The standard release of perl (the one maintained by the perl
development team) is distributed only in source code form.  You
can find this at http://www.cpan.org/src/latest.tar.gz , which
is in a standard Internet format (a gzipped archive in POSIX tar format).

Perl builds and runs on a bewildering number of platforms.  Virtually
all known and current Unix derivatives are supported (perl's native
platform), as are other systems like VMS, DOS, OS/2, Windows,
QNX, BeOS, OS X, MPE/iX and the Amiga.

Binary distributions for some proprietary platforms, including
Apple systems, can be found http://www.cpan.org/ports/ directory.
Because these are not part of the standard distribution, they may
and in fact do differ from the base perl port in a variety of ways.
You'll have to check their respective release notes to see just
what the differences are.  These differences can be either positive
(e.g. extensions for the features of the particular platform that
are not supported in the source release of perl) or negative (e.g.
might be based upon a less current source release of perl).

=head2 How can I get a binary version of perl?

If you don't have a C compiler because your vendor for whatever
reasons did not include one with your system, the best thing to do is
grab a binary version of gcc from the net and use that to compile perl
with.  CPAN only has binaries for systems that are terribly hard to
get free compilers for, not for Unix systems.

Some URLs that might help you are:


Someone looking for a perl for Win16 might look to Laszlo Molnar's djgpp
port in http://www.cpan.org/ports/#msdos , which comes with clear
installation instructions.  A simple installation guide for MS-DOS using
Ilya Zakharevich's OS/2 port is available at
and similarly for Windows 3.1 at http://www.cs.ruu.nl/%7Epiet/perlwin3.html .

=head2 I don't have a C compiler. How can I build my own Perl interpreter?

Since you don't have a C compiler, you're doomed and your vendor
should be sacrificed to the Sun gods.  But that doesn't help you.

What you need to do is get a binary version of gcc for your system
first.  Consult the Usenet FAQs for your operating system for
information on where to get such a binary version.

=head2 I copied the perl binary from one machine to another, but scripts don't work.

That's probably because you forgot libraries, or library paths differ.
You really should build the whole distribution on the machine it will
eventually live on, and then type C<make install>.  Most other
approaches are doomed to failure.

One simple way to check that things are in the right place is to print out
the hard-coded @INC that perl looks through for libraries:

    % perl -le 'print for @INC'

If this command lists any paths that don't exist on your system, then you
may need to move the appropriate libraries to these locations, or create
symbolic links, aliases, or shortcuts appropriately.  @INC is also printed as
part of the output of

    % perl -V

You might also want to check out
L<perlfaq8/"How do I keep my own module/library directory?">.

=head2 I grabbed the sources and tried to compile but gdbm/dynamic loading/malloc/linking/... failed.  How do I make it work?

Read the F<INSTALL> file, which is part of the source distribution.
It describes in detail how to cope with most idiosyncrasies that the
Configure script can't work around for any given system or

=head2 What modules and extensions are available for Perl?  What is CPAN?  What does CPAN/src/... mean?

CPAN stands for Comprehensive Perl Archive Network, a ~1.2Gb archive
replicated on nearly 200 machines all over the world.  CPAN contains
source code, non-native ports, documentation, scripts, and many
third-party modules and extensions, designed for everything from
commercial database interfaces to keyboard/screen control to web
walking and CGI scripts.  The master web site for CPAN is
http://www.cpan.org/ and there is the CPAN Multiplexer at
http://www.cpan.org/CPAN.html which will choose a mirror near you
via DNS.  See http://www.perl.com/CPAN (without a slash at the
end) for how this process works. Also, http://mirror.cpan.org/
has a nice interface to the http://www.cpan.org/MIRRORED.BY
mirror directory.

See the CPAN FAQ at http://www.cpan.org/misc/cpan-faq.html for
answers to the most frequently asked questions about CPAN
including how to become a mirror.

CPAN/path/... is a naming convention for files available on CPAN
sites.  CPAN indicates the base directory of a CPAN mirror, and the
rest of the path is the path from that directory to the file.  For
instance, if you're using ftp://ftp.funet.fi/pub/languages/perl/CPAN
as your CPAN site, the file CPAN/misc/japh is downloadable as
ftp://ftp.funet.fi/pub/languages/perl/CPAN/misc/japh .

Considering that there are close to two thousand existing modules in
the archive, one probably exists to do nearly anything you can think of.
Current categories under CPAN/modules/by-category/ include Perl core
modules; development support; operating system interfaces; networking,
devices, and interprocess communication; data type utilities; database
interfaces; user interfaces; interfaces to other languages; filenames,
file systems, and file locking; internationalization and locale; world
wide web support; server and daemon utilities; archiving and
compression; image manipulation; mail and news; control flow
utilities; filehandle and I/O; Microsoft Windows modules; and
miscellaneous modules.

See http://www.cpan.org/modules/00modlist.long.html or
http://search.cpan.org/ for a more complete list of modules by category.

CPAN is not affiliated with O'Reilly Media.

=head2 Is there an ISO or ANSI certified version of Perl?

Certainly not.  Larry expects that he'll be certified before Perl is.

=head2 Where can I get information on Perl?

The complete Perl documentation is available with the Perl distribution.
If you have Perl installed locally, you probably have the documentation
installed as well: type C<man perl> if you're on a system resembling Unix.
This will lead you to other important man pages, including how to set your
$MANPATH.  If you're not on a Unix system, access to the documentation
will be different; for example, documentation might only be in HTML format.  All
proper perl installations have fully-accessible documentation.

You might also try C<perldoc perl> in case your system doesn't
have a proper man command, or it's been misinstalled.  If that doesn't
work, try looking in /usr/local/lib/perl5/pod for documentation.

If all else fails, consult http://perldoc.perl.org/ which has the
complete documentation in HTML and PDF format.

Many good books have been written about Perl--see the section below
for more details.

Tutorial documents are included in current or upcoming Perl releases
include L<perltoot> for objects or L<perlboot> for a beginner's
approach to objects, L<perlopentut> for file opening semantics,
L<perlreftut> for managing references, L<perlretut> for regular
expressions, L<perlthrtut> for threads, L<perldebtut> for debugging,
and L<perlxstut> for linking C and Perl together.  There may be more
by the time you read this.  These URLs might also be useful:


=head2 What are the Perl newsgroups on Usenet?  Where do I post questions?

Several groups devoted to the Perl language are on Usenet:

    comp.lang.perl.announce 		Moderated announcement group
    comp.lang.perl.misc     		High traffic general Perl discussion
    comp.lang.perl.moderated        Moderated discussion group
    comp.lang.perl.modules  		Use and development of Perl modules
    comp.lang.perl.tk           	Using Tk (and X) from Perl

    comp.infosystems.www.authoring.cgi 	Writing CGI scripts for the Web.

Some years ago, comp.lang.perl was divided into those groups, and
comp.lang.perl itself officially removed.  While that group may still
be found on some news servers, it is unwise to use it, because
postings there will not appear on news servers which honour the
official list of group names.  Use comp.lang.perl.misc for topics
which do not have a more-appropriate specific group.

There is also a Usenet gateway to Perl mailing lists sponsored by
perl.org at nntp://nntp.perl.org , a web interface to the same lists
at http://nntp.perl.org/group/ and these lists are also available
under the C<perl.*> hierarchy at http://groups.google.com . Other
groups are listed at http://lists.perl.org/ ( also known as
http://lists.cpan.org/ ).

A nice place to ask questions is the PerlMonks site,
http://www.perlmonks.org/ , or the Perl Beginners mailing list
http://lists.perl.org/showlist.cgi?name=beginners .

Note that none of the above are supposed to write your code for you:
asking questions about particular problems or general advice is fine,
but asking someone to write your code for free is not very cool.

=head2 Where should I post source code?

You should post source code to whichever group is most appropriate, but
feel free to cross-post to comp.lang.perl.misc.  If you want to cross-post
to alt.sources, please make sure it follows their posting standards,
including setting the Followup-To header line to NOT include alt.sources;
see their FAQ ( http://www.faqs.org/faqs/alt-sources-intro/ ) for details.

If you're just looking for software, first use Google
( http://www.google.com ), Google's usenet search interface
( http://groups.google.com ),  and CPAN Search ( http://search.cpan.org ).
This is faster and more productive than just posting a request.

=head2 Perl Books

A number of books on Perl and/or CGI programming are available.  A few
of these are good, some are OK, but many aren't worth your money.
There is a list of these books, some with extensive reviews, at
http://books.perl.org/ . If you don't see your book listed here, you
can write to perlfaq-workers at perl.org .

The incontestably definitive reference book on Perl, written by
the creator of Perl, is Programming Perl:

	Programming Perl (the "Camel Book"):
	by Larry Wall, Tom Christiansen, and Jon Orwant
	ISBN 0-596-00027-8  [3rd edition July 2000]
	(English, translations to several languages are also available)

The companion volume to the Camel containing thousands
of real-world examples, mini-tutorials, and complete programs is:

	The Perl Cookbook (the "Ram Book"):
	by Tom Christiansen and Nathan Torkington,
	    with Foreword by Larry Wall
	ISBN 0-596-00313-7 [2nd Edition August 2003]

If you're already a seasoned programmer, then the Camel Book might
suffice for you to learn Perl.  If you're not, check out the
Llama book:

	Learning Perl
	by Randal L. Schwartz, Tom Phoenix, and brian d foy
	ISBN 0-596-10105-8 [4th edition July 2005]

And for more advanced information on writing larger programs,
presented in the same style as the Llama book, continue your education
with the Alpaca book:

	Learning Perl Objects, References, and Modules (the "Alpaca Book")
	by Randal L. Schwartz, with Tom Phoenix (foreword by Damian Conway)
	ISBN 0-596-00478-8 [1st edition June 2003]

If you're not an accidental programmer, but a more serious and
possibly even degreed computer scientist who doesn't need as much
hand-holding as we try to provide in the Llama, please check out the
delightful book

	Perl: The Programmer's Companion
	by Nigel Chapman
	ISBN 0-471-97563-X [1997, 3rd printing Spring 1998]
	http://www.wiley.com/compbooks/chapman/perl/perltpc.html (errata etc)

If you are more at home in Windows the following is available
(though unfortunately rather dated).

	Learning Perl on Win32 Systems (the "Gecko Book")
	by Randal L. Schwartz, Erik Olson, and Tom Christiansen,
	    with foreword by Larry Wall
	ISBN 1-56592-324-3 [1st edition August 1997]

Addison-Wesley ( http://www.awlonline.com/ ) and Manning
( http://www.manning.com/ ) are also publishers of some fine Perl books
such as I<Object Oriented Programming with Perl> by Damian Conway and
I<Network Programming with Perl> by Lincoln Stein.

An excellent technical book discounter is Bookpool at
http://www.bookpool.com/ where a 30% discount or more is not unusual.

What follows is a list of the books that the FAQ authors found personally
useful.  Your mileage may (but, we hope, probably won't) vary.

Recommended books on (or mostly on) Perl follow.

=over 4

=item References

	Programming Perl
	by Larry Wall, Tom Christiansen, and Jon Orwant
	ISBN 0-596-00027-8 [3rd edition July 2000]

	Perl 5 Pocket Reference
	by Johan Vromans
	ISBN 0-596-00032-4 [3rd edition May 2000]

=item Tutorials

	Beginning Perl
	by James Lee
	ISBN 1-59059-391-X [2nd edition August 2004]

	Elements of Programming with Perl
	by Andrew L. Johnson
	ISBN 1-884777-80-5 [1st edition October 1999]

	Learning Perl
	by Randal L. Schwartz, Tom Phoenix, and brian d foy
	ISBN 0-596-10105-8 [4th edition July 2005]

	Learning Perl Objects, References, and Modules
	by Randal L. Schwartz, with Tom Phoenix (foreword by Damian Conway)
	ISBN 0-596-00478-8 [1st edition June 2003]

=item Task-Oriented

	Writing Perl Modules for CPAN
	by Sam Tregar
	ISBN 1-59059-018-X [1st edition Aug 2002]

	The Perl Cookbook
	by Tom Christiansen and Nathan Torkington
	    with foreword by Larry Wall
	ISBN 1-56592-243-3 [1st edition August 1998]

	Effective Perl Programming
	by Joseph Hall
	ISBN 0-201-41975-0 [1st edition 1998]

	Real World SQL Server Administration with Perl
	by Linchi Shea
	ISBN 1-59059-097-X [1st edition July 2003]

=item Special Topics

	Perl Best Practices
	by Damian Conway
	ISBN: 0-596-00173-8 [1st edition July 2005]

	Higher Order Perl
	by Mark-Jason Dominus
	ISBN: 1558607013 [1st edition March 2005]

	Perl 6 Now: The Core Ideas Illustrated with Perl 5
	by Scott Walters
	ISBN 1-59059-395-2 [1st edition December 2004]

	Mastering Regular Expressions
	by Jeffrey E. F. Friedl
	ISBN 0-596-00289-0 [2nd edition July 2002]

	Network Programming with Perl
	by Lincoln Stein
	ISBN 0-201-61571-1 [1st edition 2001]

	Object Oriented Perl
	Damian Conway
	    with foreword by Randal L. Schwartz
	ISBN 1-884777-79-1 [1st edition August 1999]

	Data Munging with Perl
	Dave Cross
	ISBN 1-930110-00-6 [1st edition 2001]

	Mastering Perl/Tk
	by Steve Lidie and Nancy Walsh
	ISBN 1-56592-716-8 [1st edition January 2002]

	Extending and Embedding Perl
	by Tim Jenness and Simon Cozens
	ISBN 1-930110-82-0 [1st edition August 2002]

	Perl Debugger Pocket Reference
	by Richard Foley
	ISBN 0-596-00503-2 [1st edition January 2004]


=head2 Which magazines have Perl content?

The first (and for a long time, only) periodical devoted to All Things Perl,
I<The Perl Journal> contains tutorials, demonstrations, case studies,
announcements, contests, and much more.  I<TPJ> has columns on web
development, databases, Win32 Perl, graphical programming, regular
expressions, and networking, and sponsors the Obfuscated Perl Contest
and the Perl Poetry Contests.  Beginning in November 2002, TPJ moved to a
reader-supported monthly e-zine format in which subscribers can download
issues as PDF documents. For more details on TPJ, see http://www.tpj.com/

Beyond this, magazines that frequently carry quality articles on
Perl are I<The Perl Review> ( http://www.theperlreview.com ),
I<Unix Review> ( http://www.unixreview.com/ ),
I<Linux Magazine> ( http://www.linuxmagazine.com/ ),
and Usenix's newsletter/magazine to its members, I<login:>
( http://www.usenix.org/ )

The Perl columns of Randal L. Schwartz are available on the web at
http://www.stonehenge.com/merlyn/WebTechniques/ ,
http://www.stonehenge.com/merlyn/UnixReview/ , and
http://www.stonehenge.com/merlyn/LinuxMag/ .

=head2 What mailing lists are there for Perl?

Most of the major modules (Tk, CGI, libwww-perl) have their own
mailing lists.  Consult the documentation that came with the module for
subscription information.

A comprehensive list of Perl related mailing lists can be found at:


=head2 Where are the archives for comp.lang.perl.misc?

The Google search engine now carries archived and searchable newsgroup


If you have a question, you can be sure someone has already asked the
same question at some point on c.l.p.m. It requires some time and patience
to sift through all the content but often you will find the answer you

=head2 Where can I buy a commercial version of perl?

In a real sense, perl already I<is> commercial software: it has a license
that you can grab and carefully read to your manager. It is distributed
in releases and comes in well-defined packages. There is a very large
user community and an extensive literature.  The comp.lang.perl.*
newsgroups and several of the mailing lists provide free answers to your
questions in near real-time.  Perl has traditionally been supported by
Larry, scores of software designers and developers, and myriad
programmers, all working for free to create a useful thing to make life
better for everyone.

However, these answers may not suffice for managers who require a
purchase order from a company whom they can sue should anything go awry.
Or maybe they need very serious hand-holding and contractual obligations.
Shrink-wrapped CDs with perl on them are available from several sources if
that will help.  For example, many Perl books include a distribution of perl,
as do the O'Reilly Perl Resource Kits (in both the Unix flavor
and in the proprietary Microsoft flavor); the free Unix distributions
also all come with perl.

=head2 Where do I send bug reports?

If you are reporting a bug in the perl interpreter or the modules
shipped with Perl, use the I<perlbug> program in the Perl distribution or
mail your report to perlbug at perl.org or at http://rt.perl.org/perlbug/ .

For Perl modules, you can submit bug reports to the Request Tracker set
up at http://rt.cpan.org .

If you are posting a bug with a non-standard port (see the answer to
"What platforms is perl available for?"), a binary distribution, or a
non-standard module (such as Tk, CGI, etc), then please see the
documentation that came with it to determine the correct place to post

Read the perlbug(1) man page (perl5.004 or later) for more information.

=head2 What is perl.com? Perl Mongers? pm.org? perl.org? cpan.org?

Perl.com at http://www.perl.com/ is part of the O'Reilly Network, a
subsidiary of O'Reilly Media.

The Perl Foundation is an advocacy organization for the Perl language
which maintains the web site http://www.perl.org/ as a general
advocacy site for the Perl language. It uses the domain to provide
general support services to the Perl community, including the hosting
of mailing lists, web sites, and other services.  The web site
http://www.perl.org/ is a general advocacy site for the Perl language,
and there are many other sub-domains for special topics, such as


Perl Mongers uses the pm.org domain for services related to Perl user
groups, including the hosting of mailing lists and web sites.  See the
Perl user group web site at http://www.pm.org/ for more information about
joining, starting, or requesting services for a Perl user group.

http://www.cpan.org/ is the Comprehensive Perl Archive Network,
a replicated worldwide repository of Perl software, see
the I<What is CPAN?> question earlier in this document.


Copyright (c) 1997-2006 Tom Christiansen, Nathan Torkington, and
other authors as noted. All rights reserved.

This documentation is free; you can redistribute it and/or modify it
under the same terms as Perl itself.

Irrespective of its distribution, all code examples here are in the public
domain.  You are permitted and encouraged to use this code and any
derivatives thereof in your own programs for fun or for profit as you
see fit.  A simple comment in the code giving credit to the FAQ would
be courteous but is not required.

--- NEW FILE: perlrun.pod ---
=head1 NAME

perlrun - how to execute the Perl interpreter


B<perl>	S<[ B<-sTtuUWX> ]>
	S<[ B<-hv> ] [ B<-V>[:I<configvar>] ]>
	S<[ B<-cw> ] [ B<-d>[B<t>][:I<debugger>] ] [ B<-D>[I<number/list>] ]>
	S<[ B<-pna> ] [ B<-F>I<pattern> ] [ B<-l>[I<octal>] ] [ B<-0>[I<octal/hexadecimal>] ]>
	S<[ B<-I>I<dir> ] [ B<-m>[B<->]I<module> ] [ B<-M>[B<->]I<'module...'> ] [ B<-f> ]>
	S<[ B<-C [I<number/list>] >]>
	S<[ B<-P> ]>
	S<[ B<-S> ]>
	S<[ B<-x>[I<dir>] ]>
	S<[ B<-i>[I<extension>] ]>
	S<[ B<-e> I<'command'> ] [ B<--> ] [ I<programfile> ] [ I<argument> ]...>

[...1281 lines suppressed...]

=item SYS$LOGIN (specific to the VMS port)

Used if chdir has no argument and HOME and LOGDIR are not set.


Perl also has environment variables that control how Perl handles data
specific to particular natural languages.  See L<perllocale>.

Apart from these, Perl uses no other environment variables, except
to make them available to the program being executed, and to child
processes.  However, programs running setuid would do well to execute
the following lines before doing anything else, just to keep people

    $ENV{PATH}  = '/bin:/usr/bin';    # or whatever you need
    $ENV{SHELL} = '/bin/sh' if exists $ENV{SHELL};
    delete @ENV{qw(IFS CDPATH ENV BASH_ENV)};

--- NEW FILE: perl583delta.pod ---
=head1 NAME

perl583delta - what is new for perl v5.8.3


This document describes differences between the 5.8.2 release and
the 5.8.3 release.

If you are upgrading from an earlier release such as 5.6.1, first read
the L<perl58delta>, which describes differences between 5.6.0 and
5.8.0, and the L<perl581delta> and L<perl582delta>, which describe differences
between 5.8.0, 5.8.1 and 5.8.2

=head1 Incompatible Changes

There are no changes incompatible with 5.8.2.

=head1 Core Enhancements

A C<SCALAR> method is now available for tied hashes. This is called when
a tied hash is used in scalar context, such as

    if (%tied_hash) {

The old behaviour was that %tied_hash would return whatever would have been
returned for that hash before the hash was tied (so usually 0). The new
behaviour in the absence of a SCALAR method is to return TRUE if in the
middle of an C<each> iteration, and otherwise call FIRSTKEY to check if the
hash is empty (making sure that a subsequent C<each> will also begin by
calling FIRSTKEY). Please see L<perltie/SCALAR> for the full details and

=head1 Modules and Pragmata

=over 4

=item CGI

=item Cwd

=item Digest

=item Digest::MD5

=item Encode

=item File::Spec

=item FindBin

A function C<again> is provided to resolve problems where modules in different
directories wish to use FindBin.

=item List::Util

You can now weaken references to read only values.

=item Math::BigInt

=item PodParser

=item Pod::Perldoc

=item POSIX

=item Unicode::Collate

=item Unicode::Normalize

=item Test::Harness

=item threads::shared

C<cond_wait> has a new two argument form. C<cond_timedwait> has been added.


=head1 Utility Changes

C<find2perl> now assumes C<-print> as a default action. Previously, it
needed to be specified explicitly.

A new utility, C<prove>, makes it easy to run an individual regression test
at the command line. C<prove> is part of Test::Harness, which users of earlier
Perl versions can install from CPAN.

=head1 New Documentation

The documentation has been revised in places to produce more standard manpages.

The documentation for the special code blocks (BEGIN, CHECK, INIT, END)
has been improved.

=head1 Installation and Configuration Improvements

Perl now builds on OpenVMS I64

=head1 Selected Bug Fixes

Using substr() on a UTF8 string could cause subsequent accesses on that
string to return garbage. This was due to incorrect UTF8 offsets being
cached, and is now fixed.

join() could return garbage when the same join() statement was used to
process 8 bit data having earlier processed UTF8 data, due to the flags
on that statement's temporary workspace not being reset correctly. This
is now fixed.

C<$a .. $b> will now work as expected when either $a or $b is C<undef>

Using Unicode keys with tied hashes should now work correctly.

Reading $^E now preserves $!. Previously, the C code implementing $^E
did not preserve C<errno>, so reading $^E could cause C<errno> and therefore
C<$!> to change unexpectedly.

Reentrant functions will (once more) work with C++. 5.8.2 introduced a bugfix
which accidentally broke the compilation of Perl extensions written in C++

=head1 New or Changed Diagnostics

The fatal error "DESTROY created new reference to dead object" is now
documented in L<perldiag>.

=head1 Changed Internals

The hash code has been refactored to reduce source duplication. The
external interface is unchanged, and aside from the bug fixes described
above, there should be no change in behaviour.

C<hv_clear_placeholders> is now part of the perl API

Some C macros have been tidied. In particular macros which create temporary
local variables now name these variables more defensively, which should
avoid bugs where names clash.

<signal.h> is now always included.

=head1 Configuration and Building

C<Configure> now invokes callbacks regardless of the value of the variable
they are called for. Previously callbacks were only invoked in the
C<case $variable $define)> branch. This change should only affect platform
maintainers writing configuration hints files.

=head1 Platform Specific Problems

The regression test ext/threads/shared/t/wait.t fails on early RedHat 9
and HP-UX 10.20 due to bugs in their threading implementations.
RedHat users should see https://rhn.redhat.com/errata/RHBA-2003-136.html
and consider upgrading their glibc.

=head1 Known Problems

Detached threads aren't supported on Windows yet, as they may lead to 
memory access violation problems.

There is a known race condition opening scripts in C<suidperl>. C<suidperl>
is neither built nor installed by default, and has been deprecated since
perl 5.8.0. You are advised to replace use of suidperl with tools such
as sudo ( http://www.courtesan.com/sudo/ )

We have a backlog of unresolved bugs. Dealing with bugs and bug reports
is unglamorous work; not something ideally suited to volunteer labour,
but that is all that we have.

The perl5 development team are implementing changes to help address this
problem, which should go live in early 2004.

=head1 Future Directions

Code freeze for the next maintenance release (5.8.4) is on March 31st 2004,
with release expected by mid April. Similarly 5.8.5's freeze will be at
the end of June, with release by mid July.

=head1 Obituary

Iain 'Spoon' Truskett, Perl hacker, author of L<perlreref> and
contributor to CPAN, died suddenly on 29th December 2003, aged 24.
He will be missed.

=head1 Reporting Bugs

If you find what you think is a bug, you might check the articles
recently posted to the comp.lang.perl.misc newsgroup and the perl
bug database at http://bugs.perl.org.  There may also be
information at http://www.perl.org, the Perl Home Page.

If you believe you have an unreported bug, please run the B<perlbug>
program included with your release.  Be sure to trim your bug down
to a tiny but sufficient test case.  Your bug report, along with the
output of C<perl -V>, will be sent off to perlbug at perl.org to be
analysed by the Perl porting team.  You can browse and search
the Perl 5 bugs at http://bugs.perl.org/

=head1 SEE ALSO

The F<Changes> file for exhaustive details on what changed.

The F<INSTALL> file for how to build Perl.

The F<README> file for general stuff.

The F<Artistic> and F<Copying> files for copyright information.


--- NEW FILE: pod2text.PL ---

use Config;
use File::Basename qw(&basename &dirname);
use Cwd;

# List explicitly here the variables you want Configure to
# generate.  Metaconfig only looks for shell variables, so you
# have to mention them as if they were shell variables, not
# %Config entries.  Thus you write
#  $startperl
# to ensure Configure will look for $Config{startperl}.

# This forces PL files to create target in same directory as PL file.
# This is so that make depend always knows where to find PL derivatives.
$origdir = cwd;
chdir dirname($0);
$file = basename($0, '.PL');
$file .= '.com' if $^O eq 'VMS';

open OUT,">$file" or die "Can't create $file: $!";

print "Extracting $file (with variable substitutions)\n";

# In this section, perl variables will be expanded during extraction.
# You can use $Config{...} to use Configure variables.

print OUT <<"!GROK!THIS!";
    eval 'exec $Config{perlpath} -S \$0 \${1+"\$@"}'
        if \$running_under_some_shell;

# In the following, perl variables are not expanded during extraction.

print OUT <<'!NO!SUBS!';

# pod2text -- Convert POD data to formatted ASCII text.
# Copyright 1999, 2000, 2001 by Russ Allbery <rra at stanford.edu>
# This program is free software; you may redistribute it and/or modify it
# under the same terms as Perl itself.
# The driver script for Pod::Text, Pod::Text::Termcap, and Pod::Text::Color,
# invoked by perldoc -t among other things.

require 5.004;

use Getopt::Long qw(GetOptions);
use Pod::Text ();
use Pod::Usage qw(pod2usage);

use strict;

# Silence -w warnings.
use vars qw($running_under_some_shell);

# Take an initial pass through our options, looking for one of the form
# -<number>.  We turn that into -w <number> for compatibility with the
# original pod2text script.
for (my $i = 0; $i < @ARGV; $i++) {
    last if $ARGV[$i] =~ /^--$/;
    if ($ARGV[$i] =~ /^-(\d+)$/) {
        splice (@ARGV, $i++, 1, '-w', $1);

# Insert -- into @ARGV before any single dash argument to hide it from
# Getopt::Long; we want to interpret it as meaning stdin (which Pod::Parser
# does correctly).
my $stdin;
@ARGV = map { $_ eq '-' && !$stdin++ ? ('--', $_) : $_ } @ARGV;

# Parse our options.  Use the same names as Pod::Text for simplicity, and
# default to sentence boundaries turned off for compatibility.
my %options;
$options{sentence} = 0;
Getopt::Long::config ('bundling');
GetOptions (\%options, 'alt|a', 'code', 'color|c', 'help|h', 'indent|i=i',
            'loose|l', 'margin|left-margin|m=i', 'overstrike|o',
            'quotes|q=s', 'sentence|s', 'termcap|t', 'width|w=i') or exit 1;
pod2usage (1) if $options{help};

# Figure out what formatter we're going to use.  -c overrides -t.
my $formatter = 'Pod::Text';
if ($options{color}) {
    $formatter = 'Pod::Text::Color';
    eval { require Term::ANSIColor };
    if ($@) { die "-c (--color) requires Term::ANSIColor be installed\n" }
    require Pod::Text::Color;
} elsif ($options{termcap}) {
    $formatter = 'Pod::Text::Termcap';
    require Pod::Text::Termcap;
} elsif ($options{overstrike}) {
    $formatter = 'Pod::Text::Overstrike';
    require Pod::Text::Overstrike;
delete @options{'color', 'termcap', 'overstrike'};

# Initialize and run the formatter.
my $parser = $formatter->new (%options);
$parser->parse_from_file (@ARGV);


=head1 NAME

pod2text - Convert POD data to formatted ASCII text


pod2text [B<-aclost>] [B<--code>] [B<-i> I<indent>] S<[B<-q> I<quotes>]>
S<[B<-w> I<width>]> [I<input> [I<output>]]

pod2text B<-h>


B<pod2text> is a front-end for Pod::Text and its subclasses.  It uses them
to generate formatted ASCII text from POD source.  It can optionally use
either termcap sequences or ANSI color escape sequences to format the text.

I<input> is the file to read for POD source (the POD can be embedded in
code).  If I<input> isn't given, it defaults to STDIN.  I<output>, if given,
is the file to which to write the formatted output.  If I<output> isn't
given, the formatted output is written to STDOUT.

=head1 OPTIONS

=over 4

=item B<-a>, B<--alt>

Use an alternate output format that, among other things, uses a different
heading style and marks C<=item> entries with a colon in the left margin.

=item B<--code>

Include any non-POD text from the input file in the output as well.  Useful
for viewing code documented with POD blocks with the POD rendered and the
code left intact.

=item B<-c>, B<--color>

Format the output with ANSI color escape sequences.  Using this option
requires that Term::ANSIColor be installed on your system.

=item B<-i> I<indent>, B<--indent=>I<indent>

Set the number of spaces to indent regular text, and the default indentation
for C<=over> blocks.  Defaults to 4 spaces if this option isn't given.

=item B<-h>, B<--help>

Print out usage information and exit.

=item B<-l>, B<--loose>

Print a blank line after a C<=head1> heading.  Normally, no blank line is
printed after C<=head1>, although one is still printed after C<=head2>,
because this is the expected formatting for manual pages; if you're
formatting arbitrary text documents, using this option is recommended.

=item B<-m> I<width>, B<--left-margin>=I<width>, B<--margin>=I<width>

The width of the left margin in spaces.  Defaults to 0.  This is the margin
for all text, including headings, not the amount by which regular text is
indented; for the latter, see B<-i> option.

=item B<-o>, B<--overstrike>

Format the output with overstruck printing.  Bold text is rendered as
character, backspace, character.  Italics and file names are rendered as
underscore, backspace, character.  Many pagers, such as B<less>, know how
to convert this to bold or underlined text.

=item B<-q> I<quotes>, B<--quotes>=I<quotes>

Sets the quote marks used to surround CE<lt>> text to I<quotes>.  If
I<quotes> is a single character, it is used as both the left and right
quote; if I<quotes> is two characters, the first character is used as the
left quote and the second as the right quoted; and if I<quotes> is four
characters, the first two are used as the left quote and the second two as
the right quote.

I<quotes> may also be set to the special value C<none>, in which case no
quote marks are added around CE<lt>> text.

=item B<-s>, B<--sentence>

Assume each sentence ends with two spaces and try to preserve that spacing.
Without this option, all consecutive whitespace in non-verbatim paragraphs
is compressed into a single space.

=item B<-t>, B<--termcap>

Try to determine the width of the screen and the bold and underline
sequences for the terminal from termcap, and use that information in
formatting the output.  Output will be wrapped at two columns less than the
width of your terminal device.  Using this option requires that your system
have a termcap file somewhere where Term::Cap can find it and requires that
your system support termios.  With this option, the output of B<pod2text>
will contain terminal control sequences for your current terminal type.

=item B<-w>, B<--width=>I<width>, B<->I<width>

The column at which to wrap text on the right-hand side.  Defaults to 76,
unless B<-t> is given, in which case it's two columns less than the width of
your terminal device.



If B<pod2text> fails with errors, see L<Pod::Text> and L<Pod::Parser> for
information about what those errors might mean.  Internally, it can also
produce the following diagnostics:

=over 4

=item -c (--color) requires Term::ANSIColor be installed

(F) B<-c> or B<--color> were given, but Term::ANSIColor could not be

=item Unknown option: %s

(F) An unknown command line option was given.


In addition, other L<Getopt::Long|Getopt::Long> error messages may result
from invalid command-line options.


=over 4


If B<-t> is given, B<pod2text> will take the current width of your screen
from this environment variable, if available.  It overrides terminal width
information in TERMCAP.


If B<-t> is given, B<pod2text> will use the contents of this environment
variable if available to determine the correct formatting sequences for your
current terminal device.


=head1 SEE ALSO

L<Pod::Text>, L<Pod::Text::Color>, L<Pod::Text::Overstrike>,
L<Pod::Text::Termcap>, L<Pod::Parser>

The current version of this script is always available from its web site at
L<http://www.eyrie.org/~eagle/software/podlators/>.  It is also part of the
Perl core distribution as of 5.6.0.

=head1 AUTHOR

Russ Allbery <rra at stanford.edu>.


Copyright 1999, 2000, 2001 by Russ Allbery <rra at stanford.edu>.

This program is free software; you may redistribute it and/or modify it
under the same terms as Perl itself.


close OUT or die "Can't close $file: $!";
chmod 0755, $file or die "Can't reset permissions for $file: $!\n";
exec("$Config{'eunicefix'} $file") if $Config{'eunicefix'} ne ':';
chdir $origdir;

--- NEW FILE: perlothrtut.pod ---
=head1 NAME

perlothrtut - old tutorial on threads in Perl


This tutorial describes the old-style thread model that was introduced in
release 5.005. This model is now deprecated, and will be removed, probably
in version 5.10. The interfaces described here were considered
experimental, and are likely to be buggy.

For information about the new interpreter threads ("ithreads") model, see
the F<perlthrtut> tutorial, and the L<threads> and L<threads::shared>

You are strongly encouraged to migrate any existing threads code to the
new model as soon as possible.

[...1029 lines suppressed...]
=head1 Acknowledgements

Thanks (in no particular order) to Chaim Frenkel, Steve Fink, Gurusamy
Sarathy, Ilya Zakharevich, Benjamin Sugars, Jürgen Christoffel, Joshua
Pritikin, and Alan Burlison, for their help in reality-checking and
polishing this article.  Big thanks to Tom Christiansen for his rewrite
of the prime number generator.

=head1 AUTHOR

Dan Sugalski E<lt>sugalskd at ous.eduE<gt>

=head1 Copyrights

This article originally appeared in The Perl Journal #10, and is
copyright 1998 The Perl Journal. It appears courtesy of Jon Orwant and
The Perl Journal.  This document may be distributed under the same terms
as Perl itself.

--- NEW FILE: perldoc.pod ---

=head1 NAME

perldoc - Look up Perl documentation in Pod format.


B<perldoc> [B<-h>] [B<-v>] [B<-t>] [B<-u>] [B<-m>] [B<-l>] [B<-F>]
[B<-i>] [B<-V>] [B<-T>] [B<-r>]

B<perldoc> B<-f> BuiltinFunction

B<perldoc> B<-q> FAQ Keyword

See below for more description of the switches.


I<perldoc> looks up a piece of documentation in .pod format that is embedded
in the perl installation tree or in a perl script, and displays it via
C<pod2man | nroff -man | $PAGER>. (In addition, if running under HP-UX,
C<col -x> will be used.) This is primarily used for the documentation for
the perl library modules.

Your system may also have man pages installed for those modules, in
which case you can probably just use the man(1) command.

If you are looking for a table of contents to the Perl library modules
documentation, see the L<perltoc> page.

=head1 OPTIONS

=over 5

=item B<-h>

Prints out a brief B<h>elp message.

=item B<-v>

Describes search for the item in detail (B<v>erbosely).

=item B<-t>

Display docs using plain B<t>ext converter, instead of nroff. This may be faster,
but it probably won't look as nice.

=item B<-u>

Skip the real Pod formatting, and just show the raw Pod source (B<U>nformatted)

=item B<-m> I<module>

Display the entire module: both code and unformatted pod documentation.
This may be useful if the docs don't explain a function in the detail
you need, and you'd like to inspect the code directly; perldoc will find
the file for you and simply hand it off for display.

=item B<-l>

Display onB<l>y the file name of the module found.

=item B<-F>

Consider arguments as file names; no search in directories will be performed.

=item B<-f> I<perlfunc>

The B<-f> option followed by the name of a perl built in function will
extract the documentation of this function from L<perlfunc>.


      perldoc -f sprintf

=item B<-q> I<perlfaq-search-regexp>

The B<-q> option takes a regular expression as an argument.  It will search
the B<q>uestion headings in perlfaq[1-9] and print the entries matching
the regular expression.  Example: C<perldoc -q shuffle>

=item B<-T>

This specifies that the output is not to be sent to a pager, but is to
be sent right to STDOUT.

=item B<-d> I<destination-filename>

This specifies that the output is to be sent neither to a pager nor
to STDOUT, but is to be saved to the specified filename.  Example:
C<perldoc -oLaTeX -dtextwrapdocs.tex Text::Wrap>

=item B<-o> I<output-formatname>

This specifies that you want Perldoc to try using a Pod-formatting
class for the output format that you specify.  For example:
C<-oman>.  This is actually just a wrapper around the C<-M> switch;
using C<-oI<formatname>> just looks for a loadable class by adding
that format name (with different capitalizations) to the end of
different classname prefixes.

For example, C<-oLaTeX> currently tries all of the following classes:
Pod::Perldoc::ToLaTeX Pod::Perldoc::Tolatex Pod::Perldoc::ToLatex
Pod::Perldoc::ToLATEX Pod::Simple::LaTeX Pod::Simple::latex
Pod::Simple::Latex Pod::Simple::LATEX Pod::LaTeX Pod::latex Pod::Latex

=item B<-M> I<module-name>

This specifies the module that you want to try using for formatting the
pod.  The class must at least provide a C<parse_from_file> method.
For example: C<perldoc -MPod::Perldoc::ToChecker>.

You can specify several classes to try by joining them with commas
or semicolons, as in C<-MTk::SuperPod;Tk::Pod>.

=item B<-w> I<option:value> or B<-w> I<option> 

This specifies an option to call the formatter B<w>ith.  For example,
C<-w textsize:15> will call
C<< $formatter->textsize(15) >> on the formatter object before it is
used to format the object.  For this to be valid, the formatter class
must provide such a method, and the value you pass should be valid.
(So if C<textsize> expects an integer, and you do C<-w textsize:big>,
expect trouble.)

You can use C<-w optionname> (without a value) as shorthand for
C<-w optionname:I<TRUE>>.  This is presumably useful in cases of on/off
features like: C<-w page_numbering>.

You can use a "=" instead of the ":", as in: C<-w textsize=15>.  This
might be more (or less) convenient, depending on what shell you use.

=item B<-X>

Use an index if it is present -- the B<-X> option looks for an entry
whose basename matches the name given on the command line in the file
C<$Config{archlib}/pod.idx>. The F<pod.idx> file should contain fully
qualified filenames, one per line.

=item B<PageName|ModuleName|ProgramName>

The item you want to look up.  Nested modules (such as C<File::Basename>)
are specified either as C<File::Basename> or C<File/Basename>.  You may also
give a descriptive name of a page, such as C<perlfunc>.

=item B<-n> I<some-formatter>

Specify replacement for nroff

=item B<-r>

Recursive search.

=item B<-i>

Ignore case.

=item B<-V>

Displays the version of perldoc you're running.



Because B<perldoc> does not run properly tainted, and is known to
have security issues, when run as the superuser it will attempt to
drop privileges by setting the effective and real IDs to nobody's
or nouser's account, or -2 if unavailable.  If it cannot relinquish
its privileges, it will not run.  


Any switches in the C<PERLDOC> environment variable will be used before the
command line arguments.

Useful values for C<PERLDOC> include C<-oman>, C<-otext>, C<-otk>, C<-ortf>,
C<-oxml>, and so on, depending on what modules you have on hand; or
exactly specify the formatter class with C<-MPod::Perldoc::ToMan>
or the like.

C<perldoc> also searches directories
specified by the C<PERL5LIB> (or C<PERLLIB> if C<PERL5LIB> is not
defined) and C<PATH> environment variables.
(The latter is so that embedded pods for executables, such as
C<perldoc> itself, are available.)

C<perldoc> will use, in order of preference, the pager defined in
C<PERLDOC_PAGER>, C<MANPAGER>, or C<PAGER> before trying to find a pager
on its own. (C<MANPAGER> is not used if C<perldoc> was told to display
plain text or unformatted pod.)

One useful value for C<PERLDOC_PAGER> is C<less -+C -E>.

Having PERLDOCDEBUG set to a positive integer will make perldoc emit
even more descriptive output than the C<-v> switch does -- the higher the
number, the more it emits.

=head1 AUTHOR

Current maintainer: Sean M. Burke, <sburke at cpan.org>

Past contributors are:
Kenneth Albanowski <kjahds at kjahds.com>,
Andy Dougherty  <doughera at lafcol.lafayette.edu>,
and many others.


--- NEW FILE: podchecker.PL ---

use Config;
use File::Basename qw(&basename &dirname);
use Cwd;

# List explicitly here the variables you want Configure to
# generate.  Metaconfig only looks for shell variables, so you
# have to mention them as if they were shell variables, not
# %Config entries.  Thus you write
#  $startperl
# to ensure Configure will look for $Config{startperl}.

# This forces PL files to create target in same directory as PL file.
# This is so that make depend always knows where to find PL derivatives.
$origdir = cwd;
($file = basename($0)) =~ s/\.PL$//;
$file =~ s/\.pl$//
        if ($^O eq 'VMS' or $^O eq 'os2' or $^O eq 'dos');  # "case-forgiving"
$file .= '.com' if $^O eq 'VMS';

open OUT,">$file" or die "Can't create $file: $!";

print "Extracting $file (with variable substitutions)\n";

# In this section, perl variables will be expanded during extraction.
# You can use $Config{...} to use Configure variables.

print OUT <<"!GROK!THIS!";
    eval 'exec perl -S \$0 "\$@"'
        if 0;

# In the following, perl variables are not expanded during extraction.

print OUT <<'!NO!SUBS!';
# podchecker -- command to invoke the podchecker function in Pod::Checker
# Copyright (c) 1998-2000 by Bradford Appleton. All rights reserved.
# This file is part of "PodParser". PodParser is free software;
# you can redistribute it and/or modify it under the same terms
# as Perl itself.

use strict;
#use diagnostics;

=head1 NAME

podchecker - check the syntax of POD format documentation files


B<podchecker> [B<-help>] [B<-man>] [B<-(no)warnings>] [I<file>S< >...]


=over 8

=item B<-help>

Print a brief help message and exit.

=item B<-man>

Print the manual page and exit.

=item B<-warnings> B<-nowarnings>

Turn on/off printing of warnings. Repeating B<-warnings> increases the
warning level, i.e. more warnings are printed. Currently increasing to
level two causes flagging of unescaped "E<lt>,E<gt>" characters.

=item I<file>

The pathname of a POD file to syntax-check (defaults to standard input).



B<podchecker> will read the given input files looking for POD
syntax errors in the POD documentation and will print any errors
it find to STDERR. At the end, it will print a status message
indicating the number of errors found.

Directories are ignored, an appropriate warning message is printed.

B<podchecker> invokes the B<podchecker()> function exported by B<Pod::Checker>
Please see L<Pod::Checker/podchecker()> for more details.


B<podchecker> returns a 0 (zero) exit status if all specified
POD files are ok.

=head1 ERRORS

B<podchecker> returns the exit status 1 if at least one of
the given POD files has syntax errors.

The status 2 indicates that at least one of the specified 
files does not contain I<any> POD commands.

Status 1 overrides status 2. If you want unambigouus
results, call B<podchecker> with one single argument only.

=head1 SEE ALSO

L<Pod::Parser> and L<Pod::Checker>

=head1 AUTHORS

Please report bugs using L<http://rt.cpan.org>.

Brad Appleton E<lt>bradapp at enteract.comE<gt>,
Marek Rouchal E<lt>marekr at cpan.orgE<gt>

Based on code for B<Pod::Text::pod2text(1)> written by
Tom Christiansen E<lt>tchrist at mox.perl.comE<gt>


use Pod::Checker;
use Pod::Usage;
use Getopt::Long;

## Define options
my %options;

## Parse options
GetOptions(\%options, qw(help man warnings+ nowarnings))  ||  pod2usage(2);
pod2usage(1)  if ($options{help});
pod2usage(-verbose => 2)  if ($options{man});

if($options{nowarnings}) {
  $options{warnings} = 0;
elsif(!defined $options{warnings}) {
  $options{warnings} = 1; # default is warnings on

## Dont default to STDIN if connected to a terminal
pod2usage(2) if ((@ARGV == 0) && (-t STDIN));

## Invoke podchecker()
my $status = 0;
@ARGV = qw(-) unless(@ARGV);
for my $podfile (@ARGV) {
    if($podfile eq '-') {
      $podfile = "<&STDIN";
    elsif(-d $podfile) {
      warn "podchecker: Warning: Ignoring directory '$podfile'\n";
    my $errors = 
      podchecker($podfile, undef, '-warnings' => $options{warnings});
    if($errors > 0) {
        # errors occurred
        $status = 1;
        printf STDERR ("%s has %d pod syntax %s.\n",
                       $podfile, $errors,
                       ($errors == 1) ? "error" : "errors");
    elsif($errors < 0) {
        # no pod found
        $status = 2 unless($status);
        print STDERR "$podfile does not contain any pod commands.\n";
    else {
        print STDERR "$podfile pod syntax OK.\n";
exit $status;


close OUT or die "Can't close $file: $!";
chmod 0755, $file or die "Can't reset permissions for $file: $!\n";
exec("$Config{'eunicefix'} $file") if $Config{'eunicefix'} ne ':';
chdir $origdir;

--- NEW FILE: perlfaq4.pod ---
=head1 NAME

perlfaq4 - Data Manipulation ($Revision: 1.2 $, $Date: 2006-12-04 17:01:32 $)


This section of the FAQ answers questions related to manipulating
numbers, dates, strings, arrays, hashes, and miscellaneous data issues.

=head1 Data: Numbers

=head2 Why am I getting long decimals (eg, 19.9499999999999) instead of the numbers I should be getting (eg, 19.95)?

Internally, your computer represents floating-point numbers
in binary. Digital (as in powers of two) computers cannot
store all numbers exactly.  Some real numbers lose precision
in the process.  This is a problem with how computers store
numbers and affects all computer languages, not just Perl.

[...2129 lines suppressed...]

=head2 How do I pack arrays of doubles or floats for XS code?

The kgbpack.c code in the PGPLOT module on CPAN does just this.
If you're doing a lot of float or double processing, consider using
the PDL module from CPAN instead--it makes number-crunching easy.


Copyright (c) 1997-2006 Tom Christiansen, Nathan Torkington, and
other authors as noted. All rights reserved.

This documentation is free; you can redistribute it and/or modify it
under the same terms as Perl itself.

Irrespective of its distribution, all code examples in this file
are hereby placed into the public domain.  You are permitted and
encouraged to use this code in your own programs for fun
or for profit as you see fit.  A simple comment in the code giving
credit would be courteous but is not required.

--- NEW FILE: perlapi.pod ---
=head1 NAME

perlapi - autogenerated documentation for the perl public API

X<Perl API> X<API> X<api>

This file contains the documentation of the perl public API generated by
embed.pl, specifically a listing of functions, macros, flags, and variables
that may be used by extension writers.  The interfaces of any functions that
are not listed here are subject to change without notice.  For this reason,
blindly using functions listed in proto.h is to be avoided when writing

Note that all Perl API global variables must be referenced with the C<PL_>
prefix.  Some macros are provided for compatibility with the older,
unadorned names, but this support may be disabled in a future release.

The listing is alphabetical, case insensitive.
[...6165 lines suppressed...]

=head1 AUTHORS

Until May 1997, this document was maintained by Jeff Okamoto
<okamoto at corp.hp.com>.  It is now maintained as part of Perl itself.

With lots of help and suggestions from Dean Roehrich, Malcolm Beattie,
Andreas Koenig, Paul Hudson, Ilya Zakharevich, Paul Marquess, Neil
Bowers, Matthew Green, Tim Bunce, Spider Boardman, Ulrich Pfeifer,
Stephen McCamant, and Gurusamy Sarathy.

API Listing originally by Dean Roehrich <roehrich at cray.com>.

Updated to be autogenerated from comments in the source by Benjamin Stuhl.

=head1 SEE ALSO

perlguts(1), perlxs(1), perlxstut(1), perlintern(1)

--- NEW FILE: splitpod ---

use lib '../lib';  # If you haven't installed perl yet.
use Pod::Functions;

local $/ = '';

$level = 0;

$cur = '';
while (<>) {

    next unless /^=(?!cut)/ .. /^=cut/;

    ++$level if /^=over/;
    --$level if /^=back/;

    # Ignore items that are nested within other items, e.g. don't split on the
    # items nested within the pack() and sprintf() items in perlfunc.pod.
    if (/=item (\S+)/ and $level == 1) {
	my $item = $1;
	s/=item //; 
	$next{$cur} = $item;
	$cur = $item;
	$syn{$cur} .= $_;
    } else { 
	push @{$pod{$cur}}, $_ if $cur;

for $f ( keys %syn ) {
    next unless $Type{$f};
    $flavor = $Flavor{$f};
    $orig = $f;
    ($name = $f) =~ s/\W//g;

    # deal with several functions sharing a description
    $func = $orig;
    $func = $next{$func} until $pod{$func};
    my $body = join "", @{$pod{$func}};

    # deal with unbalanced =over and =back cause by the split
    my $has_over = $body =~ /^=over/;
    my $has_back = $body =~ /^=back/;
    $body =~ s/^=over\s*//m if $has_over and !$has_back;
    $body =~ s/^=back\s*//m if $has_back and !$has_over;
    open (POD, "> $name.pod") || die "can't open $name.pod: $!";
    print POD <<EOF;
=head1 NAME

$orig - $flavor






    close POD;


--- NEW FILE: perlvar.pod ---
=head1 NAME

perlvar - Perl predefined variables


=head2 Predefined Names

The following names have special meaning to Perl.  Most 
punctuation names have reasonable mnemonics, or analogs in the
shells.  Nevertheless, if you wish to use long variable names,
you need only say

    use English;

at the top of your program. This aliases all the short names to the long
names in the current package. Some even have medium names, generally
borrowed from B<awk>. In general, it's best to use the

[...1493 lines suppressed...]

In particular, the new special C<${^_XYZ}> variables are always taken
to be in package C<main>, regardless of any C<package> declarations
presently in scope.  

=head1 BUGS

Due to an unfortunate accident of Perl's implementation, C<use
English> imposes a considerable performance penalty on all regular
expression matches in a program, regardless of whether they occur
in the scope of C<use English>.  For that reason, saying C<use
English> in libraries is strongly discouraged.  See the
Devel::SawAmpersand module documentation from CPAN
( http://www.cpan.org/modules/by-module/Devel/ )
for more information.

Having to even think about the C<$^S> variable in your exception
handlers is simply wrong.  C<$SIG{__DIE__}> as currently implemented
invites grievous and difficult to track down errors.  Avoid it
and use an C<END{}> or CORE::GLOBAL::die override instead.

--- NEW FILE: perlsec.pod ---
=head1 NAME

perlsec - Perl security


Perl is designed to make it easy to program securely even when running
with extra privileges, like setuid or setgid programs.  Unlike most
command line shells, which are based on multiple substitution passes on
each line of the script, Perl uses a more conventional evaluation scheme
with fewer hidden snags.  Additionally, because the language has more
builtin functionality, it can rely less upon external (and possibly
untrustworthy) programs to accomplish its purposes.

Perl automatically enables a set of special security checks, called I<taint
mode>, when it detects its program running with differing real and effective
user or group IDs.  The setuid bit in Unix permissions is mode 04000, the
setgid bit mode 02000; either or both may be set.  You can also enable taint
mode explicitly by using the B<-T> command line flag. This flag is
I<strongly> suggested for server programs and any program run on behalf of
someone else, such as a CGI script. Once taint mode is on, it's on for
the remainder of your script.

While in this mode, Perl takes special precautions called I<taint
checks> to prevent both obvious and subtle traps.  Some of these checks
are reasonably simple, such as verifying that path directories aren't
writable by others; careful programmers have always used checks like
these.  Other checks, however, are best supported by the language itself,
and it is these checks especially that contribute to making a set-id Perl
program more secure than the corresponding C program.

You may not use data derived from outside your program to affect
something else outside your program--at least, not by accident.  All
command line arguments, environment variables, locale information (see
L<perllocale>), results of certain system calls (C<readdir()>,
C<readlink()>, the variable of C<shmread()>, the messages returned by
C<msgrcv()>, the password, gcos and shell fields returned by the
C<getpwxxx()> calls), and all file input are marked as "tainted".
Tainted data may not be used directly or indirectly in any command
that invokes a sub-shell, nor in any command that modifies files,
directories, or processes, B<with the following exceptions>:

=over 4

=item *

Arguments to C<print> and C<syswrite> are B<not> checked for taintedness.

=item *

Symbolic methods


and symbolic sub references


are not checked for taintedness.  This requires extra carefulness
unless you want external data to affect your control flow.  Unless
you carefully limit what these symbolic values are, people are able
to call functions B<outside> your Perl code, such as POSIX::system,
in which case they are able to run arbitrary external code.


For efficiency reasons, Perl takes a conservative view of
whether data is tainted.  If an expression contains tainted data,
any subexpression may be considered tainted, even if the value
of the subexpression is not itself affected by the tainted data.

Because taintedness is associated with each scalar value, some
elements of an array or hash can be tainted and others not.
The keys of a hash are never tainted.

For example:

    $arg = shift;		# $arg is tainted
    $hid = $arg, 'bar';		# $hid is also tainted
    $line = <>;			# Tainted
    $line = <STDIN>;		# Also tainted
    open FOO, "/home/me/bar" or die $!;
    $line = <FOO>;		# Still tainted
    $path = $ENV{'PATH'};	# Tainted, but see below
    $data = 'abc';		# Not tainted

    system "echo $arg";		# Insecure
    system "/bin/echo", $arg;	# Considered insecure
				# (Perl doesn't know about /bin/echo)
    system "echo $hid";		# Insecure
    system "echo $data";	# Insecure until PATH set

    $path = $ENV{'PATH'};	# $path now tainted

    $ENV{'PATH'} = '/bin:/usr/bin';
    delete @ENV{'IFS', 'CDPATH', 'ENV', 'BASH_ENV'};

    $path = $ENV{'PATH'};	# $path now NOT tainted
    system "echo $data";	# Is secure now!

    open(FOO, "< $arg");	# OK - read-only file
    open(FOO, "> $arg"); 	# Not OK - trying to write

    open(FOO,"echo $arg|");	# Not OK
	or exec 'echo', $arg;	# Also not OK

    $shout = `echo $arg`;	# Insecure, $shout now tainted

    unlink $data, $arg;		# Insecure
    umask $arg;			# Insecure

    exec "echo $arg";		# Insecure
    exec "echo", $arg;		# Insecure
    exec "sh", '-c', $arg;	# Very insecure!

    @files = <*.c>;		# insecure (uses readdir() or similar)
    @files = glob('*.c');	# insecure (uses readdir() or similar)

    # In Perl releases older than 5.6.0 the <*.c> and glob('*.c') would
    # have used an external program to do the filename expansion; but in
    # either case the result is tainted since the list of filenames comes
    # from outside of the program.

    $bad = ($arg, 23);		# $bad will be tainted
    $arg, `true`;		# Insecure (although it isn't really)

If you try to do something insecure, you will get a fatal error saying
something like "Insecure dependency" or "Insecure $ENV{PATH}".

The exception to the principle of "one tainted value taints the whole
expression" is with the ternary conditional operator C<?:>.  Since code
with a ternary conditional

    $result = $tainted_value ? "Untainted" : "Also untainted";

is effectively

    if ( $tainted_value ) {
        $result = "Untainted";
    } else {
        $result = "Also untainted";

it doesn't make sense for C<$result> to be tainted.

=head2 Laundering and Detecting Tainted Data

To test whether a variable contains tainted data, and whose use would
thus trigger an "Insecure dependency" message, you can use the
C<tainted()> function of the Scalar::Util module, available in your
nearby CPAN mirror, and included in Perl starting from the release 5.8.0.
Or you may be able to use the following C<is_tainted()> function.

    sub is_tainted {
        return ! eval { eval("#" . substr(join("", @_), 0, 0)); 1 };

This function makes use of the fact that the presence of tainted data
anywhere within an expression renders the entire expression tainted.  It
would be inefficient for every operator to test every argument for
taintedness.  Instead, the slightly more efficient and conservative
approach is used that if any tainted value has been accessed within the
same expression, the whole expression is considered tainted.

But testing for taintedness gets you only so far.  Sometimes you have just
to clear your data's taintedness.  Values may be untainted by using them
as keys in a hash; otherwise the only way to bypass the tainting
mechanism is by referencing subpatterns from a regular expression match.
Perl presumes that if you reference a substring using $1, $2, etc., that
you knew what you were doing when you wrote the pattern.  That means using
a bit of thought--don't just blindly untaint anything, or you defeat the
entire mechanism.  It's better to verify that the variable has only good
characters (for certain values of "good") rather than checking whether it
has any bad characters.  That's because it's far too easy to miss bad
characters that you never thought of.

Here's a test to make sure that the data contains nothing but "word"
characters (alphabetics, numerics, and underscores), a hyphen, an at sign,
or a dot.

    if ($data =~ /^([-\@\w.]+)$/) {
	$data = $1; 			# $data now untainted
    } else {
	die "Bad data in '$data'"; 	# log this somewhere

This is fairly secure because C</\w+/> doesn't normally match shell
metacharacters, nor are dot, dash, or at going to mean something special
to the shell.  Use of C</.+/> would have been insecure in theory because
it lets everything through, but Perl doesn't check for that.  The lesson
is that when untainting, you must be exceedingly careful with your patterns.
Laundering data using regular expression is the I<only> mechanism for
untainting dirty data, unless you use the strategy detailed below to fork
a child of lesser privilege.

The example does not untaint C<$data> if C<use locale> is in effect,
because the characters matched by C<\w> are determined by the locale.
Perl considers that locale definitions are untrustworthy because they
contain data from outside the program.  If you are writing a
locale-aware program, and want to launder data with a regular expression
containing C<\w>, put C<no locale> ahead of the expression in the same
block.  See L<perllocale/SECURITY> for further discussion and examples.

=head2 Switches On the "#!" Line

When you make a script executable, in order to make it usable as a
command, the system will pass switches to perl from the script's #!
line.  Perl checks that any command line switches given to a setuid
(or setgid) script actually match the ones set on the #! line.  Some
Unix and Unix-like environments impose a one-switch limit on the #!
line, so you may need to use something like C<-wU> instead of C<-w -U>
under such systems.  (This issue should arise only in Unix or
Unix-like environments that support #! and setuid or setgid scripts.)

=head2 Taint mode and @INC

When the taint mode (C<-T>) is in effect, the "." directory is removed
from C<@INC>, and the environment variables C<PERL5LIB> and C<PERLLIB>
are ignored by Perl. You can still adjust C<@INC> from outside the
program by using the C<-I> command line option as explained in
L<perlrun>. The two environment variables are ignored because
they are obscured, and a user running a program could be unaware that
they are set, whereas the C<-I> option is clearly visible and
therefore permitted.

Another way to modify C<@INC> without modifying the program, is to use
the C<lib> pragma, e.g.:

  perl -Mlib=/foo program

The benefit of using C<-Mlib=/foo> over C<-I/foo>, is that the former
will automagically remove any duplicated directories, while the later
will not.

Note that if a tainted string is added to C<@INC>, the following
problem will be reported:

  Insecure dependency in require while running with -T switch

=head2 Cleaning Up Your Path

For "Insecure C<$ENV{PATH}>" messages, you need to set C<$ENV{'PATH'}> to
a known value, and each directory in the path must be absolute and
non-writable by others than its owner and group.  You may be surprised to
get this message even if the pathname to your executable is fully
qualified.  This is I<not> generated because you didn't supply a full path
to the program; instead, it's generated because you never set your PATH
environment variable, or you didn't set it to something that was safe.
Because Perl can't guarantee that the executable in question isn't itself
going to turn around and execute some other program that is dependent on
your PATH, it makes sure you set the PATH.

The PATH isn't the only environment variable which can cause problems.
Because some shells may use the variables IFS, CDPATH, ENV, and
BASH_ENV, Perl checks that those are either empty or untainted when
starting subprocesses. You may wish to add something like this to your
setid and taint-checking scripts.

    delete @ENV{qw(IFS CDPATH ENV BASH_ENV)};   # Make %ENV safer

It's also possible to get into trouble with other operations that don't
care whether they use tainted values.  Make judicious use of the file
tests in dealing with any user-supplied filenames.  When possible, do
opens and such B<after> properly dropping any special user (or group!)
privileges. Perl doesn't prevent you from opening tainted filenames for reading,
so be careful what you print out.  The tainting mechanism is intended to
prevent stupid mistakes, not to remove the need for thought.

Perl does not call the shell to expand wild cards when you pass C<system>
and C<exec> explicit parameter lists instead of strings with possible shell
wildcards in them.  Unfortunately, the C<open>, C<glob>, and
backtick functions provide no such alternate calling convention, so more
subterfuge will be required.

Perl provides a reasonably safe way to open a file or pipe from a setuid
or setgid program: just create a child process with reduced privilege who
does the dirty work for you.  First, fork a child using the special
C<open> syntax that connects the parent and child by a pipe.  Now the
child resets its ID set and any other per-process attributes, like
environment variables, umasks, current working directories, back to the
originals or known safe values.  Then the child process, which no longer
has any special permissions, does the C<open> or other system call.
Finally, the child passes the data it managed to access back to the
parent.  Because the file or pipe was opened in the child while running
under less privilege than the parent, it's not apt to be tricked into
doing something it shouldn't.

Here's a way to do backticks reasonably safely.  Notice how the C<exec> is
not called with a string that the shell could expand.  This is by far the
best way to call something that might be subjected to shell escapes: just
never call the shell at all.  

        use English '-no_match_vars';
        die "Can't fork: $!" unless defined($pid = open(KID, "-|"));
        if ($pid) {           # parent
            while (<KID>) {
                # do something
            close KID;
        } else {
            my @temp     = ($EUID, $EGID);
            my $orig_uid = $UID;
            my $orig_gid = $GID;
            $EUID = $UID;
            $EGID = $GID;
            # Drop privileges
            $UID  = $orig_uid;
            $GID  = $orig_gid;
            # Make sure privs are really gone
            ($EUID, $EGID) = @temp;
            die "Can't drop privileges"
                unless $UID == $EUID  && $GID eq $EGID;
            $ENV{PATH} = "/bin:/usr/bin"; # Minimal PATH.
	    # Consider sanitizing the environment even more.
            exec 'myprog', 'arg1', 'arg2'
                or die "can't exec myprog: $!";

A similar strategy would work for wildcard expansion via C<glob>, although
you can use C<readdir> instead.

Taint checking is most useful when although you trust yourself not to have
written a program to give away the farm, you don't necessarily trust those
who end up using it not to try to trick it into doing something bad.  This
is the kind of security checking that's useful for set-id programs and
programs launched on someone else's behalf, like CGI programs.

This is quite different, however, from not even trusting the writer of the
code not to try to do something evil.  That's the kind of trust needed
when someone hands you a program you've never seen before and says, "Here,
run this."  For that kind of safety, check out the Safe module,
included standard in the Perl distribution.  This module allows the
programmer to set up special compartments in which all system operations
are trapped and namespace access is carefully controlled.

=head2 Security Bugs

Beyond the obvious problems that stem from giving special privileges to
systems as flexible as scripts, on many versions of Unix, set-id scripts
are inherently insecure right from the start.  The problem is a race
condition in the kernel.  Between the time the kernel opens the file to
see which interpreter to run and when the (now-set-id) interpreter turns
around and reopens the file to interpret it, the file in question may have
changed, especially if you have symbolic links on your system.

Fortunately, sometimes this kernel "feature" can be disabled.
Unfortunately, there are two ways to disable it.  The system can simply
outlaw scripts with any set-id bit set, which doesn't help much.
Alternately, it can simply ignore the set-id bits on scripts.  If the
latter is true, Perl can emulate the setuid and setgid mechanism when it
notices the otherwise useless setuid/gid bits on Perl scripts.  It does
this via a special executable called F<suidperl> that is automatically
invoked for you if it's needed.

However, if the kernel set-id script feature isn't disabled, Perl will
complain loudly that your set-id script is insecure.  You'll need to
either disable the kernel set-id script feature, or put a C wrapper around
the script.  A C wrapper is just a compiled program that does nothing
except call your Perl program.   Compiled programs are not subject to the
kernel bug that plagues set-id scripts.  Here's a simple wrapper, written
in C:

    #define REAL_PATH "/path/to/script"
    main(ac, av)
	char **av;
	execv(REAL_PATH, av);

Compile this wrapper into a binary executable and then make I<it> rather
than your script setuid or setgid.

In recent years, vendors have begun to supply systems free of this
inherent security bug.  On such systems, when the kernel passes the name
of the set-id script to open to the interpreter, rather than using a
pathname subject to meddling, it instead passes I</dev/fd/3>.  This is a
special file already opened on the script, so that there can be no race
condition for evil scripts to exploit.  On these systems, Perl should be
compiled with C<-DSETUID_SCRIPTS_ARE_SECURE_NOW>.  The F<Configure>
program that builds Perl tries to figure this out for itself, so you
should never have to specify this yourself.  Most modern releases of
SysVr4 and BSD 4.4 use this approach to avoid the kernel race condition.

Prior to release 5.6.1 of Perl, bugs in the code of F<suidperl> could
introduce a security hole.

=head2 Protecting Your Programs

There are a number of ways to hide the source to your Perl programs,
with varying levels of "security".

First of all, however, you I<can't> take away read permission, because
the source code has to be readable in order to be compiled and
interpreted.  (That doesn't mean that a CGI script's source is
readable by people on the web, though.)  So you have to leave the
permissions at the socially friendly 0755 level.  This lets 
people on your local system only see your source.

Some people mistakenly regard this as a security problem.  If your program does
insecure things, and relies on people not knowing how to exploit those
insecurities, it is not secure.  It is often possible for someone to
determine the insecure things and exploit them without viewing the
source.  Security through obscurity, the name for hiding your bugs
instead of fixing them, is little security indeed.

You can try using encryption via source filters (Filter::* from CPAN,
or Filter::Util::Call and Filter::Simple since Perl 5.8).
But crackers might be able to decrypt it.  You can try using the byte
code compiler and interpreter described below, but crackers might be
able to de-compile it.  You can try using the native-code compiler
described below, but crackers might be able to disassemble it.  These
pose varying degrees of difficulty to people wanting to get at your
code, but none can definitively conceal it (this is true of every
language, not just Perl).

If you're concerned about people profiting from your code, then the
bottom line is that nothing but a restrictive licence will give you
legal security.  License your software and pepper it with threatening
statements like "This is unpublished proprietary software of XYZ Corp.
Your access to it does not give you permission to use it blah blah
blah."  You should see a lawyer to be sure your licence's wording will
stand up in court.

=head2 Unicode

Unicode is a new and complex technology and one may easily overlook
certain security pitfalls.  See L<perluniintro> for an overview and
L<perlunicode> for details, and L<perlunicode/"Security Implications
of Unicode"> for security implications in particular.

=head2 Algorithmic Complexity Attacks

Certain internal algorithms used in the implementation of Perl can
be attacked by choosing the input carefully to consume large amounts
of either time or space or both.  This can lead into the so-called
I<Denial of Service> (DoS) attacks.

=over 4

=item *

Hash Function - the algorithm used to "order" hash elements has been
changed several times during the development of Perl, mainly to be
reasonably fast.  In Perl 5.8.1 also the security aspect was taken
into account.

In Perls before 5.8.1 one could rather easily generate data that as
hash keys would cause Perl to consume large amounts of time because
internal structure of hashes would badly degenerate.  In Perl 5.8.1
the hash function is randomly perturbed by a pseudorandom seed which
makes generating such naughty hash keys harder.
See L<perlrun/PERL_HASH_SEED> for more information.

The random perturbation is done by default but if one wants for some
reason emulate the old behaviour one can set the environment variable
PERL_HASH_SEED to zero (or any other integer).  One possible reason
for wanting to emulate the old behaviour is that in the new behaviour
consecutive runs of Perl will order hash keys differently, which may
confuse some applications (like Data::Dumper: the outputs of two
different runs are no more identical).

B<Perl has never guaranteed any ordering of the hash keys>, and the
ordering has already changed several times during the lifetime of
Perl 5.  Also, the ordering of hash keys has always been, and
continues to be, affected by the insertion order.

Also note that while the order of the hash elements might be
randomised, this "pseudoordering" should B<not> be used for
applications like shuffling a list randomly (use List::Util::shuffle()
for that, see L<List::Util>, a standard core module since Perl 5.8.0;
or the CPAN module Algorithm::Numerical::Shuffle), or for generating
permutations (use e.g. the CPAN modules Algorithm::Permute or
Algorithm::FastPermute), or for any cryptographic applications.

=item *

Regular expressions - Perl's regular expression engine is so called
NFA (Non-Finite Automaton), which among other things means that it can
rather easily consume large amounts of both time and space if the
regular expression may match in several ways.  Careful crafting of the
regular expressions can help but quite often there really isn't much
one can do (the book "Mastering Regular Expressions" is required
reading, see L<perlfaq2>).  Running out of space manifests itself by
Perl running out of memory.

=item *

Sorting - the quicksort algorithm used in Perls before 5.8.0 to
implement the sort() function is very easy to trick into misbehaving
so that it consumes a lot of time.  Nothing more is required than
resorting a list already sorted.  Starting from Perl 5.8.0 a different
sorting algorithm, mergesort, is used.  Mergesort is insensitive to
its input data, so it cannot be similarly fooled.


See L<http://www.cs.rice.edu/~scrosby/hash/> for more information,
and any computer science text book on the algorithmic complexity.

=head1 SEE ALSO

L<perlrun> for its description of cleaning up environment variables.

--- NEW FILE: perlfilter.pod ---
=head1 NAME

perlfilter - Source Filters


This article is about a little-known feature of Perl called
I<source filters>. Source filters alter the program text of a module
before Perl sees it, much as a C preprocessor alters the source text of
a C program before the compiler sees it. This article tells you more
about what source filters are, how they work, and how to write your

The original purpose of source filters was to let you encrypt your
program source to prevent casual piracy. This isn't all they can do, as
you'll soon learn. But first, the basics.


Before the Perl interpreter can execute a Perl script, it must first
read it from a file into memory for parsing and compilation. If that
script itself includes other scripts with a C<use> or C<require>
statement, then each of those scripts will have to be read from their
respective files as well.

Now think of each logical connection between the Perl parser and an
individual file as a I<source stream>. A source stream is created when
the Perl parser opens a file, it continues to exist as the source code
is read into memory, and it is destroyed when Perl is finished parsing
the file. If the parser encounters a C<require> or C<use> statement in
a source stream, a new and distinct stream is created just for that

The diagram below represents a single source stream, with the flow of
source from a Perl script file on the left into the Perl parser on the
right. This is how Perl normally operates.

    file -------> parser

There are two important points to remember:

=over 5

=item 1.

Although there can be any number of source streams in existence at any
given time, only one will be active.

=item 2.

Every source stream is associated with only one file.


A source filter is a special kind of Perl module that intercepts and
modifies a source stream before it reaches the parser. A source filter
changes our diagram like this:

    file ----> filter ----> parser

If that doesn't make much sense, consider the analogy of a command
pipeline. Say you have a shell script stored in the compressed file
I<trial.gz>. The simple pipeline command below runs the script without
needing to create a temporary file to hold the uncompressed file.

    gunzip -c trial.gz | sh

In this case, the data flow from the pipeline can be represented as follows:

    trial.gz ----> gunzip ----> sh

With source filters, you can store the text of your script compressed and use a source filter to uncompress it for Perl's parser:

     compressed           gunzip
    Perl program ---> source filter ---> parser


So how do you use a source filter in a Perl script? Above, I said that
a source filter is just a special kind of module. Like all Perl
modules, a source filter is invoked with a use statement.

Say you want to pass your Perl source through the C preprocessor before
execution. You could use the existing C<-P> command line option to do
this, but as it happens, the source filters distribution comes with a C
preprocessor filter module called Filter::cpp. Let's use that instead.

Below is an example program, C<cpp_test>, which makes use of this filter.
Line numbers have been added to allow specific lines to be referenced

    1: use Filter::cpp;
    2: #define TRUE 1
    3: $a = TRUE;
    4: print "a = $a\n";

When you execute this script, Perl creates a source stream for the
file. Before the parser processes any of the lines from the file, the
source stream looks like this:

    cpp_test ---------> parser

Line 1, C<use Filter::cpp>, includes and installs the C<cpp> filter
module. All source filters work this way. The use statement is compiled
and executed at compile time, before any more of the file is read, and
it attaches the cpp filter to the source stream behind the scenes. Now
the data flow looks like this:

    cpp_test ----> cpp filter ----> parser

As the parser reads the second and subsequent lines from the source
stream, it feeds those lines through the C<cpp> source filter before
processing them. The C<cpp> filter simply passes each line through the
real C preprocessor. The output from the C preprocessor is then
inserted back into the source stream by the filter.

                  .-> cpp --.
                  |         |
                  |         |
                  |       <-'
   cpp_test ----> cpp filter ----> parser

The parser then sees the following code:

    use Filter::cpp;
    $a = 1;
    print "a = $a\n";

Let's consider what happens when the filtered code includes another
module with use:

    1: use Filter::cpp;
    2: #define TRUE 1
    3: use Fred;
    4: $a = TRUE;
    5: print "a = $a\n";

The C<cpp> filter does not apply to the text of the Fred module, only
to the text of the file that used it (C<cpp_test>). Although the use
statement on line 3 will pass through the cpp filter, the module that
gets included (C<Fred>) will not. The source streams look like this
after line 3 has been parsed and before line 4 is parsed:

    cpp_test ---> cpp filter ---> parser (INACTIVE)

    Fred.pm ----> parser

As you can see, a new stream has been created for reading the source
from C<Fred.pm>. This stream will remain active until all of C<Fred.pm>
has been parsed. The source stream for C<cpp_test> will still exist,
but is inactive. Once the parser has finished reading Fred.pm, the
source stream associated with it will be destroyed. The source stream
for C<cpp_test> then becomes active again and the parser reads line 4
and subsequent lines from C<cpp_test>.

You can use more than one source filter on a single file. Similarly,
you can reuse the same filter in as many files as you like.

For example, if you have a uuencoded and compressed source file, it is
possible to stack a uudecode filter and an uncompression filter like

    use Filter::uudecode; use Filter::uncompress;

Once the first line has been processed, the flow will look like this:

    file ---> uudecode ---> uncompress ---> parser
               filter         filter

Data flows through filters in the same order they appear in the source
file. The uudecode filter appeared before the uncompress filter, so the
source file will be uudecoded before it's uncompressed.


There are three ways to write your own source filter. You can write it
in C, use an external program as a filter, or write the filter in Perl.
I won't cover the first two in any great detail, so I'll get them out
of the way first. Writing the filter in Perl is most convenient, so
I'll devote the most space to it.


The first of the three available techniques is to write the filter
completely in C. The external module you create interfaces directly
with the source filter hooks provided by Perl.

The advantage of this technique is that you have complete control over
the implementation of your filter. The big disadvantage is the
increased complexity required to write the filter - not only do you
need to understand the source filter hooks, but you also need a
reasonable knowledge of Perl guts. One of the few times it is worth
going to this trouble is when writing a source scrambler. The
C<decrypt> filter (which unscrambles the source before Perl parses it)
included with the source filter distribution is an example of a C
source filter (see Decryption Filters, below).

=over 5

=item B<Decryption Filters>

All decryption filters work on the principle of "security through
obscurity." Regardless of how well you write a decryption filter and
how strong your encryption algorithm, anyone determined enough can
retrieve the original source code. The reason is quite simple - once
the decryption filter has decrypted the source back to its original
form, fragments of it will be stored in the computer's memory as Perl
parses it. The source might only be in memory for a short period of
time, but anyone possessing a debugger, skill, and lots of patience can
eventually reconstruct your program.

That said, there are a number of steps that can be taken to make life
difficult for the potential cracker. The most important: Write your
decryption filter in C and statically link the decryption module into
the Perl binary. For further tips to make life difficult for the
potential cracker, see the file I<decrypt.pm> in the source filters



An alternative to writing the filter in C is to create a separate
executable in the language of your choice. The separate executable
reads from standard input, does whatever processing is necessary, and
writes the filtered data to standard output. C<Filter:cpp> is an
example of a source filter implemented as a separate executable - the
executable is the C preprocessor bundled with your C compiler.

The source filter distribution includes two modules that simplify this
task: C<Filter::exec> and C<Filter::sh>. Both allow you to run any
external executable. Both use a coprocess to control the flow of data
into and out of the external executable. (For details on coprocesses,
see Stephens, W.R. "Advanced Programming in the UNIX Environment."
Addison-Wesley, ISBN 0-210-56317-7, pages 441-445.) The difference
between them is that C<Filter::exec> spawns the external command
directly, while C<Filter::sh> spawns a shell to execute the external
command. (Unix uses the Bourne shell; NT uses the cmd shell.) Spawning
a shell allows you to make use of the shell metacharacters and
redirection facilities.

Here is an example script that uses C<Filter::sh>:

    use Filter::sh 'tr XYZ PQR';
    $a = 1;
    print "XYZ a = $a\n";

The output you'll get when the script is executed:

    PQR a = 1

Writing a source filter as a separate executable works fine, but a
small performance penalty is incurred. For example, if you execute the
small example above, a separate subprocess will be created to run the
Unix C<tr> command. Each use of the filter requires its own subprocess.
If creating subprocesses is expensive on your system, you might want to
consider one of the other options for creating source filters.


The easiest and most portable option available for creating your own
source filter is to write it completely in Perl. To distinguish this
from the previous two techniques, I'll call it a Perl source filter.

To help understand how to write a Perl source filter we need an example
to study. Here is a complete source filter that performs rot13
decoding. (Rot13 is a very simple encryption scheme used in Usenet
postings to hide the contents of offensive posts. It moves every letter
forward thirteen places, so that A becomes N, B becomes O, and Z
becomes M.)

   package Rot13;

   use Filter::Util::Call;

   sub import {
      my ($type) = @_;
      my ($ref) = [];
      filter_add(bless $ref);

   sub filter {
      my ($self) = @_;
      my ($status);

         if ($status = filter_read()) > 0;


All Perl source filters are implemented as Perl classes and have the
same basic structure as the example above.

First, we include the C<Filter::Util::Call> module, which exports a
number of functions into your filter's namespace. The filter shown
above uses two of these functions, C<filter_add()> and

Next, we create the filter object and associate it with the source
stream by defining the C<import> function. If you know Perl well
enough, you know that C<import> is called automatically every time a
module is included with a use statement. This makes C<import> the ideal
place to both create and install a filter object.

In the example filter, the object (C<$ref>) is blessed just like any
other Perl object. Our example uses an anonymous array, but this isn't
a requirement. Because this example doesn't need to store any context
information, we could have used a scalar or hash reference just as
well. The next section demonstrates context data.

The association between the filter object and the source stream is made
with the C<filter_add()> function. This takes a filter object as a
parameter (C<$ref> in this case) and installs it in the source stream.

Finally, there is the code that actually does the filtering. For this
type of Perl source filter, all the filtering is done in a method
called C<filter()>. (It is also possible to write a Perl source filter
using a closure. See the C<Filter::Util::Call> manual page for more
details.) It's called every time the Perl parser needs another line of
source to process. The C<filter()> method, in turn, reads lines from
the source stream using the C<filter_read()> function.

If a line was available from the source stream, C<filter_read()>
returns a status value greater than zero and appends the line to C<$_>.
A status value of zero indicates end-of-file, less than zero means an
error. The filter function itself is expected to return its status in
the same way, and put the filtered line it wants written to the source
stream in C<$_>. The use of C<$_> accounts for the brevity of most Perl
source filters.

In order to make use of the rot13 filter we need some way of encoding
the source file in rot13 format. The script below, C<mkrot13>, does
just that.

    die "usage mkrot13 filename\n" unless @ARGV;
    my $in = $ARGV[0];
    my $out = "$in.tmp";
    open(IN, "<$in") or die "Cannot open file $in: $!\n";
    open(OUT, ">$out") or die "Cannot open file $out: $!\n";

    print OUT "use Rot13;\n";
    while (<IN>) {
       print OUT;

    close IN;
    close OUT;
    unlink $in;
    rename $out, $in;

If we encrypt this with C<mkrot13>:

    print " hello fred \n";

the result will be this:

    use Rot13;
    cevag "uryyb serq\a";

Running it produces this output:

    hello fred


The rot13 example was a trivial example. Here's another demonstration
that shows off a few more features.

Say you wanted to include a lot of debugging code in your Perl script
during development, but you didn't want it available in the released
product. Source filters offer a solution. In order to keep the example
simple, let's say you wanted the debugging output to be controlled by
an environment variable, C<DEBUG>. Debugging code is enabled if the
variable exists, otherwise it is disabled.

Two special marker lines will bracket debugging code, like this:

    if ($year > 1999) {
       warn "Debug: millennium bug in year $year\n";
    ## DEBUG_END

When the C<DEBUG> environment variable exists, the filter ensures that
Perl parses only the code between the C<DEBUG_BEGIN> and C<DEBUG_END>
markers. That means that when C<DEBUG> does exist, the code above
should be passed through the filter unchanged. The marker lines can
also be passed through as-is, because the Perl parser will see them as
comment lines. When C<DEBUG> isn't set, we need a way to disable the
debug code. A simple way to achieve that is to convert the lines
between the two markers into comments:

    #if ($year > 1999) {
    #     warn "Debug: millennium bug in year $year\n";
    ## DEBUG_END

Here is the complete Debug filter:

    package Debug;

    use strict;
    use warnings;
    use Filter::Util::Call;

    use constant TRUE => 1;
    use constant FALSE => 0;

    sub import {
       my ($type) = @_;
       my (%context) = (
         Enabled => defined $ENV{DEBUG},
         InTraceBlock => FALSE,
         Filename => (caller)[1],
         LineNo => 0,
         LastBegin => 0,
       filter_add(bless \%context);

    sub Die {
       my ($self) = shift;
       my ($message) = shift;
       my ($line_no) = shift || $self->{LastBegin};
       die "$message at $self->{Filename} line $line_no.\n"

    sub filter {
       my ($self) = @_;
       my ($status);
       $status = filter_read();
       ++ $self->{LineNo};

       # deal with EOF/error first
       if ($status <= 0) {
           $self->Die("DEBUG_BEGIN has no DEBUG_END")
               if $self->{InTraceBlock};
           return $status;

       if ($self->{InTraceBlock}) {
          if (/^\s*##\s*DEBUG_BEGIN/ ) {
              $self->Die("Nested DEBUG_BEGIN", $self->{LineNo})
          } elsif (/^\s*##\s*DEBUG_END/) {
              $self->{InTraceBlock} = FALSE;

          # comment out the debug lines when the filter is disabled
          s/^/#/ if ! $self->{Enabled};
       } elsif ( /^\s*##\s*DEBUG_BEGIN/ ) {
          $self->{InTraceBlock} = TRUE;
          $self->{LastBegin} = $self->{LineNo};
       } elsif ( /^\s*##\s*DEBUG_END/ ) {
          $self->Die("DEBUG_END has no DEBUG_BEGIN", $self->{LineNo});
       return $status;


The big difference between this filter and the previous example is the
use of context data in the filter object. The filter object is based on
a hash reference, and is used to keep various pieces of context
information between calls to the filter function. All but two of the
hash fields are used for error reporting. The first of those two,
Enabled, is used by the filter to determine whether the debugging code
should be given to the Perl parser. The second, InTraceBlock, is true
when the filter has encountered a C<DEBUG_BEGIN> line, but has not yet
encountered the following C<DEBUG_END> line.

If you ignore all the error checking that most of the code does, the
essence of the filter is as follows:

    sub filter {
       my ($self) = @_;
       my ($status);
       $status = filter_read();

       # deal with EOF/error first
       return $status if $status <= 0;
       if ($self->{InTraceBlock}) {
          if (/^\s*##\s*DEBUG_END/) {
             $self->{InTraceBlock} = FALSE

          # comment out debug lines when the filter is disabled
          s/^/#/ if ! $self->{Enabled};
       } elsif ( /^\s*##\s*DEBUG_BEGIN/ ) {
          $self->{InTraceBlock} = TRUE;
       return $status;

Be warned: just as the C-preprocessor doesn't know C, the Debug filter
doesn't know Perl. It can be fooled quite easily:

    print <<EOM;

Such things aside, you can see that a lot can be achieved with a modest
amount of code.


You now have better understanding of what a source filter is, and you
might even have a possible use for them. If you feel like playing with
source filters but need a bit of inspiration, here are some extra
features you could add to the Debug filter.

First, an easy one. Rather than having debugging code that is
all-or-nothing, it would be much more useful to be able to control
which specific blocks of debugging code get included. Try extending the
syntax for debug blocks to allow each to be identified. The contents of
the C<DEBUG> environment variable can then be used to control which
blocks get included.

Once you can identify individual blocks, try allowing them to be
nested. That isn't difficult either.

Here is an interesting idea that doesn't involve the Debug filter.
Currently Perl subroutines have fairly limited support for formal
parameter lists. You can specify the number of parameters and their
type, but you still have to manually take them out of the C<@_> array
yourself. Write a source filter that allows you to have a named
parameter list. Such a filter would turn this:

    sub MySub ($first, $second, @rest) { ... }

into this:

    sub MySub($$@) {
       my ($first) = shift;
       my ($second) = shift;
       my (@rest) = @_;

Finally, if you feel like a real challenge, have a go at writing a
full-blown Perl macro preprocessor as a source filter. Borrow the
useful features from the C preprocessor and any other macro processors
you know. The tricky bit will be choosing how much knowledge of Perl's
syntax you want your filter to have.


=over 5

=item Some Filters Clobber the C<DATA> Handle

Some source filters use the C<DATA> handle to read the calling program.
When using these source filters you cannot rely on this handle, nor expect
any particular kind of behavior when operating on it.  Filters based on
Filter::Util::Call (and therefore Filter::Simple) do not alter the C<DATA>



The Source Filters distribution is available on CPAN, in 


Starting from Perl 5.8 Filter::Util::Call (the core part of the
Source Filters distribution) is part of the standard Perl distribution.
Also included is a friendlier interface called Filter::Simple, by
Damian Conway.

=head1 AUTHOR

Paul Marquess E<lt>Paul.Marquess at btinternet.comE<gt>

=head1 Copyrights

This article originally appeared in The Perl Journal #11, and is
copyright 1998 The Perl Journal. It appears courtesy of Jon Orwant and
The Perl Journal.  This document may be distributed under the same terms
as Perl itself.

--- NEW FILE: perldbmfilter.pod ---
=head1 NAME

perldbmfilter - Perl DBM Filters


    $db = tie %hash, 'DBM', ...

    $old_filter = $db->filter_store_key  ( sub { ... } );
    $old_filter = $db->filter_store_value( sub { ... } );
    $old_filter = $db->filter_fetch_key  ( sub { ... } );
    $old_filter = $db->filter_fetch_value( sub { ... } );


The four C<filter_*> methods shown above are available in all the DBM
modules that ship with Perl, namely DB_File, GDBM_File, NDBM_File,
ODBM_File and SDBM_File.

Each of the methods work identically, and are used to install (or
uninstall) a single DBM Filter. The only difference between them is the
place that the filter is installed.

To summarise:

=over 5

=item B<filter_store_key>

If a filter has been installed with this method, it will be invoked
every time you write a key to a DBM database.

=item B<filter_store_value>

If a filter has been installed with this method, it will be invoked
every time you write a value to a DBM database.

=item B<filter_fetch_key>

If a filter has been installed with this method, it will be invoked
every time you read a key from a DBM database.

=item B<filter_fetch_value>

If a filter has been installed with this method, it will be invoked
every time you read a value from a DBM database.


You can use any combination of the methods from none to all four.

All filter methods return the existing filter, if present, or C<undef>
in not.

To delete a filter pass C<undef> to it.

=head2 The Filter

When each filter is called by Perl, a local copy of C<$_> will contain
the key or value to be filtered. Filtering is achieved by modifying
the contents of C<$_>. The return code from the filter is ignored.

=head2 An Example -- the NULL termination problem.

DBM Filters are useful for a class of problems where you I<always>
want to make the same transformation to all keys, all values or both.

For example, consider the following scenario. You have a DBM database
that you need to share with a third-party C application. The C application
assumes that I<all> keys and values are NULL terminated. Unfortunately
when Perl writes to DBM databases it doesn't use NULL termination, so
your Perl application will have to manage NULL termination itself. When
you write to the database you will have to use something like this:

    $hash{"$key\0"} = "$value\0";

Similarly the NULL needs to be taken into account when you are considering
the length of existing keys/values.

It would be much better if you could ignore the NULL terminations issue
in the main application code and have a mechanism that automatically
added the terminating NULL to all keys and values whenever you write to
the database and have them removed when you read from the database. As I'm
sure you have already guessed, this is a problem that DBM Filters can
fix very easily.

    use strict;
    use warnings;
    use SDBM_File;
    use Fcntl;

    my %hash;
    my $filename = "filt";
    unlink $filename;

    my $db = tie(%hash, 'SDBM_File', $filename, O_RDWR|O_CREAT, 0640)
      or die "Cannot open $filename: $!\n";

    # Install DBM Filters
    $db->filter_fetch_key  ( sub { s/\0$//    } );
    $db->filter_store_key  ( sub { $_ .= "\0" } );
        sub { no warnings 'uninitialized'; s/\0$// } );
    $db->filter_store_value( sub { $_ .= "\0" } );

    $hash{"abc"} = "def";
    my $a = $hash{"ABC"};
    # ...
    undef $db;
    untie %hash;

The code above uses SDBM_File, but it will work with any of the DBM

Hopefully the contents of each of the filters should be
self-explanatory. Both "fetch" filters remove the terminating NULL,
and both "store" filters add a terminating NULL.

=head2 Another Example -- Key is a C int.

Here is another real-life example. By default, whenever Perl writes to
a DBM database it always writes the key and value as strings. So when
you use this:

    $hash{12345} = "something";

the key 12345 will get stored in the DBM database as the 5 byte string
"12345". If you actually want the key to be stored in the DBM database
as a C int, you will have to use C<pack> when writing, and C<unpack>
when reading.

Here is a DBM Filter that does it:

    use strict;
    use warnings;
    use DB_File;
    my %hash;
    my $filename = "filt";
    unlink $filename;

    my $db = tie %hash, 'DB_File', $filename, O_CREAT|O_RDWR, 0666, $DB_HASH 
      or die "Cannot open $filename: $!\n";

    $db->filter_fetch_key  ( sub { $_ = unpack("i", $_) } );
    $db->filter_store_key  ( sub { $_ = pack ("i", $_) } );
    $hash{123} = "def";
    # ...
    undef $db;
    untie %hash;

The code above uses DB_File, but again it will work with any of the
DBM modules.

This time only two filters have been used -- we only need to manipulate
the contents of the key, so it wasn't necessary to install any value

=head1 SEE ALSO

L<DB_File>, L<GDBM_File>, L<NDBM_File>, L<ODBM_File> and L<SDBM_File>.

=head1 AUTHOR

Paul Marquess

--- NEW FILE: perlipc.pod ---
=head1 NAME

perlipc - Perl interprocess communication (signals, fifos, pipes, safe subprocesses, sockets, and semaphores)


The basic IPC facilities of Perl are built out of the good old Unix
signals, named pipes, pipe opens, the Berkeley socket routines, and SysV
IPC calls.  Each is used in slightly different situations.

=head1 Signals

Perl uses a simple signal handling model: the %SIG hash contains names
or references of user-installed signal handlers.  These handlers will
be called with an argument which is the name of the signal that
triggered it.  A signal may be generated intentionally from a
particular keyboard sequence like control-C or control-Z, sent to you
from another process, or triggered automatically by the kernel when
special events transpire, like a child process exiting, your process
[...1658 lines suppressed...]
There's a lot more to networking than this, but this should get you

For intrepid programmers, the indispensable textbook is I<Unix
Network Programming, 2nd Edition, Volume 1> by W. Richard Stevens
(published by Prentice-Hall).  Note that most books on networking
address the subject from the perspective of a C programmer; translation
to Perl is left as an exercise for the reader.

The IO::Socket(3) manpage describes the object library, and the Socket(3)
manpage describes the low-level interface to sockets.  Besides the obvious
functions in L<perlfunc>, you should also check out the F<modules> file
at your nearest CPAN site.  (See L<perlmodlib> or best yet, the F<Perl
FAQ> for a description of what CPAN is and where to get it.)

Section 5 of the F<modules> file is devoted to "Networking, Device Control
(modems), and Interprocess Communication", and contains numerous unbundled
modules numerous networking modules, Chat and Expect operations, CGI
programming, DCE, FTP, IPC, NNTP, Proxy, Ptty, RPC, SNMP, SMTP, Telnet,
Threads, and ToolTalk--just to name a few.

--- NEW FILE: perl570delta.pod ---
=head1 NAME

perl570delta - what's new for perl v5.7.0


This document describes differences between the 5.6.0 release and
the 5.7.0 release.

=head1 Security Vulnerability Closed

A potential security vulnerability in the optional suidperl component
of Perl has been identified.  suidperl is neither built nor installed
by default.  As of September the 2nd, 2000, the only known vulnerable
platform is Linux, most likely all Linux distributions.  CERT and
various vendors have been alerted about the vulnerability.

The problem was caused by Perl trying to report a suspected security
exploit attempt using an external program, /bin/mail.  On Linux
platforms the /bin/mail program had an undocumented feature which
when combined with suidperl gave access to a root shell, resulting in
a serious compromise instead of reporting the exploit attempt.  If you
don't have /bin/mail, or if you have 'safe setuid scripts', or if
suidperl is not installed, you are safe.

The exploit attempt reporting feature has been completely removed from
the Perl 5.7.0 release, so that particular vulnerability isn't there
anymore.  However, further security vulnerabilities are,
unfortunately, always possible.  The suidperl code is being reviewed
and if deemed too risky to continue to be supported, it may be
completely removed from future releases.  In any case, suidperl should
only be used by security experts who know exactly what they are doing
and why they are using suidperl instead of some other solution such as
sudo ( see http://www.courtesan.com/sudo/ ).

=head1 Incompatible Changes

=over 4

=item *

Arrays now always interpolate into double-quoted strings:
constructs like "foo at bar" now always assume C<@bar> is an array,
whether or not the compiler has seen use of C<@bar>.

=item *

The semantics of bless(REF, REF) were unclear and until someone proves
it to make some sense, it is forbidden.

=item *

A reference to a reference now stringify as "REF(0x81485ec)" instead
of "SCALAR(0x81485ec)" in order to be more consistent with the return
value of ref().

=item *

The very dusty examples in the eg/ directory have been removed.
Suggestions for new shiny examples welcome but the main issue is that
the examples need to be documented, tested and (most importantly)

=item *

The obsolete chat2 library that should never have been allowed
to escape the laboratory has been decommissioned.

=item *

The unimplemented POSIX regex features [[.cc.]] and [[=c=]] are still
recognised but now cause fatal errors.  The previous behaviour of
ignoring them by default and warning if requested was unacceptable
since it, in a way, falsely promised that the features could be used.

=item *

The (bogus) escape sequences \8 and \9 now give an optional warning
("Unrecognized escape passed through").  There is no need to \-escape
any C<\w> character.

=item *

lstat(FILEHANDLE) now gives a warning because the operation makes no sense.
In future releases this may become a fatal error.

=item *

The long deprecated uppercase aliases for the string comparison
operators (EQ, NE, LT, LE, GE, GT) have now been removed.

=item *

The regular expression captured submatches ($1, $2, ...) are now
more consistently unset if the match fails, instead of leaving false
data lying around in them.

=item *

The tr///C and tr///U features have been removed and will not return;
the interface was a mistake.  Sorry about that.  For similar
functionality, see pack('U0', ...) and pack('C0', ...).


=head1 Core Enhancements

=over 4

=item *

C<perl -d:Module=arg,arg,arg> now works (previously one couldn't pass
in multiple arguments.)

=item *

my __PACKAGE__ $obj now works.

=item *

C<no Module;> now works even if there is no "sub unimport" in the Module.

=item *

The numerical comparison operators return C<undef> if either operand
is a NaN.  Previously the behaviour was unspecified.

=item *

C<pack('U0a*', ...)> can now be used to force a string to UTF-8.

=item *

prototype(\&) is now available.

=item *

There is now an UNTIE method.


=head1 Modules and Pragmata

=head2 New Modules

=over 4

=item *

File::Temp allows one to create temporary files and directories in an
easy, portable, and secure way.

=item *

Storable gives persistence to Perl data structures by allowing the
storage and retrieval of Perl data to and from files in a fast and
compact binary format.


=head2 Updated And Improved Modules and Pragmata

=over 4

=item *

The following independently supported modules have been updated to
newer versions from CPAN: CGI, CPAN, DB_File, File::Spec, Getopt::Long,
the podlators bundle, Pod::LaTeX, Pod::Parser, Term::ANSIColor, Test.

=item *

Bug fixes and minor enhancements have been applied to B::Deparse,
Data::Dumper, IO::Poll, IO::Socket::INET, Math::BigFloat,
Math::Complex, Math::Trig, Net::protoent, the re pragma, SelfLoader,
Sys::SysLog, Test::Harness, Text::Wrap, UNIVERSAL, and the warnings

=item *

The attributes::reftype() now works on tied arguments.

=item *

AutoLoader can now be disabled with C<no AutoLoader;>,

=item *

The English module can now be used without the infamous performance
hit by saying

	use English '-no_performance_hit';

(Assuming, of course, that one doesn't need the troublesome variables
C<$`>, C<$&>, or C<$'>.)  Also, introduced C<@LAST_MATCH_START> and
C<@LAST_MATCH_END> English aliases for C<@-> and C<@+>.

=item *

File::Find now has pre- and post-processing callbacks.  It also
correctly changes directories when chasing symbolic links.  Callbacks
(naughtily) exiting with "next;" instead of "return;" now work.

=item *

File::Glob::glob() renamed to File::Glob::bsd_glob() to avoid
prototype mismatch with CORE::glob().

=item *

IPC::Open3 now allows the use of numeric file descriptors.

=item *

use lib now works identically to @INC.  Removing directories
with 'no lib' now works.

=item *

C<%INC> now localised in a Safe compartment so that use/require work.

=item *

The Shell module now has an OO interface.


=head1 Utility Changes

=over 4

=item *

The Emacs perl mode (emacs/cperl-mode.el) has been updated to version

=item *

Perlbug is now much more robust.  It also sends the bug report to
perl.org, not perl.com.

=item *

The perlcc utility has been rewritten and its user interface (that is,
command line) is much more like that of the UNIX C compiler, cc.

=item *

The xsubpp utility for extension writers now understands POD
documentation embedded in the *.xs files.


=head1 New Documentation

=over 4

=item *

perl56delta details the changes between the 5.005 release and the
5.6.0 release.

=item *

perldebtut is a Perl debugging tutorial.

=item *

perlebcdic contains considerations for running Perl on EBCDIC platforms.
Note that unfortunately EBCDIC platforms that used to supported back in
Perl 5.005 are still unsupported by Perl 5.7.0; the plan, however, is to
bring them back to the fold.  

=item *

perlnewmod tells about writing and submitting a new module.

=item *

perlposix-bc explains using Perl on the POSIX-BC platform
(an EBCDIC mainframe platform).

=item *

perlretut is a regular expression tutorial.

=item *

perlrequick is a regular expressions quick-start guide.
Yes, much quicker than perlretut.

=item *

perlutil explains the command line utilities packaged with the Perl


=head1 Performance Enhancements

=over 4

=item *

map() that changes the size of the list should now work faster.

=item *

sort() has been changed to use mergesort internally as opposed to the
earlier quicksort.  For very small lists this may result in slightly
slower sorting times, but in general the speedup should be at least
20%.  Additional bonuses are that the worst case behaviour of sort()
is now better (in computer science terms it now runs in time O(N log N),
as opposed to quicksort's Theta(N**2) worst-case run time behaviour),
and that sort() is now stable (meaning that elements with identical
keys will stay ordered as they were before the sort).


=head1 Installation and Configuration Improvements

=head2 Generic Improvements

=over 4

=item *

INSTALL now explains how you can configure Perl to use 64-bit
integers even on non-64-bit platforms.

=item *

Policy.sh policy change: if you are reusing a Policy.sh file
(see INSTALL) and you use Configure -Dprefix=/foo/bar and in the old
Policy $prefix eq $siteprefix and $prefix eq $vendorprefix, all of
them will now be changed to the new prefix, /foo/bar.  (Previously
only $prefix changed.)  If you do not like this new behaviour,
specify prefix, siteprefix, and vendorprefix explicitly.

=item *

A new optional location for Perl libraries, otherlibdirs, is available.
It can be used for example for vendor add-ons without disturbing Perl's
own library directories.

=item *

In many platforms the vendor-supplied 'cc' is too stripped-down to
build Perl (basically, 'cc' doesn't do ANSI C).  If this seems
to be the case and 'cc' does not seem to be the GNU C compiler
'gcc', an automatic attempt is made to find and use 'gcc' instead.

=item *

gcc needs to closely track the operating system release to avoid
build problems. If Configure finds that gcc was built for a different
operating system release than is running, it now gives a clearly visible
warning that there may be trouble ahead.

=item *

If binary compatibility with the 5.005 release is not wanted, Configure
no longer suggests including the 5.005 modules in @INC.

=item *

Configure C<-S> can now run non-interactively.

=item *

configure.gnu now works with options with whitespace in them.

=item *

installperl now outputs everything to STDERR.

=item *

$Config{byteorder} is now computed dynamically (this is more robust
with "fat binaries" where an executable image contains binaries for
more than one binary platform.)


=head1 Selected Bug Fixes

=over 4

=item *

Several debugger fixes: exit code now reflects the script exit code,
condition C<"0"> now treated correctly, the C<d> command now checks
line number, the C<$.> no longer gets corrupted, all debugger output now
goes correctly to the socket if RemotePort is set.

=item *

C<*foo{FORMAT}> now works.

=item *

Lexical warnings now propagating correctly between scopes.

=item *

Line renumbering with eval and C<#line> now works.

=item *

Fixed numerous memory leaks, especially in eval "".

=item *

Modulus of unsigned numbers now works (4063328477 % 65535 used to
return 27406, instead of 27047).

=item *

Some "not a number" warnings introduced in 5.6.0 eliminated to be
more compatible with 5.005.  Infinity is now recognised as a number.

=item *

our() variables will not cause "will not stay shared" warnings.

=item *

pack "Z" now correctly terminates the string with "\0".

=item *

Fix password routines which in some shadow password platforms
(e.g. HP-UX) caused getpwent() to return every other entry.

=item *

printf() no longer resets the numeric locale to "C".

=item *

C<q(a\\b)> now parses correctly as C<'a\\b'>.

=item *

Printing quads (64-bit integers) with printf/sprintf now works
without the q L ll prefixes (assuming you are on a quad-capable platform).

=item *

Regular expressions on references and overloaded scalars now work.

=item *

scalar() now forces scalar context even when used in void context.

=item *

sort() arguments are now compiled in the right wantarray context
(they were accidentally using the context of the sort() itself).

=item *

Changed the POSIX character class C<[[:space:]]> to include the (very
rare) vertical tab character.  Added a new POSIX-ish character class
C<[[:blank:]]> which stands for horizontal whitespace (currently,
the space and the tab).

=item *

$AUTOLOAD, sort(), lock(), and spawning subprocesses
in multiple threads simultaneously are now thread-safe.

=item *

Allow read-only string on left hand side of non-modifying tr///.

=item *

Several Unicode fixes (but still not perfect).

=over 8

=item *

BOMs (byte order marks) in the beginning of Perl files
(scripts, modules) should now be transparently skipped.
UTF-16 (UCS-2) encoded Perl files should now be read correctly.

=item *

The character tables have been updated to Unicode 3.0.1.

=item *

chr() for values greater than 127 now create utf8 when under use

=item *

Comparing with utf8 data does not magically upgrade non-utf8 data into

=item *

C<IsAlnum>, C<IsAlpha>, and C<IsWord> now match titlecase.

=item *

Concatenation with the C<.> operator or via variable interpolation,
C<eq>, C<substr>, C<reverse>, C<quotemeta>, the C<x> operator,
substitution with C<s///>, single-quoted UTF-8, should now work--in

=item *

The C<tr///> operator now works I<slightly> better but is still rather
broken.  Note that the C<tr///CU> functionality has been removed (but
see pack('U0', ...)).

=item *

vec() now refuses to deal with characters >255.

=item *

Zero entries were missing from the Unicode classes like C<IsDigit>.


=item *

UNIVERSAL::isa no longer caches methods incorrectly.  (This broke
the Tk extension with 5.6.0.)


=head2 Platform Specific Changes and Fixes

=over 4

=item *

BSDI 4.*

Perl now works on post-4.0 BSD/OSes.

=item *

All BSDs

Setting C<$0> now works (as much as possible; see perlvar for details).

=item *


Numerous updates; currently synchronised with Cygwin 1.1.4.

=item *


EPOC update after Perl 5.6.0.  See README.epoc.

=item *

FreeBSD 3.*

Perl now works on post-3.0 FreeBSDs.

=item *


README.hpux updated; C<Configure -Duse64bitall> now almost works.

=item *


Numerous compilation flag and hint enhancements; accidental mixing
of 32-bit and 64-bit libraries (a doomed attempt) made much harder.

=item *


Long doubles should now work (see INSTALL).

=item *

Mac OS Classic

Compilation of the standard Perl distribution in Mac OS Classic should
now work if you have the Metrowerks development environment and the
missing Mac-specific toolkit bits.  Contact the macperl mailing list
for details.

=item *


MPE/iX update after Perl 5.6.0.  See README.mpeix.

=item *


Perl now works on NetBSD/sparc.

=item *


Now works with usethreads (see INSTALL).

=item *


64-bitness using the Sun Workshop compiler now works.

=item *

Tru64 (aka Digital UNIX, aka DEC OSF/1)

The operating system version letter now recorded in $Config{osvers}.
Allow compiling with gcc (previously explicitly forbidden).  Compiling
with gcc still not recommended because buggy code results, even with
gcc 2.95.2.

=item *


Fixed various alignment problems that lead into core dumps either
during build or later; no longer dies on math errors at runtime;
now using full quad integers (64 bits), previously was using 
only 46 bit integers for speed.

=item *


chdir() now works better despite a CRT bug; now works with MULTIPLICITY
(see INSTALL); now works with Perl's malloc.

=item *


=over 8

=item *

accept() no longer leaks memory.

=item *

Better chdir() return value for a non-existent directory.

=item *

New %ENV entries now propagate to subprocesses.

=item *

$ENV{LIB} now used to search for libs under Visual C.

=item *

A failed (pseudo)fork now returns undef and sets errno to EAGAIN.

=item *

Allow REG_EXPAND_SZ keys in the registry.

=item *

Can now send() from all threads, not just the first one.

=item *

Fake signal handling reenabled, bugs and all.

=item *

Less stack reserved per thread so that more threads can run
concurrently. (Still 16M per thread.)

=item *

C<< File::Spec->tmpdir() >> now prefers C:/temp over /tmp
(works better when perl is running as service).

=item *

Better UNC path handling under ithreads.

=item *

wait() and waitpid() now work much better.

=item *

winsock handle leak fixed.



=head1 New or Changed Diagnostics

All regular expression compilation error messages are now hopefully
easier to understand both because the error message now comes before
the failed regex and because the point of failure is now clearly

The various "opened only for", "on closed", "never opened" warnings
drop the C<main::> prefix for filehandles in the C<main> package,
for example C<STDIN> instead of <main::STDIN>. 

The "Unrecognized escape" warning has been extended to include C<\8>,
C<\9>, and C<\_>.  There is no need to escape any of the C<\w> characters.

=head1 Changed Internals

=over 4

=item *

perlapi.pod (a companion to perlguts) now attempts to document the
internal API.

=item *

You can now build a really minimal perl called microperl.
Building microperl does not require even running Configure;
C<make -f Makefile.micro> should be enough.  Beware: microperl makes
many assumptions, some of which may be too bold; the resulting
executable may crash or otherwise misbehave in wondrous ways.
For careful hackers only.

=item *

Added rsignal(), whichsig(), do_join() to the publicised API.

=item *

Made possible to propagate customised exceptions via croak()ing.

=item *

Added is_utf8_char(), is_utf8_string(), bytes_to_utf8(), and utf8_to_bytes().

=item *

Now xsubs can have attributes just like subs.


=head1 Known Problems

=head2 Unicode Support Still Far From Perfect

We're working on it.  Stay tuned.

=head2 EBCDIC Still A Lost Platform

The plan is to bring them back.

=head2 Building Extensions Can Fail Because Of Largefiles

Certain extensions like mod_perl and BSD::Resource are known to have
issues with `largefiles', a change brought by Perl 5.6.0 in which file
offsets default to 64 bits wide, where supported.  Modules may fail to
compile at all or compile and work incorrectly.  Currently there is no
good solution for the problem, but Configure now provides appropriate
non-largefile ccflags, ldflags, libswanted, and libs in the %Config
hash (e.g., $Config{ccflags_nolargefiles}) so the extensions that are
having problems can try configuring themselves without the
largefileness.  This is admittedly not a clean solution, and the
solution may not even work at all.  One potential failure is whether
one can (or, if one can, whether it's a good idea) link together at
all binaries with different ideas about file offsets, all this is

=head2 ftmp-security tests warn 'system possibly insecure'

Don't panic.  Read INSTALL 'make test' section instead. 

=head2 Test lib/posix Subtest 9 Fails In LP64-Configured HP-UX

If perl is configured with -Duse64bitall, the successful result of the
subtest 10 of lib/posix may arrive before the successful result of the
subtest 9, which confuses the test harness so much that it thinks the
subtest 9 failed.

=head2 Long Doubles Still Don't Work In Solaris

The experimental long double support is still very much so in Solaris.
(Other platforms like Linux and Tru64 are beginning to solidify in
this area.)

=head2 Linux With Sfio Fails op/misc Test 48

No known fix.

=head2 Storable tests fail in some platforms

If any Storable tests fail the use of Storable is not advisable.

=over 4

=item *

Many Storable tests fail on AIX configured with 64 bit integers.

So far unidentified problems break Storable in AIX if Perl is
configured to use 64 bit integers.  AIX in 32-bit mode works and
other 64-bit platforms work with Storable.

=item *

DOS DJGPP may hang when testing Storable.

=item *

st-06compat fails in UNICOS and UNICOS/mk.

This means that you cannot read old (pre-Storable-0.7) Storable images
made in other platforms.

=item *

st-store.t and st-retrieve may fail with Compaq C 6.2 on OpenVMS Alpha 7.2.


=head2 Threads Are Still Experimental

Multithreading is still an experimental feature.  Some platforms
emit the following message for lib/thr5005

    # This is a KNOWN FAILURE, and one of the reasons why threading
    # is still an experimental feature.  It is here to stop people
    # from deploying threads in production. ;-)

and another known thread-related warning is

   pragma/overload......Unbalanced saves: 3 more saves than restores
   panic: magic_mutexfree during global destruction.
   lib/selfloader.......Unbalanced saves: 3 more saves than restores
   panic: magic_mutexfree during global destruction.
   lib/st-dclone........Unbalanced saves: 3 more saves than restores
   panic: magic_mutexfree during global destruction.

=head2 The Compiler Suite Is Still Experimental

The compiler suite is slowly getting better but is nowhere near
working order yet.  The backend part that has seen perhaps the most
progress is the bytecode compiler.

=head1 Reporting Bugs

If you find what you think is a bug, you might check the articles
recently posted to the comp.lang.perl.misc newsgroup and the perl
bug database at http://bugs.perl.org/  There may also be
information at http://www.perl.com/perl/ , the Perl Home Page.

If you believe you have an unreported bug, please run the B<perlbug>
program included with your release.  Be sure to trim your bug down
to a tiny but sufficient test case.  Your bug report, along with the
output of C<perl -V>, will be sent off to perlbug at perl.org to be
analysed by the Perl porting team.

=head1 SEE ALSO

The F<Changes> file for exhaustive details on what changed.

The F<INSTALL> file for how to build Perl.

The F<README> file for general stuff.

The F<Artistic> and F<Copying> files for copyright information.

=head1 HISTORY

Written by Jarkko Hietaniemi <F<jhi at iki.fi>>, with many contributions
from The Perl Porters and Perl Users submitting feedback and patches.

Send omissions or corrections to <F<perlbug at perl.org>>.


--- NEW FILE: perlpod.pod ---

=for comment
This document is in Pod format.  To read this, use a Pod formatter,
like "perldoc perlpod".

=head1 NAME
X<POD> X<plain old documentation>

perlpod - the Plain Old Documentation format


Pod is a simple-to-use markup language used for writing documentation
for Perl, Perl programs, and Perl modules.

Translators are available for converting Pod to various formats
like plain text, HTML, man pages, and more.

Pod markup consists of three basic kinds of paragraphs:
L<ordinary|/"Ordinary Paragraph">,
L<verbatim|/"Verbatim Paragraph">, and 
L<command|/"Command Paragraph">.

=head2 Ordinary Paragraph
X<POD, ordinary paragraph>

Most paragraphs in your documentation will be ordinary blocks
of text, like this one.  You can simply type in your text without
any markup whatsoever, and with just a blank line before and
after.  When it gets formatted, it will undergo minimal formatting, 
like being rewrapped, probably put into a proportionally spaced
font, and maybe even justified.

You can use formatting codes in ordinary paragraphs, for B<bold>,
I<italic>, C<code-style>, L<hyperlinks|perlfaq>, and more.  Such
codes are explained in the "L<Formatting Codes|/"Formatting Codes">"
section, below.

=head2 Verbatim Paragraph
X<POD, verbatim paragraph> X<verbatim>

Verbatim paragraphs are usually used for presenting a codeblock or
other text which does not require any special parsing or formatting,
and which shouldn't be wrapped.

A verbatim paragraph is distinguished by having its first character
be a space or a tab.  (And commonly, all its lines begin with spaces
and/or tabs.)  It should be reproduced exactly, with tabs assumed to
be on 8-column boundaries.  There are no special formatting codes,
so you can't italicize or anything like that.  A \ means \, and
nothing else.

=head2 Command Paragraph
X<POD, command>

A command paragraph is used for special treatment of whole chunks
of text, usually as headings or parts of lists.

All command paragraphs (which are typically only one line long) start
with "=", followed by an identifier, followed by arbitrary text that
the command can use however it pleases.  Currently recognized commands

    =head1 Heading Text
    =head2 Heading Text
    =head3 Heading Text
    =head4 Heading Text
    =over indentlevel
    =item stuff
    =begin format
    =end format
    =for format text...
    =encoding type

To explain them each in detail:


=item C<=head1 I<Heading Text>>
X<=head1> X<=head2> X<=head3> X<=head4>
X<head1> X<head2> X<head3> X<head4>

=item C<=head2 I<Heading Text>>

=item C<=head3 I<Heading Text>>

=item C<=head4 I<Heading Text>>

Head1 through head4 produce headings, head1 being the highest
level.  The text in the rest of this paragraph is the content of the
heading.  For example:

  =head2 Object Attributes

The text "Object Attributes" comprises the heading there.  (Note that
head3 and head4 are recent additions, not supported in older Pod
translators.)  The text in these heading commands can use
formatting codes, as seen here:

  =head2 Possible Values for C<$/>

Such commands are explained in the
"L<Formatting Codes|/"Formatting Codes">" section, below.

=item C<=over I<indentlevel>>
X<=over> X<=item> X<=back> X<over> X<item> X<back>

=item C<=item I<stuff...>>

=item C<=back>

Item, over, and back require a little more explanation:  "=over" starts
a region specifically for the generation of a list using "=item"
commands, or for indenting (groups of) normal paragraphs.  At the end
of your list, use "=back" to end it.  The I<indentlevel> option to
"=over" indicates how far over to indent, generally in ems (where
one em is the width of an "M" in the document's base font) or roughly
comparable units; if there is no I<indentlevel> option, it defaults
to four.  (And some formatters may just ignore whatever I<indentlevel>
you provide.)  In the I<stuff> in C<=item I<stuff...>>, you may
use formatting codes, as seen here:

  =item Using C<$|> to Control Buffering

Such commands are explained in the
"L<Formatting Codes|/"Formatting Codes">" section, below.

Note also that there are some basic rules to using "=over" ...
"=back" regions:


=item *

Don't use "=item"s outside of an "=over" ... "=back" region.

=item *

The first thing after the "=over" command should be an "=item", unless
there aren't going to be any items at all in this "=over" ... "=back"

=item *

Don't put "=headI<n>" commands inside an "=over" ... "=back" region.

=item *

And perhaps most importantly, keep the items consistent: either use
"=item *" for all of them, to produce bullets; or use "=item 1.",
"=item 2.", etc., to produce numbered lists; or use "=item foo",
"=item bar", etc. -- namely, things that look nothing like bullets or

If you start with bullets or numbers, stick with them, as
formatters use the first "=item" type to decide how to format the


=item C<=cut>
X<=cut> X<cut>

To end a Pod block, use a blank line,
then a line beginning with "=cut", and a blank
line after it.  This lets Perl (and the Pod formatter) know that
this is where Perl code is resuming.  (The blank line before the "=cut"
is not technically necessary, but many older Pod processors require it.)

=item C<=pod>
X<=pod> X<pod>

The "=pod" command by itself doesn't do much of anything, but it
signals to Perl (and Pod formatters) that a Pod block starts here.  A
Pod block starts with I<any> command paragraph, so a "=pod" command is
usually used just when you want to start a Pod block with an ordinary
paragraph or a verbatim paragraph.  For example:

  =item stuff()

  This function does stuff.


  sub stuff {


  Remember to check its return value, as in:

    stuff() || die "Couldn't do stuff!";


=item C<=begin I<formatname>>
X<=begin> X<=end> X<=for> X<begin> X<end> X<for>

=item C<=end I<formatname>>

=item C<=for I<formatname> I<text...>>

For, begin, and end will let you have regions of text/code/data that
are not generally interpreted as normal Pod text, but are passed
directly to particular formatters, or are otherwise special.  A
formatter that can use that format will use the region, otherwise it
will be completely ignored.

A command "=begin I<formatname>", some paragraphs, and a
command "=end I<formatname>", mean that the text/data inbetween
is meant for formatters that understand the special format
called I<formatname>.  For example,

  =begin html

  <hr> <img src="thang.png">
  <p> This is a raw HTML paragraph </p>

  =end html

The command "=for I<formatname> I<text...>"
specifies that the remainder of just this paragraph (starting
right after I<formatname>) is in that special format.  

  =for html <hr> <img src="thang.png">
  <p> This is a raw HTML paragraph </p>

This means the same thing as the above "=begin html" ... "=end html"

That is, with "=for", you can have only one paragraph's worth
of text (i.e., the text in "=foo targetname text..."), but with
"=begin targetname" ... "=end targetname", you can have any amount
of stuff inbetween.  (Note that there still must be a blank line
after the "=begin" command and a blank line before the "=end"

Here are some examples of how to use these:

  =begin html

  <br>Figure 1.<br><IMG SRC="figure1.png"><br>

  =end html

  =begin text

    |  foo        |
    |        bar  |

  ^^^^ Figure 1. ^^^^

  =end text

Some format names that formatters currently are known to accept
include "roff", "man", "latex", "tex", "text", and "html".  (Some
formatters will treat some of these as synonyms.)

A format name of "comment" is common for just making notes (presumably
to yourself) that won't appear in any formatted version of the Pod

  =for comment
  Make sure that all the available options are documented!

Some I<formatnames> will require a leading colon (as in
C<"=for :formatname">, or
C<"=begin :formatname" ... "=end :formatname">),
to signal that the text is not raw data, but instead I<is> Pod text
(i.e., possibly containing formatting codes) that's just not for
normal formatting (e.g., may not be a normal-use paragraph, but might
be for formatting as a footnote).

=item C<=encoding I<encodingname>>
X<=encoding> X<encoding>

This command is used for declaring the encoding of a document.  Most
users won't need this; but if your encoding isn't US-ASCII or Latin-1,
then put a C<=encoding I<encodingname>> command early in the document so
that pod formatters will know how to decode the document.  For
I<encodingname>, use a name recognized by the L<Encode::Supported>
module.  Examples:

  =encoding utf8

  =encoding koi8-r
  =encoding ShiftJIS
  =encoding big5


And don't forget, when using any command, that the command lasts up
until the end of its I<paragraph>, not its line.  So in the
examples below, you can see that every command needs the blank
line after it, to end its paragraph.

Some examples of lists include:


  =item *

  First item

  =item *

  Second item



  =item Foo()

  Description of Foo function

  =item Bar()

  Description of Bar function


=head2 Formatting Codes
X<POD, formatting code> X<formatting code>
X<POD, interior sequence> X<interior sequence>

In ordinary paragraphs and in some command paragraphs, various
formatting codes (a.k.a. "interior sequences") can be used:

=for comment
 "interior sequences" is such an opaque term.
 Prefer "formatting codes" instead.


=item C<IE<lt>textE<gt>> -- italic text
X<I> X<< IZ<><> >> X<POD, formatting code, italic> X<italic>

Used for emphasis ("C<be IE<lt>careful!E<gt>>") and parameters
("C<redo IE<lt>LABELE<gt>>")

=item C<BE<lt>textE<gt>> -- bold text
X<B> X<< BZ<><> >> X<POD, formatting code, bold> X<bold>

Used for switches ("C<perl's BE<lt>-nE<gt> switch>"), programs
("C<some systems provide a BE<lt>chfnE<gt> for that>"),
emphasis ("C<be BE<lt>careful!E<gt>>"), and so on
("C<and that feature is known as BE<lt>autovivificationE<gt>>").

=item C<CE<lt>codeE<gt>> -- code text
X<C> X<< CZ<><> >> X<POD, formatting code, code> X<code>

Renders code in a typewriter font, or gives some other indication that
this represents program text ("C<CE<lt>gmtime($^T)E<gt>>") or some other
form of computerese ("C<CE<lt>drwxr-xr-xE<gt>>").

=item C<LE<lt>nameE<gt>> -- a hyperlink
X<L> X<< LZ<><> >> X<POD, formatting code, hyperlink> X<hyperlink>

There are various syntaxes, listed below.  In the syntaxes given,
C<text>, C<name>, and C<section> cannot contain the characters
'/' and '|'; and any '<' or '>' should be matched.


=item *


Link to a Perl manual page (e.g., C<LE<lt>Net::PingE<gt>>).  Note
that C<name> should not contain spaces.  This syntax
is also occasionally used for references to UNIX man pages, as in

=item *

C<LE<lt>name/"sec"E<gt>> or C<LE<lt>name/secE<gt>>

Link to a section in other manual page.  E.g.,
C<LE<lt>perlsyn/"For Loops"E<gt>>

=item *

C<LE<lt>/"sec"E<gt>> or C<LE<lt>/secE<gt>> or C<LE<lt>"sec"E<gt>>

Link to a section in this manual page.  E.g.,
C<LE<lt>/"Object Methods"E<gt>>


A section is started by the named heading or item.  For
example, C<LE<lt>perlvar/$.E<gt>> or C<LE<lt>perlvar/"$."E<gt>> both
link to the section started by "C<=item $.>" in perlvar.  And
C<LE<lt>perlsyn/For LoopsE<gt>> or C<LE<lt>perlsyn/"For Loops"E<gt>>
both link to the section started by "C<=head2 For Loops>"
in perlsyn.

To control what text is used for display, you
use "C<LE<lt>text|...E<gt>>", as in:


=item *


Link this text to that manual page.  E.g.,
C<LE<lt>Perl Error Messages|perldiagE<gt>>

=item *

C<LE<lt>text|name/"sec"E<gt>> or C<LE<lt>text|name/secE<gt>>

Link this text to that section in that manual page.  E.g.,
C<LE<lt>SWITCH statements|perlsyn/"Basic BLOCKs and Switch

=item *

C<LE<lt>text|/"sec"E<gt>> or C<LE<lt>text|/secE<gt>>
or C<LE<lt>text|"sec"E<gt>>

Link this text to that section in this manual page.  E.g.,
C<LE<lt>the various attributes|/"Member Data"E<gt>>


Or you can link to a web page:


=item *


Links to an absolute URL.  For example,
C<LE<lt>http://www.perl.org/E<gt>>.  But note
that there is no corresponding C<LE<lt>text|scheme:...E<gt>> syntax, for
various reasons.


=item C<EE<lt>escapeE<gt>> -- a character escape
X<E> X<< EZ<><> >> X<POD, formatting code, escape> X<escape>

Very similar to HTML/XML C<&I<foo>;> "entity references":


=item *

C<EE<lt>ltE<gt>> -- a literal E<lt> (less than)

=item *

C<EE<lt>gtE<gt>> -- a literal E<gt> (greater than)

=item *

C<EE<lt>verbarE<gt>> -- a literal | (I<ver>tical I<bar>)

=item *

C<EE<lt>solE<gt>> = a literal / (I<sol>idus)

The above four are optional except in other formatting codes,
notably C<LE<lt>...E<gt>>, and when preceded by a
capital letter.

=item *


Some non-numeric HTML entity name, such as C<EE<lt>eacuteE<gt>>,
meaning the same thing as C<&eacute;> in HTML -- i.e., a lowercase
e with an acute (/-shaped) accent.

=item *


The ASCII/Latin-1/Unicode character with that number.  A
leading "0x" means that I<number> is hex, as in
C<EE<lt>0x201EE<gt>>.  A leading "0" means that I<number> is octal,
as in C<EE<lt>075E<gt>>.  Otherwise I<number> is interpreted as being
in decimal, as in C<EE<lt>181E<gt>>.

Note that older Pod formatters might not recognize octal or
hex numeric escapes, and that many formatters cannot reliably
render characters above 255.  (Some formatters may even have
to use compromised renderings of Latin-1 characters, like
rendering C<EE<lt>eacuteE<gt>> as just a plain "e".)


=item C<FE<lt>filenameE<gt>> -- used for filenames
X<F> X<< FZ<><> >> X<POD, formatting code, filename> X<filename>

Typically displayed in italics.  Example: "C<FE<lt>.cshrcE<gt>>"

=item C<SE<lt>textE<gt>> -- text contains non-breaking spaces
X<S> X<< SZ<><> >> X<POD, formatting code, non-breaking space> 
X<non-breaking space>

This means that the words in I<text> should not be broken
across lines.  Example: S<C<SE<lt>$x ? $y : $zE<gt>>>.

=item C<XE<lt>topic nameE<gt>> -- an index entry
X<X> X<< XZ<><> >> X<POD, formatting code, index entry> X<index entry>

This is ignored by most formatters, but some may use it for building
indexes.  It always renders as empty-string.
Example: C<XE<lt>absolutizing relative URLsE<gt>>

=item C<ZE<lt>E<gt>> -- a null (zero-effect) formatting code
X<Z> X<< ZZ<><> >> X<POD, formatting code, null> X<null>

This is rarely used.  It's one way to get around using an
EE<lt>...E<gt> code sometimes.  For example, instead of
"C<NEE<lt>ltE<gt>3>" (for "NE<lt>3") you could write
"C<NZE<lt>E<gt>E<lt>3>" (the "ZE<lt>E<gt>" breaks up the "N" and
the "E<lt>" so they can't be considered
the part of a (fictitious) "NE<lt>...E<gt>" code.

=for comment
 This was formerly explained as a "zero-width character".  But it in
 most parser models, it parses to nothing at all, as opposed to parsing
 as if it were a E<zwnj> or E<zwj>, which are REAL zero-width characters.
 So "width" and "character" are exactly the wrong words.


Most of the time, you will need only a single set of angle brackets to
delimit the beginning and end of formatting codes.  However,
sometimes you will want to put a real right angle bracket (a
greater-than sign, '>') inside of a formatting code.  This is particularly
common when using a formatting code to provide a different font-type for a
snippet of code.  As with all things in Perl, there is more than
one way to do it.  One way is to simply escape the closing bracket
using an C<E> code:

    C<$a E<lt>=E<gt> $b>

This will produce: "C<$a E<lt>=E<gt> $b>"

A more readable, and perhaps more "plain" way is to use an alternate
set of delimiters that doesn't require a single ">" to be escaped.  With
the Pod formatters that are standard starting with perl5.5.660, doubled
angle brackets ("<<" and ">>") may be used I<if and only if there is
whitespace right after the opening delimiter and whitespace right
before the closing delimiter!>  For example, the following will
do the trick:
X<POD, formatting code, escaping with multiple brackets>

    C<< $a <=> $b >>

In fact, you can use as many repeated angle-brackets as you like so
long as you have the same number of them in the opening and closing
delimiters, and make sure that whitespace immediately follows the last
'<' of the opening delimiter, and immediately precedes the first '>'
of the closing delimiter.  (The whitespace is ignored.)  So the
following will also work:
X<POD, formatting code, escaping with multiple brackets>

    C<<< $a <=> $b >>>
    C<<<<  $a <=> $b     >>>>

And they all mean exactly the same as this:

    C<$a E<lt>=E<gt> $b>

As a further example, this means that if you wanted to put these bits of
code in C<C> (code) style:

    open(X, ">>thing.dat") || die $!

you could do it like so:

    C<<< open(X, ">>thing.dat") || die $! >>>
    C<< $foo->bar(); >>

which is presumably easier to read than the old way:

    C<open(X, "E<gt>E<gt>thing.dat") || die $!>

This is currently supported by pod2text (Pod::Text), pod2man (Pod::Man),
and any other pod2xxx or Pod::Xxxx translators that use
Pod::Parser 1.093 or later, or Pod::Tree 1.02 or later.

=head2 The Intent
X<POD, intent of>

The intent is simplicity of use, not power of expression.  Paragraphs
look like paragraphs (block format), so that they stand out
visually, and so that I could run them through C<fmt> easily to reformat
them (that's F7 in my version of B<vi>, or Esc Q in my version of
B<emacs>).  I wanted the translator to always leave the C<'> and C<`> and
C<"> quotes alone, in verbatim mode, so I could slurp in a
working program, shift it over four spaces, and have it print out, er,
verbatim.  And presumably in a monospace font.

The Pod format is not necessarily sufficient for writing a book.  Pod
is just meant to be an idiot-proof common source for nroff, HTML,
TeX, and other markup languages, as used for online
documentation.  Translators exist for B<pod2text>, B<pod2html>,
B<pod2man> (that's for nroff(1) and troff(1)), B<pod2latex>, and
B<pod2fm>.  Various others are available in CPAN.

=head2 Embedding Pods in Perl Modules
X<POD, embedding>

You can embed Pod documentation in your Perl modules and scripts.
Start your documentation with an empty line, a "=head1" command at the
beginning, and end it with a "=cut" command and an empty line.  Perl
will ignore the Pod text.  See any of the supplied library modules for
examples.  If you're going to put your Pod at the end of the file, and
you're using an __END__ or __DATA__ cut mark, make sure to put an
empty line there before the first Pod command.


  =head1 NAME

  Time::Local - efficiently compute time from local and GMT time

Without that empty line before the "=head1", many translators wouldn't
have recognized the "=head1" as starting a Pod block.

=head2 Hints for Writing Pod


=item *
X<podchecker> X<POD, validating>

The B<podchecker> command is provided for checking Pod syntax for errors
and warnings.  For example, it checks for completely blank lines in
Pod blocks and for unknown commands and formatting codes.  You should
still also pass your document through one or more translators and proofread
the result, or print out the result and proofread that.  Some of the
problems found may be bugs in the translators, which you may or may not
wish to work around.

=item *

If you're more familiar with writing in HTML than with writing in Pod, you
can try your hand at writing documentation in simple HTML, and converting
it to Pod with the experimental L<Pod::HTML2Pod|Pod::HTML2Pod> module,
(available in CPAN), and looking at the resulting code.  The experimental
L<Pod::PXML|Pod::PXML> module in CPAN might also be useful.

=item *

Many older Pod translators require the lines before every Pod
command and after every Pod command (including "=cut"!) to be a blank
line.  Having something like this:

 # - - - - - - - - - - - -
 =item $firecracker->boom()

 This noisily detonates the firecracker object.
 sub boom {

...will make such Pod translators completely fail to see the Pod block
at all.

Instead, have it like this:

 # - - - - - - - - - - - -

 =item $firecracker->boom()

 This noisily detonates the firecracker object.


 sub boom {

=item *

Some older Pod translators require paragraphs (including command
paragraphs like "=head2 Functions") to be separated by I<completely>
empty lines.  If you have an apparently empty line with some spaces
on it, this might not count as a separator for those translators, and
that could cause odd formatting.

=item *

Older translators might add wording around an LE<lt>E<gt> link, so that
C<LE<lt>Foo::BarE<gt>> may become "the Foo::Bar manpage", for example.
So you shouldn't write things like C<the LE<lt>fooE<gt>
documentation>, if you want the translated document to read sensibly
-- instead write C<the LE<lt>Foo::Bar|Foo::BarE<gt> documentation> or
C<LE<lt>the Foo::Bar documentation|Foo::BarE<gt>>, to control how the
link comes out.

=item *

Going past the 70th column in a verbatim block might be ungracefully
wrapped by some formatters.


=head1 SEE ALSO

L<perlpodspec>, L<perlsyn/"PODs: Embedded Documentation">,
L<perlnewmod>, L<perldoc>, L<pod2html>, L<pod2man>, L<podchecker>.

=head1 AUTHOR

Larry Wall, Sean M. Burke


--- NEW FILE: buildtoc ---
#!/usr/bin/perl -w

use strict;
use vars qw($masterpodfile %Build %Targets $Verbose $Up %Ignore
	    @Master %Readmes %Pods %Aux %Readmepods %Pragmata %Modules
use File::Spec;
use File::Find;
use FindBin;
use Text::Tabs;
use Text::Wrap;
use Getopt::Long;

no locale;

$Up = File::Spec->updir;
$masterpodfile = File::Spec->catdir($Up, "pod.lst");

# Generate any/all of these files
# --verbose gives slightly more output
# --build-all tries to build everything
# --build-foo updates foo as follows
# --showfiles shows the files to be changed

  = (
     toc => "perltoc.pod",
     manifest => File::Spec->catdir($Up, "MANIFEST"),
     perlpod => "perl.pod",
     vms => File::Spec->catdir($Up, "vms", "descrip_mms.template"),
     nmake => File::Spec->catdir($Up, "win32", "Makefile"),
     dmake => File::Spec->catdir($Up, "win32", "makefile.mk"),
     podmak => File::Spec->catdir($Up, "win32", "pod.mak"),
     # plan9 =>  File::Spec->catdir($Up, "plan9", "mkfile"),
     unix => File::Spec->catdir($Up, "Makefile.SH"),

  my @files = keys %Targets;
  my $filesopts = join(" | ", map { "--build-$_" } "all", sort @files);
  my $showfiles;
  die <<__USAGE__
$0: Usage: $0 [--verbose] [--showfiles] $filesopts
  unless @ARGV
	&& GetOptions (verbose => \$Verbose,
		       showfiles => \$showfiles,
		       map {+"build-$_", \$Build{$_}} @files, 'all');
  # Set them all to true
  @Build{@files} = @files if ($Build{all});
  if ($showfiles) {
	  join(" ",
	       sort { lc $a cmp lc $b }
	       map {
		   my ($v, $d, $f) = File::Spec->splitpath($_);
		   my @d;
		   @d = defined $d ? File::Spec->splitdir($d) : ();
		   shift @d if @d;
		   File::Spec->catfile(@d ?
				       (@d == 1 && $d[0] eq '' ? () : @d)
				       : "pod", $f);
	       } @Targets{grep { $_ ne 'all' && $Build{$_} } keys %Build}),

# Don't copy these top level READMEs
  = (
     Y2K => 1,
     micro => 1,
#     vms => 1,

if ($Verbose) {
  print "I'm building $_\n" foreach grep {$Build{$_}} keys %Build;

chdir $FindBin::Bin or die "$0: Can't chdir $FindBin::Bin: $!";

open MASTER, $masterpodfile or die "$0: Can't open $masterpodfile: $!";

my ($delta_source, $delta_target);

foreach (<MASTER>) {
  next if /^\#/;

  # At least one upper case letter somewhere in the first group
  if (/^(\S+)\s(.*)/ && $1 =~ tr/h//) {
    # it's a heading
    my $flags = $1;
    $flags =~ tr/h//d;
    my %flags = (header => 1);
    $flags{toc_omit} = 1 if $flags =~ tr/o//d;
    $flags{aux} = 1 if $flags =~ tr/a//d;
    die "$0: Unknown flag found in heading line: $_" if length $flags;
    push @Master, [\%flags, $2];

  } elsif (/^(\S*)\s+(\S+)\s+(.*)/) {
    # it's a section
    my ($flags, $filename, $desc) = ($1, $2, $3);

    my %flags = (indent => 0);
    $flags{indent} = $1 if $flags =~ s/(\d+)//;
    $flags{toc_omit} = 1 if $flags =~ tr/o//d; 
    $flags{aux} = 1 if $flags =~ tr/a//d;

    if ($flags =~ tr/D//d) {
      $flags{perlpod_omit} = 1;
      $delta_source = "$filename.pod";
    if ($flags =~ tr/d//d) {
      $flags{manifest_omit} = 1;
      $delta_target = "$filename.pod";

    if ($flags =~ tr/r//d) {
      my $readme = $filename;
      $readme =~ s/^perl//;
      $Readmepods{$filename} = $Readmes{$readme} = $desc;
      $flags{readme} = 1;
    } elsif ($flags{aux}) {
      $Aux{$filename} = $desc;
    } else {
      $Pods{$filename} = $desc;
    die "$0: Unknown flag found in section line: $_" if length $flags;
    push @Master, [\%flags, $filename, $desc];
  } elsif (/^$/) {
    push @Master, undef;
  } else {
    die "$0: Malformed line: $_" if $1 =~ tr/A-Z//;
if (defined $delta_source) {
  if (defined $delta_target) {
    # This way round so that keys can act as a MANIFEST skip list
    # Targets will aways be in the pod directory. Currently we can only cope
    # with sources being in the same directory. Fix this and do perlvms.pod
    # with this?
    $Copies{$delta_target} = $delta_source;
  } else {
    die "$0: delta source defined but not target";
} elsif (defined $delta_target) {
  die "$0: delta target defined but not target";

close MASTER;

# Sanity cross check
  my (%disk_pods, @disk_pods);
  my (@manipods, %manipods);
  my (@manireadmes, %manireadmes);
  my (@perlpods, %perlpods);
  my (%our_pods);
  my (%sources);

  # Convert these to a list of filenames.
  foreach (keys %Pods, keys %Readmepods) {

  # None of these filenames will be boolean false
  @disk_pods = glob("*.pod");
  @disk_pods{@disk_pods} = @disk_pods;

  # Things we copy from won't be in perl.pod
  # Things we copy to won't be in MANIFEST
  @sources{values %Copies} = ();

  open(MANI, "../MANIFEST") || die "$0: opening ../MANIFEST failed: $!";
  while (<MANI>) {
    if (m!^pod/([^.]+\.pod)\s+!i) {
      push @manipods, $1;
    } elsif (m!^README\.(\S+)\s+!i) {
      next if $Ignore{$1};
      push @manireadmes, "perl$1.pod";
  @manipods{@manipods} = @manipods;
  @manireadmes{@manireadmes} = @manireadmes;

  open(PERLPOD, "perl.pod") || die "$0: opening perl.pod failed: $!\n";
  while (<PERLPOD>) {
    if (/^For ease of access, /../^\(If you're intending /) {
      if (/^\s+(perl\S*)\s+\w/) {
	push @perlpods, "$1.pod";
  die "$0: could not find the pod listing of perl.pod\n"
    unless @perlpods;
  @perlpods{@perlpods} = @perlpods;

  foreach my $i (sort keys %disk_pods) {
    warn "$0: $i exists but is unknown by buildtoc\n"
      unless $our_pods{$i};
    warn "$0: $i exists but is unknown by ../MANIFEST\n"
      if !$manipods{$i} && !$manireadmes{$i} && !$Copies{$i};
    warn "$0: $i exists but is unknown by perl.pod\n"
	if !$perlpods{$i} && !exists $sources{$i};
  foreach my $i (sort keys %our_pods) {
    warn "$0: $i is known by buildtoc but does not exist\n"
      unless $disk_pods{$i};
  foreach my $i (sort keys %manipods) {
    warn "$0: $i is known by ../MANIFEST but does not exist\n"
      unless $disk_pods{$i};
  foreach my $i (sort keys %perlpods) {
    warn "$0: $i is known by perl.pod but does not exist\n"
      unless $disk_pods{$i};

# Find all the mdoules
  my @modpods;
  find \&getpods => qw(../lib ../ext);

  sub getpods {
    if (/\.p(od|m)$/) {
      my $file = $File::Find::name;
      return if $file eq '../lib/Pod/Functions.pm'; # Used only by pod itself
      return if $file =~ m!(?:^|/)t/!;
      return if $file =~ m!lib/Attribute/Handlers/demo/!;
      return if $file =~ m!lib/Net/FTP/.+\.pm!; # Hi, Graham! :-)
      return if $file =~ m!lib/Math/BigInt/t/!;
      return if $file =~ m!/Devel/PPPort/[Hh]arness|lib/Devel/Harness!i;
      return if $file =~ m!XS/(?:APItest|Typemap)!;
      my $pod = $file;
      return if $pod =~ s/pm$/pod/ && -e $pod;
      die "$0: tut $File::Find::name" if $file =~ /TUT/;
      unless (open (F, "< $_\0")) {
	warn "$0: bogus <$file>: $!";
	system "ls", "-l", $file;
      else {
	my $line;
	while ($line = <F>) {
	  if ($line =~ /^=head1\s+NAME\b/) {
	    push @modpods, $file;
	    #warn "GOOD $file\n";
	warn "$0: $file: cannot find =head1 NAME\n";

  die "$0: no pods" unless @modpods;

  my %done;
  for (@modpods) {
    #($name) = /(\w+)\.p(m|od)$/;
    my $name = path2modname($_);
    if ($name =~ /^[a-z]/) {
      $Pragmata{$name} = $_;
    } else {
      if ($done{$name}++) {
	# warn "already did $_\n";
      $Modules{$name} = $_;

# OK. Now a lot of ancillay function definitions follow
# Main program returns at "Do stuff"

sub path2modname {
    local $_ = shift;
    return $_;

sub output ($);

sub output_perltoc {
  open(OUT, ">perltoc.pod") || die "$0: creating perltoc.pod failed: $!";

  local $/ = '';

  ($_= <<"EOPOD2B") =~ s/^\t//gm && output($_);

	# !!!!!!!   DO NOT EDIT THIS FILE   !!!!!!!
	# This file is autogenerated by buildtoc from all the other pods.
	# Edit those files and run buildtoc --build-toc to effect changes.

	=head1 NAME

	perltoc - perl documentation table of contents


	This page provides a brief table of contents for the rest of the Perl
	documentation set.  It is meant to be scanned quickly or grepped
	through to locate the proper section you're looking for.


#' make emacs happy

  # All the things in the master list that happen to be pod filenames
  podset(map {"$_->[1].pod"} grep {defined $_ && @$_ == 3 && !$_->[0]{toc_omit}} @Master);

  ($_= <<"EOPOD2B") =~ s/^\t//gm && output($_);



  podset(sort values %Pragmata);

  ($_= <<"EOPOD2B") =~ s/^\t//gm && output($_);



  podset( @Modules{ sort keys %Modules } );

  $_= <<"EOPOD2B";


	Here should be listed all the extra programs' documentation, but they
	don't all have manual pages yet:

	=over 4


  $_ .=  join "\n", map {"\t=item $_\n"} sort keys %Aux;
  $_ .= <<"EOPOD2B" ;


	=head1 AUTHOR

	Larry Wall <F<larry\@wall.org>>, with the help of oodles
	of other folks.


  output $_;
  output "\n";                    # flush $LINE

# Below are all the auxiliary routines for generating perltoc.pod

my ($inhead1, $inhead2, $initem);

sub podset {
    local @ARGV = @_;
    my $pod;

    while(<>) {
	if (s/^=head1 (NAME)\s*/=head2 /) {
	    $pod = path2modname($ARGV);
	    output "\n \n\n=head2 ";
	    $_ = <>;
	    if ( /^\s*$pod\b/ ) {
		s/$pod\.pm/$pod/;       # '.pm' in NAME !?
		output $_;
	    } else {
		s/^/$pod, /;
		output $_;
	if (s/^=head1 (.*)/=item $1/) {
	    output "=over 4\n\n" unless $inhead1;
	    $inhead1 = 1;
	    output $_; nl(); next;
	if (s/^=head2 (.*)/=item $1/) {
	    output "=over 4\n\n" unless $inhead2;
	    $inhead2 = 1;
	    output $_; nl(); next;
	if (s/^=item ([^=].*)/$1/) {
	    next if $pod eq 'perldiag';
	    s/^\s*\*\s*$// && next;
	    s/\n/ /g;
	    next if /^[\d.]+$/;
	    next if $pod eq 'perlmodlib' && /^ftp:/;
	    ##print "=over 4\n\n" unless $initem;
	    output ", " if $initem;
	    $initem = 1;
	    output $_; next;
	if (s/^=cut\s*\n//) {

sub unhead1 {
    if ($inhead1) {
	output "\n\n=back\n\n";
    $inhead1 = 0;

sub unhead2 {
    if ($inhead2) {
	output "\n\n=back\n\n";
    $inhead2 = 0;

sub unitem {
    if ($initem) {
	output "\n\n";
	##print "\n\n=back\n\n";
    $initem = 0;

sub nl {
    output "\n";

my $NEWLINE = 0;	# how many newlines have we seen recently
my $LINE;		# what remains to be printed

sub output ($) {
    for (split /(\n)/, shift) {
	if ($_ eq "\n") {
	    if ($LINE) {
		print OUT wrap('', '', $LINE);
		$LINE = '';
	    if (($NEWLINE) < 2) {
		print OUT;
	elsif (/\S/ && length) {
	    $LINE .= $_;
	    $NEWLINE = 0;

# End of original buildtoc. From here on are routines to generate new sections
# for and inplace edit other files

sub generate_perlpod {
  my @output;
  my $maxlength = 0;
  foreach (@Master) {
    my $flags = $_->[0];
    next if $flags->{aux};
    next if $flags->{perlpod_omit};

    if (@$_ == 2) {
      # Heading
      push @output, "=head2 $_->[1]\n";
    } elsif (@$_ == 3) {
      # Section
      my $start = " " x (4 + $flags->{indent}) . $_->[1];
      $maxlength = length $start if length ($start) > $maxlength;
      push @output, [$start, $_->[2]];
    } elsif (@$_ == 0) {
      # blank line
      push @output, "\n";
    } else {
      die "$0: Illegal length " . scalar @$_;
  # want at least 2 spaces padding
  $maxlength += 2;
  $maxlength = ($maxlength + 3) & ~3;
  # sprintf gives $1.....$2 where ... are spaces:
  return unexpand (map {ref $_ ? sprintf "%-${maxlength}s%s\n", @$_ : $_}

sub generate_manifest {
  # Annyoingly unexpand doesn't consider it good form to replace a single
  # space before a tab with a tab
  # Annoyingly (2) it returns read only values.
  my @temp = unexpand (map {sprintf "%-32s%s\n", @$_} @_);
  map {s/ \t/\t\t/g; $_} @temp;
sub generate_manifest_pod {
  generate_manifest map {["pod/$_.pod", $Pods{$_}]}
    grep {!$Copies{"$_.pod"}} sort keys %Pods;
sub generate_manifest_readme {
  generate_manifest map {["README.$_", $Readmes{$_}]} sort keys %Readmes;

sub generate_roffitall {
  (map ({"\t\$maindir/$_.1\t\\"}sort keys %Pods),
   map ({"\t\$maindir/$_.1\t\\"}sort keys %Aux),
   map ({"\t\$libdir/$_.3\t\\"}sort keys %Pragmata),
   map ({"\t\$libdir/$_.3\t\\"}sort keys %Modules),

sub generate_descrip_mms_1 {
  local $Text::Wrap::columns = 150;
  my $count = 0;
  my @lines = map {"pod" . $count++ . " = $_"}
    split /\n/, wrap('', '', join " ", map "[.lib.pods]$_.pod",
		     sort keys %Pods, keys %Readmepods);
  @lines, "pod = " . join ' ', map {"\$(pod$_)"} 0 .. $count - 1;

sub generate_descrip_mms_2 {
  map {sprintf <<'SNIP', $_, $_ eq 'perlvms' ? 'vms' : 'pod', $_}
[.lib.pods]%s.pod : [.%s]%s.pod
	@ If F$Search("[.lib]pods.dir").eqs."" Then Create/Directory [.lib.pods]
	Copy/NoConfirm/Log $(MMS$SOURCE) [.lib.pods]
   sort keys %Pods, keys %Readmepods;

sub generate_nmake_1 {
  # XXX Fix this with File::Spec
  (map {sprintf "\tcopy ..\\README.%-8s ..\\pod\\perl$_.pod\n", $_}
    sort keys %Readmes),
      (map {"\tcopy ..\\pod\\$Copies{$_} ..\\pod\\$_\n"} sort keys %Copies);

# This doesn't have a trailing newline
sub generate_nmake_2 {
  # Spot the special case
  local $Text::Wrap::columns = 76;
  my $line = wrap ("\t    ", "\t    ",
		   join " ", sort keys %Copies,
				  map {"perl$_.pod"} "vms", keys %Readmes);
  $line =~ s/$/ \\/mg;

sub generate_pod_mak {
  my $variable = shift;
  my @lines;
  my $line = join "\\\n", "\U$variable = ",
    map {"\t$_.$variable\t"} sort keys %Pods;
  # Special case
  $line =~ s/.*perltoc.html.*\n//m;

sub do_manifest {
  my $name = shift;
  my @manifest =
    grep {! m!^pod/[^.]+\.pod.*\n!}
      grep {! m!^README\.(\S+)! || $Ignore{$1}} @_;
  # Dictionary order - fold and handle non-word chars as nothing
  map  { $_->[0] }
  sort { $a->[1] cmp $b->[1] || $a->[0] cmp $b->[0] }
  map  { my $f = lc $_; $f =~ s/[^a-z0-9\s]//g; [ $_, $f ] }

sub do_nmake {
  my $name = shift;
  my $makefile = join '', @_;
  die "$0: $name contains NUL bytes" if $makefile =~ /\0/;
  $makefile =~ s/^\tcopy \.\.\\README.*\n/\0/gm;
  my $sections = () = $makefile =~ m/\0+/g;
  die "$0: $name contains no README copies" if $sections < 1;
  die "$0: $name contains discontiguous README copies" if $sections > 1;
  # Now remove the other copies that follow
  1 while $makefile =~ s/\0\tcopy .*\n/\0/gm;
  $makefile =~ s/\0+/join ("", &generate_nmake_1)/se;

  $makefile =~ s{(del /f [^\n]+checkpods[^\n]+).*?(pod2html)}
    {"$1\n" . &generate_nmake_2."\n\t    $2"}se;

# shut up used only once warning
*do_dmake = *do_dmake = \&do_nmake;

sub do_perlpod {
  my $name = shift;
  my $pod = join '', @_;

  unless ($pod =~ s{(For\ ease\ of\ access,\ .*\n)
		    (?:\s+[a-z]{4,}.*\n	#   fooo
		    |=head.*\n		# =head foo
		    |\s*\n		# blank line
	  {$1 . join "", &generate_perlpod}mxe) {
    die "$0: Failed to insert ammendments in do_perlpod";

sub do_podmak {
  my $name = shift;
  my $body = join '', @_;
  foreach my $variable (qw(pod man html tex)) {
    die "$0: could not find $variable in $name"
      unless $body =~ s{\n\U$variable\E = (?:[^\n]*\\\n)*[^\n]*}
	{"\n" . generate_pod_mak ($variable)}se;

sub do_vms {
  my $name = shift;
  my $makefile = join '', @_;
  die "$0: $name contains NUL bytes" if $makefile =~ /\0/;
  $makefile =~ s/\npod\d* =[^\n]*/\0/gs;
  my $sections = () = $makefile =~ m/\0+/g;
  die "$0: $name contains no pod assignments" if $sections < 1;
  die "$0: $name contains $sections discontigous pod assignments"
    if $sections > 1;
  $makefile =~ s/\0+/join "\n", '', &generate_descrip_mms_1/se;

  die "$0: $name contains NUL bytes" if $makefile =~ /\0/;

# Looking for rules like this
# [.lib.pods]perl.pod : [.pod]perl.pod
#	@ If F$Search("[.lib]pods.dir").eqs."" Then Create/Directory [.lib.pods]
#	Copy/NoConfirm/Log $(MMS$SOURCE) [.lib.pods]

  $makefile =~ s/\n\Q[.lib.pods]\Eperl[^\n\.]*\.pod[^\n]+\n
		 [^\n]+\n	# Another line
		 [^\n]+\Q[.lib.pods]\E\n		# ends [.lib.pods]
  $sections = () = $makefile =~ m/\0+/g;
  die "$0: $name contains no copy rules" if $sections < 1;
  die "$0: $name contains $sections discontigous copy rules"
    if $sections > 1;
  $makefile =~ s/\0+/join "\n", '', &generate_descrip_mms_2/se;

sub do_unix {
  my $name = shift;
  my $makefile_SH = join '', @_;
  die "$0: $name contains NUL bytes" if $makefile_SH =~ /\0/;

  $makefile_SH =~ s/\n\s+-\@test -f \S+ && cd pod && \$\(LNS\) \S+ \S+ && cd \.\. && echo "\S+" >> extra.pods \# See buildtoc\n/\0/gm;

  my $sections = () = $makefile_SH =~ m/\0+/g;

  die "$0: $name contains no copy rules" if $sections < 1;
  die "$0: $name contains $sections discontigous copy rules"
    if $sections > 1;

  my @copy_rules = map "\t-\@test -f pod/$Copies{$_} && cd pod && \$(LNS) $Copies{$_} $_ && cd .. && echo \"pod/$_\" >> extra.pods # See buildtoc",
    keys %Copies;

  $makefile_SH =~ s/\0+/join "\n", '', @copy_rules, ''/se;


# Do stuff

my $built;
while (my ($target, $name) = each %Targets) {
  next unless $Build{$target};
  if ($target eq "toc") {
    print "Now processing $name\n" if $Verbose;
    print "Finished\n" if $Verbose;
  print "Now processing $name\n" if $Verbose;
  open THING, $name or die "Can't open $name: $!";
  my @orig = <THING>;
  my $orig = join '', @orig;
  close THING;
  my @new = do {
    no strict 'refs';
    &{"do_$target"}($target, @orig);
  my $new = join '', @new;
  if ($new eq $orig) {
    print "Was not modified\n" if $Verbose;
  rename $name, "$name.old" or die "$0: Can't rename $name to $name.old: $!";
  open THING, ">$name" or die "$0: Can't open $name for writing: $!";
  print THING $new or die "$0: print to $name failed: $!";
  close THING or die die "$0: close $name failed: $!";

warn "$0: was not instructed to build anything\n" unless $built;

--- NEW FILE: perltie.pod ---
=head1 NAME

perltie - how to hide an object class in a simple variable



 $object = tied VARIABLE



Prior to release 5.0 of Perl, a programmer could use dbmopen()
to connect an on-disk database in the standard Unix dbm(3x)
format magically to a %HASH in their program.  However, their Perl was either
built with one particular dbm library or another, but not both, and
[...1150 lines suppressed...]
module that does attempt to address this need is DBM::Deep.  Check your
nearest CPAN site as described in L<perlmodlib> for source code.  Note
that despite its name, DBM::Deep does not use dbm.  Another earlier attempt
at solving the problem is MLDBM, which is also available on the CPAN, but
which has some fairly serious limitations.

Tied filehandles are still incomplete.  sysopen(), truncate(),
flock(), fcntl(), stat() and -X can't currently be trapped.

=head1 AUTHOR

Tom Christiansen

TIEHANDLE by Sven Verdoolaege <F<skimo at dns.ufsia.ac.be>> and Doug MacEachern <F<dougm at osf.org>>

UNTIE by Nick Ing-Simmons <F<nick at ing-simmons.net>>

SCALAR by Tassilo von Parseval <F<tassilo.von.parseval at rwth-aachen.de>>

Tying Arrays by Casey West <F<casey at geeknest.com>>

--- NEW FILE: perlintern.pod ---
=head1 NAME

perlintern - autogenerated documentation of purely B<internal>
		 Perl functions

X<internal Perl functions> X<interpreter functions>

This file is the autogenerated documentation of functions in the
Perl interpreter that are documented using Perl's internal documentation
format but are not marked as part of the Perl API. In other words,
B<they are not for use in extensions>!

=head1 CV reference counts and CvOUTSIDE

=over 8


Each CV has a pointer, C<CvOUTSIDE()>, to its lexically enclosing
CV (if any). Because pointers to anonymous sub prototypes are
stored in C<&> pad slots, it is a possible to get a circular reference,
with the parent pointing to the child and vice-versa. To avoid the
ensuing memory leak, we do not increment the reference count of the CV
pointed to by C<CvOUTSIDE> in the I<one specific instance> that the parent
has a C<&> pad slot pointing back to us. In this case, we set the
C<CvWEAKOUTSIDE> flag in the child. This allows us to determine under what
circumstances we should decrement the refcount of the parent when freeing
the child.

There is a further complication with non-closure anonymous subs (i.e. those
that do not refer to any lexicals outside that sub). In this case, the
anonymous prototype is shared rather than being cloned. This has the
consequence that the parent may be freed while there are still active
children, eg

    BEGIN { $a = sub { eval '$x' } }

In this case, the BEGIN is freed immediately after execution since there
are no active references to it: the anon sub prototype has
C<CvWEAKOUTSIDE> set since it's not a closure, and $a points to the same
CV, so it doesn't contribute to BEGIN's refcount either.  When $a is
executed, the C<eval '$x'> causes the chain of C<CvOUTSIDE>s to be followed,
and the freed BEGIN is accessed.

To avoid this, whenever a CV and its associated pad is freed, any
C<&> entries in the pad are explicitly removed from the pad, and if the
refcount of the pointed-to anon sub is still positive, then that
child's C<CvOUTSIDE> is set to point to its grandparent. This will only
occur in the single specific case of a non-closure anon prototype
having one or more active references (such as C<$a> above).

One other thing to consider is that a CV may be merely undefined
rather than freed, eg C<undef &foo>. In this case, its refcount may
not have reached zero, but we still delete its pad and its C<CvROOT> etc.
Since various children may still have their C<CvOUTSIDE> pointing at this
undefined CV, we keep its own C<CvOUTSIDE> for the time being, so that
the chain of lexical scopes is unbroken. For example, the following
should print 123:

    my $x = 123;
    sub tmp { sub { eval '$x' } }
    my $a = tmp();
    undef &tmp;
    print  $a->();


=for hackers
Found in file cv.h


=head1 Functions in file pad.h

=over 8


Save the current pad in the given context block structure.

	void	CX_CURPAD_SAVE(struct context)

=for hackers
Found in file pad.h


Access the SV at offset po in the saved current pad in the given
context block structure (can be used as an lvalue).

	SV *	CX_CURPAD_SV(struct context, PADOFFSET po)

=for hackers
Found in file pad.h


Get the value from slot C<po> in the base (DEPTH=1) pad of a padlist


=for hackers
Found in file pad.h


Clone the state variables associated with running and compiling pads.

	void	PAD_CLONE_VARS(PerlInterpreter *proto_perl \)

=for hackers
Found in file pad.h


Return the flags for the current compiling pad name
at offset C<po>. Assumes a valid slot entry.


=for hackers
Found in file pad.h


The generation number of the name at offset C<po> in the current
compiling pad (lvalue). Note that C<SvCUR> is hijacked for this purpose.


=for hackers
Found in file pad.h


Sets the generation number of the name at offset C<po> in the current
ling pad (lvalue) to C<gen>.  Note that C<SvCUR_set> is hijacked for this purpose.


=for hackers
Found in file pad.h


Return the stash associated with an C<our> variable.
Assumes the slot entry is a valid C<our> lexical.


=for hackers
Found in file pad.h


Return the name of the current compiling pad name
at offset C<po>. Assumes a valid slot entry.


=for hackers
Found in file pad.h


Return the type (stash) of the current compiling pad name at offset
C<po>. Must be a valid name. Returns null if not typed.


=for hackers
Found in file pad.h

=item PAD_DUP

Clone a padlist.

	void	PAD_DUP(PADLIST dstpad, PADLIST srcpad, CLONE_PARAMS* param)

=for hackers
Found in file pad.h


Restore the old pad saved into the local variable opad by PAD_SAVE_LOCAL()


=for hackers
Found in file pad.h


Save the current pad to the local variable opad, then make the
current pad equal to npad

	void	PAD_SAVE_LOCAL(PAD *opad, PAD *npad)

=for hackers
Found in file pad.h


Save the current pad then set it to null.


=for hackers
Found in file pad.h


Set the slot at offset C<po> in the current pad to C<sv>


=for hackers
Found in file pad.h


Set the current pad to be pad C<n> in the padlist, saving
the previous current pad. NB currently this macro expands to a string too
long for some compilers, so it's best to replace it with


	void	PAD_SET_CUR(PADLIST padlist, I32 n)

=for hackers
Found in file pad.h


like PAD_SET_CUR, but without the save

	void	PAD_SET_CUR_NOSAVE(PADLIST padlist, I32 n)

=for hackers
Found in file pad.h

=item PAD_SV

Get the value at offset C<po> in the current pad


=for hackers
Found in file pad.h

=item PAD_SVl

Lightweight and lvalue version of C<PAD_SV>.
Get or set the value at offset C<po> in the current pad.
Unlike C<PAD_SV>, does not print diagnostics with -DX.
For internal use only.


=for hackers
Found in file pad.h


Clear the pointed to pad value on scope exit. (i.e. the runtime action of 'my')

	void	SAVECLEARSV(SV **svp)

=for hackers
Found in file pad.h


save PL_comppad and PL_curpad


=for hackers
Found in file pad.h


Save a pad slot (used to restore after an iteration)

XXX DAPM it would make more sense to make the arg a PADOFFSET

=for hackers
Found in file pad.h


=head1 Functions in file pp_ctl.c

=over 8

=item find_runcv

Locate the CV corresponding to the currently executing sub or eval.
If db_seqp is non_null, skip CVs that are in the DB package and populate
*db_seqp with the cop sequence number at the point that the DB:: code was
entered. (allows debuggers to eval in the scope of the breakpoint rather
than in the scope of the debugger itself).

	CV*	find_runcv(U32 *db_seqp)

=for hackers
Found in file pp_ctl.c


=head1 Global Variables

=over 8

=item PL_DBsingle

When Perl is run in debugging mode, with the B<-d> switch, this SV is a
boolean which indicates whether subs are being single-stepped.
Single-stepping is automatically turned on after every step.  This is the C
variable which corresponds to Perl's $DB::single variable.  See

	SV *	PL_DBsingle

=for hackers
Found in file intrpvar.h

=item PL_DBsub

When Perl is run in debugging mode, with the B<-d> switch, this GV contains
the SV which holds the name of the sub being debugged.  This is the C
variable which corresponds to Perl's $DB::sub variable.  See

	GV *	PL_DBsub

=for hackers
Found in file intrpvar.h

=item PL_DBtrace

Trace variable used when Perl is run in debugging mode, with the B<-d>
switch.  This is the C variable which corresponds to Perl's $DB::trace
variable.  See C<PL_DBsingle>.

	SV *	PL_DBtrace

=for hackers
Found in file intrpvar.h

=item PL_dowarn

The C variable which corresponds to Perl's $^W warning variable.

	bool	PL_dowarn

=for hackers
Found in file intrpvar.h

=item PL_last_in_gv

The GV which was last used for a filehandle input operation. (C<< <FH> >>)

	GV*	PL_last_in_gv

=for hackers
Found in file thrdvar.h

=item PL_ofs_sv

The output field separator - C<$,> in Perl space.

	SV*	PL_ofs_sv

=for hackers
Found in file thrdvar.h

=item PL_rs

The input record separator - C<$/> in Perl space.

	SV*	PL_rs

=for hackers
Found in file thrdvar.h


=head1 GV Functions

=over 8

=item is_gv_magical

Returns C<TRUE> if given the name of a magical GV.

Currently only useful internally when determining if a GV should be
created even in rvalue contexts.

C<flags> is not used at present but available for future extension to
allow selecting particular classes of magical variable.

Currently assumes that C<name> is NUL terminated (as well as len being valid).
This assumption is met by all callers within the perl core, which all pass
pointers returned by SvPV.

	bool	is_gv_magical(char *name, STRLEN len, U32 flags)

=for hackers
Found in file gv.c


=head1 IO Functions

=over 8

=item start_glob

Function called by C<do_readline> to spawn a glob (or do the glob inside
perl on VMS). This code used to be inline, but now perl uses C<File::Glob>
this glob starter is only used by miniperl during the build process.
Moving it away shrinks pp_hot.c; shrinking pp_hot.c helps speed perl up.

	PerlIO*	start_glob(SV* pattern, IO *io)

=for hackers
Found in file doio.c


=head1 Pad Data Structures

=over 8


CV's can have CvPADLIST(cv) set to point to an AV.

For these purposes "forms" are a kind-of CV, eval""s are too (except they're
not callable at will and are always thrown away after the eval"" is done

XSUBs don't have CvPADLIST set - dXSTARG fetches values from PL_curpad,
but that is really the callers pad (a slot of which is allocated by
every entersub).

The CvPADLIST AV has does not have AvREAL set, so REFCNT of component items
is managed "manual" (mostly in pad.c) rather than normal av.c rules.
The items in the AV are not SVs as for a normal AV, but other AVs:

0'th Entry of the CvPADLIST is an AV which represents the "names" or rather
the "static type information" for lexicals.

The CvDEPTH'th entry of CvPADLIST AV is an AV which is the stack frame at that
depth of recursion into the CV.
The 0'th slot of a frame AV is an AV which is @_.
other entries are storage for variables and op targets.

During compilation:
C<PL_comppad_name> is set to the names AV.
C<PL_comppad> is set to the frame AV for the frame CvDEPTH == 1.
C<PL_curpad> is set to the body of the frame AV (i.e. AvARRAY(PL_comppad)).

During execution, C<PL_comppad> and C<PL_curpad> refer to the live
frame of the currently executing sub.

Iterating over the names AV iterates over all possible pad
items. Pad slots that are SVs_PADTMP (targets/GVs/constants) end up having
&PL_sv_undef "names" (see pad_alloc()).

Only my/our variable (SVs_PADMY/SVs_PADOUR) slots get valid names.
The rest are op targets/GVs/constants which are statically allocated
or resolved at compile time.  These don't have names by which they
can be looked up from Perl code at run time through eval"" like
my/our variables can be.  Since they can't be looked up by "name"
but only by their index allocated at compile time (which is usually
in PL_op->op_targ), wasting a name SV for them doesn't make sense.

The SVs in the names AV have their PV being the name of the variable.
NV+1..IV inclusive is a range of cop_seq numbers for which the name is
valid.  For typed lexicals name SV is SVt_PVMG and SvSTASH points at the
type.  For C<our> lexicals, the type is SVt_PVGV, and GvSTASH points at the
stash of the associated global (so that duplicate C<our> declarations in the
same package can be detected).  SvCUR is sometimes hijacked to
store the generation number during compilation.

If SvFAKE is set on the name SV then slot in the frame AVs are
a REFCNT'ed references to a lexical from "outside". In this case,
the name SV does not have a cop_seq range, since it is in scope

If the 'name' is '&' the corresponding entry in frame AV
is a CV representing a possible closure.
(SvFAKE and name of '&' is not a meaningful combination currently but could
become so if C<my sub foo {}> is implemented.)

The flag SVf_PADSTALE is cleared on lexicals each time the my() is executed,
and set on scope exit. This allows the 'Variable $x is not available' warning
to be generated in evals, such as 

    { my $x = 1; sub f { eval '$x'} } f();


=for hackers
Found in file pad.c

=item cv_clone

Clone a CV: make a new CV which points to the same code etc, but which
has a newly-created pad built by copying the prototype pad and capturing
any outer lexicals.

	CV*	cv_clone(CV* proto)

=for hackers
Found in file pad.c

=item cv_dump

dump the contents of a CV

	void	cv_dump(const CV *cv, const char *title)

=for hackers
Found in file pad.c

=item do_dump_pad

Dump the contents of a padlist

	void	do_dump_pad(I32 level, PerlIO *file, PADLIST *padlist, int full)

=for hackers
Found in file pad.c

=item intro_my

"Introduce" my variables to visible status.

	U32	intro_my()

=for hackers
Found in file pad.c

=item pad_add_anon

Add an anon code entry to the current compiling pad

	PADOFFSET	pad_add_anon(SV* sv, OPCODE op_type)

=for hackers
Found in file pad.c

=item pad_add_name

Create a new name in the current pad at the specified offset.
If C<typestash> is valid, the name is for a typed lexical; set the
name's stash to that value.
If C<ourstash> is valid, it's an our lexical, set the name's
GvSTASH to that value

Also, if the name is @.. or %.., create a new array or hash for that slot

If fake, it means we're cloning an existing entry

	PADOFFSET	pad_add_name(char *name, HV* typestash, HV* ourstash, bool clone)

=for hackers
Found in file pad.c

=item pad_alloc

Allocate a new my or tmp pad entry. For a my, simply push a null SV onto
the end of PL_comppad, but for a tmp, scan the pad from PL_padix upwards
for a slot which has no name and no active value.

	PADOFFSET	pad_alloc(I32 optype, U32 tmptype)

=for hackers
Found in file pad.c

=item pad_block_start

Update the pad compilation state variables on entry to a new block

	void	pad_block_start(int full)

=for hackers
Found in file pad.c

=item pad_check_dup

Check for duplicate declarations: report any of:
     * a my in the current scope with the same name;
     * an our (anywhere in the pad) with the same name and the same stash
       as C<ourstash>
C<is_our> indicates that the name to check is an 'our' declaration

	void	pad_check_dup(char* name, bool is_our, HV* ourstash)

=for hackers
Found in file pad.c

=item pad_findlex

Find a named lexical anywhere in a chain of nested pads. Add fake entries
in the inner pads if it's found in an outer one. innercv is the CV *inside*
the chain of outer CVs to be searched. If newoff is non-null, this is a
run-time cloning: don't add fake entries, just find the lexical and add a
ref to it at newoff in the current pad.

	PADOFFSET	pad_findlex(const char* name, PADOFFSET newoff, const CV* innercv)

=for hackers
Found in file pad.c

=item pad_findmy

Given a lexical name, try to find its offset, first in the current pad,
or failing that, in the pads of any lexically enclosing subs (including
the complications introduced by eval). If the name is found in an outer pad,
then a fake entry is added to the current pad.
Returns the offset in the current pad, or NOT_IN_PAD on failure.

	PADOFFSET	pad_findmy(char* name)

=for hackers
Found in file pad.c

=item pad_fixup_inner_anons

For any anon CVs in the pad, change CvOUTSIDE of that CV from
old_cv to new_cv if necessary. Needed when a newly-compiled CV has to be
moved to a pre-existing CV struct.

	void	pad_fixup_inner_anons(PADLIST *padlist, CV *old_cv, CV *new_cv)

=for hackers
Found in file pad.c

=item pad_free

Free the SV at offset po in the current pad.

	void	pad_free(PADOFFSET po)

=for hackers
Found in file pad.c

=item pad_leavemy

Cleanup at end of scope during compilation: set the max seq number for
lexicals in this scope and warn of any lexicals that never got introduced.

	void	pad_leavemy()

=for hackers
Found in file pad.c

=item pad_new

Create a new compiling padlist, saving and updating the various global
vars at the same time as creating the pad itself. The following flags
can be OR'ed together:

    padnew_CLONE	this pad is for a cloned CV
    padnew_SAVE		save old globals
    padnew_SAVESUB	also save extra stuff for start of sub

	PADLIST*	pad_new(int flags)

=for hackers
Found in file pad.c

=item pad_push

Push a new pad frame onto the padlist, unless there's already a pad at
this depth, in which case don't bother creating a new one.
If has_args is true, give the new pad an @_ in slot zero.

	void	pad_push(PADLIST *padlist, int depth, int has_args)

=for hackers
Found in file pad.c

=item pad_reset

Mark all the current temporaries for reuse

	void	pad_reset()

=for hackers
Found in file pad.c

=item pad_setsv

Set the entry at offset po in the current pad to sv.
Use the macro PAD_SETSV() rather than calling this function directly.

	void	pad_setsv(PADOFFSET po, SV* sv)

=for hackers
Found in file pad.c

=item pad_swipe

Abandon the tmp in the current pad at offset po and replace with a
new one.

	void	pad_swipe(PADOFFSET po, bool refadjust)

=for hackers
Found in file pad.c

=item pad_tidy

Tidy up a pad after we've finished compiling it:
    * remove most stuff from the pads of anonsub prototypes;
    * give it a @_;
    * mark tmps as such.

	void	pad_tidy(padtidy_type type)

=for hackers
Found in file pad.c

=item pad_undef

Free the padlist associated with a CV.
If parts of it happen to be current, we null the relevant
PL_*pad* global vars so that we don't have any dangling references left.
We also repoint the CvOUTSIDE of any about-to-be-orphaned
inner subs to the outer of this cv.

(This function should really be called pad_free, but the name was already

	void	pad_undef(CV* cv)

=for hackers
Found in file pad.c


=head1 Stack Manipulation Macros

=over 8

=item djSP

Declare Just C<SP>. This is actually identical to C<dSP>, and declares
a local copy of perl's stack pointer, available via the C<SP> macro.
See C<SP>.  (Available for backward source code compatibility with the
old (Perl 5.005) thread model.)


=for hackers
Found in file pp.h

=item LVRET

True if this op will be the return value of an lvalue subroutine

=for hackers
Found in file pp.h


=head1 SV Manipulation Functions

=over 8

=item report_uninit

Print appropriate "Use of uninitialized variable" warning

	void	report_uninit()

=for hackers
Found in file sv.c

=item sv_add_arena

Given a chunk of memory, link it to the head of the list of arenas,
and split it into a list of free SVs.

	void	sv_add_arena(char* ptr, U32 size, U32 flags)

=for hackers
Found in file sv.c

=item sv_clean_all

Decrement the refcnt of each remaining SV, possibly triggering a
cleanup. This function may have to be called multiple times to free
SVs which are in complex self-referential hierarchies.

	I32	sv_clean_all()

=for hackers
Found in file sv.c

=item sv_clean_objs

Attempt to destroy all objects not yet freed

	void	sv_clean_objs()

=for hackers
Found in file sv.c

=item sv_free_arenas

Deallocate the memory used by all arenas. Note that all the individual SV
heads and bodies within the arenas must already have been freed.

	void	sv_free_arenas()

=for hackers
Found in file sv.c


=head1 AUTHORS

The autodocumentation system was originally added to the Perl core by
Benjamin Stuhl. Documentation is by whoever was kind enough to
document their functions.

=head1 SEE ALSO

perlguts(1), perlapi(1)

--- NEW FILE: perl58delta.pod ---
=head1 NAME

perl58delta - what is new for perl v5.8.0


This document describes differences between the 5.6.0 release and
the 5.8.0 release.

Many of the bug fixes in 5.8.0 were already seen in the 5.6.1
maintenance release since the two releases were kept closely
coordinated (while 5.8.0 was still called 5.7.something).

Changes that were integrated into the 5.6.1 release are marked C<[561]>.
Many of these changes have been further developed since 5.6.1 was released,
those are marked C<[561+]>.

You can see the list of changes in the 5.6.1 release (both from the
5.005_03 release and the 5.6.0 release) by reading L<perl561delta>.
[...3707 lines suppressed...]
program included with your release.  Be sure to trim your bug down
to a tiny but sufficient test case.  Your bug report, along with the
output of C<perl -V>, will be sent off to perlbug at perl.org to be
analysed by the Perl porting team.

=head1 SEE ALSO

The F<Changes> file for exhaustive details on what changed.

The F<INSTALL> file for how to build Perl.

The F<README> file for general stuff.

The F<Artistic> and F<Copying> files for copyright information.

=head1 HISTORY

Written by Jarkko Hietaniemi <F<jhi at iki.fi>>.


--- NEW FILE: perlxs.pod ---
=head1 NAME

perlxs - XS language reference manual


=head2 Introduction

XS is an interface description file format used to create an extension
interface between Perl and C code (or a C library) which one wishes
to use with Perl.  The XS interface is combined with the library to
create a new library which can then be either dynamically loaded
or statically linked into perl.  The XS interface description is
written in the XS language and is the core component of the Perl
extension interface.

An B<XSUB> forms the basic unit of the XS interface.  After compilation
by the B<xsubpp> compiler, each XSUB amounts to a C function definition
which will provide the glue between Perl calling conventions and C
[...2020 lines suppressed...]
     $netconf = getnetconfigent();
     $a = rpcb_gettime();
     print "time = $a\n";
     print "netconf = $netconf\n";

     $netconf = getnetconfigent("tcp");
     $a = rpcb_gettime("poplar");
     print "time = $a\n";
     print "netconf = $netconf\n";


This document covers features supported by C<xsubpp> 1.935.

=head1 AUTHOR

Originally written by Dean Roehrich <F<roehrich at cray.com>>.

Maintained since 1996 by The Perl Porters <F<perlbug at perl.org>>.

--- NEW FILE: perlthrtut.pod ---
=head1 NAME

perlthrtut - tutorial on threads in Perl


B<NOTE>: this tutorial describes the new Perl threading flavour
introduced in Perl 5.6.0 called interpreter threads, or B<ithreads>
for short.  In this model each thread runs in its own Perl interpreter,
and any data sharing between threads must be explicit.

There is another older Perl threading flavour called the 5.005 model,
unsurprisingly for 5.005 versions of Perl.  The old model is known to
have problems, deprecated, and will probably be removed around release
5.10. You are strongly encouraged to migrate any existing 5.005
threads code to the new model as soon as possible.

You can see which (or neither) threading flavour you have by
running C<perl -V> and looking at the C<Platform> section.
[...1071 lines suppressed...]
=head1 AUTHOR

Dan Sugalski E<lt>dan at sidhe.org<gt>

Slightly modified by Arthur Bergman to fit the new thread model/module.

Reworked slightly by Jörg Walter E<lt>jwalt at cpan.org<gt> to be more concise
about thread-safety of perl code.

Rearranged slightly by Elizabeth Mattijsen E<lt>liz at dijkmat.nl<gt> to put
less emphasis on yield().

=head1 Copyrights

The original version of this article originally appeared in The Perl
Journal #10, and is copyright 1998 The Perl Journal. It appears courtesy
of Jon Orwant and The Perl Journal.  This document may be distributed
under the same terms as Perl itself.

For more information please see L<threads> and L<threads::shared>.

--- NEW FILE: perlpodspec.pod ---

=head1 NAME

perlpodspec - Plain Old Documentation: format specification and notes


This document is detailed notes on the Pod markup language.  Most
people will only have to read L<perlpod|perlpod> to know how to write
in Pod, but this document may answer some incidental questions to do
with parsing and rendering Pod.

In this document, "must" / "must not", "should" /
"should not", and "may" have their conventional (cf. RFC 2119)
meanings: "X must do Y" means that if X doesn't do Y, it's against
this specification, and should really be fixed.  "X should do Y"
means that it's recommended, but X may fail to do Y, if there's a
good reason.  "X may do Y" is merely a note that X can do Y at
will (although it is up to the reader to detect any connotation of
[...1860 lines suppressed...]

  =begin thing


This is invalid because every "=end" command must have a formatname

=head1 SEE ALSO

L<perlpod>, L<perlsyn/"PODs: Embedded Documentation">,

=head1 AUTHOR

Sean M. Burke


--- NEW FILE: perltoot.pod ---
=head1 NAME

perltoot - Tom's object-oriented tutorial for perl


Object-oriented programming is a big seller these days.  Some managers
would rather have objects than sliced bread.  Why is that?  What's so
special about an object?  Just what I<is> an object anyway?

An object is nothing but a way of tucking away complex behaviours into
a neat little easy-to-use bundle.  (This is what professors call
abstraction.) Smart people who have nothing to do but sit around for
weeks on end figuring out really hard problems make these nifty
objects that even regular people can use. (This is what professors call
software reuse.)  Users (well, programmers) can play with this little
bundle all they want, but they aren't to open it up and mess with the
insides.  Just like an expensive piece of hardware, the contract says
that you void the warranty if you muck with the cover.  So don't do that.
[...1736 lines suppressed...]
Irrespective of its distribution, all code examples in this file
are hereby placed into the public domain.  You are permitted and
encouraged to use this code in your own programs for fun
or for profit as you see fit.  A simple comment in the code giving
credit would be courteous but is not required.


=head2 Acknowledgments

Thanks to
Larry Wall,
Roderick Schertler,
Gurusamy Sarathy,
Dean Roehrich,
Raphael Manfredi,
Brent Halsey,
Greg Bacon,
Brad Appleton,
and many others for their helpful comments.

--- NEW FILE: perldata.pod ---
=head1 NAME

perldata - Perl data types


=head2 Variable names
X<variable, name> X<variable name> X<data type> X<type>

Perl has three built-in data types: scalars, arrays of scalars, and
associative arrays of scalars, known as "hashes".  A scalar is a 
single string (of any size, limited only by the available memory),
number, or a reference to something (which will be discussed
in L<perlref>).  Normal arrays are ordered lists of scalars indexed
by number, starting with 0.  Hashes are unordered collections of scalar 
values indexed by their associated string key.

Values are usually referred to by name, or through a named reference.
The first character of the name tells you to what sort of data
structure it refers.  The rest of the name tells you the particular
value to which it refers.  Usually this name is a single I<identifier>,
that is, a string beginning with a letter or underscore, and
containing letters, underscores, and digits.  In some cases, it may
be a chain of identifiers, separated by C<::> (or by the slightly
archaic C<'>); all but the last are interpreted as names of packages,
to locate the namespace in which to look up the final identifier
(see L<perlmod/Packages> for details).  It's possible to substitute
for a simple identifier, an expression that produces a reference
to the value at runtime.   This is described in more detail below
and in L<perlref>.

Perl also has its own built-in variables whose names don't follow
these rules.  They have strange names so they don't accidentally
collide with one of your normal variables.  Strings that match
parenthesized parts of a regular expression are saved under names
containing only digits after the C<$> (see L<perlop> and L<perlre>).
In addition, several special variables that provide windows into
the inner working of Perl have names containing punctuation characters
and control characters.  These are documented in L<perlvar>.
X<variable, built-in>

Scalar values are always named with '$', even when referring to a
scalar that is part of an array or a hash.  The '$' symbol works
semantically like the English word "the" in that it indicates a
single value is expected.

    $days		# the simple scalar value "days"
    $days[28]		# the 29th element of array @days
    $days{'Feb'}	# the 'Feb' value from hash %days
    $#days		# the last index of array @days

Entire arrays (and slices of arrays and hashes) are denoted by '@',
which works much like the word "these" or "those" does in English,
in that it indicates multiple values are expected.

    @days		# ($days[0], $days[1],... $days[n])
    @days[3,4,5]	# same as ($days[3],$days[4],$days[5])
    @days{'a','c'}	# same as ($days{'a'},$days{'c'})

Entire hashes are denoted by '%':

    %days		# (key1, val1, key2, val2 ...)

In addition, subroutines are named with an initial '&', though this
is optional when unambiguous, just as the word "do" is often redundant
in English.  Symbol table entries can be named with an initial '*',
but you don't really care about that yet (if ever :-).

Every variable type has its own namespace, as do several
non-variable identifiers.  This means that you can, without fear
of conflict, use the same name for a scalar variable, an array, or
a hash--or, for that matter, for a filehandle, a directory handle, a
subroutine name, a format name, or a label.  This means that $foo
and @foo are two different variables.  It also means that C<$foo[1]>
is a part of @foo, not a part of $foo.  This may seem a bit weird,
but that's okay, because it is weird.

Because variable references always start with '$', '@', or '%', the
"reserved" words aren't in fact reserved with respect to variable
names.  They I<are> reserved with respect to labels and filehandles,
however, which don't have an initial special character.  You can't
have a filehandle named "log", for instance.  Hint: you could say
C<open(LOG,'logfile')> rather than C<open(log,'logfile')>.  Using
uppercase filehandles also improves readability and protects you
from conflict with future reserved words.  Case I<is> significant--"FOO",
"Foo", and "foo" are all different names.  Names that start with a
letter or underscore may also contain digits and underscores.
X<identifier, case sensitivity>

It is possible to replace such an alphanumeric name with an expression
that returns a reference to the appropriate type.  For a description
of this, see L<perlref>.

Names that start with a digit may contain only more digits.  Names
that do not start with a letter, underscore, digit or a caret (i.e.
a control character) are limited to one character, e.g.,  C<$%> or
C<$$>.  (Most of these one character names have a predefined
significance to Perl.  For instance, C<$$> is the current process

=head2 Context
X<context> X<scalar context> X<list context>

The interpretation of operations and values in Perl sometimes depends
on the requirements of the context around the operation or value.
There are two major contexts: list and scalar.  Certain operations
return list values in contexts wanting a list, and scalar values
otherwise.  If this is true of an operation it will be mentioned in
the documentation for that operation.  In other words, Perl overloads
certain operations based on whether the expected return value is
singular or plural.  Some words in English work this way, like "fish"
and "sheep".

In a reciprocal fashion, an operation provides either a scalar or a
list context to each of its arguments.  For example, if you say

    int( <STDIN> )

the integer operation provides scalar context for the <>
operator, which responds by reading one line from STDIN and passing it
back to the integer operation, which will then find the integer value
of that line and return that.  If, on the other hand, you say

    sort( <STDIN> )

then the sort operation provides list context for <>, which
will proceed to read every line available up to the end of file, and
pass that list of lines back to the sort routine, which will then
sort those lines and return them as a list to whatever the context
of the sort was.

Assignment is a little bit special in that it uses its left argument
to determine the context for the right argument.  Assignment to a
scalar evaluates the right-hand side in scalar context, while
assignment to an array or hash evaluates the righthand side in list
context.  Assignment to a list (or slice, which is just a list
anyway) also evaluates the righthand side in list context.

When you use the C<use warnings> pragma or Perl's B<-w> command-line 
option, you may see warnings
about useless uses of constants or functions in "void context".
Void context just means the value has been discarded, such as a
statement containing only C<"fred";> or C<getpwuid(0);>.  It still
counts as scalar context for functions that care whether or not
they're being called in list context.

User-defined subroutines may choose to care whether they are being
called in a void, scalar, or list context.  Most subroutines do not
need to bother, though.  That's because both scalars and lists are
automatically interpolated into lists.  See L<perlfunc/wantarray>
for how you would dynamically discern your function's calling

=head2 Scalar values
X<scalar> X<number> X<string> X<reference>

All data in Perl is a scalar, an array of scalars, or a hash of
scalars.  A scalar may contain one single value in any of three
different flavors: a number, a string, or a reference.  In general,
conversion from one form to another is transparent.  Although a
scalar may not directly hold multiple values, it may contain a
reference to an array or hash which in turn contains multiple values.

Scalars aren't necessarily one thing or another.  There's no place
to declare a scalar variable to be of type "string", type "number",
type "reference", or anything else.  Because of the automatic
conversion of scalars, operations that return scalars don't need
to care (and in fact, cannot care) whether their caller is looking
for a string, a number, or a reference.  Perl is a contextually
polymorphic language whose scalars can be strings, numbers, or
references (which includes objects).  Although strings and numbers
are considered pretty much the same thing for nearly all purposes,
references are strongly-typed, uncastable pointers with builtin
reference-counting and destructor invocation.

A scalar value is interpreted as TRUE in the Boolean sense if it is not
the null string or the number 0 (or its string equivalent, "0").  The
Boolean context is just a special kind of scalar context where no 
conversion to a string or a number is ever performed.
X<boolean> X<bool> X<true> X<false> X<truth>

There are actually two varieties of null strings (sometimes referred
to as "empty" strings), a defined one and an undefined one.  The
defined version is just a string of length zero, such as C<"">.
The undefined version is the value that indicates that there is
no real value for something, such as when there was an error, or
at end of file, or when you refer to an uninitialized variable or
element of an array or hash.  Although in early versions of Perl,
an undefined scalar could become defined when first used in a
place expecting a defined value, this no longer happens except for
rare cases of autovivification as explained in L<perlref>.  You can
use the defined() operator to determine whether a scalar value is
defined (this has no meaning on arrays or hashes), and the undef()
operator to produce an undefined value.
X<defined> X<undefined> X<undef> X<null> X<string, null>

To find out whether a given string is a valid non-zero number, it's
sometimes enough to test it against both numeric 0 and also lexical
"0" (although this will cause noises if warnings are on).  That's 
because strings that aren't numbers count as 0, just as they do in B<awk>:

    if ($str == 0 && $str ne "0")  {
	warn "That doesn't look like a number";

That method may be best because otherwise you won't treat IEEE
notations like C<NaN> or C<Infinity> properly.  At other times, you
might prefer to determine whether string data can be used numerically
by calling the POSIX::strtod() function or by inspecting your string
with a regular expression (as documented in L<perlre>).

    warn "has nondigits"	if     /\D/;
    warn "not a natural number" unless /^\d+$/;             # rejects -3
    warn "not an integer"       unless /^-?\d+$/;           # rejects +3
    warn "not an integer"       unless /^[+-]?\d+$/;
    warn "not a decimal number" unless /^-?\d+\.?\d*$/;     # rejects .2
    warn "not a decimal number" unless /^-?(?:\d+(?:\.\d*)?|\.\d+)$/;
    warn "not a C float"
	unless /^([+-]?)(?=\d|\.\d)\d*(\.\d*)?([Ee]([+-]?\d+))?$/;

The length of an array is a scalar value.  You may find the length
of array @days by evaluating C<$#days>, as in B<csh>.  However, this
isn't the length of the array; it's the subscript of the last element,
which is a different value since there is ordinarily a 0th element.
Assigning to C<$#days> actually changes the length of the array.
Shortening an array this way destroys intervening values.  Lengthening
an array that was previously shortened does not recover values
that were in those elements.  (It used to do so in Perl 4, but we
had to break this to make sure destructors were called when expected.)
X<$#> X<array, length>

You can also gain some minuscule measure of efficiency by pre-extending
an array that is going to get big.  You can also extend an array
by assigning to an element that is off the end of the array.  You
can truncate an array down to nothing by assigning the null list
() to it.  The following are equivalent:

    @whatever = ();
    $#whatever = -1;

If you evaluate an array in scalar context, it returns the length
of the array.  (Note that this is not true of lists, which return
the last value, like the C comma operator, nor of built-in functions,
which return whatever they feel like returning.)  The following is
always true:
X<array, length>

    scalar(@whatever) == $#whatever - $[ + 1;

Version 5 of Perl changed the semantics of C<$[>: files that don't set
the value of C<$[> no longer need to worry about whether another
file changed its value.  (In other words, use of C<$[> is deprecated.)
So in general you can assume that

    scalar(@whatever) == $#whatever + 1;

Some programmers choose to use an explicit conversion so as to 
leave nothing to doubt:

    $element_count = scalar(@whatever);

If you evaluate a hash in scalar context, it returns false if the
hash is empty.  If there are any key/value pairs, it returns true;
more precisely, the value returned is a string consisting of the
number of used buckets and the number of allocated buckets, separated
by a slash.  This is pretty much useful only to find out whether
Perl's internal hashing algorithm is performing poorly on your data
set.  For example, you stick 10,000 things in a hash, but evaluating
%HASH in scalar context reveals C<"1/16">, which means only one out
of sixteen buckets has been touched, and presumably contains all
10,000 of your items.  This isn't supposed to happen.
X<hash, scalar context> X<hash, bucket> X<bucket>

You can preallocate space for a hash by assigning to the keys() function.
This rounds up the allocated buckets to the next power of two:

    keys(%users) = 1000;		# allocate 1024 buckets

=head2 Scalar value constructors
X<scalar, literal> X<scalar, constant>

Numeric literals are specified in any of the following floating point or
integer formats:

    .23E-10             # a very small number
    3.14_15_92          # a very important number
    4_294_967_296       # underscore for legibility
    0xff                # hex
    0xdead_beef         # more hex   
    0377                # octal (only numbers, begins with 0)
    0b011011            # binary

You are allowed to use underscores (underbars) in numeric literals
between digits for legibility.  You could, for example, group binary
digits by threes (as for a Unix-style mode argument such as 0b110_100_100)
or by fours (to represent nibbles, as in 0b1010_0110) or in other groups.
X<number, literal>

String literals are usually delimited by either single or double
quotes.  They work much like quotes in the standard Unix shells:
double-quoted string literals are subject to backslash and variable
substitution; single-quoted strings are not (except for C<\'> and
C<\\>).  The usual C-style backslash rules apply for making
characters such as newline, tab, etc., as well as some more exotic
forms.  See L<perlop/"Quote and Quote-like Operators"> for a list.
X<string, literal>

Hexadecimal, octal, or binary, representations in string literals
(e.g. '0xff') are not automatically converted to their integer
representation.  The hex() and oct() functions make these conversions
for you.  See L<perlfunc/hex> and L<perlfunc/oct> for more details.

You can also embed newlines directly in your strings, i.e., they can end
on a different line than they begin.  This is nice, but if you forget
your trailing quote, the error will not be reported until Perl finds
another line containing the quote character, which may be much further
on in the script.  Variable substitution inside strings is limited to
scalar variables, arrays, and array or hash slices.  (In other words,
names beginning with $ or @, followed by an optional bracketed
expression as a subscript.)  The following code segment prints out "The
price is $Z<>100."

    $Price = '$100';	# not interpolated
    print "The price is $Price.\n";	# interpolated

There is no double interpolation in Perl, so the C<$100> is left as is.

As in some shells, you can enclose the variable name in braces to
disambiguate it from following alphanumerics (and underscores).
You must also do
this when interpolating a variable into a string to separate the
variable name from a following double-colon or an apostrophe, since
these would be otherwise treated as a package separator:

    $who = "Larry";
    print PASSWD "${who}::0:0:Superuser:/:/bin/perl\n";
    print "We use ${who}speak when ${who}'s here.\n";

Without the braces, Perl would have looked for a $whospeak, a
C<$who::0>, and a C<$who's> variable.  The last two would be the
$0 and the $s variables in the (presumably) non-existent package

In fact, an identifier within such curlies is forced to be a string,
as is any simple identifier within a hash subscript.  Neither need
quoting.  Our earlier example, C<$days{'Feb'}> can be written as
C<$days{Feb}> and the quotes will be assumed automatically.  But
anything more complicated in the subscript will be interpreted as an
expression.  This means for example that C<$version{2.0}++> is
equivalent to C<$version{2}++>, not to C<$version{'2.0'}++>.

=head3 Version Strings
X<version string> X<vstring> X<v-string>

B<Note:> Version Strings (v-strings) have been deprecated.  They will
not be available after Perl 5.8.  The marginal benefits of v-strings
were greatly outweighed by the potential for Surprise and Confusion.

A literal of the form C<v1.20.300.4000> is parsed as a string composed
of characters with the specified ordinals.  This form, known as
v-strings, provides an alternative, more readable way to construct
strings, rather than use the somewhat less readable interpolation form
C<"\x{1}\x{14}\x{12c}\x{fa0}">.  This is useful for representing
Unicode strings, and for comparing version "numbers" using the string
comparison operators, C<cmp>, C<gt>, C<lt> etc.  If there are two or
more dots in the literal, the leading C<v> may be omitted.

    print v9786;              # prints UTF-8 encoded SMILEY, "\x{263a}"
    print v102.111.111;       # prints "foo"
    print 102.111.111;        # same

Such literals are accepted by both C<require> and C<use> for
doing a version check.  The C<$^V> special variable also contains the
running Perl interpreter's version in this form.  See L<perlvar/$^V>.
Note that using the v-strings for IPv4 addresses is not portable unless
you also use the inet_aton()/inet_ntoa() routines of the Socket package.

Note that since Perl 5.8.1 the single-number v-strings (like C<v65>)
are not v-strings before the C<< => >> operator (which is usually used
to separate a hash key from a hash value), instead they are interpreted
as literal strings ('v65').  They were v-strings from Perl 5.6.0 to
Perl 5.8.0, but that caused more confusion and breakage than good.
Multi-number v-strings like C<v65.66> and C<65.66.67> continue to
be v-strings always.

=head3 Special Literals
X<special literal> X<__END__> X<__DATA__> X<END> X<DATA>
X<end> X<data> X<^D> X<^Z>

The special literals __FILE__, __LINE__, and __PACKAGE__
represent the current filename, line number, and package name at that
point in your program.  They may be used only as separate tokens; they
will not be interpolated into strings.  If there is no current package
(due to an empty C<package;> directive), __PACKAGE__ is the undefined
X<__FILE__> X<__LINE__> X<__PACKAGE__> X<line> X<file> X<package>

The two control characters ^D and ^Z, and the tokens __END__ and __DATA__
may be used to indicate the logical end of the script before the actual
end of file.  Any following text is ignored.

Text after __DATA__ but may be read via the filehandle C<PACKNAME::DATA>,
where C<PACKNAME> is the package that was current when the __DATA__
token was encountered.  The filehandle is left open pointing to the
contents after __DATA__.  It is the program's responsibility to
C<close DATA> when it is done reading from it.  For compatibility with
older scripts written before __DATA__ was introduced, __END__ behaves
like __DATA__ in the toplevel script (but not in files loaded with
C<require> or C<do>) and leaves the remaining contents of the
file accessible via C<main::DATA>.

See L<SelfLoader> for more description of __DATA__, and
an example of its use.  Note that you cannot read from the DATA
filehandle in a BEGIN block: the BEGIN block is executed as soon
as it is seen (during compilation), at which point the corresponding
__DATA__ (or __END__) token has not yet been seen.

=head3 Barewords

A word that has no other interpretation in the grammar will
be treated as if it were a quoted string.  These are known as
"barewords".  As with filehandles and labels, a bareword that consists
entirely of lowercase letters risks conflict with future reserved
words, and if you use the C<use warnings> pragma or the B<-w> switch, 
Perl will warn you about any
such words.  Some people may wish to outlaw barewords entirely.  If you

    use strict 'subs';

then any bareword that would NOT be interpreted as a subroutine call
produces a compile-time error instead.  The restriction lasts to the
end of the enclosing block.  An inner block may countermand this
by saying C<no strict 'subs'>.

=head3 Array Joining Delimiter
X<array, interpolation> X<interpolation, array> X<$">

Arrays and slices are interpolated into double-quoted strings
by joining the elements with the delimiter specified in the C<$">
variable (C<$LIST_SEPARATOR> if "use English;" is specified), 
space by default.  The following are equivalent:

    $temp = join($", @ARGV);
    system "echo $temp";

    system "echo @ARGV";

Within search patterns (which also undergo double-quotish substitution)
there is an unfortunate ambiguity:  Is C</$foo[bar]/> to be interpreted as
C</${foo}[bar]/> (where C<[bar]> is a character class for the regular
expression) or as C</${foo[bar]}/> (where C<[bar]> is the subscript to array
@foo)?  If @foo doesn't otherwise exist, then it's obviously a
character class.  If @foo exists, Perl takes a good guess about C<[bar]>,
and is almost always right.  If it does guess wrong, or if you're just
plain paranoid, you can force the correct interpretation with curly
braces as above.

If you're looking for the information on how to use here-documents,
which used to be here, that's been moved to
L<perlop/Quote and Quote-like Operators>.

=head2 List value constructors

List values are denoted by separating individual values by commas
(and enclosing the list in parentheses where precedence requires it):


In a context not requiring a list value, the value of what appears
to be a list literal is simply the value of the final element, as
with the C comma operator.  For example,

    @foo = ('cc', '-E', $bar);

assigns the entire list value to array @foo, but

    $foo = ('cc', '-E', $bar);

assigns the value of variable $bar to the scalar variable $foo.
Note that the value of an actual array in scalar context is the
length of the array; the following assigns the value 3 to $foo:

    @foo = ('cc', '-E', $bar);
    $foo = @foo;                # $foo gets 3

You may have an optional comma before the closing parenthesis of a
list literal, so that you can say:

    @foo = (

To use a here-document to assign an array, one line per element,
you might use an approach like this:

    @sauces = <<End_Lines =~ m/(\S.*\S)/g;
        normal tomato
        spicy tomato
        green chile
        white wine

LISTs do automatic interpolation of sublists.  That is, when a LIST is
evaluated, each element of the list is evaluated in list context, and
the resulting list value is interpolated into LIST just as if each
individual element were a member of LIST.  Thus arrays and hashes lose their
identity in a LIST--the list

    (@foo, at bar,&SomeSub,%glarch)

contains all the elements of @foo followed by all the elements of @bar,
followed by all the elements returned by the subroutine named SomeSub 
called in list context, followed by the key/value pairs of %glarch.
To make a list reference that does I<NOT> interpolate, see L<perlref>.

The null list is represented by ().  Interpolating it in a list
has no effect.  Thus ((),(),()) is equivalent to ().  Similarly,
interpolating an array with no elements is the same as if no
array had been interpolated at that point.

This interpolation combines with the facts that the opening
and closing parentheses are optional (except when necessary for
precedence) and lists may end with an optional comma to mean that
multiple commas within lists are legal syntax. The list C<1,,3> is a
concatenation of two lists, C<1,> and C<3>, the first of which ends
with that optional comma.  C<1,,3> is C<(1,),(3)> is C<1,3> (And
similarly for C<1,,,3> is C<(1,),(,),3> is C<1,3> and so on.)  Not that
we'd advise you to use this obfuscation.

A list value may also be subscripted like a normal array.  You must
put the list in parentheses to avoid ambiguity.  For example:

    # Stat returns list value.
    $time = (stat($file))[8];

    $time = stat($file)[8];  # OOPS, FORGOT PARENTHESES

    # Find a hex digit.
    $hexdigit = ('a','b','c','d','e','f')[$digit-10];

    # A "reverse comma operator".
    return (pop(@foo),pop(@foo))[0];

Lists may be assigned to only when each element of the list
is itself legal to assign to:

    ($a, $b, $c) = (1, 2, 3);

    ($map{'red'}, $map{'blue'}, $map{'green'}) = (0x00f, 0x0f0, 0xf00);

An exception to this is that you may assign to C<undef> in a list.
This is useful for throwing away some of the return values of a

    ($dev, $ino, undef, undef, $uid, $gid) = stat($file);

List assignment in scalar context returns the number of elements
produced by the expression on the right side of the assignment:

    $x = (($foo,$bar) = (3,2,1));       # set $x to 3, not 2
    $x = (($foo,$bar) = f());           # set $x to f()'s return count

This is handy when you want to do a list assignment in a Boolean
context, because most list functions return a null list when finished,
which when assigned produces a 0, which is interpreted as FALSE.

It's also the source of a useful idiom for executing a function or
performing an operation in list context and then counting the number of
return values, by assigning to an empty list and then using that
assignment in scalar context. For example, this code:

    $count = () = $string =~ /\d+/g;

will place into $count the number of digit groups found in $string.
This happens because the pattern match is in list context (since it
is being assigned to the empty list), and will therefore return a list
of all matching parts of the string. The list assignment in scalar
context will translate that into the number of elements (here, the
number of times the pattern matched) and assign that to $count. Note
that simply using

    $count = $string =~ /\d+/g;

would not have worked, since a pattern match in scalar context will
only return true or false, rather than a count of matches.

The final element of a list assignment may be an array or a hash:

    ($a, $b, @rest) = split;
    my($a, $b, %rest) = @_;

You can actually put an array or hash anywhere in the list, but the first one
in the list will soak up all the values, and anything after it will become
undefined.  This may be useful in a my() or local().

A hash can be initialized using a literal list holding pairs of
items to be interpreted as a key and a value:

    # same as map assignment above
    %map = ('red',0x00f,'blue',0x0f0,'green',0xf00);

While literal lists and named arrays are often interchangeable, that's
not the case for hashes.  Just because you can subscript a list value like
a normal array does not mean that you can subscript a list value as a
hash.  Likewise, hashes included as parts of other lists (including
parameters lists and return lists from functions) always flatten out into
key/value pairs.  That's why it's good to use references sometimes.

It is often more readable to use the C<< => >> operator between key/value
pairs.  The C<< => >> operator is mostly just a more visually distinctive
synonym for a comma, but it also arranges for its left-hand operand to be
interpreted as a string -- if it's a bareword that would be a legal simple
identifier (C<< => >> doesn't quote compound identifiers, that contain
double colons). This makes it nice for initializing hashes:

    %map = (
                 red   => 0x00f,
                 blue  => 0x0f0,
                 green => 0xf00,

or for initializing hash references to be used as records:

    $rec = {
                witch => 'Mable the Merciless',
                cat   => 'Fluffy the Ferocious',
                date  => '10/31/1776',

or for using call-by-named-parameter to complicated functions:

   $field = $query->radio_group(
               name      => 'group_name',
               values    => ['eenie','meenie','minie'],
               default   => 'meenie',
               linebreak => 'true',
               labels    => \%labels

Note that just because a hash is initialized in that order doesn't
mean that it comes out in that order.  See L<perlfunc/sort> for examples
of how to arrange for an output ordering.

=head2 Subscripts

An array is subscripted by specifying a dollar sign (C<$>), then the
name of the array (without the leading C<@>), then the subscript inside
square brackets.  For example:

    @myarray = (5, 50, 500, 5000);
    print "Element Number 2 is", $myarray[2], "\n";

The array indices start with 0. A negative subscript retrieves its 
value from the end.  In our example, C<$myarray[-1]> would have been 
5000, and C<$myarray[-2]> would have been 500.

Hash subscripts are similar, only instead of square brackets curly brackets
are used. For example:

    %scientists = 
        "Newton" => "Isaac",
        "Einstein" => "Albert",
        "Darwin" => "Charles",
        "Feynman" => "Richard",

    print "Darwin's First Name is ", $scientists{"Darwin"}, "\n";

=head2 Slices
X<slice> X<array, slice> X<hash, slice>

A common way to access an array or a hash is one scalar element at a
time.  You can also subscript a list to get a single element from it.

    $whoami = $ENV{"USER"};             # one element from the hash
    $parent = $ISA[0];                  # one element from the array
    $dir    = (getpwnam("daemon"))[7];  # likewise, but with list

A slice accesses several elements of a list, an array, or a hash
simultaneously using a list of subscripts.  It's more convenient
than writing out the individual elements as a list of separate
scalar values.

    ($him, $her)   = @folks[0,-1];              # array slice
    @them          = @folks[0 .. 3];            # array slice
    ($who, $home)  = @ENV{"USER", "HOME"};      # hash slice
    ($uid, $dir)   = (getpwnam("daemon"))[2,7]; # list slice

Since you can assign to a list of variables, you can also assign to
an array or hash slice.

    @days[3..5]    = qw/Wed Thu Fri/;
                   = (0xff0000, 0x0000ff, 0x00ff00);
    @folks[0, -1]  = @folks[-1, 0];

The previous assignments are exactly equivalent to

    ($days[3], $days[4], $days[5]) = qw/Wed Thu Fri/;
    ($colors{'red'}, $colors{'blue'}, $colors{'green'})
                   = (0xff0000, 0x0000ff, 0x00ff00);
    ($folks[0], $folks[-1]) = ($folks[-1], $folks[0]);

Since changing a slice changes the original array or hash that it's
slicing, a C<foreach> construct will alter some--or even all--of the
values of the array or hash.

    foreach (@array[ 4 .. 10 ]) { s/peter/paul/ } 

    foreach (@hash{qw[key1 key2]}) {
        s/^\s+//;           # trim leading whitespace
        s/\s+$//;           # trim trailing whitespace
        s/(\w+)/\u\L$1/g;   # "titlecase" words

A slice of an empty list is still an empty list.  Thus:

    @a = ()[1,0];           # @a has no elements
    @b = (@a)[0,1];         # @b has no elements
    @c = (0,1)[2,3];        # @c has no elements


    @a = (1)[1,0];          # @a has two elements
    @b = (1,undef)[1,0,2];  # @b has three elements

This makes it easy to write loops that terminate when a null list
is returned:

    while ( ($home, $user) = (getpwent)[7,0]) {
        printf "%-8s %s\n", $user, $home;

As noted earlier in this document, the scalar sense of list assignment
is the number of elements on the right-hand side of the assignment.
The null list contains no elements, so when the password file is
exhausted, the result is 0, not 2.

If you're confused about why you use an '@' there on a hash slice
instead of a '%', think of it like this.  The type of bracket (square
or curly) governs whether it's an array or a hash being looked at.
On the other hand, the leading symbol ('$' or '@') on the array or
hash indicates whether you are getting back a singular value (a
scalar) or a plural one (a list).

=head2 Typeglobs and Filehandles
X<typeglob> X<filehandle> X<*>

Perl uses an internal type called a I<typeglob> to hold an entire
symbol table entry.  The type prefix of a typeglob is a C<*>, because
it represents all types.  This used to be the preferred way to
pass arrays and hashes by reference into a function, but now that
we have real references, this is seldom needed.  

The main use of typeglobs in modern Perl is create symbol table aliases.
This assignment:

    *this = *that;

makes $this an alias for $that, @this an alias for @that, %this an alias
for %that, &this an alias for &that, etc.  Much safer is to use a reference.

    local *Here::blue = \$There::green;

temporarily makes $Here::blue an alias for $There::green, but doesn't
make @Here::blue an alias for @There::green, or %Here::blue an alias for
%There::green, etc.  See L<perlmod/"Symbol Tables"> for more examples
of this.  Strange though this may seem, this is the basis for the whole
module import/export system.

Another use for typeglobs is to pass filehandles into a function or
to create new filehandles.  If you need to use a typeglob to save away
a filehandle, do it this way:

    $fh = *STDOUT;

or perhaps as a real reference, like this:

    $fh = \*STDOUT;

See L<perlsub> for examples of using these as indirect filehandles
in functions.

Typeglobs are also a way to create a local filehandle using the local()
operator.  These last until their block is exited, but may be passed back.
For example:

    sub newopen {
        my $path = shift;
        local  *FH;  # not my!
        open   (FH, $path)          or  return undef;
        return *FH;
    $fh = newopen('/etc/passwd');

Now that we have the C<*foo{THING}> notation, typeglobs aren't used as much
for filehandle manipulations, although they're still needed to pass brand
new file and directory handles into or out of functions. That's because
C<*HANDLE{IO}> only works if HANDLE has already been used as a handle.
In other words, C<*FH> must be used to create new symbol table entries;
C<*foo{THING}> cannot.  When in doubt, use C<*FH>.

All functions that are capable of creating filehandles (open(),
opendir(), pipe(), socketpair(), sysopen(), socket(), and accept())
automatically create an anonymous filehandle if the handle passed to
them is an uninitialized scalar variable. This allows the constructs
such as C<open(my $fh, ...)> and C<open(local $fh,...)> to be used to
create filehandles that will conveniently be closed automatically when
the scope ends, provided there are no other references to them. This
largely eliminates the need for typeglobs when opening filehandles
that must be passed around, as in the following example:

    sub myopen {
        open my $fh, "@_"
             or die "Can't open '@_': $!";
        return $fh;

        my $f = myopen("</etc/motd");
        print <$f>;
        # $f implicitly closed here

Note that if an initialized scalar variable is used instead the
result is different: C<my $fh='zzz'; open($fh, ...)> is equivalent
to C<open( *{'zzz'}, ...)>.
C<use strict 'refs'> forbids such practice.

Another way to create anonymous filehandles is with the Symbol
module or with the IO::Handle module and its ilk.  These modules
have the advantage of not hiding different types of the same name
during the local().  See the bottom of L<perlfunc/open()> for an

=head1 SEE ALSO

See L<perlvar> for a description of Perl's built-in variables and
a discussion of legal variable names.  See L<perlref>, L<perlsub>,
and L<perlmod/"Symbol Tables"> for more discussion on typeglobs and
the C<*foo{THING}> syntax.

--- NEW FILE: perl561delta.pod ---
=head1 NAME

perl561delta - what's new for perl v5.6.x


This document describes differences between the 5.005 release and the 5.6.1

=head1 Summary of changes between 5.6.0 and 5.6.1

This section contains a summary of the changes between the 5.6.0 release
and the 5.6.1 release.  More details about the changes mentioned here
may be found in the F<Changes> files that accompany the Perl source
distribution.  See L<perlhack> for pointers to online resources where you
can inspect the individual patches described by these changes.

=head2 Security Issues

[...3622 lines suppressed...]
analysed by the Perl porting team.

=head1 SEE ALSO

The F<Changes> file for exhaustive details on what changed.

The F<INSTALL> file for how to build Perl.

The F<README> file for general stuff.

The F<Artistic> and F<Copying> files for copyright information.

=head1 HISTORY

Written by Gurusamy Sarathy <F<gsar at ActiveState.com>>, with many
contributions from The Perl Porters.

Send omissions or corrections to <F<perlbug at perl.org>>.


--- NEW FILE: perlebcdic.pod ---
=head1 NAME

perlebcdic - Considerations for running Perl on EBCDIC platforms


An exploration of some of the issues facing Perl programmers
on EBCDIC based computers.  We do not cover localization, 
internationalization, or multi byte character set issues other
than some discussion of UTF-8 and UTF-EBCDIC.

Portions that are still incomplete are marked with XXX.


=head2 ASCII

The American Standard Code for Information Interchange is a set of
integers running from 0 to 127 (decimal) that imply character 
[...1355 lines suppressed...]

B<IBM - EBCDIC and the P-bit; The biggest Computer Goof Ever> Robert Bemer.

=head1 HISTORY

15 April 2001: added UTF-8 and UTF-EBCDIC to main table, pvhp.

=head1 AUTHOR

Peter Prymmer pvhp at best.com wrote this in 1999 and 2000 
with CCSID 0819 and 0037 help from Chris Leach and 
AndrE<eacute> Pirard A.Pirard at ulg.ac.be as well as POSIX-BC 
help from Thomas Dorner Thomas.Dorner at start.de.
Thanks also to Vickie Cooper, Philip Newton, William Raffloer, and 
Joe Smith.  Trademarks, registered trademarks, service marks and 
registered service marks used in this document are the property of 
their respective owners.

--- NEW FILE: perlcheat.pod ---
=head1 NAME

perlcheat - Perl 5 Cheat Sheet


This 'cheat sheet' is a handy reference, meant for beginning Perl
programmers. Not everything is mentioned, but 194 features may
already be overwhelming.

=head2 The sheet

  void      $scalar   whole:   @array        %hash
  scalar    @array    slice:   @array[0, 2]  @hash{'a', 'b'}
  list      %hash     element: $array[0]     $hash{'a'}
            *glob    SCALAR VALUES
                     number, string, reference, glob, undef
  \     references      $$foo[1]       aka $foo->[1]
  $@%&* dereference     $$foo{bar}     aka $foo->{bar}
  []    anon. arrayref  ${$$foo[1]}[2] aka $foo->[1]->[2]
  {}    anon. hashref   ${$$foo[1]}[2] aka $foo->[1][2]
  \()   list of refs
                          NUMBERS vs STRINGS  LINKS
  OPERATOR PRECEDENCE     =          =        perl.plover.com
  ->                      +          .        search.cpan.org
  ++ --                   == !=      eq ne         cpan.org
  **                      < > <= >=  lt gt le ge   pm.org
  ! ~ \ u+ u-             <=>        cmp           tpj.com
  =~ !~                                            perldoc.com
  * / % x                 SYNTAX
  + - .                   for    (LIST) { }, for (a;b;c) { }
  << >>                   while  ( ) { }, until ( ) { }
  named uops              if     ( ) { } elsif ( ) { } else { }
  < > <= >= lt gt le ge   unless ( ) { } elsif ( ) { } else { }
  == != <=> eq ne cmp     for equals foreach (ALWAYS)
  | ^              REGEX METACHARS            REGEX MODIFIERS
  &&               ^     string begin         /i case insens.
  ||               $     str. end (before \n) /m line based ^$
  .. ...           +     one or more          /s . includes \n
  ?:               *     zero or more         /x ign. wh.space
  = += -= *= etc.  ?     zero or one          /g global
  , =>             {3,7} repeat in range
  list ops         ()    capture          REGEX CHARCLASSES
  not              (?:)  no capture       .  == [^\n]
  and              []    character class  \s == [\x20\f\t\r\n]
  or xor           |     alternation      \w == [A-Za-z0-9_]
                   \b    word boundary    \d == [0-9]
                   \z    string end       \S, \W and \D negate
  use strict;        DON'T            LINKS
  use warnings;      "$foo"           perl.com
  my $var;           $$variable_name  perlmonks.org
  open() or die $!;  `$userinput`     use.perl.org
  use Modules;       /$userinput/     perl.apache.org
  stat      localtime    caller         SPECIAL VARIABLES
   0 dev    0 second     0 package      $_    default variable
   1 ino    1 minute     1 filename     $0    program name
   2 mode   2 hour       2 line         $/    input separator
   3 nlink  3 day        3 subroutine   $\    output separator
   4 uid    4 month-1    4 hasargs      $|    autoflush
   5 gid    5 year-1900  5 wantarray    $!    sys/libcall error
   6 rdev   6 weekday    6 evaltext     $@    eval error
   7 size   7 yearday    7 is_require   $$    process ID
   8 atime  8 is_dst     8 hints        $.    line number
   9 mtime               9 bitmask      @ARGV command line args
  10 ctime  just use                    @INC  include paths
  11 blksz  POSIX::      3..9 only      @_    subroutine args
  12 blcks  strftime!    with EXPR      %ENV  environment


The first version of this document appeared on Perl Monks, where several
people had useful suggestions. Thank you, Perl Monks.

A special thanks to Damian Conway, who didn't only suggest important changes,
but also took the time to count the number of listed features and make a
Perl 6 version to show that Perl will stay Perl.

=head1 AUTHOR

Juerd Waalboer <juerd at cpan.org>, with the help of many Perl Monks.

=head1 SEE ALSO

 http://perlmonks.org/?node_id=216602      the original PM post
 http://perlmonks.org/?node_id=238031      Damian Conway's Perl 6 version
 http://juerd.nl/site.plp/perlcheat        home of the Perl Cheat Sheet

--- NEW FILE: perlgpl.pod ---

=head1 NAME

perlgpl - the GNU General Public License, version 2


 You can refer to this document in Pod via "L<perlgpl>"
 Or you can see this document by entering "perldoc perlgpl"


# Because the following document's language disallows "changing"
# it, we haven't gone thru and prettied it up with =item's or
# anything.  It's good enough the way it is.


This is B<"The GNU General Public License, version 2">.  It's here so
that modules, programs, etc., that want to declare this as their
distribution license, can link to it.

It is also one of the two licenses Perl allows itself to be
redistributed and/or modified; for the other one, the Perl Artistic
License, see the L<perlartistic>.


		       Version 2, June 1991

 Copyright (C) 1989, 1991 Free Software Foundation, Inc.
                          59 Temple Place - Suite 330, Boston, MA
                          02111-1307, USA.
 Everyone is permitted to copy and distribute verbatim copies
 of this license document, but changing it is not allowed.


The licenses for most software are designed to take away your
freedom to share and change it.  By contrast, the GNU General Public
License is intended to guarantee your freedom to share and change free
software--to make sure the software is free for all its users.  This
General Public License applies to most of the Free Software
Foundation's software and to any other program whose authors commit to
using it.  (Some other Free Software Foundation software is covered by
the GNU Library General Public License instead.)  You can apply it to
your programs, too.

When we speak of free software, we are referring to freedom, not
price.  Our General Public Licenses are designed to make sure that you
have the freedom to distribute copies of free software (and charge for
this service if you wish), that you receive source code or can get it
if you want it, that you can change the software or use pieces of it
in new free programs; and that you know you can do these things.

To protect your rights, we need to make restrictions that forbid
anyone to deny you these rights or to ask you to surrender the rights.
These restrictions translate to certain responsibilities for you if you
distribute copies of the software, or if you modify it.

For example, if you distribute copies of such a program, whether
gratis or for a fee, you must give the recipients all the rights that
you have.  You must make sure that they, too, receive or can get the
source code.  And you must show them these terms so they know their

We protect your rights with two steps: (1) copyright the software, and
(2) offer you this license which gives you legal permission to copy,
distribute and/or modify the software.

Also, for each author's protection and ours, we want to make certain
that everyone understands that there is no warranty for this free
software.  If the software is modified by someone else and passed on, we
want its recipients to know that what they have is not the original, so
that any problems introduced by others will not reflect on the original
authors' reputations.

Finally, any free program is threatened constantly by software
patents.  We wish to avoid the danger that redistributors of a free
program will individually obtain patent licenses, in effect making the
program proprietary.  To prevent this, we have made it clear that any
patent must be licensed for everyone's free use or not licensed at all.

The precise terms and conditions for copying, distribution and
modification follow.



0. This License applies to any program or other work which contains
a notice placed by the copyright holder saying it may be distributed
under the terms of this General Public License.  The "Program", below,
refers to any such program or work, and a "work based on the Program"
means either the Program or any derivative work under copyright law:
that is to say, a work containing the Program or a portion of it,
either verbatim or with modifications and/or translated into another
language.  (Hereinafter, translation is included without limitation in
the term "modification".)  Each licensee is addressed as "you".

Activities other than copying, distribution and modification are not
covered by this License; they are outside its scope.  The act of
running the Program is not restricted, and the output from the Program
is covered only if its contents constitute a work based on the
Program (independent of having been made by running the Program).
Whether that is true depends on what the Program does.

1. You may copy and distribute verbatim copies of the Program's
source code as you receive it, in any medium, provided that you
conspicuously and appropriately publish on each copy an appropriate
copyright notice and disclaimer of warranty; keep intact all the
notices that refer to this License and to the absence of any warranty;
and give any other recipients of the Program a copy of this License
along with the Program.

You may charge a fee for the physical act of transferring a copy, and
you may at your option offer warranty protection in exchange for a fee.

2. You may modify your copy or copies of the Program or any portion
of it, thus forming a work based on the Program, and copy and
distribute such modifications or work under the terms of Section 1
above, provided that you also meet all of these conditions:

    a) You must cause the modified files to carry prominent notices
    stating that you changed the files and the date of any change.

    b) You must cause any work that you distribute or publish, that in
    whole or in part contains or is derived from the Program or any
    part thereof, to be licensed as a whole at no charge to all third
    parties under the terms of this License.

    c) If the modified program normally reads commands interactively
    when run, you must cause it, when started running for such
    interactive use in the most ordinary way, to print or display an
    announcement including an appropriate copyright notice and a
    notice that there is no warranty (or else, saying that you provide
    a warranty) and that users may redistribute the program under
    these conditions, and telling the user how to view a copy of this
    License.  (Exception: if the Program itself is interactive but
    does not normally print such an announcement, your work based on
    the Program is not required to print an announcement.)


These requirements apply to the modified work as a whole.  If
identifiable sections of that work are not derived from the Program,
and can be reasonably considered independent and separate works in
themselves, then this License, and its terms, do not apply to those
sections when you distribute them as separate works.  But when you
distribute the same sections as part of a whole which is a work based
on the Program, the distribution of the whole must be on the terms of
this License, whose permissions for other licensees extend to the
entire whole, and thus to each and every part regardless of who wrote it.

Thus, it is not the intent of this section to claim rights or contest
your rights to work written entirely by you; rather, the intent is to
exercise the right to control the distribution of derivative or
collective works based on the Program.

In addition, mere aggregation of another work not based on the Program
with the Program (or with a work based on the Program) on a volume of
a storage or distribution medium does not bring the other work under
the scope of this License.

3. You may copy and distribute the Program (or a work based on it,
under Section 2) in object code or executable form under the terms of
Sections 1 and 2 above provided that you also do one of the following:

    a) Accompany it with the complete corresponding machine-readable
    source code, which must be distributed under the terms of Sections
    1 and 2 above on a medium customarily used for software interchange; or,

    b) Accompany it with a written offer, valid for at least three
    years, to give any third party, for a charge no more than your
    cost of physically performing source distribution, a complete
    machine-readable copy of the corresponding source code, to be
    distributed under the terms of Sections 1 and 2 above on a medium
    customarily used for software interchange; or,

    c) Accompany it with the information you received as to the offer
    to distribute corresponding source code.  (This alternative is
    allowed only for noncommercial distribution and only if you
    received the program in object code or executable form with such
    an offer, in accord with Subsection b above.)

The source code for a work means the preferred form of the work for
making modifications to it.  For an executable work, complete source
code means all the source code for all modules it contains, plus any
associated interface definition files, plus the scripts used to
control compilation and installation of the executable.  However, as a
special exception, the source code distributed need not include
anything that is normally distributed (in either source or binary
form) with the major components (compiler, kernel, and so on) of the
operating system on which the executable runs, unless that component
itself accompanies the executable.

If distribution of executable or object code is made by offering
access to copy from a designated place, then offering equivalent
access to copy the source code from the same place counts as
distribution of the source code, even though third parties are not
compelled to copy the source along with the object code.


4. You may not copy, modify, sublicense, or distribute the Program
except as expressly provided under this License.  Any attempt
otherwise to copy, modify, sublicense or distribute the Program is
void, and will automatically terminate your rights under this License.
However, parties who have received copies, or rights, from you under
this License will not have their licenses terminated so long as such
parties remain in full compliance.

5. You are not required to accept this License, since you have not
signed it.  However, nothing else grants you permission to modify or
distribute the Program or its derivative works.  These actions are
prohibited by law if you do not accept this License.  Therefore, by
modifying or distributing the Program (or any work based on the
Program), you indicate your acceptance of this License to do so, and
all its terms and conditions for copying, distributing or modifying
the Program or works based on it.

6. Each time you redistribute the Program (or any work based on the
Program), the recipient automatically receives a license from the
original licensor to copy, distribute or modify the Program subject to
these terms and conditions.  You may not impose any further
restrictions on the recipients' exercise of the rights granted herein.
You are not responsible for enforcing compliance by third parties to
this License.

7. If, as a consequence of a court judgment or allegation of patent
infringement or for any other reason (not limited to patent issues),
conditions are imposed on you (whether by court order, agreement or
otherwise) that contradict the conditions of this License, they do not
excuse you from the conditions of this License.  If you cannot
distribute so as to satisfy simultaneously your obligations under this
License and any other pertinent obligations, then as a consequence you
may not distribute the Program at all.  For example, if a patent
license would not permit royalty-free redistribution of the Program by
all those who receive copies directly or indirectly through you, then
the only way you could satisfy both it and this License would be to
refrain entirely from distribution of the Program.

If any portion of this section is held invalid or unenforceable under
any particular circumstance, the balance of the section is intended to
apply and the section as a whole is intended to apply in other

It is not the purpose of this section to induce you to infringe any
patents or other property right claims or to contest validity of any
such claims; this section has the sole purpose of protecting the
integrity of the free software distribution system, which is
implemented by public license practices.  Many people have made
generous contributions to the wide range of software distributed
through that system in reliance on consistent application of that
system; it is up to the author/donor to decide if he or she is willing
to distribute software through any other system and a licensee cannot
impose that choice.

This section is intended to make thoroughly clear what is believed to
be a consequence of the rest of this License.


8. If the distribution and/or use of the Program is restricted in
certain countries either by patents or by copyrighted interfaces, the
original copyright holder who places the Program under this License
may add an explicit geographical distribution limitation excluding
those countries, so that distribution is permitted only in or among
countries not thus excluded.  In such case, this License incorporates
the limitation as if written in the body of this License.

9. The Free Software Foundation may publish revised and/or new versions
of the General Public License from time to time.  Such new versions will
be similar in spirit to the present version, but may differ in detail to
address new problems or concerns.

Each version is given a distinguishing version number.  If the Program
specifies a version number of this License which applies to it and "any
later version", you have the option of following the terms and conditions
either of that version or of any later version published by the Free
Software Foundation.  If the Program does not specify a version number of
this License, you may choose any version ever published by the Free Software

10. If you wish to incorporate parts of the Program into other free
programs whose distribution conditions are different, write to the author
to ask for permission.  For software which is copyrighted by the Free
Software Foundation, write to the Free Software Foundation; we sometimes
make exceptions for this.  Our decision will be guided by the two goals
of preserving the free status of all derivatives of our free software and
of promoting the sharing and reuse of software generally.






	Appendix: How to Apply These Terms to Your New Programs

If you develop a new program, and you want it to be of the greatest
possible use to the public, the best way to achieve this is to make it
free software which everyone can redistribute and change under these terms.

To do so, attach the following notices to the program.  It is safest
to attach them to the start of each source file to most effectively
convey the exclusion of warranty; and each file should have at least
the "copyright" line and a pointer to where the full notice is found.

    <one line to give the program's name and a brief idea of what it does.>
    Copyright (C) 19yy  <name of author>

    This program is free software; you can redistribute it and/or modify
    it under the terms of the GNU General Public License as published by
    the Free Software Foundation; either version 2 of the License, or
    (at your option) any later version.

    This program is distributed in the hope that it will be useful,
    but WITHOUT ANY WARRANTY; without even the implied warranty of
    GNU General Public License for more details.

    You should have received a copy of the GNU General Public License
    along with this program; if not, write to the Free Software
    Foundation, Inc., 59 Temple Place - Suite 330, Boston, MA 02111-1307, USA.

Also add information on how to contact you by electronic and paper mail.

If the program is interactive, make it output a short notice like this
when it starts in an interactive mode:

    Gnomovision version 69, Copyright (C) 19yy name of author
    Gnomovision comes with ABSOLUTELY NO WARRANTY; for details type "show w".
    This is free software, and you are welcome to redistribute it
    under certain conditions; type "show c" for details.

The hypothetical commands "show w" and "show c" should show the appropriate
parts of the General Public License.  Of course, the commands you use may
be called something other than "show w" and "show c"; they could even be
mouse-clicks or menu items--whatever suits your program.

You should also get your employer (if you work as a programmer) or your
school, if any, to sign a "copyright disclaimer" for the program, if
necessary.  Here is a sample; alter the names:

  Yoyodyne, Inc., hereby disclaims all copyright interest in the program
  "Gnomovision" (which makes passes at compilers) written by James Hacker.

  <signature of Ty Coon>, 1 April 1989
  Ty Coon, President of Vice

This General Public License does not permit incorporating your program into
proprietary programs.  If your program is a subroutine library, you may
consider it more useful to permit linking proprietary applications with the
library.  If this is what you want to do, use the GNU Library General
Public License instead of this License.



--- NEW FILE: pod2latex.PL ---

use Config;
use File::Basename qw(&basename &dirname);
use Cwd;

# List explicitly here the variables you want Configure to
# generate.  Metaconfig only looks for shell variables, so you
# have to mention them as if they were shell variables, not
# %Config entries.  Thus you write
#  $startperl
# to ensure Configure will look for $Config{startperl}.

# This forces PL files to create target in same directory as PL file.
# This is so that make depend always knows where to find PL derivatives.
$origdir = cwd;
chdir dirname($0);
$file = basename($0, '.PL');
$file .= '.com' if $^O eq 'VMS';

open OUT,">$file" or die "Can't create $file: $!";

print "Extracting $file (with variable substitutions)\n";

# In this section, perl variables will be expanded during extraction.
# You can use $Config{...} to use Configure variables.

print OUT <<"!GROK!THIS!";
    eval 'exec $Config{perlpath} -S \$0 \${1+"\$@"}'
	if \$running_under_some_shell;

# In the following, perl variables are not expanded during extraction.

print OUT <<'!NO!SUBS!';

# pod2latex conversion program

use strict;
use Pod::LaTeX;
use Pod::Find qw/ pod_find /;
use Pod::Usage;
use Getopt::Long;
use File::Basename;
use Symbol;

my $VERSION = "1.01";

# return the entire contents of a text file
# whose name is given as argument
sub _get {
    my $fn = shift;
    my $infh = gensym;
    open $infh, $fn
        or die "Could not open file $fn: $!\n";
    local $/;
    return <$infh>;

# Read command line arguments

my %options = (
	       "help"   => 0,
	       "man"    => 0,
	       "sections" => [],
	       "full"   => 0,
	       "out"    => undef,
	       "verbose" => 0,
	       "modify" => 0,
	       "h1level" => 1,  # section is equivalent to H1
	       "preamble" => [],
	       "postamble" => [],
# "prefile" is just like "preamble", but the argument 
# comes from the file named by the argument
$options{"prefile"} = sub { shift; push @{$options{"preamble"}}, _get(shift) };
# the same between "postfile" and "postamble"
$options{"postfile"} = sub { shift; push @{$options{"postamble"}}, _get(shift) };

	  ) || pod2usage(2);

pod2usage(1)  if ($options{help});
pod2usage(-verbose => 2)  if ($options{man});

# Read all the files from the command line
my @files = @ARGV;

# Now find which ones are real pods and convert 
# directories to their contents.

# Extract the pods from each arg since some of them might
# be directories
# This is not as efficient as using pod_find to search through
# everything at once but it allows us to preserve the order 
# supplied by the user

my @pods;
foreach my $arg (@files) {
  my %pods = pod_find($arg);
  push(@pods, sort keys %pods);

# Abort if nothing to do
if ($#pods == -1) {
  warn "None of the supplied Pod files actually exist\n";

# Only want to override the preamble and postamble if we have
# been given values.
my %User;
$User{UserPreamble} = join("\n", @{$options{'preamble'}})
  if ($options{preamble} && @{$options{preamble}});
$User{UserPostamble} = join("\n", @{$options{'postamble'}})
  if ($options{postamble} && @{$options{postamble}});

# If $options{'out'} is set we are processing to a single output file
my $multi_documents;
if (exists $options{'out'} && defined $options{'out'}) {
  $multi_documents = 0;
} else {
  $multi_documents = 1;

# If the output file is not specified it is assumed that
# a single output file is required per input file using
# a .tex extension rather than any exisiting extension

if ($multi_documents) {

  # Case where we just generate one input per output

  foreach my $pod (@pods) {

    if (-f $pod) {

      my $output = $pod;
      $output = basename($output, '.pm', '.pod','.pl') . '.tex';

      # Create a new parser object
      my $parser = new Pod::LaTeX(
				  AddPreamble => $options{'full'},
				  AddPostamble => $options{'full'},
				  MakeIndex => $options{'full'},
				  TableOfContents => $options{'full'},
				  ReplaceNAMEwithSection => $options{'modify'},
				  UniqueLabels => $options{'modify'},
				  Head1Level => $options{'h1level'},
				  LevelNoNum => $options{'h1level'} + 1,

      # Select sections if supplied
      $parser->select(@{ $options{'sections'}})
	if @{$options{'sections'}};

      # Derive the input file from the output file
      $parser->parse_from_file($pod, $output);

      print "Written output to $output\n" if $options{'verbose'};

    } else {
      warn "File $pod not found\n";

} else {

  # Case where we want everything to be in a single document

  # Need to open the output file ourselves
  my $output = $options{'out'};
  $output .= '.tex' unless $output =~ /\.tex$/;

  # Use auto-vivified file handle in perl 5.6
  my $outfh = gensym;
  open ($outfh, ">$output") || die "Could not open output file: $!\n";

  # Flag to indicate whether we have converted at least one file
  # indicates how many files have been converted
  my $converted = 0;

  # Loop over the input files
  foreach my $pod (@pods) {

    if (-f $pod) {

      warn "Converting $pod\n" if $options{'verbose'};

      # Open the file (need the handle)
      # Use auto-vivified handle in perl 5.6
      my $podfh = gensym;
      open ($podfh, "<$pod") || die "Could not open pod file $pod: $!\n";

      # if this is the first file to be converted we may want to add
      # a preamble (controlled by command line option)
      my $preamble = 0;
      $preamble = 1 if ($converted == 0 && $options{'full'});

      # if this is the last file to be converted may want to add
      # a postamble (controlled by command line option)
      # relies on a previous pass to check existence of all pods we
      # are converting.
      my $postamble = ( ($converted == $#pods && $options{'full'}) ? 1 : 0 );

      # Open parser object
      # May want to start with a preamble for the first one and
      # end with an index for the last
      my $parser = new Pod::LaTeX(
				  MakeIndex => $options{'full'},
				  TableOfContents => $preamble,
				  ReplaceNAMEwithSection => $options{'modify'},
				  UniqueLabels => $options{'modify'},
				  StartWithNewPage => $options{'full'},
				  AddPreamble => $preamble,
				  AddPostamble => $postamble,
				  Head1Level => $options{'h1level'},
				  LevelNoNum => $options{'h1level'} + 1,

      # Store the file name for error messages
      # This is a kluge that breaks the data hiding of the object
      $parser->{_INFILE} = $pod;

      # Select sections if supplied
      $parser->select(@{ $options{'sections'}})
	if @{$options{'sections'}};

      # Parse it
      $parser->parse_from_filehandle($podfh, $outfh);

      # We have converted at least one file

    } else {
      warn "File $pod not found\n";


  # Should unlink the file if we didn't convert anything!
  # dont check for return status of unlink
  # since there is not a lot to be done if the unlink failed
  # and the program does not rely upon it.
  unlink "$output" unless $converted;

  # If verbose
  warn "Converted $converted files\n" if $options{'verbose'};




=head1 NAME

pod2latex - convert pod documentation to latex format


  pod2latex *.pm

  pod2latex -out mytex.tex *.pod

  pod2latex -full -sections 'DESCRIPTION|NAME' SomeDir

  pod2latex -prefile h.tex -postfile t.tex my.pod


C<pod2latex> is a program to convert POD format documentation
(L<perlpod>) into latex. It can process multiple input documents at a
time and either generate a latex file per input document or a single
combined output file.


This section describes the supported command line options. Minimum
matching is supported.

=over 4

=item B<-out>

Name of the output file to be used. If there are multiple input pods
it is assumed that the intention is to write all translated output
into a single file. C<.tex> is appended if not present.  If the
argument is not supplied, a single document will be created for each
input file.

=item B<-full>

Creates a complete C<latex> file that can be processed immediately
(unless C<=for/=begin> directives are used that rely on extra packages).
Table of contents and index generation commands are included in the
wrapper C<latex> code.

=item B<-sections>

Specify pod sections to include (or remove if negated) in the
translation.  See L<Pod::Select/"SECTION SPECIFICATIONS"> for the
format to use for I<section-spec>. This option may be given multiple
times on the command line.This is identical to the similar option in
the C<podselect()> command.

=item B<-modify>

This option causes the output C<latex> to be slightly
modified from the input pod such that when a C<=head1 NAME>
is encountered a section is created containing the actual
pod name (rather than B<NAME>) and all subsequent C<=head1>
directives are treated as subsections. This has the advantage
that the description of a module will be in its own section
which is helpful for including module descriptions in documentation.
Also forces C<latex> label and index entries to be prefixed by the
name of the module.

=item B<-h1level>

Specifies the C<latex> section that is equivalent to a C<H1> pod
directive. This is an integer between 0 and 5 with 0 equivalent to a
C<latex> chapter, 1 equivalent to a C<latex> section etc. The default
is 1 (C<H1> equivalent to a latex section).

=item B<-help>

Print a brief help message and exit.

=item B<-man>

Print the manual page and exit.

=item B<-verbose>

Print information messages as each document is processed.

=item B<-preamble>

A user-supplied preamble for the LaTeX code. Multiple values
are supported and appended in order separated by "\n".
See B<-prefile> for reading the preamble from a file.

=item B<-postamble>

A user supplied postamble for the LaTeX code. Multiple values
are supported and appended in order separated by "\n".
See B<-postfile> for reading the postamble from a file.

=item B<-prefile>

A user-supplied preamble for the LaTeX code to be read from the
named file. Multiple values are supported and appended in
order. See B<-preamble>.

=item B<-postfile>

A user-supplied postamble for the LaTeX code to be read from the
named file. Multiple values are supported and appended in
order. See B<-postamble>.


=head1 BUGS

Known bugs are:

=over 4

=item *

Cross references between documents are not resolved when multiple
pod documents are converted into a single output C<latex> file.

=item *

Functions and variables are not automatically recognized
and they will therefore not be marked up in any special way
unless instructed by an explicit pod command.


=head1 SEE ALSO


=head1 AUTHOR

Tim Jenness E<lt>tjenness at cpan.orgE<gt>

This program is free software; you can redistribute it
and/or modify it under the same terms as Perl itself.

Copyright (C) 2000, 2003, 2004 Tim Jenness. All Rights Reserved.



close OUT or die "Can't close $file: $!";
chmod 0755, $file or die "Can't reset permissions for $file: $!\n";
exec("$Config{'eunicefix'} $file") if $Config{'eunicefix'} ne ':';
chdir $origdir;

--- NEW FILE: perlcall.pod ---
=head1 NAME

perlcall - Perl calling conventions from C


The purpose of this document is to show you how to call Perl subroutines
directly from C, i.e., how to write I<callbacks>.

Apart from discussing the C interface provided by Perl for writing
callbacks the document uses a series of examples to show how the
interface actually works in practice.  In addition some techniques for
coding callbacks are covered.

Examples where callbacks are necessary include

=over 5

=item * An Error Handler
[...1855 lines suppressed...]
L<perlapi/eval_pv>).  Once this code reference is in hand, it
can be mixed in with all the previous examples we've shown.

=head1 SEE ALSO

L<perlxs>, L<perlguts>, L<perlembed>

=head1 AUTHOR

Paul Marquess 

Special thanks to the following people who assisted in the creation of
the document.

Jeff Okamoto, Tim Bunce, Nick Gianniotis, Steve Kelem, Gurusamy Sarathy
and Larry Wall.

=head1 DATE

Version 1.3, 14th Apr 1997

--- NEW FILE: perlfunc.pod ---
=head1 NAME

perlfunc - Perl builtin functions


The functions in this section can serve as terms in an expression.
They fall into two major categories: list operators and named unary
operators.  These differ in their precedence relationship with a
following comma.  (See the precedence table in L<perlop>.)  List
operators take more than one argument, while unary operators can never
take more than one argument.  Thus, a comma terminates the argument of
a unary operator, but merely separates the arguments of a list
operator.  A unary operator generally provides a scalar context to its
argument, while a list operator may provide either scalar or list
contexts for its arguments.  If it does both, the scalar arguments will
be first, and the list argument will follow.  (Note that there can ever
be only one such list argument.)  For instance, splice() has three scalar
[...7001 lines suppressed...]
is used to format the new page header, and then the record is written.
By default the top-of-page format is the name of the filehandle with
"_TOP" appended, but it may be dynamically set to the format of your
choice by assigning the name to the C<$^> variable while the filehandle is
selected.  The number of lines remaining on the current page is in
variable C<$->, which can be set to C<0> to force a new page.

If FILEHANDLE is unspecified, output goes to the current default output
channel, which starts out as STDOUT but may be changed by the
C<select> operator.  If the FILEHANDLE is an EXPR, then the expression
is evaluated and the resulting string is used to look up the name of
the FILEHANDLE at run time.  For more on formats, see L<perlform>.

Note that write is I<not> the opposite of C<read>.  Unfortunately.

=item y///

The transliteration operator.  Same as C<tr///>.  See L<perlop>.


--- NEW FILE: perlbook.pod ---
=head1 NAME

perlbook - Perl book information


The Camel Book, officially known as I<Programming Perl, Third Edition>,
by Larry Wall et al, is the definitive reference work covering nearly
all of Perl.  You can order it and other Perl books from O'Reilly &
Associates, 1-800-998-9938.  Local/overseas is +1 707 829 0515.  If you
can locate an O'Reilly order form, you can also fax to +1 707 829 0104.
If you're web-connected, you can even mosey on over to
L<http://www.oreilly.com/> for an online order form.

Other Perl books from various publishers and authors 
can be found listed in L<perlfaq2> or on the web at

--- NEW FILE: perlref.pod ---
=head1 NAME
X<reference> X<pointer> X<data structure> X<structure> X<struct>

perlref - Perl references and nested data structures

=head1 NOTE

This is complete documentation about all aspects of references.
For a shorter, tutorial introduction to just the essential features,
see L<perlreftut>.


Before release 5 of Perl it was difficult to represent complex data
structures, because all references had to be symbolic--and even then
it was difficult to refer to a variable instead of a symbol table entry.
Perl now not only makes it easier to use symbolic references to variables,
but also lets you have "hard" references to any piece of data or code.
Any scalar may hold a hard reference.  Because arrays and hashes contain
scalars, you can now easily build arrays of arrays, arrays of hashes,
hashes of arrays, arrays of hashes of functions, and so on.

Hard references are smart--they keep track of reference counts for you,
automatically freeing the thing referred to when its reference count goes
to zero.  (Reference counts for values in self-referential or
cyclic data structures may not go to zero without a little help; see
L<perlobj/"Two-Phased Garbage Collection"> for a detailed explanation.)
If that thing happens to be an object, the object is destructed.  See
L<perlobj> for more about objects.  (In a sense, everything in Perl is an
object, but we usually reserve the word for references to objects that
have been officially "blessed" into a class package.)

Symbolic references are names of variables or other objects, just as a
symbolic link in a Unix filesystem contains merely the name of a file.
The C<*glob> notation is something of a symbolic reference.  (Symbolic
references are sometimes called "soft references", but please don't call
them that; references are confusing enough without useless synonyms.)
X<reference, symbolic> X<reference, soft>
X<symbolic reference> X<soft reference>

In contrast, hard references are more like hard links in a Unix file
system: They are used to access an underlying object without concern for
what its (other) name is.  When the word "reference" is used without an
adjective, as in the following paragraph, it is usually talking about a
hard reference.
X<reference, hard> X<hard reference>

References are easy to use in Perl.  There is just one overriding
principle: Perl does no implicit referencing or dereferencing.  When a
scalar is holding a reference, it always behaves as a simple scalar.  It
doesn't magically start being an array or hash or subroutine; you have to
tell it explicitly to do so, by dereferencing it.

=head2 Making References
X<reference, creation> X<referencing>

References can be created in several ways.

=over 4

=item 1.
X<\> X<backslash>

By using the backslash operator on a variable, subroutine, or value.
(This works much like the & (address-of) operator in C.)  
This typically creates I<another> reference to a variable, because
there's already a reference to the variable in the symbol table.  But
the symbol table reference might go away, and you'll still have the
reference that the backslash returned.  Here are some examples:

    $scalarref = \$foo;
    $arrayref  = \@ARGV;
    $hashref   = \%ENV;
    $coderef   = \&handler;
    $globref   = \*foo;

It isn't possible to create a true reference to an IO handle (filehandle
or dirhandle) using the backslash operator.  The most you can get is a
reference to a typeglob, which is actually a complete symbol table entry.
But see the explanation of the C<*foo{THING}> syntax below.  However,
you can still use type globs and globrefs as though they were IO handles.

=item 2.
X<array, anonymous> X<[> X<[]> X<square bracket>
X<bracket, square> X<arrayref> X<array reference> X<reference, array>

A reference to an anonymous array can be created using square

    $arrayref = [1, 2, ['a', 'b', 'c']];

Here we've created a reference to an anonymous array of three elements
whose final element is itself a reference to another anonymous array of three
elements.  (The multidimensional syntax described later can be used to
access this.  For example, after the above, C<< $arrayref->[2][1] >> would have
the value "b".)

Taking a reference to an enumerated list is not the same
as using square brackets--instead it's the same as creating
a list of references!

    @list = (\$a, \@b, \%c);
    @list = \($a, @b, %c);	# same thing!

As a special case, C<\(@foo)> returns a list of references to the contents
of C<@foo>, not a reference to C<@foo> itself.  Likewise for C<%foo>,
except that the key references are to copies (since the keys are just
strings rather than full-fledged scalars).

=item 3.
X<hash, anonymous> X<{> X<{}> X<curly bracket>
X<bracket, curly> X<brace> X<hashref> X<hash reference> X<reference, hash>

A reference to an anonymous hash can be created using curly

    $hashref = {
	'Adam'  => 'Eve',
	'Clyde' => 'Bonnie',

Anonymous hash and array composers like these can be intermixed freely to
produce as complicated a structure as you want.  The multidimensional
syntax described below works for these too.  The values above are
literals, but variables and expressions would work just as well, because
assignment operators in Perl (even within local() or my()) are executable
statements, not compile-time declarations.

Because curly brackets (braces) are used for several other things
including BLOCKs, you may occasionally have to disambiguate braces at the
beginning of a statement by putting a C<+> or a C<return> in front so
that Perl realizes the opening brace isn't starting a BLOCK.  The economy and
mnemonic value of using curlies is deemed worth this occasional extra

For example, if you wanted a function to make a new hash and return a
reference to it, you have these options:

    sub hashem {        { @_ } }   # silently wrong
    sub hashem {       +{ @_ } }   # ok
    sub hashem { return { @_ } }   # ok

On the other hand, if you want the other meaning, you can do this:

    sub showem {        { @_ } }   # ambiguous (currently ok, but may change)
    sub showem {       {; @_ } }   # ok
    sub showem { { return @_ } }   # ok

The leading C<+{> and C<{;> always serve to disambiguate
the expression to mean either the HASH reference, or the BLOCK.

=item 4.
X<subroutine, anonymous> X<subroutine, reference> X<reference, subroutine>
X<scope, lexical> X<closure> X<lexical> X<lexical scope>

A reference to an anonymous subroutine can be created by using
C<sub> without a subname:

    $coderef = sub { print "Boink!\n" };

Note the semicolon.  Except for the code
inside not being immediately executed, a C<sub {}> is not so much a
declaration as it is an operator, like C<do{}> or C<eval{}>.  (However, no
matter how many times you execute that particular line (unless you're in an
C<eval("...")>), $coderef will still have a reference to the I<same>
anonymous subroutine.)

Anonymous subroutines act as closures with respect to my() variables,
that is, variables lexically visible within the current scope.  Closure
is a notion out of the Lisp world that says if you define an anonymous
function in a particular lexical context, it pretends to run in that
context even when it's called outside the context.

In human terms, it's a funny way of passing arguments to a subroutine when
you define it as well as when you call it.  It's useful for setting up
little bits of code to run later, such as callbacks.  You can even
do object-oriented stuff with it, though Perl already provides a different
mechanism to do that--see L<perlobj>.

You might also think of closure as a way to write a subroutine
template without using eval().  Here's a small example of how
closures work:

    sub newprint {
	my $x = shift;
	return sub { my $y = shift; print "$x, $y!\n"; };
    $h = newprint("Howdy");
    $g = newprint("Greetings");

    # Time passes...


This prints

    Howdy, world!
    Greetings, earthlings!

Note particularly that $x continues to refer to the value passed
into newprint() I<despite> "my $x" having gone out of scope by the
time the anonymous subroutine runs.  That's what a closure is all

This applies only to lexical variables, by the way.  Dynamic variables
continue to work as they have always worked.  Closure is not something
that most Perl programmers need trouble themselves about to begin with.

=item 5.
X<constructor> X<new>

References are often returned by special subroutines called constructors.
Perl objects are just references to a special type of object that happens to know
which package it's associated with.  Constructors are just special
subroutines that know how to create that association.  They do so by
starting with an ordinary reference, and it remains an ordinary reference
even while it's also being an object.  Constructors are often
named new() and called indirectly:

    $objref = new Doggie (Tail => 'short', Ears => 'long');

But don't have to be:

    $objref   = Doggie->new(Tail => 'short', Ears => 'long');

    use Term::Cap;
    $terminal = Term::Cap->Tgetent( { OSPEED => 9600 });

    use Tk;
    $main    = MainWindow->new();
    $menubar = $main->Frame(-relief              => "raised",
                            -borderwidth         => 2)

=item 6.

References of the appropriate type can spring into existence if you
dereference them in a context that assumes they exist.  Because we haven't
talked about dereferencing yet, we can't show you any examples yet.

=item 7.
X<*foo{THING}> X<*>

A reference can be created by using a special syntax, lovingly known as
the *foo{THING} syntax.  *foo{THING} returns a reference to the THING
slot in *foo (which is the symbol table entry which holds everything
known as foo).

    $scalarref = *foo{SCALAR};
    $arrayref  = *ARGV{ARRAY};
    $hashref   = *ENV{HASH};
    $coderef   = *handler{CODE};
    $ioref     = *STDIN{IO};
    $globref   = *foo{GLOB};
    $formatref = *foo{FORMAT};

All of these are self-explanatory except for C<*foo{IO}>.  It returns
the IO handle, used for file handles (L<perlfunc/open>), sockets
(L<perlfunc/socket> and L<perlfunc/socketpair>), and directory
handles (L<perlfunc/opendir>).  For compatibility with previous
versions of Perl, C<*foo{FILEHANDLE}> is a synonym for C<*foo{IO}>, though it
is deprecated as of 5.8.0.  If deprecation warnings are in effect, it will warn
of its use.

C<*foo{THING}> returns undef if that particular THING hasn't been used yet,
except in the case of scalars.  C<*foo{SCALAR}> returns a reference to an
anonymous scalar if $foo hasn't been used yet.  This might change in a
future release.

C<*foo{IO}> is an alternative to the C<*HANDLE> mechanism given in
L<perldata/"Typeglobs and Filehandles"> for passing filehandles
into or out of subroutines, or storing into larger data structures.
Its disadvantage is that it won't create a new filehandle for you.
Its advantage is that you have less risk of clobbering more than
you want to with a typeglob assignment.  (It still conflates file
and directory handles, though.)  However, if you assign the incoming
value to a scalar instead of a typeglob as we do in the examples
below, there's no risk of that happening.

    splutter(*STDOUT);		# pass the whole glob
    splutter(*STDOUT{IO});	# pass both file and dir handles

    sub splutter {
	my $fh = shift;
	print $fh "her um well a hmmm\n";

    $rec = get_rec(*STDIN);	# pass the whole glob
    $rec = get_rec(*STDIN{IO}); # pass both file and dir handles

    sub get_rec {
	my $fh = shift;
	return scalar <$fh>;


=head2 Using References
X<reference, use> X<dereferencing> X<dereference>

That's it for creating references.  By now you're probably dying to
know how to use references to get back to your long-lost data.  There
are several basic methods.

=over 4

=item 1.

Anywhere you'd put an identifier (or chain of identifiers) as part
of a variable or subroutine name, you can replace the identifier with
a simple scalar variable containing a reference of the correct type:

    $bar = $$scalarref;
    push(@$arrayref, $filename);
    $$arrayref[0] = "January";
    $$hashref{"KEY"} = "VALUE";
    print $globref "output\n";

It's important to understand that we are specifically I<not> dereferencing
C<$arrayref[0]> or C<$hashref{"KEY"}> there.  The dereference of the
scalar variable happens I<before> it does any key lookups.  Anything more
complicated than a simple scalar variable must use methods 2 or 3 below.
However, a "simple scalar" includes an identifier that itself uses method
1 recursively.  Therefore, the following prints "howdy".

    $refrefref = \\\"howdy";
    print $$$$refrefref;

=item 2.
X<${}> X<@{}> X<%{}>

Anywhere you'd put an identifier (or chain of identifiers) as part of a
variable or subroutine name, you can replace the identifier with a
BLOCK returning a reference of the correct type.  In other words, the
previous examples could be written like this:

    $bar = ${$scalarref};
    push(@{$arrayref}, $filename);
    ${$arrayref}[0] = "January";
    ${$hashref}{"KEY"} = "VALUE";
    $globref->print("output\n");  # iff IO::Handle is loaded

Admittedly, it's a little silly to use the curlies in this case, but
the BLOCK can contain any arbitrary expression, in particular,
subscripted expressions:

    &{ $dispatch{$index} }(1,2,3);	# call correct routine

Because of being able to omit the curlies for the simple case of C<$$x>,
people often make the mistake of viewing the dereferencing symbols as
proper operators, and wonder about their precedence.  If they were,
though, you could use parentheses instead of braces.  That's not the case.
Consider the difference below; case 0 is a short-hand version of case 1,
I<not> case 2:

    $$hashref{"KEY"}   = "VALUE";	# CASE 0
    ${$hashref}{"KEY"} = "VALUE";	# CASE 1
    ${$hashref{"KEY"}} = "VALUE";	# CASE 2
    ${$hashref->{"KEY"}} = "VALUE";	# CASE 3

Case 2 is also deceptive in that you're accessing a variable
called %hashref, not dereferencing through $hashref to the hash
it's presumably referencing.  That would be case 3.

=item 3.
X<autovivification> X<< -> >> X<arrow>

Subroutine calls and lookups of individual array elements arise often
enough that it gets cumbersome to use method 2.  As a form of
syntactic sugar, the examples for method 2 may be written:

    $arrayref->[0] = "January";   # Array element
    $hashref->{"KEY"} = "VALUE";  # Hash element
    $coderef->(1,2,3);            # Subroutine call

The left side of the arrow can be any expression returning a reference,
including a previous dereference.  Note that C<$array[$x]> is I<not> the
same thing as C<< $array->[$x] >> here:

    $array[$x]->{"foo"}->[0] = "January";

This is one of the cases we mentioned earlier in which references could
spring into existence when in an lvalue context.  Before this
statement, C<$array[$x]> may have been undefined.  If so, it's
automatically defined with a hash reference so that we can look up
C<{"foo"}> in it.  Likewise C<< $array[$x]->{"foo"} >> will automatically get
defined with an array reference so that we can look up C<[0]> in it.
This process is called I<autovivification>.

One more thing here.  The arrow is optional I<between> brackets
subscripts, so you can shrink the above down to

    $array[$x]{"foo"}[0] = "January";

Which, in the degenerate case of using only ordinary arrays, gives you
multidimensional arrays just like C's:

    $score[$x][$y][$z] += 42;

Well, okay, not entirely like C's arrays, actually.  C doesn't know how
to grow its arrays on demand.  Perl does.

=item 4.

If a reference happens to be a reference to an object, then there are
probably methods to access the things referred to, and you should probably
stick to those methods unless you're in the class package that defines the
object's methods.  In other words, be nice, and don't violate the object's
encapsulation without a very good reason.  Perl does not enforce
encapsulation.  We are not totalitarians here.  We do expect some basic
civility though.


Using a string or number as a reference produces a symbolic reference,
as explained above.  Using a reference as a number produces an
integer representing its storage location in memory.  The only
useful thing to be done with this is to compare two references
numerically to see whether they refer to the same location.
X<reference, numeric context>

    if ($ref1 == $ref2) {  # cheap numeric compare of references
	print "refs 1 and 2 refer to the same thing\n";

Using a reference as a string produces both its referent's type,
including any package blessing as described in L<perlobj>, as well
as the numeric address expressed in hex.  The ref() operator returns
just the type of thing the reference is pointing to, without the
address.  See L<perlfunc/ref> for details and examples of its use.
X<reference, string context>

The bless() operator may be used to associate the object a reference
points to with a package functioning as an object class.  See L<perlobj>.

A typeglob may be dereferenced the same way a reference can, because
the dereference syntax always indicates the type of reference desired.
So C<${*foo}> and C<${\$foo}> both indicate the same scalar variable.

Here's a trick for interpolating a subroutine call into a string:

    print "My sub returned @{[mysub(1,2,3)]} that time.\n";

The way it works is that when the C<@{...}> is seen in the double-quoted
string, it's evaluated as a block.  The block creates a reference to an
anonymous array containing the results of the call to C<mysub(1,2,3)>.  So
the whole block returns a reference to an array, which is then
dereferenced by C<@{...}> and stuck into the double-quoted string. This
chicanery is also useful for arbitrary expressions:

    print "That yields @{[$n + 5]} widgets\n";

=head2 Symbolic references
X<reference, symbolic> X<reference, soft>
X<symbolic reference> X<soft reference>

We said that references spring into existence as necessary if they are
undefined, but we didn't say what happens if a value used as a
reference is already defined, but I<isn't> a hard reference.  If you
use it as a reference, it'll be treated as a symbolic
reference.  That is, the value of the scalar is taken to be the I<name>
of a variable, rather than a direct link to a (possibly) anonymous

People frequently expect it to work like this.  So it does.

    $name = "foo";
    $$name = 1;			# Sets $foo
    ${$name} = 2;		# Sets $foo
    ${$name x 2} = 3;		# Sets $foofoo
    $name->[0] = 4;		# Sets $foo[0]
    @$name = ();		# Clears @foo
    &$name();			# Calls &foo() (as in Perl 4)
    $pack = "THAT";
    ${"${pack}::$name"} = 5;	# Sets $THAT::foo without eval

This is powerful, and slightly dangerous, in that it's possible
to intend (with the utmost sincerity) to use a hard reference, and
accidentally use a symbolic reference instead.  To protect against
that, you can say

    use strict 'refs';

and then only hard references will be allowed for the rest of the enclosing
block.  An inner block may countermand that with

    no strict 'refs';

Only package variables (globals, even if localized) are visible to
symbolic references.  Lexical variables (declared with my()) aren't in
a symbol table, and thus are invisible to this mechanism.  For example:

    local $value = 10;
    $ref = "value";
	my $value = 20;
	print $$ref;

This will still print 10, not 20.  Remember that local() affects package
variables, which are all "global" to the package.

=head2 Not-so-symbolic references

A new feature contributing to readability in perl version 5.001 is that the
brackets around a symbolic reference behave more like quotes, just as they
always have within a string.  That is,

    $push = "pop on ";
    print "${push}over";

has always meant to print "pop on over", even though push is
a reserved word.  This has been generalized to work the same outside
of quotes, so that

    print ${push} . "over";

and even

    print ${ push } . "over";

will have the same effect.  (This would have been a syntax error in
Perl 5.000, though Perl 4 allowed it in the spaceless form.)  This
construct is I<not> considered to be a symbolic reference when you're
using strict refs:

    use strict 'refs';
    ${ bareword };	# Okay, means $bareword.
    ${ "bareword" };	# Error, symbolic reference.

Similarly, because of all the subscripting that is done using single
words, we've applied the same rule to any bareword that is used for
subscripting a hash.  So now, instead of writing

    $array{ "aaa" }{ "bbb" }{ "ccc" }

you can write just

    $array{ aaa }{ bbb }{ ccc }

and not worry about whether the subscripts are reserved words.  In the
rare event that you do wish to do something like

    $array{ shift }

you can force interpretation as a reserved word by adding anything that
makes it more than a bareword:

    $array{ shift() }
    $array{ +shift }
    $array{ shift @_ }

The C<use warnings> pragma or the B<-w> switch will warn you if it
interprets a reserved word as a string.
But it will no longer warn you about using lowercase words, because the
string is effectively quoted.

=head2 Pseudo-hashes: Using an array as a hash
X<pseudo-hash> X<pseudo hash> X<pseudohash>

B<WARNING>:  This section describes an experimental feature.  Details may
change without notice in future versions.

B<NOTE>: The current user-visible implementation of pseudo-hashes
(the weird use of the first array element) is deprecated starting from
Perl 5.8.0 and will be removed in Perl 5.10.0, and the feature will be
implemented differently.  Not only is the current interface rather ugly,
but the current implementation slows down normal array and hash use quite
noticeably.  The 'fields' pragma interface will remain available.

Beginning with release 5.005 of Perl, you may use an array reference
in some contexts that would normally require a hash reference.  This
allows you to access array elements using symbolic names, as if they
were fields in a structure.

For this to work, the array must contain extra information.  The first
element of the array has to be a hash reference that maps field names
to array indices.  Here is an example:

    $struct = [{foo => 1, bar => 2}, "FOO", "BAR"];

    $struct->{foo};  # same as $struct->[1], i.e. "FOO"
    $struct->{bar};  # same as $struct->[2], i.e. "BAR"

    keys %$struct;   # will return ("foo", "bar") in some order
    values %$struct; # will return ("FOO", "BAR") in same some order

    while (my($k,$v) = each %$struct) {
       print "$k => $v\n";

Perl will raise an exception if you try to access nonexistent fields.
To avoid inconsistencies, always use the fields::phash() function
provided by the C<fields> pragma.

    use fields;
    $pseudohash = fields::phash(foo => "FOO", bar => "BAR");

For better performance, Perl can also do the translation from field
names to array indices at compile time for typed object references.
See L<fields>.

There are two ways to check for the existence of a key in a
pseudo-hash.  The first is to use exists().  This checks to see if the
given field has ever been set.  It acts this way to match the behavior
of a regular hash.  For instance:

    use fields;
    $phash = fields::phash([qw(foo bar pants)], ['FOO']);
    $phash->{pants} = undef;

    print exists $phash->{foo};    # true, 'foo' was set in the declaration
    print exists $phash->{bar};    # false, 'bar' has not been used.
    print exists $phash->{pants};  # true, your 'pants' have been touched

The second is to use exists() on the hash reference sitting in the
first array element.  This checks to see if the given key is a valid
field in the pseudo-hash.

    print exists $phash->[0]{bar};	# true, 'bar' is a valid field
    print exists $phash->[0]{shoes};# false, 'shoes' can't be used

delete() on a pseudo-hash element only deletes the value corresponding
to the key, not the key itself.  To delete the key, you'll have to
explicitly delete it from the first hash element.

    print delete $phash->{foo};     # prints $phash->[1], "FOO"
    print exists $phash->{foo};     # false
    print exists $phash->[0]{foo};  # true, key still exists
    print delete $phash->[0]{foo};  # now key is gone
    print $phash->{foo};            # runtime exception

=head2 Function Templates
X<scope, lexical> X<closure> X<lexical> X<lexical scope>
X<subroutine, nested> X<sub, nested> X<subroutine, local> X<sub, local>

As explained above, an anonymous function with access to the lexical
variables visible when that function was compiled, creates a closure.  It
retains access to those variables even though it doesn't get run until
later, such as in a signal handler or a Tk callback.

Using a closure as a function template allows us to generate many functions
that act similarly.  Suppose you wanted functions named after the colors
that generated HTML font changes for the various colors:

    print "Be ", red("careful"), "with that ", green("light");

The red() and green() functions would be similar.  To create these,
we'll assign a closure to a typeglob of the name of the function we're
trying to build.  

    @colors = qw(red blue green yellow orange purple violet);
    for my $name (@colors) {
        no strict 'refs';	# allow symbol table manipulation
        *$name = *{uc $name} = sub { "<FONT COLOR='$name'>@_</FONT>" };

Now all those different functions appear to exist independently.  You can
call red(), RED(), blue(), BLUE(), green(), etc.  This technique saves on
both compile time and memory use, and is less error-prone as well, since
syntax checks happen at compile time.  It's critical that any variables in
the anonymous subroutine be lexicals in order to create a proper closure.
That's the reasons for the C<my> on the loop iteration variable.

This is one of the only places where giving a prototype to a closure makes
much sense.  If you wanted to impose scalar context on the arguments of
these functions (probably not a wise idea for this particular example),
you could have written it this way instead:

    *$name = sub ($) { "<FONT COLOR='$name'>$_[0]</FONT>" };

However, since prototype checking happens at compile time, the assignment
above happens too late to be of much use.  You could address this by
putting the whole loop of assignments within a BEGIN block, forcing it
to occur during compilation.

Access to lexicals that change over type--like those in the C<for> loop
above--only works with closures, not general subroutines.  In the general
case, then, named subroutines do not nest properly, although anonymous
ones do. Thus is because named subroutines are created (and capture any
outer lexicals) only once at compile time, whereas anonymous subroutines
get to capture each time you execute the 'sub' operator.  If you are
accustomed to using nested subroutines in other programming languages with
their own private variables, you'll have to work at it a bit in Perl.  The
intuitive coding of this type of thing incurs mysterious warnings about
"will not stay shared".  For example, this won't work:

    sub outer {
        my $x = $_[0] + 35;
        sub inner { return $x * 19 }   # WRONG
        return $x + inner();

A work-around is the following:

    sub outer {
        my $x = $_[0] + 35;
        local *inner = sub { return $x * 19 };
        return $x + inner();

Now inner() can only be called from within outer(), because of the
temporary assignments of the closure (anonymous subroutine).  But when
it does, it has normal access to the lexical variable $x from the scope
of outer().

This has the interesting effect of creating a function local to another
function, something not normally supported in Perl.

=head1 WARNING
X<reference, string context> X<reference, use as hash key>

You may not (usefully) use a reference as the key to a hash.  It will be
converted into a string:

    $x{ \$a } = $a;

If you try to dereference the key, it won't do a hard dereference, and
you won't accomplish what you're attempting.  You might want to do something
more like

    $r = \@a;
    $x{ $r } = $r;

And then at least you can use the values(), which will be
real refs, instead of the keys(), which won't.

The standard Tie::RefHash module provides a convenient workaround to this.

=head1 SEE ALSO

Besides the obvious documents, source code can be instructive.
Some pathological examples of the use of references can be found
in the F<t/op/ref.t> regression test in the Perl source directory.

See also L<perldsc> and L<perllol> for how to use references to create
complex data structures, and L<perltoot>, L<perlobj>, and L<perlbot>
for how to use them to create objects.

--- NEW FILE: perldsc.pod ---
=head1 NAME
X<data structure> X<complex data structure> X<struct>

perldsc - Perl Data Structures Cookbook


The single feature most sorely lacking in the Perl programming language
prior to its 5.0 release was complex data structures.  Even without direct
language support, some valiant programmers did manage to emulate them, but
it was hard work and not for the faint of heart.  You could occasionally
get away with the C<$m{$AoA,$b}> notation borrowed from B<awk> in which the
keys are actually more like a single concatenated string C<"$AoA$b">, but
traversal and sorting were difficult.  More desperate programmers even
hacked Perl's internal symbol table directly, a strategy that proved hard
to develop and maintain--to put it mildly.

The 5.0 release of Perl let us have complex data structures.  You
may now write something like this and all of a sudden, you'd have an array
with three dimensions!

    for $x (1 .. 10) {
	for $y (1 .. 10) {
	    for $z (1 .. 10) {
		$AoA[$x][$y][$z] =
		    $x ** $y + $z;

Alas, however simple this may appear, underneath it's a much more
elaborate construct than meets the eye!

How do you print it out?  Why can't you say just C<print @AoA>?  How do
you sort it?  How can you pass it to a function or get one of these back
from a function?  Is it an object?  Can you save it to disk to read
back later?  How do you access whole rows or columns of that matrix?  Do
all the values have to be numeric?

As you see, it's quite easy to become confused.  While some small portion
of the blame for this can be attributed to the reference-based
implementation, it's really more due to a lack of existing documentation with
examples designed for the beginner.

This document is meant to be a detailed but understandable treatment of the
many different sorts of data structures you might want to develop.  It
should also serve as a cookbook of examples.  That way, when you need to
create one of these complex data structures, you can just pinch, pilfer, or
purloin a drop-in example from here.

Let's look at each of these possible constructs in detail.  There are separate
sections on each of the following:

=over 5

=item * arrays of arrays

=item * hashes of arrays

=item * arrays of hashes

=item * hashes of hashes

=item * more elaborate constructs


But for now, let's look at general issues common to all
these types of data structures.

X<reference> X<dereference> X<dereferencing> X<pointer>

The most important thing to understand about all data structures in Perl
-- including multidimensional arrays--is that even though they might
appear otherwise, Perl C<@ARRAY>s and C<%HASH>es are all internally
one-dimensional.  They can hold only scalar values (meaning a string,
number, or a reference).  They cannot directly contain other arrays or
hashes, but instead contain I<references> to other arrays or hashes.
X<multidimensional array> X<array, multidimensional>

You can't use a reference to an array or hash in quite the same way that you
would a real array or hash.  For C or C++ programmers unused to
distinguishing between arrays and pointers to the same, this can be
confusing.  If so, just think of it as the difference between a structure
and a pointer to a structure.

You can (and should) read more about references in the perlref(1) man
page.  Briefly, references are rather like pointers that know what they
point to.  (Objects are also a kind of reference, but we won't be needing
them right away--if ever.)  This means that when you have something which
looks to you like an access to a two-or-more-dimensional array and/or hash,
what's really going on is that the base type is
merely a one-dimensional entity that contains references to the next
level.  It's just that you can I<use> it as though it were a
two-dimensional one.  This is actually the way almost all C
multidimensional arrays work as well.

    $array[7][12]			# array of arrays
    $array[7]{string}			# array of hashes
    $hash{string}[7]			# hash of arrays
    $hash{string}{'another string'}	# hash of hashes

Now, because the top level contains only references, if you try to print
out your array in with a simple print() function, you'll get something
that doesn't look very nice, like this:

    @AoA = ( [2, 3], [4, 5, 7], [0] );
    print $AoA[1][2];
    print @AoA;

That's because Perl doesn't (ever) implicitly dereference your variables.
If you want to get at the thing a reference is referring to, then you have
to do this yourself using either prefix typing indicators, like
C<${$blah}>, C<@{$blah}>, C<@{$blah[$i]}>, or else postfix pointer arrows,
like C<$a-E<gt>[3]>, C<$h-E<gt>{fred}>, or even C<$ob-E<gt>method()-E<gt>[3]>.


The two most common mistakes made in constructing something like
an array of arrays is either accidentally counting the number of
elements or else taking a reference to the same memory location
repeatedly.  Here's the case where you just get the count instead
of a nested array:

    for $i (1..10) {
	@array = somefunc($i);
	$AoA[$i] = @array;	# WRONG!

That's just the simple case of assigning an array to a scalar and getting
its element count.  If that's what you really and truly want, then you
might do well to consider being a tad more explicit about it, like this:

    for $i (1..10) {
	@array = somefunc($i);
	$counts[$i] = scalar @array;

Here's the case of taking a reference to the same memory location
again and again:

    for $i (1..10) {
	@array = somefunc($i);
	$AoA[$i] = \@array;	# WRONG!

So, what's the big problem with that?  It looks right, doesn't it?
After all, I just told you that you need an array of references, so by
golly, you've made me one!

Unfortunately, while this is true, it's still broken.  All the references
in @AoA refer to the I<very same place>, and they will therefore all hold
whatever was last in @array!  It's similar to the problem demonstrated in
the following C program:

    #include <pwd.h>
    main() {
	struct passwd *getpwnam(), *rp, *dp;
	rp = getpwnam("root");
	dp = getpwnam("daemon");

	printf("daemon name is %s\nroot name is %s\n",
		dp->pw_name, rp->pw_name);

Which will print

    daemon name is daemon
    root name is daemon

The problem is that both C<rp> and C<dp> are pointers to the same location
in memory!  In C, you'd have to remember to malloc() yourself some new
memory.  In Perl, you'll want to use the array constructor C<[]> or the
hash constructor C<{}> instead.   Here's the right way to do the preceding
broken code fragments:
X<[]> X<{}>

    for $i (1..10) {
	@array = somefunc($i);
	$AoA[$i] = [ @array ];

The square brackets make a reference to a new array with a I<copy>
of what's in @array at the time of the assignment.  This is what
you want.

Note that this will produce something similar, but it's
much harder to read:

    for $i (1..10) {
	@array = 0 .. $i;
	@{$AoA[$i]} = @array;

Is it the same?  Well, maybe so--and maybe not.  The subtle difference
is that when you assign something in square brackets, you know for sure
it's always a brand new reference with a new I<copy> of the data.
Something else could be going on in this new case with the C<@{$AoA[$i]}}>
dereference on the left-hand-side of the assignment.  It all depends on
whether C<$AoA[$i]> had been undefined to start with, or whether it
already contained a reference.  If you had already populated @AoA with
references, as in

    $AoA[3] = \@another_array;

Then the assignment with the indirection on the left-hand-side would
use the existing reference that was already there:

    @{$AoA[3]} = @array;

Of course, this I<would> have the "interesting" effect of clobbering
@another_array.  (Have you ever noticed how when a programmer says
something is "interesting", that rather than meaning "intriguing",
they're disturbingly more apt to mean that it's "annoying",
"difficult", or both?  :-)

So just remember always to use the array or hash constructors with C<[]>
or C<{}>, and you'll be fine, although it's not always optimally

Surprisingly, the following dangerous-looking construct will
actually work out fine:

    for $i (1..10) {
        my @array = somefunc($i);
        $AoA[$i] = \@array;

That's because my() is more of a run-time statement than it is a
compile-time declaration I<per se>.  This means that the my() variable is
remade afresh each time through the loop.  So even though it I<looks> as
though you stored the same variable reference each time, you actually did
not!  This is a subtle distinction that can produce more efficient code at
the risk of misleading all but the most experienced of programmers.  So I
usually advise against teaching it to beginners.  In fact, except for
passing arguments to functions, I seldom like to see the gimme-a-reference
operator (backslash) used much at all in code.  Instead, I advise
beginners that they (and most of the rest of us) should try to use the
much more easily understood constructors C<[]> and C<{}> instead of
relying upon lexical (or dynamic) scoping and hidden reference-counting to
do the right thing behind the scenes.

In summary:

    $AoA[$i] = [ @array ];	# usually best
    $AoA[$i] = \@array;		# perilous; just how my() was that array?
    @{ $AoA[$i] } = @array;	# way too tricky for most programmers

X<dereference, precedence> X<dereferencing, precedence>

Speaking of things like C<@{$AoA[$i]}>, the following are actually the
same thing:
X<< -> >>

    $aref->[2][2]	# clear
    $$aref[2][2]	# confusing

That's because Perl's precedence rules on its five prefix dereferencers
(which look like someone swearing: C<$ @ * % &>) make them bind more
tightly than the postfix subscripting brackets or braces!  This will no
doubt come as a great shock to the C or C++ programmer, who is quite
accustomed to using C<*a[i]> to mean what's pointed to by the I<i'th>
element of C<a>.  That is, they first take the subscript, and only then
dereference the thing at that subscript.  That's fine in C, but this isn't C.

The seemingly equivalent construct in Perl, C<$$aref[$i]> first does
the deref of $aref, making it take $aref as a reference to an
array, and then dereference that, and finally tell you the I<i'th> value
of the array pointed to by $AoA. If you wanted the C notion, you'd have to
write C<${$AoA[$i]}> to force the C<$AoA[$i]> to get evaluated first
before the leading C<$> dereferencer.

=head1 WHY YOU SHOULD ALWAYS C<use strict>

If this is starting to sound scarier than it's worth, relax.  Perl has
some features to help you avoid its most common pitfalls.  The best
way to avoid getting confused is to start every program like this:

    #!/usr/bin/perl -w
    use strict;

This way, you'll be forced to declare all your variables with my() and
also disallow accidental "symbolic dereferencing".  Therefore if you'd done

    my $aref = [
	[ "fred", "barney", "pebbles", "bambam", "dino", ],
	[ "homer", "bart", "marge", "maggie", ],
	[ "george", "jane", "elroy", "judy", ],

    print $aref[2][2];

The compiler would immediately flag that as an error I<at compile time>,
because you were accidentally accessing C<@aref>, an undeclared
variable, and it would thereby remind you to write instead:

    print $aref->[2][2]

X<data structure, debugging> X<complex data structure, debugging>
X<AoA, debugging> X<HoA, debugging> X<AoH, debugging> X<HoH, debugging>
X<array of arrays, debugging> X<hash of arrays, debugging>
X<array of hashes, debugging> X<hash of hashes, debugging>

Before version 5.002, the standard Perl debugger didn't do a very nice job of
printing out complex data structures.  With 5.002 or above, the
debugger includes several new features, including command line editing as
well as the C<x> command to dump out complex data structures.  For
example, given the assignment to $AoA above, here's the debugger output:

    DB<1> x $AoA
    $AoA = ARRAY(0x13b5a0)
       0  ARRAY(0x1f0a24)
	  0  'fred'
	  1  'barney'
	  2  'pebbles'
	  3  'bambam'
	  4  'dino'
       1  ARRAY(0x13b558)
	  0  'homer'
	  1  'bart'
	  2  'marge'
	  3  'maggie'
       2  ARRAY(0x13b540)
	  0  'george'
	  1  'jane'
	  2  'elroy'
	  3  'judy'


Presented with little comment (these will get their own manpages someday)
here are short code examples illustrating access of various
types of data structures.

X<array of arrays> X<AoA>

=head2 Declaration of an ARRAY OF ARRAYS

 @AoA = (
        [ "fred", "barney" ],
        [ "george", "jane", "elroy" ],
        [ "homer", "marge", "bart" ],

=head2 Generation of an ARRAY OF ARRAYS

 # reading from file
 while ( <> ) {
     push @AoA, [ split ];

 # calling a function
 for $i ( 1 .. 10 ) {
     $AoA[$i] = [ somefunc($i) ];

 # using temp vars
 for $i ( 1 .. 10 ) {
     @tmp = somefunc($i);
     $AoA[$i] = [ @tmp ];

 # add to an existing row
 push @{ $AoA[0] }, "wilma", "betty";

=head2 Access and Printing of an ARRAY OF ARRAYS

 # one element
 $AoA[0][0] = "Fred";

 # another element
 $AoA[1][1] =~ s/(\w)/\u$1/;

 # print the whole thing with refs
 for $aref ( @AoA ) {
     print "\t [ @$aref ],\n";

 # print the whole thing with indices
 for $i ( 0 .. $#AoA ) {
     print "\t [ @{$AoA[$i]} ],\n";

 # print the whole thing one at a time
 for $i ( 0 .. $#AoA ) {
     for $j ( 0 .. $#{ $AoA[$i] } ) {
         print "elt $i $j is $AoA[$i][$j]\n";

X<hash of arrays> X<HoA>

=head2 Declaration of a HASH OF ARRAYS

 %HoA = (
        flintstones        => [ "fred", "barney" ],
        jetsons            => [ "george", "jane", "elroy" ],
        simpsons           => [ "homer", "marge", "bart" ],

=head2 Generation of a HASH OF ARRAYS

 # reading from file
 # flintstones: fred barney wilma dino
 while ( <> ) {
     next unless s/^(.*?):\s*//;
     $HoA{$1} = [ split ];

 # reading from file; more temps
 # flintstones: fred barney wilma dino
 while ( $line = <> ) {
     ($who, $rest) = split /:\s*/, $line, 2;
     @fields = split ' ', $rest;
     $HoA{$who} = [ @fields ];

 # calling a function that returns a list
 for $group ( "simpsons", "jetsons", "flintstones" ) {
     $HoA{$group} = [ get_family($group) ];

 # likewise, but using temps
 for $group ( "simpsons", "jetsons", "flintstones" ) {
     @members = get_family($group);
     $HoA{$group} = [ @members ];

 # append new members to an existing family
 push @{ $HoA{"flintstones"} }, "wilma", "betty";

=head2 Access and Printing of a HASH OF ARRAYS

 # one element
 $HoA{flintstones}[0] = "Fred";

 # another element
 $HoA{simpsons}[1] =~ s/(\w)/\u$1/;

 # print the whole thing
 foreach $family ( keys %HoA ) {
     print "$family: @{ $HoA{$family} }\n"

 # print the whole thing with indices
 foreach $family ( keys %HoA ) {
     print "family: ";
     foreach $i ( 0 .. $#{ $HoA{$family} } ) {
         print " $i = $HoA{$family}[$i]";
     print "\n";

 # print the whole thing sorted by number of members
 foreach $family ( sort { @{$HoA{$b}} <=> @{$HoA{$a}} } keys %HoA ) {
     print "$family: @{ $HoA{$family} }\n"

 # print the whole thing sorted by number of members and name
 foreach $family ( sort {
			    @{$HoA{$b}} <=> @{$HoA{$a}}
				    $a cmp $b
	    } keys %HoA )
     print "$family: ", join(", ", sort @{ $HoA{$family} }), "\n";

X<array of hashes> X<AoH>

=head2 Declaration of an ARRAY OF HASHES

 @AoH = (
            Lead     => "fred",
            Friend   => "barney",
            Lead     => "george",
            Wife     => "jane",
            Son      => "elroy",
            Lead     => "homer",
            Wife     => "marge",
            Son      => "bart",

=head2 Generation of an ARRAY OF HASHES

 # reading from file
 # format: LEAD=fred FRIEND=barney
 while ( <> ) {
     $rec = {};
     for $field ( split ) {
         ($key, $value) = split /=/, $field;
         $rec->{$key} = $value;
     push @AoH, $rec;

 # reading from file
 # format: LEAD=fred FRIEND=barney
 # no temp
 while ( <> ) {
     push @AoH, { split /[\s+=]/ };

 # calling a function  that returns a key/value pair list, like
 # "lead","fred","daughter","pebbles"
 while ( %fields = getnextpairset() ) {
     push @AoH, { %fields };

 # likewise, but using no temp vars
 while (<>) {
     push @AoH, { parsepairs($_) };

 # add key/value to an element
 $AoH[0]{pet} = "dino";
 $AoH[2]{pet} = "santa's little helper";

=head2 Access and Printing of an ARRAY OF HASHES

 # one element
 $AoH[0]{lead} = "fred";

 # another element
 $AoH[1]{lead} =~ s/(\w)/\u$1/;

 # print the whole thing with refs
 for $href ( @AoH ) {
     print "{ ";
     for $role ( keys %$href ) {
         print "$role=$href->{$role} ";
     print "}\n";

 # print the whole thing with indices
 for $i ( 0 .. $#AoH ) {
     print "$i is { ";
     for $role ( keys %{ $AoH[$i] } ) {
         print "$role=$AoH[$i]{$role} ";
     print "}\n";

 # print the whole thing one at a time
 for $i ( 0 .. $#AoH ) {
     for $role ( keys %{ $AoH[$i] } ) {
         print "elt $i $role is $AoH[$i]{$role}\n";

X<hass of hashes> X<HoH>

=head2 Declaration of a HASH OF HASHES

 %HoH = (
        flintstones => {
		lead      => "fred",
		pal       => "barney",
        jetsons     => {
		lead      => "george",
		wife      => "jane",
		"his boy" => "elroy",
        simpsons    => {
		lead      => "homer",
		wife      => "marge",
		kid       => "bart",

=head2 Generation of a HASH OF HASHES

 # reading from file
 # flintstones: lead=fred pal=barney wife=wilma pet=dino
 while ( <> ) {
     next unless s/^(.*?):\s*//;
     $who = $1;
     for $field ( split ) {
         ($key, $value) = split /=/, $field;
         $HoH{$who}{$key} = $value;

 # reading from file; more temps
 while ( <> ) {
     next unless s/^(.*?):\s*//;
     $who = $1;
     $rec = {};
     $HoH{$who} = $rec;
     for $field ( split ) {
         ($key, $value) = split /=/, $field;
         $rec->{$key} = $value;

 # calling a function  that returns a key,value hash
 for $group ( "simpsons", "jetsons", "flintstones" ) {
     $HoH{$group} = { get_family($group) };

 # likewise, but using temps
 for $group ( "simpsons", "jetsons", "flintstones" ) {
     %members = get_family($group);
     $HoH{$group} = { %members };

 # append new members to an existing family
 %new_folks = (
     wife => "wilma",
     pet  => "dino",

 for $what (keys %new_folks) {
     $HoH{flintstones}{$what} = $new_folks{$what};

=head2 Access and Printing of a HASH OF HASHES

 # one element
 $HoH{flintstones}{wife} = "wilma";

 # another element
 $HoH{simpsons}{lead} =~ s/(\w)/\u$1/;

 # print the whole thing
 foreach $family ( keys %HoH ) {
     print "$family: { ";
     for $role ( keys %{ $HoH{$family} } ) {
         print "$role=$HoH{$family}{$role} ";
     print "}\n";

 # print the whole thing  somewhat sorted
 foreach $family ( sort keys %HoH ) {
     print "$family: { ";
     for $role ( sort keys %{ $HoH{$family} } ) {
         print "$role=$HoH{$family}{$role} ";
     print "}\n";

 # print the whole thing sorted by number of members
 foreach $family ( sort { keys %{$HoH{$b}} <=> keys %{$HoH{$a}} } keys %HoH ) {
     print "$family: { ";
     for $role ( sort keys %{ $HoH{$family} } ) {
         print "$role=$HoH{$family}{$role} ";
     print "}\n";

 # establish a sort order (rank) for each role
 $i = 0;
 for ( qw(lead wife son daughter pal pet) ) { $rank{$_} = ++$i }

 # now print the whole thing sorted by number of members
 foreach $family ( sort { keys %{ $HoH{$b} } <=> keys %{ $HoH{$a} } } keys %HoH ) {
     print "$family: { ";
     # and print these according to rank order
     for $role ( sort { $rank{$a} <=> $rank{$b} }  keys %{ $HoH{$family} } ) {
         print "$role=$HoH{$family}{$role} ";
     print "}\n";

X<record> X<structure> X<struct>

=head2 Declaration of MORE ELABORATE RECORDS

Here's a sample showing how to create and use a record whose fields are of
many different sorts:

     $rec = {
	 TEXT      => $string,
	 SEQUENCE  => [ @old_values ],
	 LOOKUP    => { %some_table },
	 THATCODE  => \&some_function,
	 THISCODE  => sub { $_[0] ** $_[1] },
	 HANDLE    => \*STDOUT,

     print $rec->{TEXT};

     print $rec->{SEQUENCE}[0];
     $last = pop @ { $rec->{SEQUENCE} };

     print $rec->{LOOKUP}{"key"};
     ($first_k, $first_v) = each %{ $rec->{LOOKUP} };

     $answer = $rec->{THATCODE}->($arg);
     $answer = $rec->{THISCODE}->($arg1, $arg2);

     # careful of extra block braces on fh ref
     print { $rec->{HANDLE} } "a string\n";

     use FileHandle;
     $rec->{HANDLE}->print(" a string\n");

=head2 Declaration of a HASH OF COMPLEX RECORDS

     %TV = (
        flintstones => {
            series   => "flintstones",
            nights   => [ qw(monday thursday friday) ],
            members  => [
                { name => "fred",    role => "lead", age  => 36, },
                { name => "wilma",   role => "wife", age  => 31, },
                { name => "pebbles", role => "kid",  age  =>  4, },

        jetsons     => {
            series   => "jetsons",
            nights   => [ qw(wednesday saturday) ],
            members  => [
                { name => "george",  role => "lead", age  => 41, },
                { name => "jane",    role => "wife", age  => 39, },
                { name => "elroy",   role => "kid",  age  =>  9, },

        simpsons    => {
            series   => "simpsons",
            nights   => [ qw(monday) ],
            members  => [
                { name => "homer", role => "lead", age  => 34, },
                { name => "marge", role => "wife", age => 37, },
                { name => "bart",  role => "kid",  age  =>  11, },

=head2 Generation of a HASH OF COMPLEX RECORDS

     # reading from file
     # this is most easily done by having the file itself be
     # in the raw data format as shown above.  perl is happy
     # to parse complex data structures if declared as data, so
     # sometimes it's easiest to do that

     # here's a piece by piece build up
     $rec = {};
     $rec->{series} = "flintstones";
     $rec->{nights} = [ find_days() ];

     @members = ();
     # assume this file in field=value syntax
     while (<>) {
         %fields = split /[\s=]+/;
         push @members, { %fields };
     $rec->{members} = [ @members ];

     # now remember the whole thing
     $TV{ $rec->{series} } = $rec;

     # now, you might want to make interesting extra fields that
     # include pointers back into the same data structure so if
     # change one piece, it changes everywhere, like for example
     # if you wanted a {kids} field that was a reference
     # to an array of the kids' records without having duplicate
     # records and thus update problems.
     foreach $family (keys %TV) {
         $rec = $TV{$family}; # temp pointer
         @kids = ();
         for $person ( @{ $rec->{members} } ) {
             if ($person->{role} =~ /kid|son|daughter/) {
                 push @kids, $person;
         # REMEMBER: $rec and $TV{$family} point to same data!!
         $rec->{kids} = [ @kids ];

     # you copied the array, but the array itself contains pointers
     # to uncopied objects. this means that if you make bart get
     # older via


     # then this would also change in
     print $TV{simpsons}{members}[2]{age};

     # because $TV{simpsons}{kids}[0] and $TV{simpsons}{members}[2]
     # both point to the same underlying anonymous hash table

     # print the whole thing
     foreach $family ( keys %TV ) {
         print "the $family";
         print " is on during @{ $TV{$family}{nights} }\n";
         print "its members are:\n";
         for $who ( @{ $TV{$family}{members} } ) {
             print " $who->{name} ($who->{role}), age $who->{age}\n";
         print "it turns out that $TV{$family}{lead} has ";
         print scalar ( @{ $TV{$family}{kids} } ), " kids named ";
         print join (", ", map { $_->{name} } @{ $TV{$family}{kids} } );
         print "\n";

=head1 Database Ties

You cannot easily tie a multilevel data structure (such as a hash of
hashes) to a dbm file.  The first problem is that all but GDBM and
Berkeley DB have size limitations, but beyond that, you also have problems
with how references are to be represented on disk.  One experimental
module that does partially attempt to address this need is the MLDBM
module.  Check your nearest CPAN site as described in L<perlmodlib> for
source code to MLDBM.

=head1 SEE ALSO

perlref(1), perllol(1), perldata(1), perlobj(1)

=head1 AUTHOR

Tom Christiansen <F<tchrist at perl.com>>

Last update:
Wed Oct 23 04:57:50 MET DST 1996

--- NEW FILE: perl585delta.pod ---
=head1 NAME

perl585delta - what is new for perl v5.8.5


This document describes differences between the 5.8.4 release and
the 5.8.5 release.

=head1 Incompatible Changes

There are no changes incompatible with 5.8.4.

=head1 Core Enhancements

Perl's regular expression engine now contains support for matching on the
intersection of two Unicode character classes. You can also now refer to
user-defined character classes from within other user defined character

=head1 Modules and Pragmata

=over 4

=item *

Carp improved to work nicely with Safe. Carp's message reporting should now
be anomaly free - it will always print out line number information.

=item *

CGI upgraded to version 3.05

=item *

charnames now avoids clobbering $_

=item *

Digest upgraded to version 1.08

=item *

Encode upgraded to version 2.01

=item *

FileCache upgraded to version 1.04

=item *

libnet upgraded to version 1.19

=item *

Pod::Parser upgraded to version 1.28

=item *

Pod::Perldoc upgraded to version 3.13

=item *

Pod::LaTeX upgraded to version 0.57

=item *

Safe now works properly with Carp

=item *

Scalar-List-Utils upgraded to version 1.14

=item *

Shell's documentation has been re-written, and its historical partial
auto-quoting of command arguments can now be disabled.

=item *

Test upgraded to version 1.25

=item *

Test::Harness upgraded to version 2.42

=item *

Time::Local upgraded to version 1.10

=item *

Unicode::Collate upgraded to version 0.40

=item *

Unicode::Normalize upgraded to version 0.30


=head1 Utility Changes

=head2 Perl's debugger

The debugger can now emulate stepping backwards, by restarting and rerunning
all bar the last command from a saved command history.

=head2 h2ph

F<h2ph> is now able to understand a very limited set of C inline functions
-- basically, the inline functions that look like CPP macros. This has
been introduced to deal with some of the headers of the newest versions of
the glibc. The standard warning still applies; to quote F<h2ph>'s
documentation, I<you may need to dicker with the files produced>.

=head1 Installation and Configuration Improvements

Perl 5.8.5 should build cleanly from source on LynxOS.

=head1 Selected Bug Fixes

=over 4

=item *

The in-place sort optimisation introduced in 5.8.4 had a bug. For example,
in code such as

    @a = sort ($b, @a)

the result would omit the value $b. This is now fixed.

=item *

The optimisation for unnecessary assignments introduced in 5.8.4 could give
spurious warnings. This has been fixed.

=item *

Perl should now correctly detect and read BOM-marked and (BOMless) UTF-16
scripts of either endianness.

=item *

Creating a new thread when weak references exist was buggy, and would often
cause warnings at interpreter destruction time. The known bug is now fixed.

=item *

Several obscure bugs involving manipulating Unicode strings with C<substr> have
been fixed.

=item *

Previously if Perl's file globbing function encountered a directory that it
did not have permission to open it would return immediately, leading to
unexpected truncation of the list of results. This has been fixed, to be
consistent with Unix shells' globbing behaviour.

=item *

Thread creation time could vary wildly between identical runs. This was caused
by a poor hashing algorithm in the thread cloning routines, which has now
been fixed.

=item *

The internals of the ithreads implementation were not checking if OS-level
thread creation had failed. threads->create() now returns C<undef> in if
thread creation fails instead of crashing perl.


=head1 New or Changed Diagnostics

=over 4

=item *

Perl -V has several improvements

=over 4

=item  *

correctly outputs local patch names that contain embedded code snippets
or other characters that used to confuse it.

=item * 

arguments to -V that look like regexps will give multiple lines of output.

=item *

a trailing colon suppresses the linefeed and ';'  terminator, allowing
embedding of queries into shell commands.

=item *

a leading colon removes the 'name=' part of the response, allowing mapping to
any name.


=item *

When perl fails to find the specified script, it now outputs a second line
suggesting that the user use the C<-S> flag:

    $ perl5.8.5 missing.pl
    Can't open perl script "missing.pl": No such file or directory.
    Use -S to search $PATH for it.


=head1 Changed Internals

The Unicode character class files used by the regular expression engine are
now built at build time from the supplied Unicode consortium data files,
instead of being shipped prebuilt. This makes the compressed Perl source
tarball about 200K smaller. A side effect is that the layout of files inside
lib/unicore has changed.

=head1 Known Problems

The regression test F<t/uni/class.t> is now performing considerably more
tests, and can take several minutes to run even on a fast machine.

=head1 Platform Specific Problems

This release is known not to build on Windows 95.

=head1 Reporting Bugs

If you find what you think is a bug, you might check the articles
recently posted to the comp.lang.perl.misc newsgroup and the perl
bug database at http://bugs.perl.org.  There may also be
information at http://www.perl.org, the Perl Home Page.

If you believe you have an unreported bug, please run the B<perlbug>
program included with your release.  Be sure to trim your bug down
to a tiny but sufficient test case.  Your bug report, along with the
output of C<perl -V>, will be sent off to perlbug at perl.org to be
analysed by the Perl porting team.  You can browse and search
the Perl 5 bugs at http://bugs.perl.org/

=head1 SEE ALSO

The F<Changes> file for exhaustive details on what changed.

The F<INSTALL> file for how to build Perl.

The F<README> file for general stuff.

The F<Artistic> and F<Copying> files for copyright information.


--- NEW FILE: perlsub.pod ---
=head1 NAME
X<subroutine> X<function>

perlsub - Perl subroutines


To declare subroutines:
X<subroutine, declaration> X<sub>

    sub NAME;			  # A "forward" declaration.
    sub NAME(PROTO);		  #  ditto, but with prototypes
    sub NAME : ATTRS;		  #  with attributes
    sub NAME(PROTO) : ATTRS;	  #  with attributes and prototypes

    sub NAME BLOCK		  # A declaration and a definition.
    sub NAME(PROTO) BLOCK	  #  ditto, but with prototypes
    sub NAME : ATTRS BLOCK	  #  with attributes
    sub NAME(PROTO) : ATTRS BLOCK #  with prototypes and attributes
[...1390 lines suppressed...]
    sub snurt : foo + bar;	  # "+" not a colon or space

The attribute list is passed as a list of constant strings to the code
which associates them with the subroutine.  In particular, the second example
of valid syntax above currently looks like this in terms of how it's
parsed and invoked:

    use attributes __PACKAGE__, \&plugh, q[Ugly('\(")], 'Bad';

For further details on attribute lists and their manipulation,
see L<attributes> and L<Attribute::Handlers>.

=head1 SEE ALSO

See L<perlref/"Function Templates"> for more about references and closures.
See L<perlxs> if you'd like to learn about calling C subroutines from Perl.  
See L<perlembed> if you'd like to learn about calling Perl subroutines from C.  
See L<perlmod> to learn about bundling up your functions in separate files.
See L<perlmodlib> to learn what library modules come standard on your system.
See L<perltoot> to learn how to make object method calls.

--- NEW FILE: perlapio.pod ---
=head1 NAME

perlapio - perl's IO abstraction interface.


    #define PERLIO_NOT_STDIO 0    /* For co-existence with stdio only */
    #include <perlio.h>           /* Usually via #include <perl.h> */

    PerlIO *PerlIO_stdin(void);
    PerlIO *PerlIO_stdout(void);
    PerlIO *PerlIO_stderr(void);

    PerlIO *PerlIO_open(const char *path,const char *mode);
    PerlIO *PerlIO_fdopen(int fd, const char *mode);
    PerlIO *PerlIO_reopen(const char *path, const char *mode, PerlIO *old);  /* deprecated */
    int     PerlIO_close(PerlIO *f);

    int     PerlIO_stdoutf(const char *fmt,...)
    int     PerlIO_puts(PerlIO *f,const char *string);
    int     PerlIO_putc(PerlIO *f,int ch);
    int     PerlIO_write(PerlIO *f,const void *buf,size_t numbytes);
    int     PerlIO_printf(PerlIO *f, const char *fmt,...);
    int     PerlIO_vprintf(PerlIO *f, const char *fmt, va_list args);
    int     PerlIO_flush(PerlIO *f);

    int     PerlIO_eof(PerlIO *f);
    int     PerlIO_error(PerlIO *f);
    void    PerlIO_clearerr(PerlIO *f);

    int     PerlIO_getc(PerlIO *d);
    int     PerlIO_ungetc(PerlIO *f,int ch);
    int     PerlIO_read(PerlIO *f, void *buf, size_t numbytes);

    int     PerlIO_fileno(PerlIO *f);

    void    PerlIO_setlinebuf(PerlIO *f);

    Off_t   PerlIO_tell(PerlIO *f);
    int     PerlIO_seek(PerlIO *f, Off_t offset, int whence);
    void    PerlIO_rewind(PerlIO *f);

    int     PerlIO_getpos(PerlIO *f, SV *save);        /* prototype changed */
    int     PerlIO_setpos(PerlIO *f, SV *saved);       /* prototype changed */

    int     PerlIO_fast_gets(PerlIO *f);
    int     PerlIO_has_cntptr(PerlIO *f);
    int     PerlIO_get_cnt(PerlIO *f);
    char   *PerlIO_get_ptr(PerlIO *f);
    void    PerlIO_set_ptrcnt(PerlIO *f, char *ptr, int count);

    int     PerlIO_canset_cnt(PerlIO *f);              /* deprecated */
    void    PerlIO_set_cnt(PerlIO *f, int count);      /* deprecated */

    int     PerlIO_has_base(PerlIO *f);
    char   *PerlIO_get_base(PerlIO *f);
    int     PerlIO_get_bufsiz(PerlIO *f);

    PerlIO *PerlIO_importFILE(FILE *stdio, const char *mode);
    FILE   *PerlIO_exportFILE(PerlIO *f, int flags);
    FILE   *PerlIO_findFILE(PerlIO *f);
    void    PerlIO_releaseFILE(PerlIO *f,FILE *stdio);

    int     PerlIO_apply_layers(PerlIO *f, const char *mode, const char *layers);
    int     PerlIO_binmode(PerlIO *f, int ptype, int imode, const char *layers);
    void    PerlIO_debug(const char *fmt,...)


Perl's source code, and extensions that want maximum portability,
should use the above functions instead of those defined in ANSI C's
I<stdio.h>.  The perl headers (in particular "perlio.h") will
C<#define> them to the I/O mechanism selected at Configure time.

The functions are modeled on those in I<stdio.h>, but parameter order
has been "tidied up a little".

C<PerlIO *> takes the place of FILE *. Like FILE * it should be
treated as opaque (it is probably safe to assume it is a pointer to

There are currently three implementations:

=over 4

=item 1. USE_STDIO

All above are #define'd to stdio functions or are trivial wrapper
functions which call stdio. In this case I<only> PerlIO * is a FILE *.
This has been the default implementation since the abstraction was
introduced in perl5.003_02.

=item 2. USE_SFIO

A "legacy" implementation in terms of the "sfio" library. Used for
some specialist applications on Unix machines ("sfio" is not widely
ported away from Unix).  Most of above are #define'd to the sfio
functions. PerlIO * is in this case Sfio_t *.

=item 3. USE_PERLIO

Introduced just after perl5.7.0, this is a re-implementation of the
above abstraction which allows perl more control over how IO is done
as it decouples IO from the way the operating system and C library
choose to do things. For USE_PERLIO PerlIO * has an extra layer of
indirection - it is a pointer-to-a-pointer.  This allows the PerlIO *
to remain with a known value while swapping the implementation around
underneath I<at run time>. In this case all the above are true (but
very simple) functions which call the underlying implementation.

This is the only implementation for which C<PerlIO_apply_layers()>
does anything "interesting".

The USE_PERLIO implementation is described in L<perliol>.


Because "perlio.h" is a thin layer (for efficiency) the semantics of
these functions are somewhat dependent on the underlying implementation.
Where these variations are understood they are noted below.

Unless otherwise noted, functions return 0 on success, or a negative
value (usually C<EOF> which is usually -1) and set C<errno> on error.

=over 4

=item B<PerlIO_stdin()>, B<PerlIO_stdout()>, B<PerlIO_stderr()>

Use these rather than C<stdin>, C<stdout>, C<stderr>. They are written
to look like "function calls" rather than variables because this makes
it easier to I<make them> function calls if platform cannot export data
to loaded modules, or if (say) different "threads" might have different

=item B<PerlIO_open(path, mode)>, B<PerlIO_fdopen(fd,mode)>

These correspond to fopen()/fdopen() and the arguments are the same.
Return C<NULL> and set C<errno> if there is an error.  There may be an
implementation limit on the number of open handles, which may be lower
than the limit on the number of open files - C<errno> may not be set
when C<NULL> is returned if this limit is exceeded.

=item B<PerlIO_reopen(path,mode,f)>

While this currently exists in all three implementations perl itself
does not use it. I<As perl does not use it, it is not well tested.>

Perl prefers to C<dup> the new low-level descriptor to the descriptor
used by the existing PerlIO. This may become the behaviour of this
function in the future.

=item B<PerlIO_printf(f,fmt,...)>, B<PerlIO_vprintf(f,fmt,a)>

These are fprintf()/vfprintf() equivalents.

=item B<PerlIO_stdoutf(fmt,...)>

This is printf() equivalent. printf is #defined to this function,
so it is (currently) legal to use C<printf(fmt,...)> in perl sources.

=item B<PerlIO_read(f,buf,count)>, B<PerlIO_write(f,buf,count)>

These correspond functionally to fread() and fwrite() but the
arguments and return values are different.  The PerlIO_read() and
PerlIO_write() signatures have been modeled on the more sane low level
read() and write() functions instead: The "file" argument is passed
first, there is only one "count", and the return value can distinguish
between error and C<EOF>.

Returns a byte count if successful (which may be zero or
positive), returns negative value and sets C<errno> on error.
Depending on implementation C<errno> may be C<EINTR> if operation was
interrupted by a signal.

=item B<PerlIO_close(f)>

Depending on implementation C<errno> may be C<EINTR> if operation was
interrupted by a signal.

=item B<PerlIO_puts(f,s)>, B<PerlIO_putc(f,c)>

These correspond to fputs() and fputc().
Note that arguments have been revised to have "file" first.

=item B<PerlIO_ungetc(f,c)>

This corresponds to ungetc().  Note that arguments have been revised
to have "file" first.  Arranges that next read operation will return
the byte B<c>.  Despite the implied "character" in the name only
values in the range 0..0xFF are defined. Returns the byte B<c> on
success or -1 (C<EOF>) on error.  The number of bytes that can be
"pushed back" may vary, only 1 character is certain, and then only if
it is the last character that was read from the handle.

=item B<PerlIO_getc(f)>

This corresponds to getc().
Despite the c in the name only byte range 0..0xFF is supported.
Returns the character read or -1 (C<EOF>) on error.

=item B<PerlIO_eof(f)>

This corresponds to feof().  Returns a true/false indication of
whether the handle is at end of file.  For terminal devices this may
or may not be "sticky" depending on the implementation.  The flag is
cleared by PerlIO_seek(), or PerlIO_rewind().

=item B<PerlIO_error(f)>

This corresponds to ferror().  Returns a true/false indication of
whether there has been an IO error on the handle.

=item B<PerlIO_fileno(f)>

This corresponds to fileno(), note that on some platforms, the meaning
of "fileno" may not match Unix. Returns -1 if the handle has no open
descriptor associated with it.

=item B<PerlIO_clearerr(f)>

This corresponds to clearerr(), i.e., clears 'error' and (usually)
'eof' flags for the "stream". Does not return a value.

=item B<PerlIO_flush(f)>

This corresponds to fflush().  Sends any buffered write data to the
underlying file.  If called with C<NULL> this may flush all open
streams (or core dump with some USE_STDIO implementations).  Calling
on a handle open for read only, or on which last operation was a read
of some kind may lead to undefined behaviour on some USE_STDIO
implementations.  The USE_PERLIO (layers) implementation tries to
behave better: it flushes all open streams when passed C<NULL>, and
attempts to retain data on read streams either in the buffer or by
seeking the handle to the current logical position.

=item B<PerlIO_seek(f,offset,whence)>

This corresponds to fseek().  Sends buffered write data to the
underlying file, or discards any buffered read data, then positions
the file descriptor as specified by B<offset> and B<whence> (sic).
This is the correct thing to do when switching between read and write
on the same handle (see issues with PerlIO_flush() above).  Offset is
of type C<Off_t> which is a perl Configure value which may not be same
as stdio's C<off_t>.

=item B<PerlIO_tell(f)>

This corresponds to ftell().  Returns the current file position, or
(Off_t) -1 on error.  May just return value system "knows" without
making a system call or checking the underlying file descriptor (so
use on shared file descriptors is not safe without a
PerlIO_seek()). Return value is of type C<Off_t> which is a perl
Configure value which may not be same as stdio's C<off_t>.

=item B<PerlIO_getpos(f,p)>, B<PerlIO_setpos(f,p)>

These correspond (loosely) to fgetpos() and fsetpos(). Rather than
stdio's Fpos_t they expect a "Perl Scalar Value" to be passed. What is
stored there should be considered opaque. The layout of the data may
vary from handle to handle.  When not using stdio or if platform does
not have the stdio calls then they are implemented in terms of
PerlIO_tell() and PerlIO_seek().

=item B<PerlIO_rewind(f)>

This corresponds to rewind(). It is usually defined as being

    PerlIO_seek(f,(Off_t)0L, SEEK_SET);

=item B<PerlIO_tmpfile()>

This corresponds to tmpfile(), i.e., returns an anonymous PerlIO or
NULL on error.  The system will attempt to automatically delete the
file when closed.  On Unix the file is usually C<unlink>-ed just after
it is created so it does not matter how it gets closed. On other
systems the file may only be deleted if closed via PerlIO_close()
and/or the program exits via C<exit>.  Depending on the implementation
there may be "race conditions" which allow other processes access to
the file, though in general it will be safer in this regard than
ad. hoc. schemes.

=item B<PerlIO_setlinebuf(f)>

This corresponds to setlinebuf().  Does not return a value. What
constitutes a "line" is implementation dependent but usually means
that writing "\n" flushes the buffer.  What happens with things like
"this\nthat" is uncertain.  (Perl core uses it I<only> when "dumping";
it has nothing to do with $| auto-flush.)


=head2 Co-existence with stdio

There is outline support for co-existence of PerlIO with stdio.
Obviously if PerlIO is implemented in terms of stdio there is no
problem. However in other cases then mechanisms must exist to create a
FILE * which can be passed to library code which is going to use stdio

The first step is to add this line:

   #define PERLIO_NOT_STDIO 0

I<before> including any perl header files. (This will probably become
the default at some point).  That prevents "perlio.h" from attempting
to #define stdio functions onto PerlIO functions.

XS code is probably better using "typemap" if it expects FILE *
arguments.  The standard typemap will be adjusted to comprehend any
changes in this area.

=over 4

=item B<PerlIO_importFILE(f,mode)>

Used to get a PerlIO * from a FILE *.

The mode argument should be a string as would be passed to
fopen/PerlIO_open.  If it is NULL then - for legacy support - the code
will (depending upon the platform and the implementation) either
attempt to empirically determine the mode in which I<f> is open, or
use "r+" to indicate a read/write stream.

Once called the FILE * should I<ONLY> be closed by calling
C<PerlIO_close()> on the returned PerlIO *.

The PerlIO is set to textmode. Use PerlIO_binmode if this is
not the desired mode.

This is B<not> the reverse of PerlIO_exportFILE().

=item B<PerlIO_exportFILE(f,mode)>

Given a PerlIO * create a 'native' FILE * suitable for passing to code
expecting to be compiled and linked with ANSI C I<stdio.h>.  The mode
argument should be a string as would be passed to fopen/PerlIO_open.
If it is NULL then - for legacy support - the FILE * is opened in same
mode as the PerlIO *.

The fact that such a FILE * has been 'exported' is recorded, (normally
by pushing a new :stdio "layer" onto the PerlIO *), which may affect
future PerlIO operations on the original PerlIO *.  You should not
call C<fclose()> on the file unless you call C<PerlIO_releaseFILE()>
to disassociate it from the PerlIO *.  (Do not use PerlIO_importFILE()
for doing the disassociation.)

Calling this function repeatedly will create a FILE * on each call
(and will push an :stdio layer each time as well).

=item B<PerlIO_releaseFILE(p,f)>

Calling PerlIO_releaseFILE informs PerlIO that all use of FILE * is
complete. It is removed from the list of 'exported' FILE *s, and the
associated PerlIO * should revert to its original behaviour.

Use this to disassociate a file from a PerlIO * that was associated
using PerlIO_exportFILE().

=item B<PerlIO_findFILE(f)>

Returns a native FILE * used by a stdio layer. If there is none, it
will create one with PerlIO_exportFILE. In either case the FILE *
should be considered as belonging to PerlIO subsystem and should
only be closed by calling C<PerlIO_close()>.


=head2 "Fast gets" Functions

In addition to standard-like API defined so far above there is an
"implementation" interface which allows perl to get at internals of
PerlIO.  The following calls correspond to the various FILE_xxx macros
determined by Configure - or their equivalent in other
implementations. This section is really of interest to only those
concerned with detailed perl-core behaviour, implementing a PerlIO
mapping or writing code which can make use of the "read ahead" that
has been done by the IO system in the same way perl does. Note that
any code that uses these interfaces must be prepared to do things the
traditional way if a handle does not support them.

=over 4

=item B<PerlIO_fast_gets(f)>

Returns true if implementation has all the interfaces required to
allow perl's C<sv_gets> to "bypass" normal IO mechanism.  This can
vary from handle to handle.

  PerlIO_fast_gets(f) = PerlIO_has_cntptr(f) && \
                        PerlIO_canset_cnt(f) && \
                        `Can set pointer into buffer'

=item B<PerlIO_has_cntptr(f)>

Implementation can return pointer to current position in the "buffer"
and a count of bytes available in the buffer.  Do not use this - use

=item B<PerlIO_get_cnt(f)>

Return count of readable bytes in the buffer. Zero or negative return
means no more bytes available.

=item B<PerlIO_get_ptr(f)>

Return pointer to next readable byte in buffer, accessing via the
pointer (dereferencing) is only safe if PerlIO_get_cnt() has returned
a positive value.  Only positive offsets up to value returned by
PerlIO_get_cnt() are allowed.

=item B<PerlIO_set_ptrcnt(f,p,c)>

Set pointer into buffer, and a count of bytes still in the
buffer. Should be used only to set pointer to within range implied by
previous calls to C<PerlIO_get_ptr> and C<PerlIO_get_cnt>. The two
values I<must> be consistent with each other (implementation may only
use one or the other or may require both).

=item B<PerlIO_canset_cnt(f)>

Implementation can adjust its idea of number of bytes in the buffer.
Do not use this - use PerlIO_fast_gets.

=item B<PerlIO_set_cnt(f,c)>

Obscure - set count of bytes in the buffer. Deprecated.  Only usable
if PerlIO_canset_cnt() returns true.  Currently used in only doio.c to
force count less than -1 to -1.  Perhaps should be PerlIO_set_empty or
similar.  This call may actually do nothing if "count" is deduced from
pointer and a "limit".  Do not use this - use PerlIO_set_ptrcnt().

=item B<PerlIO_has_base(f)>

Returns true if implementation has a buffer, and can return pointer
to whole buffer and its size. Used by perl for B<-T> / B<-B> tests.
Other uses would be very obscure...

=item B<PerlIO_get_base(f)>

Return I<start> of buffer. Access only positive offsets in the buffer
up to the value returned by PerlIO_get_bufsiz().

=item B<PerlIO_get_bufsiz(f)>

Return the I<total number of bytes> in the buffer, this is neither the
number that can be read, nor the amount of memory allocated to the
buffer. Rather it is what the operating system and/or implementation
happened to C<read()> (or whatever) last time IO was requested.


=head2 Other Functions

=over 4

=item PerlIO_apply_layers(f,mode,layers)

The new interface to the USE_PERLIO implementation. The layers ":crlf"
and ":raw" are only ones allowed for other implementations and those
are silently ignored. (As of perl5.8 ":raw" is deprecated.)  Use
PerlIO_binmode() below for the portable case.

=item PerlIO_binmode(f,ptype,imode,layers)

The hook used by perl's C<binmode> operator.
B<ptype> is perl's character for the kind of IO:

=over 8

=item 'E<lt>' read

=item 'E<gt>' write

=item '+' read/write


B<imode> is C<O_BINARY> or C<O_TEXT>.

B<layers> is a string of layers to apply, only ":crlf" makes sense in
the non USE_PERLIO case. (As of perl5.8 ":raw" is deprecated in favour
of passing NULL.)

Portable cases are:


On Unix these calls probably have no effect whatsoever.  Elsewhere
they alter "\n" to CR,LF translation and possibly cause a special text
"end of file" indicator to be written or honoured on read. The effect
of making the call after doing any IO to the handle depends on the
implementation. (It may be ignored, affect any data which is already
buffered as well, or only apply to subsequent data.)

=item PerlIO_debug(fmt,...)

PerlIO_debug is a printf()-like function which can be used for
debugging.  No return value. Its main use is inside PerlIO where using
real printf, warn() etc. would recursively call PerlIO and be a

PerlIO_debug writes to the file named by $ENV{'PERLIO_DEBUG'} typical
use might be

  Bourne shells (sh, ksh, bash, zsh, ash, ...):
   PERLIO_DEBUG=/dev/tty ./perl somescript some args

   setenv PERLIO_DEBUG /dev/tty
   ./perl somescript some args

  If you have the "env" utility:
   env PERLIO_DEBUG=/dev/tty ./perl somescript some args

   perl somescript some args

If $ENV{'PERLIO_DEBUG'} is not set PerlIO_debug() is a no-op.


--- NEW FILE: perl5005delta.pod ---
=head1 NAME

perl5005delta - what's new for perl5.005


This document describes differences between the 5.004 release and this one.

=head1 About the new versioning system

Perl is now developed on two tracks: a maintenance track that makes
small, safe updates to released production versions with emphasis on
compatibility; and a development track that pursues more aggressive
evolution.  Maintenance releases (which should be considered production
quality) have subversion numbers that run from C<1> to C<49>, and
development releases (which should be considered "alpha" quality) run
from C<50> to C<99>.

Perl 5.005 is the combined product of the new dual-track development

=head1 Incompatible Changes

=head2 WARNING:  This version is not binary compatible with Perl 5.004.

Starting with Perl 5.004_50 there were many deep and far-reaching changes
to the language internals.  If you have dynamically loaded extensions
that you built under perl 5.003 or 5.004, you can continue to use them
with 5.004, but you will need to rebuild and reinstall those extensions
to use them 5.005.  See F<INSTALL> for detailed instructions on how to

=head2 Default installation structure has changed

The new Configure defaults are designed to allow a smooth upgrade from
5.004 to 5.005, but you should read F<INSTALL> for a detailed
discussion of the changes in order to adapt them to your system.

=head2 Perl Source Compatibility

When none of the experimental features are enabled, there should be
very few user-visible Perl source compatibility issues.

If threads are enabled, then some caveats apply. C<@_> and C<$_> become
lexical variables.  The effect of this should be largely transparent to
the user, but there are some boundary conditions under which user will
need to be aware of the issues.  For example, C<local(@_)> results in
a "Can't localize lexical variable @_ ..." message.  This may be enabled
in a future version.

Some new keywords have been introduced.  These are generally expected to
have very little impact on compatibility.  See L<New C<INIT> keyword>,
L<New C<lock> keyword>, and L<New C<qrE<sol>E<sol>> operator>.

Certain barewords are now reserved.  Use of these will provoke a warning
if you have asked for them with the C<-w> switch.
See L<C<our> is now a reserved word>.

=head2 C Source Compatibility

There have been a large number of changes in the internals to support
the new features in this release.

=over 4

=item *

Core sources now require ANSI C compiler

An ANSI C compiler is now B<required> to build perl.  See F<INSTALL>.

=item *

All Perl global variables must now be referenced with an explicit prefix

All Perl global variables that are visible for use by extensions now
have a C<PL_> prefix.  New extensions should C<not> refer to perl globals
by their unqualified names.  To preserve sanity, we provide limited
backward compatibility for globals that are being widely used like
C<sv_undef> and C<na> (which should now be written as C<PL_sv_undef>,
C<PL_na> etc.)

If you find that your XS extension does not compile anymore because a
perl global is not visible, try adding a C<PL_> prefix to the global
and rebuild.

It is strongly recommended that all functions in the Perl API that don't
begin with C<perl> be referenced with a C<Perl_> prefix.  The bare function
names without the C<Perl_> prefix are supported with macros, but this
support may cease in a future release.

See L<perlapi>.

=item *

Enabling threads has source compatibility issues

Perl built with threading enabled requires extensions to use the new
C<dTHR> macro to initialize the handle to access per-thread data.
If you see a compiler error that talks about the variable C<thr> not
being declared (when building a module that has XS code),  you need
to add C<dTHR;> at the beginning of the block that elicited the error.

The API function C<perl_get_sv("@",FALSE)> should be used instead of
directly accessing perl globals as C<GvSV(errgv)>.  The API call is
backward compatible with existing perls and provides source compatibility
with threading is enabled.

See L<"C Source Compatibility"> for more information.


=head2 Binary Compatibility

This version is NOT binary compatible with older versions.  All extensions
will need to be recompiled.  Further binaries built with threads enabled
are incompatible with binaries built without.  This should largely be
transparent to the user, as all binary incompatible configurations have
their own unique architecture name, and extension binaries get installed at
unique locations.  This allows coexistence of several configurations in
the same directory hierarchy.  See F<INSTALL>.

=head2 Security fixes may affect compatibility

A few taint leaks and taint omissions have been corrected.  This may lead
to "failure" of scripts that used to work with older versions.  Compiling
with -DINCOMPLETE_TAINTS provides a perl with minimal amounts of changes
to the tainting behavior.  But note that the resulting perl will have
known insecurities.

Oneliners with the C<-e> switch do not create temporary files anymore.

=head2 Relaxed new mandatory warnings introduced in 5.004

Many new warnings that were introduced in 5.004 have been made
optional.  Some of these warnings are still present, but perl's new
features make them less often a problem.  See L<New Diagnostics>.

=head2 Licensing

Perl has a new Social Contract for contributors.  See F<Porting/Contract>.

The license included in much of the Perl documentation has changed.
Most of the Perl documentation was previously under the implicit GNU
General Public License or the Artistic License (at the user's choice).
Now much of the documentation unambiguously states the terms under which
it may be distributed.  Those terms are in general much less restrictive
than the GNU GPL.  See L<perl> and the individual perl manpages listed

=head1 Core Changes

=head2 Threads

WARNING: Threading is considered an B<experimental> feature.  Details of the
implementation may change without notice.  There are known limitations
and some bugs.  These are expected to be fixed in future versions.

See F<README.threads>.

=head2 Compiler

WARNING: The Compiler and related tools are considered B<experimental>.
Features may change without notice, and there are known limitations
and bugs.  Since the compiler is fully external to perl, the default
configuration will build and install it.

The Compiler produces three different types of transformations of a
perl program.  The C backend generates C code that captures perl's state
just before execution begins.  It eliminates the compile-time overheads
of the regular perl interpreter, but the run-time performance remains
comparatively the same.  The CC backend generates optimized C code
equivalent to the code path at run-time.  The CC backend has greater
potential for big optimizations, but only a few optimizations are
implemented currently.  The Bytecode backend generates a platform
independent bytecode representation of the interpreter's state
just before execution.  Thus, the Bytecode back end also eliminates
much of the compilation overhead of the interpreter.

The compiler comes with several valuable utilities.

C<B::Lint> is an experimental module to detect and warn about suspicious
code, especially the cases that the C<-w> switch does not detect.

C<B::Deparse> can be used to demystify perl code, and understand
how perl optimizes certain constructs.

C<B::Xref> generates cross reference reports of all definition and use
of variables, subroutines and formats in a program.

C<B::Showlex> show the lexical variables used by a subroutine or file
at a glance.

C<perlcc> is a simple frontend for compiling perl.

See C<ext/B/README>, L<B>, and the respective compiler modules.

=head2 Regular Expressions

Perl's regular expression engine has been seriously overhauled, and
many new constructs are supported.  Several bugs have been fixed.

Here is an itemized summary:

=over 4

=item Many new and improved optimizations

Changes in the RE engine:

	Unneeded nodes removed;
	Substrings merged together;
	New types of nodes to process (SUBEXPR)* and similar expressions
	    quickly, used if the SUBEXPR has no side effects and matches
	    strings of the same length;
	Better optimizations by lookup for constant substrings;
	Better search for constants substrings anchored by $ ;

Changes in Perl code using RE engine:

	More optimizations to s/longer/short/;
	study() was not working;
	/blah/ may be optimized to an analogue of index() if $& $` $' not seen;
	Unneeded copying of matched-against string removed;
	Only matched part of the string is copying if $` $' were not seen;

=item Many bug fixes

Note that only the major bug fixes are listed here.  See F<Changes> for others.

	Backtracking might not restore start of $3.
	No feedback if max count for * or + on "complex" subexpression
	    was reached, similarly (but at compile time) for {3,34567}
	Primitive restrictions on max count introduced to decrease a 
	    possibility of a segfault;
	(ZERO-LENGTH)* could segfault;
	(ZERO-LENGTH)* was prohibited;
	Long REs were not allowed;
	/RE/g could skip matches at the same position after a 
	  zero-length match;

=item New regular expression constructs

The following new syntax elements are supported:

	(?{ CODE })

=item New operator for precompiled regular expressions

See L<New C<qrE<sol>E<sol>> operator>.

=item Other improvements

	Better debugging output (possibly with colors),
            even from non-debugging Perl;
	RE engine code now looks like C, not like assembler;
	Behaviour of RE modifiable by `use re' directive;
	Improved documentation;
	Test suite significantly extended;
	Syntax [:^upper:] etc., reserved inside character classes;

=item Incompatible changes

	(?i) localized inside enclosing group;
	$( is not interpolated into RE any more;
	/RE/g may match at the same position (with non-zero length)
	    after a zero-length match (bug fix).


See L<perlre> and L<perlop>.

=head2   Improved malloc()

See banner at the beginning of C<malloc.c> for details.

=head2 Quicksort is internally implemented

Perl now contains its own highly optimized qsort() routine.  The new qsort()
is resistant to inconsistent comparison functions, so Perl's C<sort()> will
not provoke coredumps any more when given poorly written sort subroutines.
(Some C library C<qsort()>s that were being used before used to have this
problem.)  In our testing, the new C<qsort()> required the minimal number
of pair-wise compares on average, among all known C<qsort()> implementations.

See C<perlfunc/sort>.

=head2 Reliable signals

Perl's signal handling is susceptible to random crashes, because signals
arrive asynchronously, and the Perl runtime is not reentrant at arbitrary

However, one experimental implementation of reliable signals is available
when threads are enabled.  See C<Thread::Signal>.  Also see F<INSTALL> for
how to build a Perl capable of threads.

=head2 Reliable stack pointers

The internals now reallocate the perl stack only at predictable times.
In particular, magic calls never trigger reallocations of the stack,
because all reentrancy of the runtime is handled using a "stack of stacks".
This should improve reliability of cached stack pointers in the internals
and in XSUBs.

=head2 More generous treatment of carriage returns

Perl used to complain if it encountered literal carriage returns in
scripts.  Now they are mostly treated like whitespace within program text.
Inside string literals and here documents, literal carriage returns are
ignored if they occur paired with linefeeds, or get interpreted as whitespace
if they stand alone.  This behavior means that literal carriage returns
in files should be avoided.  You can get the older, more compatible (but
less generous) behavior by defining the preprocessor symbol
C<PERL_STRICT_CR> when building perl.  Of course, all this has nothing
whatever to do with how escapes like C<\r> are handled within strings.

Note that this doesn't somehow magically allow you to keep all text files
in DOS format.  The generous treatment only applies to files that perl
itself parses.  If your C compiler doesn't allow carriage returns in
files, you may still be unable to build modules that need a C compiler.

=head2 Memory leaks

C<substr>, C<pos> and C<vec> don't leak memory anymore when used in lvalue
context.  Many small leaks that impacted applications that embed multiple
interpreters have been fixed.

=head2 Better support for multiple interpreters

The build-time option C<-DMULTIPLICITY> has had many of the details
reworked.  Some previously global variables that should have been
per-interpreter now are.  With care, this allows interpreters to call
each other.  See the C<PerlInterp> extension on CPAN.

=head2 Behavior of local() on array and hash elements is now well-defined

See L<perlsub/"Temporary Values via local()">.

=head2 C<%!> is transparently tied to the L<Errno> module

See L<perlvar>, and L<Errno>.

=head2 Pseudo-hashes are supported

See L<perlref>.

=head2 C<EXPR foreach EXPR> is supported

See L<perlsyn>.

=head2 Keywords can be globally overridden

See L<perlsub>.

=head2 C<$^E> is meaningful on Win32

See L<perlvar>.

=head2 C<foreach (1..1000000)> optimized

C<foreach (1..1000000)> is now optimized into a counting loop.  It does
not try to allocate a 1000000-size list anymore.

=head2 C<Foo::> can be used as implicitly quoted package name

Barewords caused unintuitive behavior when a subroutine with the same
name as a package happened to be defined.  Thus, C<new Foo @args>,
use the result of the call to C<Foo()> instead of C<Foo> being treated
as a literal.  The recommended way to write barewords in the indirect
object slot is C<new Foo:: @args>.  Note that the method C<new()> is
called with a first argument of C<Foo>, not C<Foo::> when you do that.

=head2 C<exists $Foo::{Bar::}> tests existence of a package

It was impossible to test for the existence of a package without
actually creating it before.  Now C<exists $Foo::{Bar::}> can be
used to test if the C<Foo::Bar> namespace has been created.

=head2 Better locale support

See L<perllocale>.

=head2 Experimental support for 64-bit platforms

Perl5 has always had 64-bit support on systems with 64-bit longs.
Starting with 5.005, the beginnings of experimental support for systems
with 32-bit long and 64-bit 'long long' integers has been added.
If you add -DUSE_LONG_LONG to your ccflags in config.sh (or manually
define it in perl.h) then perl will be built with 'long long' support.
There will be many compiler warnings, and the resultant perl may not
work on all systems.  There are many other issues related to
third-party extensions and libraries.  This option exists to allow
people to work on those issues.

=head2 prototype() returns useful results on builtins

See L<perlfunc/prototype>.

=head2 Extended support for exception handling

C<die()> now accepts a reference value, and C<$@> gets set to that
value in exception traps.  This makes it possible to propagate
exception objects.  This is an undocumented B<experimental> feature.

=head2 Re-blessing in DESTROY() supported for chaining DESTROY() methods

See L<perlobj/Destructors>.

=head2 All C<printf> format conversions are handled internally

See L<perlfunc/printf>.

=head2 New C<INIT> keyword

C<INIT> subs are like C<BEGIN> and C<END>, but they get run just before
the perl runtime begins execution.  e.g., the Perl Compiler makes use of
C<INIT> blocks to initialize and resolve pointers to XSUBs.

=head2 New C<lock> keyword

The C<lock> keyword is the fundamental synchronization primitive
in threaded perl.  When threads are not enabled, it is currently a noop.

To minimize impact on source compatibility this keyword is "weak", i.e., any
user-defined subroutine of the same name overrides it, unless a C<use Thread>
has been seen.

=head2 New C<qr//> operator

The C<qr//> operator, which is syntactically similar to the other quote-like
operators, is used to create precompiled regular expressions.  This compiled
form can now be explicitly passed around in variables, and interpolated in
other regular expressions.  See L<perlop>.

=head2 C<our> is now a reserved word

Calling a subroutine with the name C<our> will now provoke a warning when
using the C<-w> switch.

=head2 Tied arrays are now fully supported

See L<Tie::Array>.

=head2 Tied handles support is better

Several missing hooks have been added.  There is also a new base class for
TIEARRAY implementations.  See L<Tie::Array>.

=head2 4th argument to substr

substr() can now both return and replace in one operation.  The optional
4th argument is the replacement string.  See L<perlfunc/substr>.

=head2 Negative LENGTH argument to splice

splice() with a negative LENGTH argument now work similar to what the
LENGTH did for substr().  Previously a negative LENGTH was treated as
0.  See L<perlfunc/splice>.

=head2 Magic lvalues are now more magical

When you say something like C<substr($x, 5) = "hi">, the scalar returned
by substr() is special, in that any modifications to it affect $x.
(This is called a 'magic lvalue' because an 'lvalue' is something on
the left side of an assignment.)  Normally, this is exactly what you
would expect to happen, but Perl uses the same magic if you use substr(),
pos(), or vec() in a context where they might be modified, like taking
a reference with C<\> or as an argument to a sub that modifies C<@_>.
In previous versions, this 'magic' only went one way, but now changes
to the scalar the magic refers to ($x in the above example) affect the
magic lvalue too. For instance, this code now acts differently:

    $x = "hello";
    sub printit {
	$x = "g'bye";
	print $_[0], "\n";
    printit(substr($x, 0, 5));

In previous versions, this would print "hello", but it now prints "g'bye".

=head2 <> now reads in records

If C<$/> is a reference to an integer, or a scalar that holds an integer,
<> will read in records instead of lines. For more info, see

=head1 Supported Platforms

Configure has many incremental improvements.  Site-wide policy for building
perl can now be made persistent, via Policy.sh.  Configure also records
the command-line arguments used in F<config.sh>.

=head2 New Platforms

BeOS is now supported.  See F<README.beos>.

DOS is now supported under the DJGPP tools.  See F<README.dos> (installed 
as L<perldos> on some systems).

MiNT is now supported.  See F<README.mint>.

MPE/iX is now supported.  See F<README.mpeix>.

MVS (aka OS390, aka Open Edition) is now supported.  See F<README.os390> 
(installed as L<perlos390> on some systems).

Stratus VOS is now supported.  See F<README.vos>.

=head2 Changes in existing support

Win32 support has been vastly enhanced.  Support for Perl Object, a C++
encapsulation of Perl.  GCC and EGCS are now supported on Win32.
See F<README.win32>, aka L<perlwin32>.

VMS configuration system has been rewritten.  See F<README.vms> (installed 
as L<README_vms> on some systems).

The hints files for most Unix platforms have seen incremental improvements.

=head1 Modules and Pragmata

=head2 New Modules

=over 4

=item B

Perl compiler and tools.  See L<B>.

=item Data::Dumper

A module to pretty print Perl data.  See L<Data::Dumper>.

=item Dumpvalue

A module to dump perl values to the screen. See L<Dumpvalue>.

=item Errno

A module to look up errors more conveniently.  See L<Errno>.

=item File::Spec

A portable API for file operations.

=item ExtUtils::Installed

Query and manage installed modules.

=item ExtUtils::Packlist

Manipulate .packlist files.

=item Fatal

Make functions/builtins succeed or die.

=item IPC::SysV

Constants and other support infrastructure for System V IPC operations
in perl.

=item Test

A framework for writing testsuites.

=item Tie::Array

Base class for tied arrays.

=item Tie::Handle

Base class for tied handles.

=item Thread

Perl thread creation, manipulation, and support.

=item attrs

Set subroutine attributes.

=item fields

Compile-time class fields.

=item re

Various pragmata to control behavior of regular expressions.


=head2 Changes in existing modules

=over 4

=item Benchmark

You can now run tests for I<x> seconds instead of guessing the right
number of tests to run.

Keeps better time.

=item Carp

Carp has a new function cluck(). cluck() warns, like carp(), but also adds
a stack backtrace to the error message, like confess().

=item CGI

CGI has been updated to version 2.42.

=item Fcntl

More Fcntl constants added: F_SETLK64, F_SETLKW64, O_LARGEFILE for
large (more than 4G) file access (the 64-bit support is not yet
working, though, so no need to get overly excited), Free/Net/OpenBSD
locking behaviour flags F_FLOCK, F_POSIX, Linux F_SHLCK, and
O_ACCMODE: the mask of O_RDONLY, O_WRONLY, and O_RDWR.

=item Math::Complex

The accessors methods Re, Im, arg, abs, rho, theta, methods can
($z->Re()) now also act as mutators ($z->Re(3)).

=item Math::Trig

A little bit of radial trigonometry (cylindrical and spherical) added,
for example the great circle distance.

=item POSIX

POSIX now has its own platform-specific hints files.

=item DB_File

DB_File supports version 2.x of Berkeley DB.  See C<ext/DB_File/Changes>.

=item MakeMaker

MakeMaker now supports writing empty makefiles, provides a way to
specify that site umask() policy should be honored.  There is also
better support for manipulation of .packlist files, and getting
information about installed modules.

Extensions that have both architecture-dependent and
architecture-independent files are now always installed completely in
the architecture-dependent locations.  Previously, the shareable parts
were shared both across architectures and across perl versions and were
therefore liable to be overwritten with newer versions that might have
subtle incompatibilities.

=item CPAN

See L<perlmodinstall> and L<CPAN>.

=item Cwd

Cwd::cwd is faster on most platforms.


=head1 Utility Changes

C<h2ph> and related utilities have been vastly overhauled.

C<perlcc>, a new experimental front end for the compiler is available.

The crude GNU C<configure> emulator is now called C<configure.gnu> to
avoid trampling on C<Configure> under case-insensitive filesystems.

C<perldoc> used to be rather slow.  The slower features are now optional.
In particular, case-insensitive searches need the C<-i> switch, and
recursive searches need C<-r>.  You can set these switches in the
C<PERLDOC> environment variable to get the old behavior.

=head1 Documentation Changes

Config.pm now has a glossary of variables.

F<Porting/patching.pod> has detailed instructions on how to create and
submit patches for perl.

L<perlport> specifies guidelines on how to write portably. 

L<perlmodinstall> describes how to fetch and install modules from C<CPAN>

Some more Perl traps are documented now.  See L<perltrap>.

L<perlopentut> gives a tutorial on using open().

L<perlreftut> gives a tutorial on references.

L<perlthrtut> gives a tutorial on threads.

=head1 New Diagnostics

=over 4

=item Ambiguous call resolved as CORE::%s(), qualify as such or use &

(W) A subroutine you have declared has the same name as a Perl keyword,
and you have used the name without qualification for calling one or the
other.  Perl decided to call the builtin because the subroutine is
not imported.

To force interpretation as a subroutine call, either put an ampersand
before the subroutine name, or qualify the name with its package.
Alternatively, you can import the subroutine (or pretend that it's
imported with the C<use subs> pragma).

To silently interpret it as the Perl operator, use the C<CORE::> prefix
on the operator (e.g. C<CORE::log($x)>) or by declaring the subroutine
to be an object method (see L<attrs>).

=item Bad index while coercing array into hash

(F) The index looked up in the hash found as the 0'th element of a
pseudo-hash is not legal.  Index values must be at 1 or greater.
See L<perlref>.

=item Bareword "%s" refers to nonexistent package

(W) You used a qualified bareword of the form C<Foo::>, but
the compiler saw no other uses of that namespace before that point.
Perhaps you need to predeclare a package?

=item Can't call method "%s" on an undefined value

(F) You used the syntax of a method call, but the slot filled by the
object reference or package name contains an undefined value.
Something like this will reproduce the error:

    $BADREF = 42;
    process $BADREF 1,2,3;

=item Can't check filesystem of script "%s" for nosuid

(P) For some reason you can't check the filesystem of the script for nosuid.

=item Can't coerce array into hash

(F) You used an array where a hash was expected, but the array has no
information on how to map from keys to array indices.  You can do that
only with arrays that have a hash reference at index 0.

=item Can't goto subroutine from an eval-string

(F) The "goto subroutine" call can't be used to jump out of an eval "string".
(You can use it to jump out of an eval {BLOCK}, but you probably don't want to.)

=item Can't localize pseudo-hash element

(F) You said something like C<< local $ar->{'key'} >>, where $ar is
a reference to a pseudo-hash.  That hasn't been implemented yet, but
you can get a similar effect by localizing the corresponding array
element directly -- C<< local $ar->[$ar->[0]{'key'}] >>.

=item Can't use %%! because Errno.pm is not available

(F) The first time the %! hash is used, perl automatically loads the
Errno.pm module. The Errno module is expected to tie the %! hash to
provide symbolic names for C<$!> errno values.

=item Cannot find an opnumber for "%s"

(F) A string of a form C<CORE::word> was given to prototype(), but
there is no builtin with the name C<word>.

=item Character class syntax [. .] is reserved for future extensions

(W) Within regular expression character classes ([]) the syntax beginning
with "[." and ending with ".]" is reserved for future extensions.
If you need to represent those character sequences inside a regular
expression character class, just quote the square brackets with the
backslash: "\[." and ".\]".

=item Character class syntax [: :] is reserved for future extensions

(W) Within regular expression character classes ([]) the syntax beginning
with "[:" and ending with ":]" is reserved for future extensions.
If you need to represent those character sequences inside a regular
expression character class, just quote the square brackets with the
backslash: "\[:" and ":\]".

=item Character class syntax [= =] is reserved for future extensions

(W) Within regular expression character classes ([]) the syntax
beginning with "[=" and ending with "=]" is reserved for future extensions.
If you need to represent those character sequences inside a regular
expression character class, just quote the square brackets with the
backslash: "\[=" and "=\]".

=item %s: Eval-group in insecure regular expression

(F) Perl detected tainted data when trying to compile a regular expression
that contains the C<(?{ ... })> zero-width assertion, which is unsafe.
See L<perlre/(?{ code })>, and L<perlsec>.

=item %s: Eval-group not allowed, use re 'eval'

(F) A regular expression contained the C<(?{ ... })> zero-width assertion,
but that construct is only allowed when the C<use re 'eval'> pragma is
in effect.  See L<perlre/(?{ code })>.

=item %s: Eval-group not allowed at run time

(F) Perl tried to compile a regular expression containing the C<(?{ ... })>
zero-width assertion at run time, as it would when the pattern contains
interpolated values.  Since that is a security risk, it is not allowed.
If you insist, you may still do this by explicitly building the pattern
from an interpolated string at run time and using that in an eval().
See L<perlre/(?{ code })>.

=item Explicit blessing to '' (assuming package main)

(W) You are blessing a reference to a zero length string.  This has
the effect of blessing the reference into the package main.  This is
usually not what you want.  Consider providing a default target
package, e.g. bless($ref, $p || 'MyPackage');

=item Illegal hex digit ignored

(W) You may have tried to use a character other than 0 - 9 or A - F in a
hexadecimal number.  Interpretation of the hexadecimal number stopped
before the illegal character.

=item No such array field

(F) You tried to access an array as a hash, but the field name used is
not defined.  The hash at index 0 should map all valid field names to
array indices for that to work.

=item No such field "%s" in variable %s of type %s

(F) You tried to access a field of a typed variable where the type
does not know about the field name.  The field names are looked up in
the %FIELDS hash in the type package at compile time.  The %FIELDS hash
is usually set up with the 'fields' pragma.

=item Out of memory during ridiculously large request

(F) You can't allocate more than 2^31+"small amount" bytes.  This error
is most likely to be caused by a typo in the Perl program. e.g., C<$arr[time]>
instead of C<$arr[$time]>.

=item Range iterator outside integer range

(F) One (or both) of the numeric arguments to the range operator ".."
are outside the range which can be represented by integers internally.
One possible workaround is to force Perl to use magical string
increment by prepending "0" to your numbers.

=item Recursive inheritance detected while looking for method '%s' %s

(F) More than 100 levels of inheritance were encountered while invoking a
method.  Probably indicates an unintended loop in your inheritance hierarchy.

=item Reference found where even-sized list expected

(W) You gave a single reference where Perl was expecting a list with
an even number of elements (for assignment to a hash). This
usually means that you used the anon hash constructor when you meant 
to use parens. In any case, a hash requires key/value B<pairs>.

    %hash = { one => 1, two => 2, };   # WRONG
    %hash = [ qw/ an anon array / ];   # WRONG
    %hash = ( one => 1, two => 2, );   # right
    %hash = qw( one 1 two 2 );                 # also fine

=item Undefined value assigned to typeglob

(W) An undefined value was assigned to a typeglob, a la C<*foo = undef>.
This does nothing.  It's possible that you really mean C<undef *foo>.

=item Use of reserved word "%s" is deprecated

(D) The indicated bareword is a reserved word.  Future versions of perl
may use it as a keyword, so you're better off either explicitly quoting
the word in a manner appropriate for its context of use, or using a
different name altogether.  The warning can be suppressed for subroutine
names by either adding a C<&> prefix, or using a package qualifier,
e.g. C<&our()>, or C<Foo::our()>.

=item perl: warning: Setting locale failed.

(S) The whole warning message will look something like:

       perl: warning: Setting locale failed.
       perl: warning: Please check that your locale settings:
               LC_ALL = "En_US",
               LANG = (unset)
           are supported and installed on your system.
       perl: warning: Falling back to the standard locale ("C").

Exactly what were the failed locale settings varies.  In the above the
settings were that the LC_ALL was "En_US" and the LANG had no value.
This error means that Perl detected that you and/or your system
administrator have set up the so-called variable system but Perl could
not use those settings.  This was not dead serious, fortunately: there
is a "default locale" called "C" that Perl can and will use, the
script will be run.  Before you really fix the problem, however, you
will get the same error message each time you run Perl.  How to really
fix the problem can be found in L<perllocale/"LOCALE PROBLEMS">.


=head1 Obsolete Diagnostics

=over 4

=item Can't mktemp()

(F) The mktemp() routine failed for some reason while trying to process
a B<-e> switch.  Maybe your /tmp partition is full, or clobbered.

Removed because B<-e> doesn't use temporary files any more.

=item Can't write to temp file for B<-e>: %s

(F) The write routine failed for some reason while trying to process
a B<-e> switch.  Maybe your /tmp partition is full, or clobbered.

Removed because B<-e> doesn't use temporary files any more.

=item Cannot open temporary file

(F) The create routine failed for some reason while trying to process
a B<-e> switch.  Maybe your /tmp partition is full, or clobbered.

Removed because B<-e> doesn't use temporary files any more.

=item regexp too big

(F) The current implementation of regular expressions uses shorts as
address offsets within a string.  Unfortunately this means that if
the regular expression compiles to longer than 32767, it'll blow up.
Usually when you want a regular expression this big, there is a better
way to do it with multiple statements.  See L<perlre>.


=head1 Configuration Changes

You can use "Configure -Uinstallusrbinperl" which causes installperl
to skip installing perl also as /usr/bin/perl.  This is useful if you
prefer not to modify /usr/bin for some reason or another but harmful
because many scripts assume to find Perl in /usr/bin/perl.

=head1 BUGS

If you find what you think is a bug, you might check the headers of
recently posted articles in the comp.lang.perl.misc newsgroup.
There may also be information at http://www.perl.com/perl/ , the Perl
Home Page.

If you believe you have an unreported bug, please run the B<perlbug>
program included with your release.  Make sure you trim your bug down
to a tiny but sufficient test case.  Your bug report, along with the
output of C<perl -V>, will be sent off to <F<perlbug at perl.com>> to be
analysed by the Perl porting team.

=head1 SEE ALSO

The F<Changes> file for exhaustive details on what changed.

The F<INSTALL> file for how to build Perl.

The F<README> file for general stuff.

The F<Artistic> and F<Copying> files for copyright information.

=head1 HISTORY

Written by Gurusamy Sarathy <F<gsar at activestate.com>>, with many contributions
from The Perl Porters.

Send omissions or corrections to <F<perlbug at perl.com>>.


--- NEW FILE: perlutil.pod ---
=head1 NAME

perlutil - utilities packaged with the Perl distribution


Along with the Perl interpreter itself, the Perl distribution installs a
range of utilities on your system. There are also several utilities
which are used by the Perl distribution itself as part of the install
process. This document exists to list all of these utilities, explain
what they are for and provide pointers to each module's documentation,
if appropriate.


=over 3

=item L<perldoc|perldoc>

The main interface to Perl's documentation is C<perldoc>, although
if you're reading this, it's more than likely that you've already found
it. F<perldoc> will extract and format the documentation from any file
in the current directory, any Perl module installed on the system, or
any of the standard documentation pages, such as this one. Use 
C<perldoc E<lt>nameE<gt>> to get information on any of the utilities
described in this document.

=item L<pod2man|pod2man> and L<pod2text|pod2text>

If it's run from a terminal, F<perldoc> will usually call F<pod2man> to
translate POD (Plain Old Documentation - see L<perlpod> for an
explanation) into a manpage, and then run F<man> to display it; if
F<man> isn't available, F<pod2text> will be used instead and the output
piped through your favourite pager.

=item L<pod2html|pod2html> and L<pod2latex|pod2latex>

As well as these two, there are two other converters: F<pod2html> will
produce HTML pages from POD, and F<pod2latex>, which produces LaTeX

=item L<pod2usage|pod2usage>

If you just want to know how to use the utilities described here,
F<pod2usage> will just extract the "USAGE" section; some of
the utilities will automatically call F<pod2usage> on themselves when
you call them with C<-help>.

=item L<podselect|podselect>

F<pod2usage> is a special case of F<podselect>, a utility to extract
named sections from documents written in POD. For instance, while
utilities have "USAGE" sections, Perl modules usually have "SYNOPSIS"
sections: C<podselect -s "SYNOPSIS" ...> will extract this section for
a given file.

=item L<podchecker|podchecker>

If you're writing your own documentation in POD, the F<podchecker>
utility will look for errors in your markup.

=item L<splain|splain>

F<splain> is an interface to L<perldiag> - paste in your error message
to it, and it'll explain it for you.

=item L<roffitall|roffitall>

The C<roffitall> utility is not installed on your system but lives in
the F<pod/> directory of your Perl source kit; it converts all the
documentation from the distribution to F<*roff> format, and produces a
typeset PostScript or text file of the whole lot.



To help you convert legacy programs to Perl, we've included three
conversion filters:

=over 3

=item L<a2p|a2p>

F<a2p> converts F<awk> scripts to Perl programs; for example, C<a2p -F:>
on the simple F<awk> script C<{print $2}> will produce a Perl program
based around this code:

    while (<>) {
        ($Fld1,$Fld2) = split(/[:\n]/, $_, 9999);
        print $Fld2;

=item L<s2p|s2p>

Similarly, F<s2p> converts F<sed> scripts to Perl programs. F<s2p> run
on C<s/foo/bar> will produce a Perl program based around this:

    while (<>) {
        print if $printit;

=item L<find2perl|find2perl>

Finally, F<find2perl> translates C<find> commands to Perl equivalents which 
use the L<File::Find|File::Find> module. As an example, 
C<find2perl . -user root -perm 4000 -print> produces the following callback
subroutine for C<File::Find>:

    sub wanted {
        my ($dev,$ino,$mode,$nlink,$uid,$gid);
        (($dev,$ino,$mode,$nlink,$uid,$gid) = lstat($_)) &&
        $uid == $uid{'root'}) &&
        (($mode & 0777) == 04000);


As well as these filters for converting other languages, the
L<pl2pm|pl2pm> utility will help you convert old-style Perl 4 libraries to 
new-style Perl5 modules.

=head2 Administration

=over 3

=item L<libnetcfg|libnetcfg>

To display and change the libnet configuration run the libnetcfg command.


=head2 Development

There are a set of utilities which help you in developing Perl programs, 
and in particular, extending Perl with C.

=over 3

=item L<perlbug|perlbug>

F<perlbug> is the recommended way to report bugs in the perl interpreter
itself or any of the standard library modules back to the developers;
please read through the documentation for F<perlbug> thoroughly before
using it to submit a bug report.

=item L<h2ph|h2ph>

Back before Perl had the XS system for connecting with C libraries,
programmers used to get library constants by reading through the C
header files. You may still see C<require 'syscall.ph'> or similar
around - the F<.ph> file should be created by running F<h2ph> on the
corresponding F<.h> file. See the F<h2ph> documentation for more on how
to convert a whole bunch of header files at once.

=item L<c2ph|c2ph> and L<pstruct|pstruct>

F<c2ph> and F<pstruct>, which are actually the same program but behave
differently depending on how they are called, provide another way of
getting at C with Perl - they'll convert C structures and union declarations
to Perl code. This is deprecated in favour of F<h2xs> these days.

=item L<h2xs|h2xs>

F<h2xs> converts C header files into XS modules, and will try and write
as much glue between C libraries and Perl modules as it can. It's also
very useful for creating skeletons of pure Perl modules.

=item L<dprofpp|dprofpp>

Perl comes with a profiler, the F<Devel::DProf> module. The
F<dprofpp> utility analyzes the output of this profiler and tells you
which subroutines are taking up the most run time. See L<Devel::DProf>
for more information.

=item L<perlcc|perlcc>

F<perlcc> is the interface to the experimental Perl compiler suite.


=head2 SEE ALSO

L<perldoc|perldoc>, L<pod2man|pod2man>, L<perlpod>,
L<pod2html|pod2html>, L<pod2usage|pod2usage>, L<podselect|podselect>,
L<podchecker|podchecker>, L<splain|splain>, L<perldiag>,
L<roffitall|roffitall>, L<a2p|a2p>, L<s2p|s2p>, L<find2perl|find2perl>,
L<File::Find|File::Find>, L<pl2pm|pl2pm>, L<perlbug|perlbug>,
L<h2ph|h2ph>, L<c2ph|c2ph>, L<h2xs|h2xs>, L<dprofpp|dprofpp>,
L<Devel::DProf>, L<perlcc|perlcc>


--- NEW FILE: roffitall ---
# Usage: roffitall [-nroff|-psroff|-groff]
# Authors: Tom Christiansen, Raphael Manfredi


if test -f ../config.sh; then
	. ../config.sh


test -d $mandir || mandir=/usr/new/man/man1
test -d $libdir || libdir=/usr/new/man/man3

case "$1" in
-nroff) cmd="nroff -man"; ext='txt';;
-psroff) cmd="psroff -t"; ext='ps';;
-groff) cmd="groff -man"; ext='ps';;
	echo "Usage: roffitall [-nroff|-psroff|-groff]" >&2
	exit 1

# NEEDS TO BE BUILT BASED ON Makefile (or Makefile.SH, should such happen)
	echo		\
	$mandir/perl.1	\
	$mandir/perl5004delta.1	\
	$mandir/perl5005delta.1	\
	$mandir/perl56delta.1	\
	$mandir/perlapi.1	\
	$mandir/perlapio.1	\
	$mandir/perlbook.1	\
	$mandir/perlboot.1	\
	$mandir/perlbot.1	\
	$mandir/perlcall.1	\
	$mandir/perlcompile.1	\
	$mandir/perldata.1	\
	$mandir/perldbmfilter.1	\
	$mandir/perldebguts.1	\
	$mandir/perldebug.1	\
	$mandir/perldelta.1	\
	$mandir/perldiag.1	\
	$mandir/perldsc.1	\
	$mandir/perlembed.1	\
	$mandir/perlfaq.1	\
	$mandir/perlfaq1.1	\
	$mandir/perlfaq2.1	\
	$mandir/perlfaq3.1	\
	$mandir/perlfaq4.1	\
	$mandir/perlfaq5.1	\
	$mandir/perlfaq6.1	\
	$mandir/perlfaq7.1	\
	$mandir/perlfaq8.1	\
	$mandir/perlfaq9.1	\
	$mandir/perlfilter.1	\
	$mandir/perlfork.1	\
	$mandir/perlform.1	\
	$mandir/perlfunc.1	\
	$mandir/perlguts.1	\
	$mandir/perlhack.1	\
	$mandir/perlhist.1	\
	$mandir/perlintern.1	\
	$mandir/perlipc.1	\
	$mandir/perllexwarn.1	\
	$mandir/perllocale.1	\
	$mandir/perllol.1	\
	$mandir/perlmod.1	\
	$mandir/perlmodinstall.1	\
	$mandir/perlmodlib.1	\
	$mandir/perlnewmod.1	\
	$mandir/perlnumber.1	\
	$mandir/perlobj.1	\
	$mandir/perlop.1	\
	$mandir/perlopentut.1	\
	$mandir/perlpod.1	\
	$mandir/perlport.1	\
	$mandir/perlre.1	\
	$mandir/perlref.1	\
	$mandir/perlreftut.1	\
	$mandir/perlrequick.1	\
	$mandir/perlretut.1	\
	$mandir/perlrun.1	\
	$mandir/perlsec.1	\
	$mandir/perlstyle.1	\
	$mandir/perlsub.1	\
	$mandir/perlsyn.1	\
	$mandir/perlthrtut.1	\
	$mandir/perltie.1	\
	$mandir/perltoc.1	\
	$mandir/perltodo.1	\
	$mandir/perltooc.1	\
	$mandir/perltoot.1	\
	$mandir/perltrap.1	\
	$mandir/perlunicode.1	\
	$mandir/perlutil.1	\
	$mandir/perlvar.1	\
	$mandir/perlxs.1	\
	$mandir/perlxstut.1	\
    $mandir/a2p.1	\
    $mandir/c2ph.1	\
    $mandir/dprofpp.1	\
    $mandir/h2ph.1	\
    $mandir/h2xs.1	\
    $mandir/perlbug.1	\
    $mandir/perldoc.1	\
    $mandir/pl2pm.1	\
    $mandir/pod2html.1	\
    $mandir/pod2man.1	\
    $mandir/s2p.1	\
    $mandir/splain.1	\
    $mandir/xsubpp.1	\
    $libdir/attrs.3	\
    $libdir/autouse.3	\
    $libdir/base.3	\
    $libdir/blib.3	\
    $libdir/constant.3	\
    $libdir/diagnostics.3	\
    $libdir/fields.3	\
    $libdir/filetest.3	\
    $libdir/integer.3	\
    $libdir/less.3	\
    $libdir/lib.3	\
    $libdir/locale.3	\
    $libdir/ops.3	\
    $libdir/overload.3	\
    $libdir/re.3	\
    $libdir/sigtrap.3	\
    $libdir/strict.3	\
    $libdir/subs.3	\
    $libdir/vars.3	\
    $libdir/AnyDBM_File.3	\
    $libdir/AutoLoader.3	\
    $libdir/AutoSplit.3	\
    $libdir/B.3	\
    $libdir/B::Asmdata.3	\
    $libdir/B::Assembler.3	\
    $libdir/B::Bblock.3	\
    $libdir/B::Bytecode.3	\
    $libdir/B::C.3	\
    $libdir/B::CC.3	\
    $libdir/B::Debug.3	\
    $libdir/B::Deparse.3	\
    $libdir/B::Disassembler.3	\
    $libdir/B::Lint.3	\
    $libdir/B::Showlex.3	\
    $libdir/B::Stackobj.3	\
    $libdir/B::Terse.3	\
    $libdir/B::Xref.3	\
    $libdir/Benchmark.3	\
    $libdir/Carp.3	\
    $libdir/CGI.3	\
    $libdir/CGI::Apache.3	\
    $libdir/CGI::Carp.3	\
    $libdir/CGI::Cookie.3	\
    $libdir/CGI::Fast.3	\
    $libdir/CGI::Push.3	\
    $libdir/CGI::Switch.3	\
    $libdir/Class::Struct.3	\
    $libdir/Config.3	\
    $libdir/CPAN.3	\
    $libdir/CPAN::FirstTime.3	\
    $libdir/CPAN::Nox.3	\
    $libdir/Cwd.3	\
    $libdir/Data::Dumper.3	\
    $libdir/DB_File.3	\
    $libdir/Devel::SelfStubber.3	\
    $libdir/DirHandle.3	\
    $libdir/DynaLoader.3	\
    $libdir/Dumpvalue.3	\
    $libdir/English.3	\
    $libdir/Env.3	\
    $libdir/Errno.3	\
    $libdir/Exporter.3	\
    $libdir/ExtUtils::Command.3	\
    $libdir/ExtUtils::Embed.3	\
    $libdir/ExtUtils::Install.3	\
    $libdir/ExtUtils::Installed.3	\
    $libdir/ExtUtils::Liblist.3	\
    $libdir/ExtUtils::MakeMaker.3	\
    $libdir/ExtUtils::Manifest.3	\
    $libdir/ExtUtils::Miniperl.3	\
    $libdir/ExtUtils::Mkbootstrap.3	\
    $libdir/ExtUtils::Mksymlists.3	\
    $libdir/ExtUtils::MM_OS2.3	\
    $libdir/ExtUtils::MM_Unix.3	\
    $libdir/ExtUtils::MM_VMS.3	\
    $libdir/ExtUtils::MM_Win32.3	\
    $libdir/ExtUtils::Packlist.3	\
    $libdir/ExtUtils::testlib.3	\
    $libdir/Fatal.3	\
    $libdir/Fcntl.3	\
    $libdir/File::Basename.3	\
    $libdir/File::CheckTree.3	\
    $libdir/File::Compare.3	\
    $libdir/File::Copy.3	\
    $libdir/File::DosGlob.3	\
    $libdir/File::Find.3	\
    $libdir/File::Path.3	\
    $libdir/File::Spec.3	\
    $libdir/File::Spec::Mac.3	\
    $libdir/File::Spec::OS2.3	\
    $libdir/File::Spec::Unix.3	\
    $libdir/File::Spec::VMS.3	\
    $libdir/File::Spec::Win32.3	\
    $libdir/File::stat.3	\
    $libdir/FileCache.3	\
    $libdir/FileHandle.3	\
    $libdir/FindBin.3	\
    $libdir/GDBM_File.3	\
    $libdir/Getopt::Long.3	\
    $libdir/Getopt::Std.3	\
    $libdir/I18N::Collate.3	\
    $libdir/IO.3 \
    $libdir/IO::File.3 \
    $libdir/IO::Handle.3 \
    $libdir/IO::Pipe.3 \
    $libdir/IO::Seekable.3 \
    $libdir/IO::Select.3 \
    $libdir/IO::Socket.3 \
    $libdir/IPC::Msg.3	\
    $libdir/IPC::Open2.3	\
    $libdir/IPC::Open3.3	\
    $libdir/IPC::Semaphore.3	\
    $libdir/IPC::SysV.3	\
    $libdir/Math::BigFloat.3	\
    $libdir/Math::BigInt.3	\
    $libdir/Math::Complex.3	\
    $libdir/Math::Trig.3	\
    $libdir/NDBM_File.3	\
    $libdir/Net::hostent.3	\
    $libdir/Net::netent.3	\
    $libdir/Net::Ping.3	\
    $libdir/Net::protoent.3	\
    $libdir/Net::servent.3	\
    $libdir/O.3	\
    $libdir/Opcode.3	\
    $libdir/Pod::Html.3	\
    $libdir/Pod::Text.3	\
    $libdir/POSIX.3	\
    $libdir/Safe.3	\
    $libdir/SDBM_File.3	\
    $libdir/Search::Dict.3	\
    $libdir/SelectSaver.3	\
    $libdir/SelfLoader.3	\
    $libdir/Shell.3	\
    $libdir/Socket.3	\
    $libdir/Symbol.3	\
    $libdir/Sys::Hostname.3	\
    $libdir/Sys::Syslog.3	\
    $libdir/Term::Cap.3	\
    $libdir/Term::Complete.3	\
    $libdir/Term::ReadLine.3	\
    $libdir/Test.3	\
    $libdir/Test::Harness.3	\
    $libdir/Text::Abbrev.3	\
    $libdir/Text::ParseWords.3	\
    $libdir/Text::Soundex.3	\
    $libdir/Text::Tabs.3	\
    $libdir/Text::Wrap.3	\
    $libdir/Tie::Array.3	\
    $libdir/Tie::Handle.3	\
    $libdir/Tie::Hash.3	\
    $libdir/Tie::RefHash.3	\
    $libdir/Tie::Scalar.3	\
    $libdir/Tie::SubstrHash.3	\
    $libdir/Time::gmtime.3	\
    $libdir/Time::Local.3	\
    $libdir/Time::localtime.3	\
    $libdir/Time::tm.3		\
    $libdir/UNIVERSAL.3		\
    $libdir/User::grent.3		\
    $libdir/User::pwent.3 | \
    perl -ne 'map { -r && print "$_ " } split'`

    # Bypass internal shell buffer limit -- can't use case
    if perl -e '$a = shift; exit($a =~ m|/|)' $toroff; then
	echo "$me: empty file list -- did you run install?" >&2
	exit 1

    #psroff -t -man -rC1 -rD1 -rF1 > $tmp/PerlDoc.ps 2>$tmp/PerlTOC.raw
    #nroff -man -rC1 -rD1 -rF1 > $tmp/PerlDoc.txt 2>$tmp/PerlTOC.nr.raw

    # First, create the raw data
    run="$cmd -rC1 -rD1 -rF1 >$tmp/PerlDoc.$ext 2>$tmp/PerlTOC.$ext.raw"
    echo "$me: running $run"
    eval $run $toroff

    #Now create the TOC
    echo "$me: parsing TOC"
    ./rofftoc $tmp/PerlTOC.$ext.raw > $tmp/PerlTOC.tmp.man
    run="$cmd $tmp/PerlTOC.tmp.man >$tmp/PerlTOC.$ext"
    echo "$me: running $run"
    eval $run

    # Finally, recreate the Doc, without the blank page 0
    run="$cmd -rC1 -rD1 >$tmp/PerlDoc.$ext 2>$tmp/PerlTOC.$ext.raw"
    echo "$me: running $run"
    eval $run $toroff
    rm -f $tmp/PerlTOC.tmp.man $tmp/PerlTOC.$ext.raw
    echo "$me: leaving you with $tmp/PerlDoc.$ext and $tmp/PerlTOC.$ext"

--- NEW FILE: perlhist.pod ---
=head1 NAME

perlhist - the Perl history records


This document aims to record the Perl source code releases.


Perl history in brief, by Larry Wall:

    Perl 0 introduced Perl to my officemates.
    Perl 1 introduced Perl to the world, and changed /\(...\|...\)/ to
        /(...|...)/.  \(Dan Faigin still hasn't forgiven me. :-\)
    Perl 2 introduced Henry Spencer's regular expression package.
    Perl 3 introduced the ability to handle binary data (embedded nulls).
    Perl 4 introduced the first Camel book.  Really.  We mostly just
        switched version numbers so the book could refer to 4.000.
    Perl 5 introduced everything else, including the ability to
        introduce everything else.


Larry Wall, Andy Dougherty, Tom Christiansen, Charles Bailey, Nick
Ing-Simmons, Chip Salzenberg, Tim Bunce, Malcolm Beattie, Gurusamy
Sarathy, Graham Barr, Jarkko Hietaniemi, Hugo van der Sanden,
Michael Schwern, Rafael Garcia-Suarez, Nicholas Clark, Richard Clamp,
Leon Brocard.

=head2 PUMPKIN?

[from Porting/pumpkin.pod in the Perl source code distribution]

Chip Salzenberg gets credit for that, with a nod to his cow orker,
David Croy.  We had passed around various names (baton, token, hot
potato) but none caught on.  Then, Chip asked:

[begin quote]

   Who has the patch pumpkin?

To explain:  David Croy once told me once that at a previous job,
there was one tape drive and multiple systems that used it for backups.
But instead of some high-tech exclusion software, they used a low-tech
method to prevent multiple simultaneous backups: a stuffed pumpkin.
No one was allowed to make backups unless they had the "backup pumpkin".

[end quote]

The name has stuck.  The holder of the pumpkin is sometimes called
the pumpking (keeping the source afloat?) or the pumpkineer (pulling
the strings?).


 Pump-  Release         Date            Notes
 king                                   (by no means
                                         see Changes*
                                         for details)

 Larry   0              Classified.     Don't ask.

 Larry   1.000          1987-Dec-18

          1.001..10     1988-Jan-30
          1.011..14     1988-Feb-02
 Schwern  1.0.15        2002-Dec-18     Modernization
 Richard  1.0.16        2003-Dec-18

 Larry   2.000          1988-Jun-05

          2.001         1988-Jun-28

 Larry   3.000          1989-Oct-18

          3.001         1989-Oct-26
          3.002..4      1989-Nov-11
          3.005         1989-Nov-18
          3.006..8      1989-Dec-22
          3.009..13     1990-Mar-02
          3.014         1990-Mar-13
          3.015         1990-Mar-14
          3.016..18     1990-Mar-28
          3.019..27     1990-Aug-10     User subs.
          3.028         1990-Aug-14
          3.029..36     1990-Oct-17
          3.037         1990-Oct-20
          3.040         1990-Nov-10
          3.041         1990-Nov-13
          3.042..43     1991-Jan-??
          3.044         1991-Jan-12

 Larry   4.000          1991-Mar-21

          4.001..3      1991-Apr-12
          4.004..9      1991-Jun-07
          4.010         1991-Jun-10
          4.011..18     1991-Nov-05
          4.019         1991-Nov-11     Stable.
          4.020..33     1992-Jun-08
          4.034         1992-Jun-11
          4.035         1992-Jun-23
 Larry    4.036         1993-Feb-05     Very stable.

          5.000alpha1   1993-Jul-31
          5.000alpha2   1993-Aug-16
          5.000alpha3   1993-Oct-10
          5.000alpha4   1993-???-??
          5.000alpha5   1993-???-??
          5.000alpha6   1994-Mar-18
          5.000alpha7   1994-Mar-25
 Andy     5.000alpha8   1994-Apr-04
 Larry    5.000alpha9   1994-May-05     ext appears.
          5.000alpha10  1994-Jun-11
          5.000alpha11  1994-Jul-01
 Andy     5.000a11a     1994-Jul-07     To fit 14.
          5.000a11b     1994-Jul-14
          5.000a11c     1994-Jul-19
          5.000a11d     1994-Jul-22
 Larry    5.000alpha12  1994-Aug-04
 Andy     5.000a12a     1994-Aug-08
          5.000a12b     1994-Aug-15
          5.000a12c     1994-Aug-22
          5.000a12d     1994-Aug-22
          5.000a12e     1994-Aug-22
          5.000a12f     1994-Aug-24
          5.000a12g     1994-Aug-24
          5.000a12h     1994-Aug-24
 Larry    5.000beta1    1994-Aug-30
 Andy     5.000b1a      1994-Sep-06
 Larry    5.000beta2    1994-Sep-14     Core slushified.
 Andy     5.000b2a      1994-Sep-14
          5.000b2b      1994-Sep-17
          5.000b2c      1994-Sep-17
 Larry    5.000beta3    1994-Sep-??
 Andy     5.000b3a      1994-Sep-18
          5.000b3b      1994-Sep-22
          5.000b3c      1994-Sep-23
          5.000b3d      1994-Sep-27
          5.000b3e      1994-Sep-28
          5.000b3f      1994-Sep-30
          5.000b3g      1994-Oct-04
 Andy     5.000b3h      1994-Oct-07
 Larry?   5.000gamma    1994-Oct-13?

 Larry   5.000          1994-Oct-17

 Andy     5.000a        1994-Dec-19
          5.000b        1995-Jan-18
          5.000c        1995-Jan-18
          5.000d        1995-Jan-18
          5.000e        1995-Jan-18
          5.000f        1995-Jan-18
          5.000g        1995-Jan-18
          5.000h        1995-Jan-18
          5.000i        1995-Jan-26
          5.000j        1995-Feb-07
          5.000k        1995-Feb-11
          5.000l        1995-Feb-21
          5.000m        1995-Feb-28
          5.000n        1995-Mar-07
          5.000o        1995-Mar-13?

 Larry   5.001          1995-Mar-13

 Andy     5.001a        1995-Mar-15
          5.001b        1995-Mar-31
          5.001c        1995-Apr-07
          5.001d        1995-Apr-14
          5.001e        1995-Apr-18     Stable.
          5.001f        1995-May-31
          5.001g        1995-May-25
          5.001h        1995-May-25
          5.001i        1995-May-30
          5.001j        1995-Jun-05
          5.001k        1995-Jun-06
          5.001l        1995-Jun-06     Stable.
          5.001m        1995-Jul-02     Very stable.
          5.001n        1995-Oct-31     Very unstable.
          5.002beta1    1995-Nov-21
          5.002b1a      1995-Dec-04
          5.002b1b      1995-Dec-04
          5.002b1c      1995-Dec-04
          5.002b1d      1995-Dec-04
          5.002b1e      1995-Dec-08
          5.002b1f      1995-Dec-08
 Tom      5.002b1g      1995-Dec-21     Doc release.
 Andy     5.002b1h      1996-Jan-05
          5.002b2       1996-Jan-14
 Larry    5.002b3       1996-Feb-02
 Andy     5.002gamma    1996-Feb-11
 Larry    5.002delta    1996-Feb-27

 Larry   5.002          1996-Feb-29     Prototypes.

 Charles  5.002_01      1996-Mar-25

         5.003          1996-Jun-25     Security release.

          5.003_01      1996-Jul-31
 Nick     5.003_02      1996-Aug-10
 Andy     5.003_03      1996-Aug-28
          5.003_04      1996-Sep-02
          5.003_05      1996-Sep-12
          5.003_06      1996-Oct-07
          5.003_07      1996-Oct-10
 Chip     5.003_08      1996-Nov-19
          5.003_09      1996-Nov-26
          5.003_10      1996-Nov-29
          5.003_11      1996-Dec-06
          5.003_12      1996-Dec-19
          5.003_13      1996-Dec-20
          5.003_14      1996-Dec-23
          5.003_15      1996-Dec-23
          5.003_16      1996-Dec-24
          5.003_17      1996-Dec-27
          5.003_18      1996-Dec-31
          5.003_19      1997-Jan-04
          5.003_20      1997-Jan-07
          5.003_21      1997-Jan-15
          5.003_22      1997-Jan-16
          5.003_23      1997-Jan-25
          5.003_24      1997-Jan-29
          5.003_25      1997-Feb-04
          5.003_26      1997-Feb-10
          5.003_27      1997-Feb-18
          5.003_28      1997-Feb-21
          5.003_90      1997-Feb-25     Ramping up to the 5.004 release.
          5.003_91      1997-Mar-01
          5.003_92      1997-Mar-06
          5.003_93      1997-Mar-10
          5.003_94      1997-Mar-22
          5.003_95      1997-Mar-25
          5.003_96      1997-Apr-01
          5.003_97      1997-Apr-03     Fairly widely used.
          5.003_97a     1997-Apr-05
          5.003_97b     1997-Apr-08
          5.003_97c     1997-Apr-10
          5.003_97d     1997-Apr-13
          5.003_97e     1997-Apr-15
          5.003_97f     1997-Apr-17
          5.003_97g     1997-Apr-18
          5.003_97h     1997-Apr-24
          5.003_97i     1997-Apr-25
          5.003_97j     1997-Apr-28
          5.003_98      1997-Apr-30
          5.003_99      1997-May-01
          5.003_99a     1997-May-09
          p54rc1        1997-May-12     Release Candidates.
          p54rc2        1997-May-14

 Chip    5.004          1997-May-15     A major maintenance release.

 Tim      5.004_01-t1   1997-???-??     The 5.004 maintenance track.
          5.004_01-t2   1997-Jun-11     aka perl5.004m1t2
          5.004_01      1997-Jun-13
          5.004_01_01   1997-Jul-29     aka perl5.004m2t1
          5.004_01_02   1997-Aug-01     aka perl5.004m2t2
          5.004_01_03   1997-Aug-05     aka perl5.004m2t3
          5.004_02      1997-Aug-07
          5.004_02_01   1997-Aug-12     aka perl5.004m3t1
          5.004_03-t2   1997-Aug-13     aka perl5.004m3t2
          5.004_03      1997-Sep-05
          5.004_04-t1   1997-Sep-19     aka perl5.004m4t1
          5.004_04-t2   1997-Sep-23     aka perl5.004m4t2
          5.004_04-t3   1997-Oct-10     aka perl5.004m4t3
          5.004_04-t4   1997-Oct-14     aka perl5.004m4t4
          5.004_04      1997-Oct-15
          5.004_04-m1   1998-Mar-04     (5.004m5t1) Maint. trials for 5.004_05.
          5.004_04-m2   1998-May-01
          5.004_04-m3   1998-May-15
          5.004_04-m4   1998-May-19
          5.004_05-MT5  1998-Jul-21
          5.004_05-MT6  1998-Oct-09
          5.004_05-MT7  1998-Nov-22
          5.004_05-MT8  1998-Dec-03
 Chip     5.004_05-MT9  1999-Apr-26
          5.004_05      1999-Apr-29

 Malcolm  5.004_50      1997-Sep-09     The 5.005 development track.
          5.004_51      1997-Oct-02
          5.004_52      1997-Oct-15
          5.004_53      1997-Oct-16
          5.004_54      1997-Nov-14
          5.004_55      1997-Nov-25
          5.004_56      1997-Dec-18
          5.004_57      1998-Feb-03
          5.004_58      1998-Feb-06
          5.004_59      1998-Feb-13
          5.004_60      1998-Feb-20
          5.004_61      1998-Feb-27
          5.004_62      1998-Mar-06
          5.004_63      1998-Mar-17
          5.004_64      1998-Apr-03
          5.004_65      1998-May-15
          5.004_66      1998-May-29
 Sarathy  5.004_67      1998-Jun-15
          5.004_68      1998-Jun-23
          5.004_69      1998-Jun-29
          5.004_70      1998-Jul-06
          5.004_71      1998-Jul-09
          5.004_72      1998-Jul-12
          5.004_73      1998-Jul-13
          5.004_74      1998-Jul-14     5.005 beta candidate.
          5.004_75      1998-Jul-15     5.005 beta1.
          5.004_76      1998-Jul-21     5.005 beta2.
          5.005         1998-Jul-22     Oneperl.

 Sarathy  5.005_01      1998-Jul-27     The 5.005 maintenance track.
          5.005_02-T1   1998-Aug-02
          5.005_02-T2   1998-Aug-05
          5.005_02      1998-Aug-08
 Graham   5.005_03-MT1  1998-Nov-30
          5.005_03-MT2  1999-Jan-04
          5.005_03-MT3  1999-Jan-17
          5.005_03-MT4  1999-Jan-26
          5.005_03-MT5  1999-Jan-28
          5.005_03-MT6  1999-Mar-05
          5.005_03      1999-Mar-28
 Leon     5.005_04-RC1  2004-Feb-05
          5.005_04-RC2  2004-Feb-18
          5.005_04      2004-Feb-23

 Sarathy  5.005_50      1998-Jul-26     The 5.6 development track.
          5.005_51      1998-Aug-10
          5.005_52      1998-Sep-25
          5.005_53      1998-Oct-31
          5.005_54      1998-Nov-30
          5.005_55      1999-Feb-16
          5.005_56      1999-Mar-01
          5.005_57      1999-May-25
          5.005_58      1999-Jul-27
          5.005_59      1999-Aug-02
          5.005_60      1999-Aug-02
          5.005_61      1999-Aug-20
          5.005_62      1999-Oct-15
          5.005_63      1999-Dec-09
          5.5.640       2000-Feb-02
          5.5.650       2000-Feb-08     beta1
          5.5.660       2000-Feb-22     beta2
          5.5.670       2000-Feb-29     beta3
          5.6.0-RC1     2000-Mar-09     Release candidate 1.
          5.6.0-RC2     2000-Mar-14     Release candidate 2.
          5.6.0-RC3     2000-Mar-21     Release candidate 3.
          5.6.0         2000-Mar-22

 Sarathy  5.6.1-TRIAL1  2000-Dec-18     The 5.6 maintenance track.
          5.6.1-TRIAL2  2001-Jan-31
          5.6.1-TRIAL3  2001-Mar-19
          5.6.1-foolish 2001-Apr-01     The "fools-gold" release.
          5.6.1         2001-Apr-08
 Rafael   5.6.2-RC1     2003-Nov-08
          5.6.2         2003-Nov-15     Fix new build issues

 Jarkko   5.7.0         2000-Sep-02     The 5.7 track: Development.
          5.7.1         2001-Apr-09
          5.7.2         2001-Jul-13     Virtual release candidate 0.
          5.7.3         2002-Mar-05
          5.8.0-RC1     2002-Jun-01
          5.8.0-RC2     2002-Jun-21
          5.8.0-RC3     2002-Jul-13
          5.8.0         2002-Jul-18
          5.8.1-RC1     2003-Jul-10
          5.8.1-RC2     2003-Jul-11
          5.8.1-RC3     2003-Jul-30
          5.8.1-RC4     2003-Aug-01
          5.8.1-RC5     2003-Sep-22
          5.8.1         2003-Sep-25
 Nicholas 5.8.2-RC1     2003-Oct-27
          5.8.2-RC2     2003-Nov-03
          5.8.2         2003-Nov-05
          5.8.3-RC1     2004-Jan-07
          5.8.3         2004-Jan-14
          5.8.4-RC1     2004-Apr-05
          5.8.4-RC2     2004-Apr-15
          5.8.4         2004-Apr-21
          5.8.5-RC1     2004-Jul-06
          5.8.5-RC2     2004-Jul-08
          5.8.5         2004-Jul-19
          5.8.6-RC1     2004-Nov-11
          5.8.6         2004-Nov-27
          5.8.7-RC1     2005-May-18
          5.8.7         2005-May-30
          5.8.8-RC1     2006-Jan-20
          5.8.8         2006-Jan-31

 Hugo     5.9.0         2003-Oct-27
 Rafael   5.9.1         2004-Mar-16
          5.9.2         2005-Apr-01
          5.9.3         2006-Jan-28


For example the notation "core: 212  29" in the release 1.000 means that
it had in the core 212 kilobytes, in 29 files.  The "core".."doc" are
explained below.

 release        core       lib         ext        t         doc

 1.000           212  29      -   -      -   -     38  51     62   3
 1.014           219  29      -   -      -   -     39  52     68   4
 2.000           309  31      2   3      -   -     55  57     92   4
 2.001           312  31      2   3      -   -     55  57     94   4
 3.000           508  36     24  11      -   -     79  73    156   5
 3.044           645  37     61  20      -   -     90  74    190   6
 4.000           635  37     59  20      -   -     91  75    198   4
 4.019           680  37     85  29      -   -     98  76    199   4
 4.036           709  37     89  30      -   -     98  76    208   5
 5.000alpha2     785  50    114  32      -   -    112  86    209   5
 5.000alpha3     801  50    117  33      -   -    121  87    209   5
 5.000alpha9    1022  56    149  43    116  29    125  90    217   6
 5.000a12h       978  49    140  49    205  46    152  97    228   9
 5.000b3h       1035  53    232  70    216  38    162  94    218  21
 5.000          1038  53    250  76    216  38    154  92    536  62
 5.001m         1071  54    388  82    240  38    159  95    544  29
 5.002          1121  54    661 101    287  43    155  94    847  35
 5.003          1129  54    680 102    291  43    166 100    853  35
 5.003_07       1231  60    748 106    396  53    213 137    976  39
 5.004          1351  60   1230 136    408  51    355 161   1587  55
 5.004_01       1356  60   1258 138    410  51    358 161   1587  55
 5.004_04       1375  60   1294 139    413  51    394 162   1629  55
 5.004_05       1463  60   1435 150    394  50    445 175   1855  59
 5.004_51       1401  61   1260 140    413  53    358 162   1594  56
 5.004_53       1422  62   1295 141    438  70    394 162   1637  56
 5.004_56       1501  66   1301 140    447  74    408 165   1648  57
 5.004_59       1555  72   1317 142    448  74    424 171   1678  58
 5.004_62       1602  77   1327 144    629  92    428 173   1674  58
 5.004_65       1626  77   1358 146    615  92    446 179   1698  60
 5.004_68       1856  74   1382 152    619  92    463 187   1784  60
 5.004_70       1863  75   1456 154    675  92    494 194   1809  60
 5.004_73       1874  76   1467 152    762 102    506 196   1883  61
 5.004_75       1877  76   1467 152    770 103    508 196   1896  62
 5.005          1896  76   1469 152    795 103    509 197   1945  63
 5.005_03       1936  77   1541 153    813 104    551 201   2176  72       
 5.005_50       1969  78   1842 301    795 103    514 198   1948  63
 5.005_53       1999  79   1885 303    806 104    602 224   2002  67
 5.005_56       2086  79   1970 307    866 113    672 238   2221  75
 5.6.0          2930  80   2626 364   1096 129    868 281   2841  93
 5.7.0          2977  80   2801 425   1250 132    975 307   3206 100
 5.6.1          3049  80   3764 484   1924 159   1025 304   3593 119
 5.7.1          3351  84   3442 455   1944 167   1334 357   3698 124
 5.7.2          3491  87   4858 618   3290 298   1598 449   3910 139
 5.7.3          3415  87   5367 630  14448 410   2205 640   4491 148

The "core"..."doc" mean the following files from the Perl source code
distribution.  The glob notation ** means recursively, (.) means
regular files.

 core   *.[hcy]
 lib    lib/**/*.p[ml]
 ext    ext/**/*.{[hcyt],xs,pm}
 t      t/**/*(.) (for 1-5.005_56) or **/*.t (for 5.6.0-5.7.3)
 doc    {README*,INSTALL,*[_.]man{,.?},pod/**/*.pod}

Here are some statistics for the other subdirectories and one file in
the Perl source distribution for somewhat more selected releases.

   Legend:  kB   #

            1.014   2.001   3.044   4.000   4.019   4.036

 atarist      -  -    -  -    -  -    -  -    -  -  113 31
 Configure   31  1   37  1   62  1   73  1   83  1   86  1
 eg           -  -   34 28   47 39   47 39   47 39   47 39
 emacs        -  -    -  -    -  -   67  4   67  4   67  4
 h2pl         -  -    -  -   12 12   12 12   12 12   12 12
 hints        -  -    -  -    -  -    -  -    5 42   11 56
 msdos        -  -    -  -   41 13   57 15   58 15   60 15
 os2          -  -    -  -   63 22   81 29   81 29  113 31
 usub         -  -    -  -   21 16   25  7   43  8   43  8
 x2p        103 17  104 17  137 17  147 18  152 19  154 19


            5.000a2 5.000a12h 5.000b3h 5.000  5.001m  5.002   5.003

 atarist    113 31  113 31    -  -      -  -    -  -    -  -    -  -
 bench        -  -    0  1    -  -      -  -    -  -    -  -    -  -
 Bugs         2  5   26  1    -  -      -  -    -  -    -  -    -  -
 dlperl      40  5    -  -    -  -      -  -    -  -    -  -    -  -
 do         127 71    -  -    -  -      -  -    -  -    -  -    -  -
 Configure    -  -  153  1  159  1    160  1  180  1  201  1  201  1
 Doc          -  -   26  1   75  7     11  1   11  1    -  -    -  -
 eg          79 58   53 44   51 43     54 44   54 44   54 44   54 44
 emacs       67  4  104  6  104  6    104  1  104  6  108  1  108  1
 h2pl        12 12   12 12   12 12     12 12   12 12   12 12   12 12
 hints       11 56   12 46   18 48     18 48   44 56   73 59   77 60
 msdos       60 15   60 15    -  -      -  -    -  -    -  -    -  -
 os2        113 31  113 31    -  -      -  -    -  -   84 17   56 10
 U            -  -   62  8  112 42      -  -    -  -    -  -    -  -
 usub        43  8    -  -    -  -      -  -    -  -    -  -    -  -
 utils        -  -    -  -    -  -      -  -    -  -   87  7   88  7
 vms          -  -   80  7  123  9    184 15  304 20  500 24  475 26
 x2p        171 22  171 21  162 20    162 20  279 20  280 20  280 20


            5.003_07 5.004   5.004_04 5.004_62 5.004_65 5.004_68

 beos         -  -     -  -    -  -     -  -     1   1    1   1
 Configure  217  1   225  1  225  1   240  1   248   1  256   1
 cygwin32     -  -    23  5   23  5    23  5    24   5   24   5
 djgpp        -  -     -  -    -  -    14  5    14   5   14   5
 eg          54 44    81 62   81 62    81 62    81  62   81  62
 emacs      143  1   194  1  204  1   212  2   212   2  212   2
 h2pl        12 12    12 12   12 12    12 12    12  12   12  12
 hints       90 62   129 69  132 71   144 72   151  74  155  74
 os2        117 42   121 42  127 42   127 44   129  44  129  44
 plan9       79 15    82 15   82 15    82 15    82  15   82  15
 Porting     51  1    94  2  109  4   203  6   234   8  241   9
 qnx          -  -     1  2    1  2     1  2     1   2    1   2
 utils       97  7   112  8  118  8   124  8   156   9  159   9
 vms        505 27   518 34  524 34   538 34   569  34  569  34
 win32        -  -   285 33  378 36   470 39   493  39  575  41
 x2p        280 19   281 19  281 19   281 19   282  19  281  19


            5.004_70 5.004_73 5.004_75  5.005  5.005_03

 apollo       -   -    -   -    -   -    -   -    0   1
 beos         1   1    1   1    1   1    1   1    1   1
 Configure  256   1  256   1  264   1  264   1  270   1
 cygwin32    24   5   24   5   24   5   24   5   24   5  
 djgpp       14   5   14   5   14   5   14   5	 15   5
 eg          86  65   86  65   86  65   86  65	 86  65
 emacs      262   2  262   2  262   2  262   2	274   2
 h2pl        12  12   12  12   12  12   12  12	 12  12
 hints      157  74  157  74  159  74  160  74	179  77
 mint         -   -    -   -    -   -    -   -	  4   7
 mpeix        -   -    -   -    5   3    5   3	  5   3
 os2        129  44  139  44  142  44  143  44	148  44
 plan9       82  15   82  15   82  15   82  15	 82  15
 Porting    241   9  253   9  259  10  264  12	272  13
 qnx          1   2    1   2    1   2    1   2	  1   2
 utils      160   9  160   9  160   9  160   9	164   9
 vms        570  34  572  34  573  34  575  34	583  34
 vos          -   -    -   -    -   -    -   -	156  10
 win32      577  41  585  41  585  41  587  41	600  42
 x2p        281  19  281  19  281  19  281  19	281  19


The "diff lines kb" means that for example the patch 5.003_08, to be
applied on top of the 5.003_07 (or whatever was before the 5.003_08)
added lines for 110 kilobytes, it removed lines for 19 kilobytes, and
changed lines for 424 kilobytes.  Just the lines themselves are
counted, not their context.  The "+ - !" become from the diff(1)
context diff output format.

 Pump-  Release         Date           diff lines kB
 king                                  -------------
                                          +   -   !

 Chip     5.003_08      1996-Nov-19     110  19 424
          5.003_09      1996-Nov-26      38   9 248
          5.003_10      1996-Nov-29      29   2  27
          5.003_11      1996-Dec-06      73  12 165
          5.003_12      1996-Dec-19     275   6 436
          5.003_13      1996-Dec-20      95   1  56
          5.003_14      1996-Dec-23      23   7 333
          5.003_15      1996-Dec-23       0   0   1
          5.003_16      1996-Dec-24      12   3  50
          5.003_17      1996-Dec-27      19   1  14
          5.003_18      1996-Dec-31      21   1  32
          5.003_19      1997-Jan-04      80   3  85
          5.003_20      1997-Jan-07      18   1 146
          5.003_21      1997-Jan-15      38  10 221
          5.003_22      1997-Jan-16       4   0  18
          5.003_23      1997-Jan-25      71  15 119
          5.003_24      1997-Jan-29     426   1  20
          5.003_25      1997-Feb-04      21   8 169
          5.003_26      1997-Feb-10      16   1  15
          5.003_27      1997-Feb-18      32  10  38
          5.003_28      1997-Feb-21      58   4  66
          5.003_90      1997-Feb-25      22   2  34
          5.003_91      1997-Mar-01      37   1  39
          5.003_92      1997-Mar-06      16   3  69
          5.003_93      1997-Mar-10      12   3  15
          5.003_94      1997-Mar-22     407   7 200
          5.003_95      1997-Mar-25      41   1  37
          5.003_96      1997-Apr-01     283   5 261
          5.003_97      1997-Apr-03      13   2  34
          5.003_97a     1997-Apr-05      57   1  27
          5.003_97b     1997-Apr-08      14   1  20
          5.003_97c     1997-Apr-10      20   1  16
          5.003_97d     1997-Apr-13       8   0  16
          5.003_97e     1997-Apr-15      15   4  46
          5.003_97f     1997-Apr-17       7   1  33
          5.003_97g     1997-Apr-18       6   1  42
          5.003_97h     1997-Apr-24      23   3  68
          5.003_97i     1997-Apr-25      23   1  31
          5.003_97j     1997-Apr-28      36   1  49
          5.003_98      1997-Apr-30     171  12 539
          5.003_99      1997-May-01       6   0   7
          5.003_99a     1997-May-09      36   2  61
          p54rc1        1997-May-12       8   1  11
          p54rc2        1997-May-14       6   0  40

        5.004           1997-May-15       4   0   4

 Tim      5.004_01      1997-Jun-13     222  14  57
          5.004_02      1997-Aug-07     112  16 119
          5.004_03      1997-Sep-05     109   0  17
          5.004_04      1997-Oct-15      66   8 173


Jarkko Hietaniemi <F<jhi at iki.fi>>.

Thanks to the collective memory of the Perlfolk.  In addition to the
Keepers of the Pumpkin also Alan Champion, Mark Dominus, 
Andreas König, John Macdonald, Matthias Neeracher, Jeff Okamoto,
Michael Peppler, Randal Schwartz, and Paul D. Smith sent corrections
and additions.


--- NEW FILE: perlfaq7.pod ---
=head1 NAME

perlfaq7 - General Perl Language Issues ($Revision: 1.2 $, $Date: 2006-12-04 17:01:33 $)


This section deals with general Perl language issues that don't
clearly fit into any of the other sections.

=head2 Can I get a BNF/yacc/RE for the Perl language?

There is no BNF, but you can paw your way through the yacc grammar in
perly.y in the source distribution if you're particularly brave.  The
grammar relies on very smart tokenizing code, so be prepared to
venture into toke.c as well.

In the words of Chaim Frenkel: "Perl's grammar can not be reduced to BNF.
The work of parsing perl is distributed between yacc, the lexer, smoke
and mirrors."

=head2 What are all these $@%&* punctuation signs, and how do I know when to use them?

They are type specifiers, as detailed in L<perldata>:

    $ for scalar values (number, string or reference)
    @ for arrays
    % for hashes (associative arrays)
    & for subroutines (aka functions, procedures, methods)
    * for all types of that symbol name.  In version 4 you used them like
      pointers, but in modern perls you can just use references.

There are couple of other symbols that you're likely to encounter that aren't
really type specifiers:

    <> are used for inputting a record from a filehandle.
    \  takes a reference to something.

Note that <FILE> is I<neither> the type specifier for files
nor the name of the handle.  It is the C<< <> >> operator applied
to the handle FILE.  It reads one line (well, record--see
L<perlvar/$E<sol>>) from the handle FILE in scalar context, or I<all> lines
in list context.  When performing open, close, or any other operation
besides C<< <> >> on files, or even when talking about the handle, do
I<not> use the brackets.  These are correct: C<eof(FH)>, C<seek(FH, 0,
2)> and "copying from STDIN to FILE".

=head2 Do I always/never have to quote my strings or use semicolons and commas?

Normally, a bareword doesn't need to be quoted, but in most cases
probably should be (and must be under C<use strict>).  But a hash key
consisting of a simple word (that isn't the name of a defined
subroutine) and the left-hand operand to the C<< => >> operator both
count as though they were quoted:

    This                    is like this
    ------------            ---------------
    $foo{line}              $foo{'line'}
    bar => stuff            'bar' => stuff

The final semicolon in a block is optional, as is the final comma in a
list.  Good style (see L<perlstyle>) says to put them in except for

    if ($whoops) { exit 1 }
    @nums = (1, 2, 3);

    if ($whoops) {
        exit 1;
    @lines = (
	"There Beren came from mountains cold",
	"And lost he wandered under leaves",

=head2 How do I skip some return values?

One way is to treat the return values as a list and index into it:

        $dir = (getpwnam($user))[7];

Another way is to use undef as an element on the left-hand-side:

    ($dev, $ino, undef, undef, $uid, $gid) = stat($file);

You can also use a list slice to select only the elements that
you need:

	($dev, $ino, $uid, $gid) = ( stat($file) )[0,1,4,5];

=head2 How do I temporarily block warnings?

If you are running Perl 5.6.0 or better, the C<use warnings> pragma
allows fine control of what warning are produced.
See L<perllexwarn> for more details.

	no warnings;          # temporarily turn off warnings
	$a = $b + $c;         # I know these might be undef

Additionally, you can enable and disable categories of warnings.
You turn off the categories you want to ignore and you can still
get other categories of warnings.  See L<perllexwarn> for the
complete details, including the category names and hierarchy.

	no warnings 'uninitialized';
	$a = $b + $c;

If you have an older version of Perl, the C<$^W> variable (documented
in L<perlvar>) controls runtime warnings for a block:

	local $^W = 0;        # temporarily turn off warnings
	$a = $b + $c;         # I know these might be undef

Note that like all the punctuation variables, you cannot currently
use my() on C<$^W>, only local().

=head2 What's an extension?

An extension is a way of calling compiled C code from Perl.  Reading
L<perlxstut> is a good place to learn more about extensions.

=head2 Why do Perl operators have different precedence than C operators?

Actually, they don't.  All C operators that Perl copies have the same
precedence in Perl as they do in C.  The problem is with operators that C
doesn't have, especially functions that give a list context to everything
on their right, eg. print, chmod, exec, and so on.  Such functions are
called "list operators" and appear as such in the precedence table in

A common mistake is to write:

    unlink $file || die "snafu";

This gets interpreted as:

    unlink ($file || die "snafu");

To avoid this problem, either put in extra parentheses or use the
super low precedence C<or> operator:

    (unlink $file) || die "snafu";
    unlink $file or die "snafu";

The "English" operators (C<and>, C<or>, C<xor>, and C<not>)
deliberately have precedence lower than that of list operators for
just such situations as the one above.

Another operator with surprising precedence is exponentiation.  It
binds more tightly even than unary minus, making C<-2**2> product a
negative not a positive four.  It is also right-associating, meaning
that C<2**3**2> is two raised to the ninth power, not eight squared.

Although it has the same precedence as in C, Perl's C<?:> operator
produces an lvalue.  This assigns $x to either $a or $b, depending
on the trueness of $maybe:

    ($maybe ? $a : $b) = $x;

=head2 How do I declare/create a structure?

In general, you don't "declare" a structure.  Just use a (probably
anonymous) hash reference.  See L<perlref> and L<perldsc> for details.
Here's an example:

    $person = {};                   # new anonymous hash
    $person->{AGE}  = 24;           # set field AGE to 24
    $person->{NAME} = "Nat";        # set field NAME to "Nat"

If you're looking for something a bit more rigorous, try L<perltoot>.

=head2 How do I create a module?

(contributed by brian d foy)

L<perlmod>, L<perlmodlib>, L<perlmodstyle> explain modules
in all the gory details. L<perlnewmod> gives a brief
overview of the process along with a couple of suggestions
about style.

If you need to include C code or C library interfaces in
your module, you'll need h2xs.  h2xs will create the module
distribution structure and the initial interface files
you'll need.  L<perlxs> and L<perlxstut> explain the details.

If you don't need to use C code, other tools such as
ExtUtils::ModuleMaker and Module::Starter, can help you
create a skeleton module distribution.

You may also want to see Sam Tregar's "Writing Perl Modules
for CPAN" ( http://apress.com/book/bookDisplay.html?bID=14 )
which is the best hands-on guide to creating module

=head2 How do I create a class?

See L<perltoot> for an introduction to classes and objects, as well as
L<perlobj> and L<perlbot>.

=head2 How can I tell if a variable is tainted?

You can use the tainted() function of the Scalar::Util module, available
from CPAN (or included with Perl since release 5.8.0).
See also L<perlsec/"Laundering and Detecting Tainted Data">.

=head2 What's a closure?

Closures are documented in L<perlref>.

I<Closure> is a computer science term with a precise but
hard-to-explain meaning. Closures are implemented in Perl as anonymous
subroutines with lasting references to lexical variables outside their
own scopes.  These lexicals magically refer to the variables that were
around when the subroutine was defined (deep binding).

Closures make sense in any programming language where you can have the
return value of a function be itself a function, as you can in Perl.
Note that some languages provide anonymous functions but are not
capable of providing proper closures: the Python language, for
example.  For more information on closures, check out any textbook on
functional programming.  Scheme is a language that not only supports
but encourages closures.

Here's a classic function-generating function:

    sub add_function_generator {
      return sub { shift() + shift() };

    $add_sub = add_function_generator();
    $sum = $add_sub->(4,5);                # $sum is 9 now.

The closure works as a I<function template> with some customization
slots left out to be filled later.  The anonymous subroutine returned
by add_function_generator() isn't technically a closure because it
refers to no lexicals outside its own scope.

Contrast this with the following make_adder() function, in which the
returned anonymous function contains a reference to a lexical variable
outside the scope of that function itself.  Such a reference requires
that Perl return a proper closure, thus locking in for all time the
value that the lexical had when the function was created.

    sub make_adder {
        my $addpiece = shift;
        return sub { shift() + $addpiece };

    $f1 = make_adder(20);
    $f2 = make_adder(555);

Now C<&$f1($n)> is always 20 plus whatever $n you pass in, whereas
C<&$f2($n)> is always 555 plus whatever $n you pass in.  The $addpiece
in the closure sticks around.

Closures are often used for less esoteric purposes.  For example, when
you want to pass in a bit of code into a function:

    my $line;
    timeout( 30, sub { $line = <STDIN> } );

If the code to execute had been passed in as a string,
C<< '$line = <STDIN>' >>, there would have been no way for the
hypothetical timeout() function to access the lexical variable
$line back in its caller's scope.

=head2 What is variable suicide and how can I prevent it?

This problem was fixed in perl 5.004_05, so preventing it means upgrading
your version of perl. ;)

Variable suicide is when you (temporarily or permanently) lose the value
of a variable.  It is caused by scoping through my() and local()
interacting with either closures or aliased foreach() iterator variables
and subroutine arguments.  It used to be easy to inadvertently lose a
variable's value this way, but now it's much harder.  Take this code:

    my $f = 'foo';
    sub T {
      while ($i++ < 3) { my $f = $f; $f .= $i; print $f, "\n" }
    print "Finally $f\n";

If you are experiencing variable suicide, that C<my $f> in the subroutine
doesn't pick up a fresh copy of the C<$f> whose value is <foo>. The output
shows that inside the subroutine the value of C<$f> leaks through when it
shouldn't, as in this output:

	Finally foo

The $f that has "bar" added to it three times should be a new C<$f>
C<my $f> should create a new lexical variable each time through the loop.
The expected output is:

	Finally foo

=head2 How can I pass/return a {Function, FileHandle, Array, Hash, Method, Regex}?

With the exception of regexes, you need to pass references to these
objects.  See L<perlsub/"Pass by Reference"> for this particular
question, and L<perlref> for information on references.

See "Passing Regexes", below, for information on passing regular

=over 4

=item Passing Variables and Functions

Regular variables and functions are quite easy to pass: just pass in a
reference to an existing or anonymous variable or function:

    func( \$some_scalar );

    func( \@some_array  );
    func( [ 1 .. 10 ]   );

    func( \%some_hash   );
    func( { this => 10, that => 20 }   );

    func( \&some_func   );
    func( sub { $_[0] ** $_[1] }   );

=item Passing Filehandles

As of Perl 5.6, you can represent filehandles with scalar variables
which you treat as any other scalar.

	open my $fh, $filename or die "Cannot open $filename! $!";
	func( $fh );

	sub func {
		my $passed_fh = shift;

		my $line = <$fh>;

Before Perl 5.6, you had to use the C<*FH> or C<\*FH> notations.
These are "typeglobs"--see L<perldata/"Typeglobs and Filehandles">
and especially L<perlsub/"Pass by Reference"> for more information.

=item Passing Regexes

To pass regexes around, you'll need to be using a release of Perl
sufficiently recent as to support the C<qr//> construct, pass around
strings and use an exception-trapping eval, or else be very, very clever.

Here's an example of how to pass in a string to be regex compared
using C<qr//>:

    sub compare($$) {
        my ($val1, $regex) = @_;
        my $retval = $val1 =~ /$regex/;
	return $retval;
    $match = compare("old McDonald", qr/d.*D/i);

Notice how C<qr//> allows flags at the end.  That pattern was compiled
at compile time, although it was executed later.  The nifty C<qr//>
notation wasn't introduced until the 5.005 release.  Before that, you
had to approach this problem much less intuitively.  For example, here
it is again if you don't have C<qr//>:

    sub compare($$) {
        my ($val1, $regex) = @_;
        my $retval = eval { $val1 =~ /$regex/ };
	die if $@;
	return $retval;

    $match = compare("old McDonald", q/($?i)d.*D/);

Make sure you never say something like this:

    return eval "\$val =~ /$regex/";   # WRONG

or someone can sneak shell escapes into the regex due to the double
interpolation of the eval and the double-quoted string.  For example:

    $pattern_of_evil = 'danger ${ system("rm -rf * &") } danger';

    eval "\$string =~ /$pattern_of_evil/";

Those preferring to be very, very clever might see the O'Reilly book,
I<Mastering Regular Expressions>, by Jeffrey Friedl.  Page 273's
Build_MatchMany_Function() is particularly interesting.  A complete
citation of this book is given in L<perlfaq2>.

=item Passing Methods

To pass an object method into a subroutine, you can do this:

    call_a_lot(10, $some_obj, "methname")
    sub call_a_lot {
        my ($count, $widget, $trick) = @_;
        for (my $i = 0; $i < $count; $i++) {

Or, you can use a closure to bundle up the object, its
method call, and arguments:

    my $whatnot =  sub { $some_obj->obfuscate(@args) };
    sub func {
        my $code = shift;

You could also investigate the can() method in the UNIVERSAL class
(part of the standard perl distribution).


=head2 How do I create a static variable?

(contributed by brian d foy)

Perl doesn't have "static" variables, which can only be accessed from
the function in which they are declared. You can get the same effect
with lexical variables, though.

You can fake a static variable by using a lexical variable which goes
out of scope. In this example, you define the subroutine C<counter>, and
it uses the lexical variable C<$count>. Since you wrap this in a BEGIN
block, C<$count> is defined at compile-time, but also goes out of
scope at the end of the BEGIN block. The BEGIN block also ensures that
the subroutine and the value it uses is defined at compile-time so the
subroutine is ready to use just like any other subroutine, and you can
put this code in the same place as other subroutines in the program
text (i.e. at the end of the code, typically). The subroutine
C<counter> still has a reference to the data, and is the only way you
can access the value (and each time you do, you increment the value).
The data in chunk of memory defined by C<$count> is private to

    BEGIN {
        my $count = 1;
        sub counter { $count++ }

    my $start = count();

    .... # code that calls count();

    my $end = count();

In the previous example, you created a function-private variable
because only one function remembered its reference. You could define
multiple functions while the variable is in scope, and each function
can share the "private" variable. It's not really "static" because you
can access it outside the function while the lexical variable is in
scope, and even create references to it. In this example,
C<increment_count> and C<return_count> share the variable. One
function adds to the value and the other simply returns the value.
They can both access C<$count>, and since it has gone out of scope,
there is no other way to access it.

    BEGIN {
        my $count = 1;
        sub increment_count { $count++ }
        sub return_count    { $count }

To declare a file-private variable, you still use a lexical variable.
A file is also a scope, so a lexical variable defined in the file
cannot be seen from any other file.

See L<perlsub/"Persistent Private Variables"> for more information.
The discussion of closures in L<perlref> may help you even though we
did not use anonymous subroutines in this answer. See
L<perlsub/"Persistent Private Variables"> for details.

=head2 What's the difference between dynamic and lexical (static) scoping?  Between local() and my()?

C<local($x)> saves away the old value of the global variable C<$x>
and assigns a new value for the duration of the subroutine I<which is
visible in other functions called from that subroutine>.  This is done
at run-time, so is called dynamic scoping.  local() always affects global
variables, also called package variables or dynamic variables.

C<my($x)> creates a new variable that is only visible in the current
subroutine.  This is done at compile-time, so it is called lexical or
static scoping.  my() always affects private variables, also called
lexical variables or (improperly) static(ly scoped) variables.

For instance:

    sub visible {
	print "var has value $var\n";

    sub dynamic {
	local $var = 'local';	# new temporary value for the still-global
	visible();              #   variable called $var

    sub lexical {
	my $var = 'private';    # new private variable, $var
	visible();              # (invisible outside of sub scope)

    $var = 'global';

    visible();      		# prints global
    dynamic();      		# prints local
    lexical();      		# prints global

Notice how at no point does the value "private" get printed.  That's
because $var only has that value within the block of the lexical()
function, and it is hidden from called subroutine.

In summary, local() doesn't make what you think of as private, local
variables.  It gives a global variable a temporary value.  my() is
what you're looking for if you want private variables.

See L<perlsub/"Private Variables via my()"> and
L<perlsub/"Temporary Values via local()"> for excruciating details.

=head2 How can I access a dynamic variable while a similarly named lexical is in scope?

If you know your package, you can just mention it explicitly, as in
$Some_Pack::var. Note that the notation $::var is B<not> the dynamic $var
in the current package, but rather the one in the "main" package, as
though you had written $main::var.

	use vars '$var';
	local $var = "global";
	my    $var = "lexical";

	print "lexical is $var\n";
	print "global  is $main::var\n";

Alternatively you can use the compiler directive our() to bring a
dynamic variable into the current lexical scope.

	require 5.006; # our() did not exist before 5.6
	use vars '$var';

	local $var = "global";
	my $var    = "lexical";

	print "lexical is $var\n";

	  our $var;
	  print "global  is $var\n";

=head2 What's the difference between deep and shallow binding?

In deep binding, lexical variables mentioned in anonymous subroutines
are the same ones that were in scope when the subroutine was created.
In shallow binding, they are whichever variables with the same names
happen to be in scope when the subroutine is called.  Perl always uses
deep binding of lexical variables (i.e., those created with my()).
However, dynamic variables (aka global, local, or package variables)
are effectively shallowly bound.  Consider this just one more reason
not to use them.  See the answer to L<"What's a closure?">.

=head2 Why doesn't "my($foo) = E<lt>FILEE<gt>;" work right?

C<my()> and C<local()> give list context to the right hand side
of C<=>.  The <FH> read operation, like so many of Perl's
functions and operators, can tell which context it was called in and
behaves appropriately.  In general, the scalar() function can help.
This function does nothing to the data itself (contrary to popular myth)
but rather tells its argument to behave in whatever its scalar fashion is.
If that function doesn't have a defined scalar behavior, this of course
doesn't help you (such as with sort()).

To enforce scalar context in this particular case, however, you need
merely omit the parentheses:

    local($foo) = <FILE>;	    # WRONG
    local($foo) = scalar(<FILE>);   # ok
    local $foo  = <FILE>;	    # right

You should probably be using lexical variables anyway, although the
issue is the same here:

    my($foo) = <FILE>;	# WRONG
    my $foo  = <FILE>;	# right

=head2 How do I redefine a builtin function, operator, or method?

Why do you want to do that? :-)

If you want to override a predefined function, such as open(),
then you'll have to import the new definition from a different
module.  See L<perlsub/"Overriding Built-in Functions">.  There's
also an example in L<perltoot/"Class::Template">.

If you want to overload a Perl operator, such as C<+> or C<**>,
then you'll want to use the C<use overload> pragma, documented
in L<overload>.

If you're talking about obscuring method calls in parent classes,
see L<perltoot/"Overridden Methods">.

=head2 What's the difference between calling a function as &foo and foo()?

When you call a function as C<&foo>, you allow that function access to
your current @_ values, and you bypass prototypes.
The function doesn't get an empty @_--it gets yours!  While not
strictly speaking a bug (it's documented that way in L<perlsub>), it
would be hard to consider this a feature in most cases.

When you call your function as C<&foo()>, then you I<do> get a new @_,
but prototyping is still circumvented.

Normally, you want to call a function using C<foo()>.  You may only
omit the parentheses if the function is already known to the compiler
because it already saw the definition (C<use> but not C<require>),
or via a forward reference or C<use subs> declaration.  Even in this
case, you get a clean @_ without any of the old values leaking through
where they don't belong.

=head2 How do I create a switch or case statement?

This is explained in more depth in the L<perlsyn>.  Briefly, there's
no official case statement, because of the variety of tests possible
in Perl (numeric comparison, string comparison, glob comparison,
regex matching, overloaded comparisons, ...).
Larry couldn't decide how best to do this, so he left it out, even
though it's been on the wish list since perl1.

Starting from Perl 5.8 to get switch and case one can use the
Switch extension and say:

	use Switch;

after which one has switch and case.  It is not as fast as it could be
because it's not really part of the language (it's done using source
filters) but it is available, and it's very flexible.

But if one wants to use pure Perl, the general answer is to write a
construct like this:

    for ($variable_to_test) {
	if    (/pat1/)  { }     # do something
	elsif (/pat2/)  { }     # do something else
	elsif (/pat3/)  { }     # do something else
	else            { }     # default

Here's a simple example of a switch based on pattern matching, this
time lined up in a way to make it look more like a switch statement.
We'll do a multiway conditional based on the type of reference stored
in $whatchamacallit:

    SWITCH: for (ref $whatchamacallit) {

	/^$/		&& die "not a reference";

	/SCALAR/	&& do {
				last SWITCH;

	/ARRAY/		&& do {
				last SWITCH;

	/HASH/		&& do {
				last SWITCH;

	/CODE/		&& do {
				warn "can't print function ref";
				last SWITCH;


	warn "User defined type skipped";


See C<perlsyn/"Basic BLOCKs and Switch Statements"> for many other
examples in this style.

Sometimes you should change the positions of the constant and the variable.
For example, let's say you wanted to test which of many answers you were
given, but in a case-insensitive way that also allows abbreviations.
You can use the following technique if the strings all start with
different characters or if you want to arrange the matches so that
one takes precedence over another, as C<"SEND"> has precedence over
C<"STOP"> here:

    chomp($answer = <>);
    if    ("SEND"  =~ /^\Q$answer/i) { print "Action is send\n"  }
    elsif ("STOP"  =~ /^\Q$answer/i) { print "Action is stop\n"  }
    elsif ("ABORT" =~ /^\Q$answer/i) { print "Action is abort\n" }
    elsif ("LIST"  =~ /^\Q$answer/i) { print "Action is list\n"  }
    elsif ("EDIT"  =~ /^\Q$answer/i) { print "Action is edit\n"  }

A totally different approach is to create a hash of function references.

    my %commands = (
        "happy" => \&joy,
        "sad",  => \&sullen,
        "done"  => sub { die "See ya!" },
        "mad"   => \&angry,

    print "How are you? ";
    chomp($string = <STDIN>);
    if ($commands{$string}) {
    } else {
        print "No such command: $string\n";

=head2 How can I catch accesses to undefined variables, functions, or methods?

The AUTOLOAD method, discussed in L<perlsub/"Autoloading"> and
L<perltoot/"AUTOLOAD: Proxy Methods">, lets you capture calls to
undefined functions and methods.

When it comes to undefined variables that would trigger a warning
under C<use warnings>, you can promote the warning to an error.

	use warnings FATAL => qw(uninitialized);

=head2 Why can't a method included in this same file be found?

Some possible reasons: your inheritance is getting confused, you've
misspelled the method name, or the object is of the wrong type.  Check
out L<perltoot> for details about any of the above cases.  You may
also use C<print ref($object)> to find out the class C<$object> was
blessed into.

Another possible reason for problems is because you've used the
indirect object syntax (eg, C<find Guru "Samy">) on a class name
before Perl has seen that such a package exists.  It's wisest to make
sure your packages are all defined before you start using them, which
will be taken care of if you use the C<use> statement instead of
C<require>.  If not, make sure to use arrow notation (eg.,
C<< Guru->find("Samy") >>) instead.  Object notation is explained in

Make sure to read about creating modules in L<perlmod> and
the perils of indirect objects in L<perlobj/"Method Invocation">.

=head2 How can I find out my current package?

If you're just a random program, you can do this to find
out what the currently compiled package is:

    my $packname = __PACKAGE__;

But, if you're a method and you want to print an error message
that includes the kind of object you were called on (which is
not necessarily the same as the one in which you were compiled):

    sub amethod {
	my $self  = shift;
	my $class = ref($self) || $self;
	warn "called me from a $class object";

=head2 How can I comment out a large block of perl code?

You can use embedded POD to discard it.  Enclose the blocks you want
to comment out in POD markers.  The <=begin> directive marks a section
for a specific formatter.  Use the C<comment> format, which no formatter
should claim to understand (by policy).  Mark the end of the block
with <=end>.

    # program is here

    =begin comment

    all of this stuff

    here will be ignored
    by everyone

	=end comment


    # program continues

The pod directives cannot go just anywhere.  You must put a
pod directive where the parser is expecting a new statement,
not just in the middle of an expression or some other
arbitrary grammar production.

See L<perlpod> for more details.

=head2 How do I clear a package?

Use this code, provided by Mark-Jason Dominus:

    sub scrub_package {
	no strict 'refs';
	my $pack = shift;
	die "Shouldn't delete main package"
	    if $pack eq "" || $pack eq "main";
	my $stash = *{$pack . '::'}{HASH};
	my $name;
	foreach $name (keys %$stash) {
	    my $fullname = $pack . '::' . $name;
	    # Get rid of everything with that name.
	    undef $$fullname;
	    undef @$fullname;
	    undef %$fullname;
	    undef &$fullname;
	    undef *$fullname;

Or, if you're using a recent release of Perl, you can
just use the Symbol::delete_package() function instead.

=head2 How can I use a variable as a variable name?

Beginners often think they want to have a variable contain the name
of a variable.

    $fred    = 23;
    $varname = "fred";
    ++$$varname;         # $fred now 24

This works I<sometimes>, but it is a very bad idea for two reasons.

The first reason is that this technique I<only works on global
variables>.  That means that if $fred is a lexical variable created
with my() in the above example, the code wouldn't work at all: you'd
accidentally access the global and skip right over the private lexical
altogether.  Global variables are bad because they can easily collide
accidentally and in general make for non-scalable and confusing code.

Symbolic references are forbidden under the C<use strict> pragma.
They are not true references and consequently are not reference counted
or garbage collected.

The other reason why using a variable to hold the name of another
variable is a bad idea is that the question often stems from a lack of
understanding of Perl data structures, particularly hashes.  By using
symbolic references, you are just using the package's symbol-table hash
(like C<%main::>) instead of a user-defined hash.  The solution is to
use your own hash or a real reference instead.

    $USER_VARS{"fred"} = 23;
    $varname = "fred";
    $USER_VARS{$varname}++;  # not $$varname++

There we're using the %USER_VARS hash instead of symbolic references.
Sometimes this comes up in reading strings from the user with variable
references and wanting to expand them to the values of your perl
program's variables.  This is also a bad idea because it conflates the
program-addressable namespace and the user-addressable one.  Instead of
reading a string and expanding it to the actual contents of your program's
own variables:

    $str = 'this has a $fred and $barney in it';
    $str =~ s/(\$\w+)/$1/eeg;		  # need double eval

it would be better to keep a hash around like %USER_VARS and have
variable references actually refer to entries in that hash:

    $str =~ s/\$(\w+)/$USER_VARS{$1}/g;   # no /e here at all

That's faster, cleaner, and safer than the previous approach.  Of course,
you don't need to use a dollar sign.  You could use your own scheme to
make it less confusing, like bracketed percent symbols, etc.

    $str = 'this has a %fred% and %barney% in it';
    $str =~ s/%(\w+)%/$USER_VARS{$1}/g;   # no /e here at all

Another reason that folks sometimes think they want a variable to
contain the name of a variable is because they don't know how to build
proper data structures using hashes.  For example, let's say they
wanted two hashes in their program: %fred and %barney, and that they
wanted to use another scalar variable to refer to those by name.

    $name = "fred";
    $$name{WIFE} = "wilma";     # set %fred

    $name = "barney";
    $$name{WIFE} = "betty";	# set %barney

This is still a symbolic reference, and is still saddled with the
problems enumerated above.  It would be far better to write:

    $folks{"fred"}{WIFE}   = "wilma";
    $folks{"barney"}{WIFE} = "betty";

And just use a multilevel hash to start with.

The only times that you absolutely I<must> use symbolic references are
when you really must refer to the symbol table.  This may be because it's
something that can't take a real reference to, such as a format name.
Doing so may also be important for method calls, since these always go
through the symbol table for resolution.

In those cases, you would turn off C<strict 'refs'> temporarily so you
can play around with the symbol table.  For example:

    @colors = qw(red blue green yellow orange purple violet);
    for my $name (@colors) {
        no strict 'refs';  # renege for the block
        *$name = sub { "<FONT COLOR='$name'>@_</FONT>" };

All those functions (red(), blue(), green(), etc.) appear to be separate,
but the real code in the closure actually was compiled only once.

So, sometimes you might want to use symbolic references to directly
manipulate the symbol table.  This doesn't matter for formats, handles, and
subroutines, because they are always global--you can't use my() on them.
For scalars, arrays, and hashes, though--and usually for subroutines--
you probably only want to use hard references.

=head2 What does "bad interpreter" mean?

(contributed by brian d foy)

The "bad interpreter" message comes from the shell, not perl.  The
actual message may vary depending on your platform, shell, and locale

If you see "bad interpreter - no such file or directory", the first
line in your perl script (the "shebang" line) does not contain the
right path to perl (or any other program capable of running scripts).
Sometimes this happens when you move the script from one machine to
another and each machine has a different path to perl---/usr/bin/perl
versus /usr/local/bin/perl for instance. It may also indicate
that the source machine has CRLF line terminators and the
destination machine has LF only: the shell tries to find
/usr/bin/perl<CR>, but can't.

If you see "bad interpreter: Permission denied", you need to make your
script executable.

In either case, you should still be able to run the scripts with perl

	% perl script.pl

If you get a message like "perl: command not found", perl is not in
your PATH, which might also mean that the location of perl is not
where you expect it so you need to adjust your shebang line.


Copyright (c) 1997-2006 Tom Christiansen, Nathan Torkington, and
other authors as noted. All rights reserved.

This documentation is free; you can redistribute it and/or modify it
under the same terms as Perl itself.

Irrespective of its distribution, all code examples in this file
are hereby placed into the public domain.  You are permitted and
encouraged to use this code in your own programs for fun
or for profit as you see fit.  A simple comment in the code giving
credit would be courteous but is not required.

--- NEW FILE: perlnumber.pod ---
=head1 NAME

perlnumber - semantics of numbers and numeric operations in Perl


    $n = 1234;		    # decimal integer
    $n = 0b1110011;	    # binary integer
    $n = 01234;		    # octal integer
    $n = 0x1234;	    # hexadecimal integer
    $n = 12.34e-56;	    # exponential notation
    $n = "-12.34e56";	    # number specified as a string
    $n = "1234";	    # number specified as a string


This document describes how Perl internally handles numeric values.

Perl's operator overloading facility is completely ignored here.  Operator
overloading allows user-defined behaviors for numbers, such as operations
over arbitrarily large integers, floating points numbers with arbitrary
precision, operations over "exotic" numbers such as modular arithmetic or
p-adic arithmetic, and so on.  See L<overload> for details.

=head1 Storing numbers

Perl can internally represent numbers in 3 different ways: as native
integers, as native floating point numbers, and as decimal strings.
Decimal strings may have an exponential notation part, as in C<"12.34e-56">.
I<Native> here means "a format supported by the C compiler which was used
to build perl".

The term "native" does not mean quite as much when we talk about native
integers, as it does when native floating point numbers are involved.
The only implication of the term "native" on integers is that the limits for
the maximal and the minimal supported true integral quantities are close to
powers of 2.  However, "native" floats have a most fundamental
restriction: they may represent only those numbers which have a relatively
"short" representation when converted to a binary fraction.  For example,
0.9 cannot be represented by a native float, since the binary fraction
for 0.9 is infinite:


with the sequence C<1100> repeating again and again.  In addition to this
limitation,  the exponent of the binary number is also restricted when it
is represented as a floating point number.  On typical hardware, floating
point values can store numbers with up to 53 binary digits, and with binary
exponents between -1024 and 1024.  In decimal representation this is close
to 16 decimal digits and decimal exponents in the range of -304..304.
The upshot of all this is that Perl cannot store a number like
12345678901234567 as a floating point number on such architectures without
loss of information.

Similarly, decimal strings can represent only those numbers which have a
finite decimal expansion.  Being strings, and thus of arbitrary length, there
is no practical limit for the exponent or number of decimal digits for these
numbers.  (But realize that what we are discussing the rules for just the
I<storage> of these numbers.  The fact that you can store such "large" numbers
does not mean that the I<operations> over these numbers will use all
of the significant digits.
See L<"Numeric operators and numeric conversions"> for details.)

In fact numbers stored in the native integer format may be stored either
in the signed native form, or in the unsigned native form.  Thus the limits
for Perl numbers stored as native integers would typically be -2**31..2**32-1,
with appropriate modifications in the case of 64-bit integers.  Again, this
does not mean that Perl can do operations only over integers in this range:
it is possible to store many more integers in floating point format.

Summing up, Perl numeric values can store only those numbers which have
a finite decimal expansion or a "short" binary expansion.

=head1 Numeric operators and numeric conversions

As mentioned earlier, Perl can store a number in any one of three formats,
but most operators typically understand only one of those formats.  When
a numeric value is passed as an argument to such an operator, it will be
converted to the format understood by the operator.

Six such conversions are possible:

  native integer        --> native floating point	(*)
  native integer        --> decimal string
  native floating_point --> native integer		(*)
  native floating_point --> decimal string		(*)
  decimal string        --> native integer
  decimal string        --> native floating point	(*)

These conversions are governed by the following general rules:

=over 4

=item *

If the source number can be represented in the target form, that
representation is used.

=item *

If the source number is outside of the limits representable in the target form,
a representation of the closest limit is used.  (I<Loss of information>)

=item *

If the source number is between two numbers representable in the target form,
a representation of one of these numbers is used.  (I<Loss of information>)

=item *

In C<< native floating point --> native integer >> conversions the magnitude
of the result is less than or equal to the magnitude of the source.
(I<"Rounding to zero".>)

=item *

If the C<< decimal string --> native integer >> conversion cannot be done
without loss of information, the result is compatible with the conversion
sequence C<< decimal_string --> native_floating_point --> native_integer >>.
In particular, rounding is strongly biased to 0, though a number like
C<"0.99999999999999999999"> has a chance of being rounded to 1.


B<RESTRICTION>: The conversions marked with C<(*)> above involve steps
performed by the C compiler.  In particular, bugs/features of the compiler
used may lead to breakage of some of the above rules.

=head1 Flavors of Perl numeric operations

Perl operations which take a numeric argument treat that argument in one
of four different ways: they may force it to one of the integer/floating/
string formats, or they may behave differently depending on the format of
the operand.  Forcing a numeric value to a particular format does not
change the number stored in the value.

All the operators which need an argument in the integer format treat the
argument as in modular arithmetic, e.g., C<mod 2**32> on a 32-bit
architecture.  C<sprintf "%u", -1> therefore provides the same result as
C<sprintf "%u", ~0>.

=over 4

=item Arithmetic operators

The binary operators C<+> C<-> C<*> C</> C<%> C<==> C<!=> C<E<gt>> C<E<lt>>
C<E<gt>=> C<E<lt>=> and the unary operators C<-> C<abs> and C<--> will
attempt to convert arguments to integers.  If both conversions are possible
without loss of precision, and the operation can be performed without
loss of precision then the integer result is used.  Otherwise arguments are
converted to floating point format and the floating point result is used.
The caching of conversions (as described above) means that the integer
conversion does not throw away fractional parts on floating point numbers.

=item ++

C<++> behaves as the other operators above, except that if it is a string
matching the format C</^[a-zA-Z]*[0-9]*\z/> the string increment described
in L<perlop> is used.

=item Arithmetic operators during C<use integer>

In scopes where C<use integer;> is in force, nearly all the operators listed
above will force their argument(s) into integer format, and return an integer
result.  The exceptions, C<abs>, C<++> and C<-->, do not change their
behavior with C<use integer;>

=item Other mathematical operators

Operators such as C<**>, C<sin> and C<exp> force arguments to floating point

=item Bitwise operators

Arguments are forced into the integer format if not strings.

=item Bitwise operators during C<use integer>

forces arguments to integer format. Also shift operations internally use
signed integers rather than the default unsigned.

=item Operators which expect an integer

force the argument into the integer format.  This is applicable
to the third and fourth arguments of C<sysread>, for example.

=item Operators which expect a string

force the argument into the string format.  For example, this is
applicable to C<printf "%s", $value>.


Though forcing an argument into a particular form does not change the
stored number, Perl remembers the result of such conversions.  In
particular, though the first such conversion may be time-consuming,
repeated operations will not need to redo the conversion.

=head1 AUTHOR

Ilya Zakharevich C<ilya at math.ohio-state.edu>

Editorial adjustments by Gurusamy Sarathy <gsar at ActiveState.com>

Updates for 5.8.0 by Nicholas Clark <nick at ccl4.org>

=head1 SEE ALSO

L<overload>, L<perlop>

--- NEW FILE: perl586delta.pod ---
=head1 NAME

perl586delta - what is new for perl v5.8.6


This document describes differences between the 5.8.5 release and
the 5.8.6 release.

=head1 Incompatible Changes

There are no changes incompatible with 5.8.5.

=head1 Core Enhancements

The perl interpreter is now more tolerant of UTF-16-encoded scripts.

On Win32, Perl can now use non-IFS compatible LSPs, which allows Perl to
work in conjunction with firewalls such as McAfee Guardian. For full details
see the file F<README.win32>, particularly if you're running Win95.

=head1 Modules and Pragmata

=over 4

=item *

With the C<base> pragma, an intermediate class with no fields used to messes
up private fields in the base class. This has been fixed.

=item *

Cwd upgraded to version 3.01 (as part of the new PathTools distribution)

=item *

Devel::PPPort upgraded to version 3.03

=item *

File::Spec upgraded to version 3.01 (as part of the new PathTools distribution)

=item *

Encode upgraded to version 2.08

=item *

ExtUtils::MakeMaker remains at version 6.17, as later stable releases currently
available on CPAN have some issues with core modules on some core platforms.

=item *

I18N::LangTags upgraded to version 0.35

=item *

Math::BigInt upgraded to version 1.73

=item *

Math::BigRat upgraded to version 0.13

=item *

MIME::Base64 upgraded to version 3.05

=item *

POSIX::sigprocmask function can now retrieve the current signal mask without
also setting it.

=item *

Time::HiRes upgraded to version 1.65


=head1 Utility Changes

Perl has a new -dt command-line flag, which enables threads support in the

=head1 Performance Enhancements

C<reverse sort ...> is now optimized to sort in reverse, avoiding the
generation of a temporary intermediate list.

C<for (reverse @foo)> now iterates in reverse, avoiding the generation of a
temporary reversed list.

=head1 Selected Bug Fixes

The regexp engine is now more robust when given invalid utf8 input, as is
sometimes generated by buggy XS modules.

C<foreach> on threads::shared array used to be able to crash Perl. This bug
has now been fixed.

A regexp in C<STDOUT>'s destructor used to coredump, because the regexp pad
was already freed. This has been fixed.

C<goto &> is now more robust - bugs in deep recursion and chained C<goto &>
have been fixed.

Using C<delete> on an array no longer leaks memory. A C<pop> of an item from a
shared array reference no longer causes a leak.

C<eval_sv()> failing a taint test could corrupt the stack - this has been

On platforms with 64 bit pointers numeric comparison operators used to
erroneously compare the addresses of references that are overloaded, rather
than using the overloaded values. This has been fixed.

C<read> into a UTF8-encoded buffer with an offset off the end of the buffer
no longer mis-calculates buffer lengths.

Although Perl has promised since version 5.8 that C<sort()> would be
stable, the two cases C<sort {$b cmp $a}> and C<< sort {$b <=> $a} >> could
produce non-stable sorts.   This is corrected in perl5.8.6.

Localising C<$^D> no longer generates a diagnostic message about valid -D

=head1 New or Changed Diagnostics

For -t and -T,
   Too late for "-T" option
has been changed to the more informative
   "-T" is on the #! line, it must also be used on the command line

=head1 Changed Internals

>From now on all applications embedding perl will behave as if perl
were compiled with -DPERL_USE_SAFE_PUTENV.  See "Environment access" in
the F<INSTALL> file for details.

Most C<C> source files now have comments at the top explaining their purpose,
which should help anyone wishing to get an overview of the implementation.

=head1 New Tests

There are significantly more tests for the C<B> suite of modules.

=head1 Reporting Bugs

If you find what you think is a bug, you might check the articles
recently posted to the comp.lang.perl.misc newsgroup and the perl
bug database at http://bugs.perl.org.  There may also be
information at http://www.perl.org, the Perl Home Page.

If you believe you have an unreported bug, please run the B<perlbug>
program included with your release.  Be sure to trim your bug down
to a tiny but sufficient test case.  Your bug report, along with the
output of C<perl -V>, will be sent off to perlbug at perl.org to be
analysed by the Perl porting team.  You can browse and search
the Perl 5 bugs at http://bugs.perl.org/

=head1 SEE ALSO

The F<Changes> file for exhaustive details on what changed.

The F<INSTALL> file for how to build Perl.

The F<README> file for general stuff.

The F<Artistic> and F<Copying> files for copyright information.


--- NEW FILE: perlembed.pod ---
=head1 NAME

perlembed - how to embed perl in your C program



Do you want to:

=over 5

=item B<Use C from Perl?>

Read L<perlxstut>, L<perlxs>, L<h2xs>, L<perlguts>, and L<perlapi>.

=item B<Use a Unix program from Perl?>

Read about back-quotes and about C<system> and C<exec> in L<perlfunc>.
[...1101 lines suppressed...]

Copyright (C) 1995, 1996, 1997, 1998 Doug MacEachern and Jon Orwant.  All
Rights Reserved.

Permission is granted to make and distribute verbatim copies of this
documentation provided the copyright notice and this permission notice are
preserved on all copies.

Permission is granted to copy and distribute modified versions of this
documentation under the conditions for verbatim copying, provided also
that they are marked clearly as modified versions, that the authors'
names and title are unchanged (though subtitles and additional
authors' names may be added), and that the entire resulting derived
work is distributed under the terms of a permission notice identical
to this one.

Permission is granted to copy and distribute translations of this
documentation into another language, under the above conditions for
modified versions.

--- NEW FILE: pod2usage.PL ---

use Config;
use File::Basename qw(&basename &dirname);
use Cwd;

# List explicitly here the variables you want Configure to
# generate.  Metaconfig only looks for shell variables, so you
# have to mention them as if they were shell variables, not
# %Config entries.  Thus you write
#  $startperl
# to ensure Configure will look for $Config{startperl}.

# This forces PL files to create target in same directory as PL file.
# This is so that make depend always knows where to find PL derivatives.
$origdir = cwd;
$file = basename($0, '.PL');
$file .= '.com' if $^O eq 'VMS';

open OUT,">$file" or die "Can't create $file: $!";

print "Extracting $file (with variable substitutions)\n";

# In this section, perl variables will be expanded during extraction.
# You can use $Config{...} to use Configure variables.

print OUT <<"!GROK!THIS!";
    eval 'exec perl -S \$0 "\$@"'
        if 0;

# In the following, perl variables are not expanded during extraction.

print OUT <<'!NO!SUBS!';

# pod2usage -- command to print usage messages from embedded pod docs
# Copyright (c) 1996-2000 by Bradford Appleton. All rights reserved.
# This file is part of "PodParser". PodParser is free software;
# you can redistribute it and/or modify it under the same terms
# as Perl itself.

use strict;
use diagnostics;

=head1 NAME

pod2usage - print usage messages from embedded pod docs in files


=over 12

=item B<pod2usage>

[B<-exit>S< >I<exitval>]
[B<-output>S< >I<outfile>]
[B<-verbose> I<level>]
[B<-pathlist> I<dirlist>]



=over 8

=item B<-help>

Print a brief help message and exit.

=item B<-man>

Print this command's manual page and exit.

=item B<-exit> I<exitval>

The exit status value to return.

=item B<-output> I<outfile>

The output file to print to. If the special names "-" or ">&1" or ">&STDOUT"
are used then standard output is used. If ">&2" or ">&STDERR" is used then
standard error is used.

=item B<-verbose> I<level>

The desired level of verbosity to use:

    1 : print SYNOPSIS only
    2 : print SYNOPSIS sections and any OPTIONS/ARGUMENTS sections
    3 : print the entire manpage (similar to running pod2text)

=item B<-pathlist> I<dirlist>

Specifies one or more directories to search for the input file if it
was not supplied with an absolute path. Each directory path in the given
list should be separated by a ':' on Unix (';' on MSWin32 and DOS).

=item I<file>

The pathname of a file containing pod documentation to be output in
usage mesage format (defaults to standard input).



B<pod2usage> will read the given input file looking for pod
documentation and will print the corresponding usage message.
If no input file is specifed than standard input is read.

B<pod2usage> invokes the B<pod2usage()> function in the B<Pod::Usage>
module. Please see L<Pod::Usage/pod2usage()>.

=head1 SEE ALSO

L<Pod::Usage>, L<pod2text(1)>

=head1 AUTHOR

Please report bugs using L<http://rt.cpan.org>.

Brad Appleton E<lt>bradapp at enteract.comE<gt>

Based on code for B<pod2text(1)> written by
Tom Christiansen E<lt>tchrist at mox.perl.comE<gt>


use Pod::Usage;
use Getopt::Long;

## Define options
my %options = ();
my @opt_specs = (

## Parse options
GetOptions(\%options, @opt_specs)  ||  pod2usage(2);
pod2usage(1)  if ($options{help});
pod2usage(VERBOSE => 2)  if ($options{man});

## Dont default to STDIN if connected to a terminal
pod2usage(2) if ((@ARGV == 0) && (-t STDIN));

@ARGV = ("-")  unless (@ARGV > 0);
if (@ARGV > 1) {
    print STDERR "pod2usage: Too many filenames given\n\n";

my %usage = ();
$usage{-input}    = shift(@ARGV);
$usage{-exitval}  = $options{"exit"}      if (defined $options{"exit"});
$usage{-output}   = $options{"output"}    if (defined $options{"output"});
$usage{-verbose}  = $options{"verbose"}   if (defined $options{"verbose"});
$usage{-pathlist} = $options{"pathlist"}  if (defined $options{"pathlist"});



close OUT or die "Can't close $file: $!";
chmod 0755, $file or die "Can't reset permissions for $file: $!\n";
exec("$Config{'eunicefix'} $file") if $Config{'eunicefix'} ne ':';
chdir $origdir;

--- NEW FILE: perlbot.pod ---
=head1 NAME

perlbot - Bag'o Object Tricks (the BOT)


The following collection of tricks and hints is intended to whet curious
appetites about such things as the use of instance variables and the
mechanics of object and class relationships.  The reader is encouraged to
consult relevant textbooks for discussion of Object Oriented definitions and
methodology.  This is not intended as a tutorial for object-oriented
programming or as a comprehensive guide to Perl's object oriented features,
nor should it be construed as a style guide.  If you're looking for tutorials,
be sure to read L<perlboot>, L<perltoot>, and L<perltooc>.

The Perl motto still holds:  There's more than one way to do it.


=over 5

=item 1

Do not attempt to verify the type of $self.  That'll break if the class is
inherited, when the type of $self is valid but its package isn't what you
expect.  See rule 5.

=item 2

If an object-oriented (OO) or indirect-object (IO) syntax was used, then the
object is probably the correct type and there's no need to become paranoid
about it.  Perl isn't a paranoid language anyway.  If people subvert the OO
or IO syntax then they probably know what they're doing and you should let
them do it.  See rule 1.

=item 3

Use the two-argument form of bless().  Let a subclass use your constructor.

=item 4

The subclass is allowed to know things about its immediate superclass, the
superclass is allowed to know nothing about a subclass.

=item 5

Don't be trigger happy with inheritance.  A "using", "containing", or
"delegation" relationship (some sort of aggregation, at least) is often more

=item 6

The object is the namespace.  Make package globals accessible via the
object.  This will remove the guess work about the symbol's home package.

=item 7

IO syntax is certainly less noisy, but it is also prone to ambiguities that
can cause difficult-to-find bugs.  Allow people to use the sure-thing OO
syntax, even if you don't like it.

=item 8

Do not use function-call syntax on a method.  You're going to be bitten
someday.  Someone might move that method into a superclass and your code
will be broken.  On top of that you're feeding the paranoia in rule 2.

=item 9

Don't assume you know the home package of a method.  You're making it
difficult for someone to override that method.  See L<THINKING OF CODE REUSE>.



An anonymous array or anonymous hash can be used to hold instance
variables.  Named parameters are also demonstrated.

	package Foo;

	sub new {
		my $type = shift;
		my %params = @_;
		my $self = {};
		$self->{'High'} = $params{'High'};
		$self->{'Low'}  = $params{'Low'};
		bless $self, $type;

	package Bar;

	sub new {
		my $type = shift;
		my %params = @_;
		my $self = [];
		$self->[0] = $params{'Left'};
		$self->[1] = $params{'Right'};
		bless $self, $type;

	package main;

	$a = Foo->new( 'High' => 42, 'Low' => 11 );
	print "High=$a->{'High'}\n";
	print "Low=$a->{'Low'}\n";

	$b = Bar->new( 'Left' => 78, 'Right' => 40 );
	print "Left=$b->[0]\n";
	print "Right=$b->[1]\n";


An anonymous scalar can be used when only one instance variable is needed.

	package Foo;

	sub new {
		my $type = shift;
		my $self;
		$self = shift;
		bless \$self, $type;

	package main;

	$a = Foo->new( 42 );
	print "a=$$a\n";


This example demonstrates how one might inherit instance variables from a
superclass for inclusion in the new class.  This requires calling the
superclass's constructor and adding one's own instance variables to the new

	package Bar;

	sub new {
		my $type = shift;
		my $self = {};
		$self->{'buz'} = 42;
		bless $self, $type;

	package Foo;
	@ISA = qw( Bar );

	sub new {
		my $type = shift;
		my $self = Bar->new;
		$self->{'biz'} = 11;
		bless $self, $type;

	package main;

	$a = Foo->new;
	print "buz = ", $a->{'buz'}, "\n";
	print "biz = ", $a->{'biz'}, "\n";


The following demonstrates how one might implement "containing" and "using"
relationships between objects.

	package Bar;

	sub new {
		my $type = shift;
		my $self = {};
		$self->{'buz'} = 42;
		bless $self, $type;

	package Foo;

	sub new {
		my $type = shift;
		my $self = {};
		$self->{'Bar'} = Bar->new;
		$self->{'biz'} = 11;
		bless $self, $type;

	package main;

	$a = Foo->new;
	print "buz = ", $a->{'Bar'}->{'buz'}, "\n";
	print "biz = ", $a->{'biz'}, "\n";


The following example demonstrates how to override a superclass method and
then call the overridden method.  The B<SUPER> pseudo-class allows the
programmer to call an overridden superclass method without actually knowing
where that method is defined.

	package Buz;
	sub goo { print "here's the goo\n" }

	package Bar; @ISA = qw( Buz );
	sub google { print "google here\n" }

	package Baz;
	sub mumble { print "mumbling\n" }

	package Foo;
	@ISA = qw( Bar Baz );

	sub new {
		my $type = shift;
		bless [], $type;
	sub grr { print "grumble\n" }
	sub goo {
		my $self = shift;
	sub mumble {
		my $self = shift;
	sub google {
		my $self = shift;

	package main;

	$foo = Foo->new;

Note that C<SUPER> refers to the superclasses of the current package
(C<Foo>), not to the superclasses of C<$self>.


This example demonstrates an interface for the SDBM class.  This creates a
"using" relationship between the SDBM class and the new class Mydbm.

	package Mydbm;

	require SDBM_File;
	require Tie::Hash;
	@ISA = qw( Tie::Hash );

	sub TIEHASH {
	    my $type = shift;
	    my $ref  = SDBM_File->new(@_);
	    bless {'dbm' => $ref}, $type;
	sub FETCH {
	    my $self = shift;
	    my $ref  = $self->{'dbm'};
	sub STORE {
	    my $self = shift;
	    if (defined $_[0]){
		my $ref = $self->{'dbm'};
	    } else {
		die "Cannot STORE an undefined key in Mydbm\n";

	package main;
	use Fcntl qw( O_RDWR O_CREAT );

	tie %foo, "Mydbm", "Sdbm", O_RDWR|O_CREAT, 0640;
	$foo{'bar'} = 123;
	print "foo-bar = $foo{'bar'}\n";

	tie %bar, "Mydbm", "Sdbm2", O_RDWR|O_CREAT, 0640;
	$bar{'Cathy'} = 456;
	print "bar-Cathy = $bar{'Cathy'}\n";


One strength of Object-Oriented languages is the ease with which old code
can use new code.  The following examples will demonstrate first how one can
hinder code reuse and then how one can promote code reuse.

This first example illustrates a class which uses a fully-qualified method
call to access the "private" method BAZ().  The second example will show
that it is impossible to override the BAZ() method.

	package FOO;

	sub new {
		my $type = shift;
		bless {}, $type;
	sub bar {
		my $self = shift;

	package FOO::private;

	sub BAZ {
		print "in BAZ\n";

	package main;

	$a = FOO->new;

Now we try to override the BAZ() method.  We would like FOO::bar() to call
GOOP::BAZ(), but this cannot happen because FOO::bar() explicitly calls

	package FOO;

	sub new {
		my $type = shift;
		bless {}, $type;
	sub bar {
		my $self = shift;

	package FOO::private;

	sub BAZ {
		print "in BAZ\n";

	package GOOP;
	@ISA = qw( FOO );
	sub new {
		my $type = shift;
		bless {}, $type;

	sub BAZ {
		print "in GOOP::BAZ\n";

	package main;

	$a = GOOP->new;

To create reusable code we must modify class FOO, flattening class
FOO::private.  The next example shows a reusable class FOO which allows the
method GOOP::BAZ() to be used in place of FOO::BAZ().

	package FOO;

	sub new {
		my $type = shift;
		bless {}, $type;
	sub bar {
		my $self = shift;

	sub BAZ {
		print "in BAZ\n";

	package GOOP;
	@ISA = qw( FOO );

	sub new {
		my $type = shift;
		bless {}, $type;
	sub BAZ {
		print "in GOOP::BAZ\n";

	package main;

	$a = GOOP->new;


Use the object to solve package and class context problems.  Everything a
method needs should be available via the object or should be passed as a
parameter to the method.

A class will sometimes have static or global data to be used by the
methods.  A subclass may want to override that data and replace it with new
data.  When this happens the superclass may not know how to find the new
copy of the data.

This problem can be solved by using the object to define the context of the
method.  Let the method look in the object for a reference to the data.  The
alternative is to force the method to go hunting for the data ("Is it in my
class, or in a subclass?  Which subclass?"), and this can be inconvenient
and will lead to hackery.  It is better just to let the object tell the
method where that data is located.

	package Bar;

	%fizzle = ( 'Password' => 'XYZZY' );

	sub new {
		my $type = shift;
		my $self = {};
		$self->{'fizzle'} = \%fizzle;
		bless $self, $type;

	sub enter {
		my $self = shift;

		# Don't try to guess if we should use %Bar::fizzle
		# or %Foo::fizzle.  The object already knows which
		# we should use, so just ask it.
		my $fizzle = $self->{'fizzle'};

		print "The word is ", $fizzle->{'Password'}, "\n";

	package Foo;
	@ISA = qw( Bar );

	%fizzle = ( 'Password' => 'Rumple' );

	sub new {
		my $type = shift;
		my $self = Bar->new;
		$self->{'fizzle'} = \%fizzle;
		bless $self, $type;

	package main;

	$a = Bar->new;
	$b = Foo->new;


An inheritable constructor should use the second form of bless() which allows
blessing directly into a specified class.  Notice in this example that the
object will be a BAR not a FOO, even though the constructor is in class FOO.

	package FOO;

	sub new {
		my $type = shift;
		my $self = {};
		bless $self, $type;

	sub baz {
		print "in FOO::baz()\n";

	package BAR;
	@ISA = qw(FOO);

	sub baz {
		print "in BAR::baz()\n";

	package main;

	$a = BAR->new;


Some classes, such as SDBM_File, cannot be effectively subclassed because
they create foreign objects.  Such a class can be extended with some sort of
aggregation technique such as the "using" relationship mentioned earlier or
by delegation.

The following example demonstrates delegation using an AUTOLOAD() function to
perform message-forwarding.  This will allow the Mydbm object to behave
exactly like an SDBM_File object.  The Mydbm class could now extend the
behavior by adding custom FETCH() and STORE() methods, if this is desired.

	package Mydbm;

	require SDBM_File;
	require Tie::Hash;
	@ISA = qw(Tie::Hash);

	sub TIEHASH {
		my $type = shift;
		my $ref = SDBM_File->new(@_);
		bless {'delegate' => $ref};

		my $self = shift;

		# The Perl interpreter places the name of the
		# message in a variable called $AUTOLOAD.

		# DESTROY messages should never be propagated.
		return if $AUTOLOAD =~ /::DESTROY$/;

		# Remove the package name.
		$AUTOLOAD =~ s/^Mydbm:://;

		# Pass the message to the delegate.

	package main;
	use Fcntl qw( O_RDWR O_CREAT );

	tie %foo, "Mydbm", "adbm", O_RDWR|O_CREAT, 0640;
	$foo{'bar'} = 123;
	print "foo-bar = $foo{'bar'}\n";

=head1 SEE ALSO

L<perlboot>, L<perltoot>, L<perltooc>.

--- NEW FILE: perl571delta.pod ---
=head1 NAME

perl571delta - what's new for perl v5.7.1


This document describes differences between the 5.7.0 release and the
5.7.1 release.  

(To view the differences between the 5.6.0 release and the 5.7.0
release, see L<perl570delta>.)

=head1 Security Vulnerability Closed

(This change was already made in 5.7.0 but bears repeating here.)

A potential security vulnerability in the optional suidperl component
of Perl was identified in August 2000.  suidperl is neither built nor
installed by default.  As of April 2001 the only known vulnerable
[...1036 lines suppressed...]
analysed by the Perl porting team.

=head1 SEE ALSO

The F<Changes> file for exhaustive details on what changed.

The F<INSTALL> file for how to build Perl.

The F<README> file for general stuff.

The F<Artistic> and F<Copying> files for copyright information.

=head1 HISTORY

Written by Jarkko Hietaniemi <F<jhi at iki.fi>>, with many contributions
from The Perl Porters and Perl Users submitting feedback and patches.

Send omissions or corrections to <F<perlbug at perl.org>>.


--- NEW FILE: perlclib.pod ---
=head1 NAME

perlclib - Internal replacements for standard C library functions


One thing Perl porters should note is that F<perl> doesn't tend to use that
much of the C standard library internally; you'll see very little use of, 
for example, the F<ctype.h> functions in there. This is because Perl
tends to reimplement or abstract standard library functions, so that we
know exactly how they're going to operate.

This is a reference card for people who are familiar with the C library
and who want to do things the Perl way; to tell them which functions
they ought to use instead of the more normal C functions. 

=head2 Conventions

In the following tables:

=over 3

=item C<t>

is a type.

=item C<p>

is a pointer.

=item C<n>

is a number.

=item C<s>

is a string.


C<sv>, C<av>, C<hv>, etc. represent variables of their respective types.

=head2 File Operations

Instead of the F<stdio.h> functions, you should use the Perl abstraction
layer. Instead of C<FILE*> types, you need to be handling C<PerlIO*>
types.  Don't forget that with the new PerlIO layered I/O abstraction 
C<FILE*> types may not even be available. See also the C<perlapio>
documentation for more information about the following functions:

    Instead Of:                 Use:

    stdin                       PerlIO_stdin()
    stdout                      PerlIO_stdout()
    stderr                      PerlIO_stderr()

    fopen(fn, mode)             PerlIO_open(fn, mode)
    freopen(fn, mode, stream)   PerlIO_reopen(fn, mode, perlio) (Deprecated)
    fflush(stream)              PerlIO_flush(perlio)
    fclose(stream)              PerlIO_close(perlio)

=head2 File Input and Output

    Instead Of:                 Use:

    fprintf(stream, fmt, ...)   PerlIO_printf(perlio, fmt, ...)

    [f]getc(stream)             PerlIO_getc(perlio)
    [f]putc(stream, n)          PerlIO_putc(perlio, n)
    ungetc(n, stream)           PerlIO_ungetc(perlio, n)

Note that the PerlIO equivalents of C<fread> and C<fwrite> are slightly
different from their C library counterparts:

    fread(p, size, n, stream)   PerlIO_read(perlio, buf, numbytes)
    fwrite(p, size, n, stream)  PerlIO_write(perlio, buf, numbytes)

    fputs(s, stream)            PerlIO_puts(perlio, s)

There is no equivalent to C<fgets>; one should use C<sv_gets> instead:

    fgets(s, n, stream)         sv_gets(sv, perlio, append)

=head2 File Positioning

    Instead Of:                 Use:

    feof(stream)                PerlIO_eof(perlio)
    fseek(stream, n, whence)    PerlIO_seek(perlio, n, whence)
    rewind(stream)              PerlIO_rewind(perlio)

    fgetpos(stream, p)          PerlIO_getpos(perlio, sv)
    fsetpos(stream, p)          PerlIO_setpos(perlio, sv)

    ferror(stream)              PerlIO_error(perlio)
    clearerr(stream)            PerlIO_clearerr(perlio)

=head2 Memory Management and String Handling

    Instead Of:                 	Use:

    t* p = malloc(n)            	Newx(id, p, n, t)
    t* p = calloc(n, s)         	Newxz(id, p, n, t)
    p = realloc(p, n)           	Renew(p, n, t)
    memcpy(dst, src, n)         	Copy(src, dst, n, t)
    memmove(dst, src, n)        	Move(src, dst, n, t)
    memcpy/*(struct foo *)      	StructCopy(src, dst, t)
    memset(dst, 0, n * sizeof(t))	Zero(dst, n, t)
    memzero(dst, 0)			Zero(dst, n, char)
    free(p)             	        Safefree(p)

    strdup(p)                   savepv(p)
    strndup(p, n)               savepvn(p, n) (Hey, strndup doesn't exist!)

    strstr(big, little)         instr(big, little)
    strcmp(s1, s2)              strLE(s1, s2) / strEQ(s1, s2) / strGT(s1,s2)
    strncmp(s1, s2, n)          strnNE(s1, s2, n) / strnEQ(s1, s2, n)

Notice the different order of arguments to C<Copy> and C<Move> than used
in C<memcpy> and C<memmove>.

Most of the time, though, you'll want to be dealing with SVs internally
instead of raw C<char *> strings:

    strlen(s)                   sv_len(sv)
    strcpy(dt, src)             sv_setpv(sv, s)
    strncpy(dt, src, n)         sv_setpvn(sv, s, n)
    strcat(dt, src)             sv_catpv(sv, s)
    strncat(dt, src)            sv_catpvn(sv, s)
    sprintf(s, fmt, ...)        sv_setpvf(sv, fmt, ...)

Note also the existence of C<sv_catpvf> and C<sv_vcatpvfn>, combining
concatenation with formatting.

Sometimes instead of zeroing the allocated heap by using Newxz() you
should consider "poisoning" the data.  This means writing a bit
pattern into it that should be illegal as pointers (and floating point
numbers), and also hopefully surprising enough as integers, so that
any code attempting to use the data without forethought will break
sooner rather than later.  Poisoning can be done using the Poison()
macro, which has similar arguments as Zero():

    Poison(dst, n, t)

=head2 Character Class Tests

There are two types of character class tests that Perl implements: one
type deals in C<char>s and are thus B<not> Unicode aware (and hence
deprecated unless you B<know> you should use them) and the other type
deal in C<UV>s and know about Unicode properties. In the following
table, C<c> is a C<char>, and C<u> is a Unicode codepoint.

    Instead Of:                 Use:            But better use:

    isalnum(c)                  isALNUM(c)      isALNUM_uni(u)
    isalpha(c)                  isALPHA(c)      isALPHA_uni(u)
    iscntrl(c)                  isCNTRL(c)      isCNTRL_uni(u)
    isdigit(c)                  isDIGIT(c)      isDIGIT_uni(u)
    isgraph(c)                  isGRAPH(c)      isGRAPH_uni(u)
    islower(c)                  isLOWER(c)      isLOWER_uni(u)
    isprint(c)                  isPRINT(c)      isPRINT_uni(u)
    ispunct(c)                  isPUNCT(c)      isPUNCT_uni(u)
    isspace(c)                  isSPACE(c)      isSPACE_uni(u)
    isupper(c)                  isUPPER(c)      isUPPER_uni(u)
    isxdigit(c)                 isXDIGIT(c)     isXDIGIT_uni(u)

    tolower(c)                  toLOWER(c)      toLOWER_uni(u)
    toupper(c)                  toUPPER(c)      toUPPER_uni(u)

=head2 F<stdlib.h> functions

    Instead Of:                 Use: 

    atof(s)                     Atof(s)
    atol(s)                     Atol(s)
    strtod(s, *p)               Nothing.  Just don't use it.
    strtol(s, *p, n)            Strtol(s, *p, n)
    strtoul(s, *p, n)           Strtoul(s, *p, n)

Notice also the C<grok_bin>, C<grok_hex>, and C<grok_oct> functions in
F<numeric.c> for converting strings representing numbers in the respective
bases into C<NV>s.

In theory C<Strtol> and C<Strtoul> may not be defined if the machine perl is
built on doesn't actually have strtol and strtoul. But as those 2
functions are part of the 1989 ANSI C spec we suspect you'll find them
everywhere by now.

    int rand()                  double Drand01()
    srand(n)                    { seedDrand01((Rand_seed_t)n); 
                                  PL_srand_called = TRUE; }

    exit(n)                     my_exit(n)
    system(s)                   Don't. Look at pp_system or use my_popen

    getenv(s)                   PerlEnv_getenv(s)
    setenv(s, val)              my_putenv(s, val)

=head2 Miscellaneous functions

You should not even B<want> to use F<setjmp.h> functions, but if you
think you do, use the C<JMPENV> stack in F<scope.h> instead.

For C<signal>/C<sigaction>, use C<rsignal(signo, handler)>.

=head1 SEE ALSO

C<perlapi>, C<perlapio>, C<perlguts>

--- NEW FILE: perlsyn.pod ---
=head1 NAME

perlsyn - Perl syntax


A Perl program consists of a sequence of declarations and statements
which run from the top to the bottom.  Loops, subroutines and other
control structures allow you to jump around within the code.

Perl is a B<free-form> language, you can format and indent it however
you like.  Whitespace mostly serves to separate tokens, unlike
languages like Python where it is an important part of the syntax.

Many of Perl's syntactic elements are B<optional>.  Rather than
requiring you to put parentheses around every function call and
declare every variable, you can often leave such explicit elements off
and Perl will figure out what you meant.  This is known as B<Do What I
Mean>, abbreviated B<DWIM>.  It allows programmers to be B<lazy> and to
code in a style with which they are comfortable.

Perl B<borrows syntax> and concepts from many languages: awk, sed, C,
Bourne Shell, Smalltalk, Lisp and even English.  Other
languages have borrowed syntax from Perl, particularly its regular
expression extensions.  So if you have programmed in another language
you will see familiar pieces in Perl.  They often work the same, but
see L<perltrap> for information about how they differ.

=head2 Declarations
X<declaration> X<undef> X<undefined> X<uninitialized>

The only things you need to declare in Perl are report formats and
subroutines (and sometimes not even subroutines).  A variable holds
the undefined value (C<undef>) until it has been assigned a defined
value, which is anything other than C<undef>.  When used as a number,
C<undef> is treated as C<0>; when used as a string, it is treated as
the empty string, C<"">; and when used as a reference that isn't being
assigned to, it is treated as an error.  If you enable warnings,
you'll be notified of an uninitialized value whenever you treat
C<undef> as a string or a number.  Well, usually.  Boolean contexts,
such as:

    my $a;
    if ($a) {}

are exempt from warnings (because they care about truth rather than
definedness).  Operators such as C<++>, C<-->, C<+=>,
C<-=>, and C<.=>, that operate on undefined left values such as:

    my $a;

are also always exempt from such warnings.

A declaration can be put anywhere a statement can, but has no effect on
the execution of the primary sequence of statements--declarations all
take effect at compile time.  Typically all the declarations are put at
the beginning or the end of the script.  However, if you're using
lexically-scoped private variables created with C<my()>, you'll
have to make sure
your format or subroutine definition is within the same block scope
as the my if you expect to be able to access those private variables.

Declaring a subroutine allows a subroutine name to be used as if it were a
list operator from that point forward in the program.  You can declare a
subroutine without defining it by saying C<sub name>, thus:
X<subroutine, declaration>

    sub myname;
    $me = myname $0 		or die "can't get myname";

Note that myname() functions as a list operator, not as a unary operator;
so be careful to use C<or> instead of C<||> in this case.  However, if
you were to declare the subroutine as C<sub myname ($)>, then
C<myname> would function as a unary operator, so either C<or> or
C<||> would work.

Subroutines declarations can also be loaded up with the C<require> statement
or both loaded and imported into your namespace with a C<use> statement.
See L<perlmod> for details on this.

A statement sequence may contain declarations of lexically-scoped
variables, but apart from declaring a variable name, the declaration acts
like an ordinary statement, and is elaborated within the sequence of
statements as if it were an ordinary statement.  That means it actually
has both compile-time and run-time effects.

=head2 Comments
X<comment> X<#>

Text from a C<"#"> character until the end of the line is a comment,
and is ignored.  Exceptions include C<"#"> inside a string or regular

=head2 Simple Statements
X<statement> X<semicolon> X<expression> X<;>

The only kind of simple statement is an expression evaluated for its
side effects.  Every simple statement must be terminated with a
semicolon, unless it is the final statement in a block, in which case
the semicolon is optional.  (A semicolon is still encouraged if the
block takes up more than one line, because you may eventually add
another line.)  Note that there are some operators like C<eval {}> and
C<do {}> that look like compound statements, but aren't (they're just
TERMs in an expression), and thus need an explicit termination if used
as the last item in a statement.

=head2 Truth and Falsehood
X<truth> X<falsehood> X<true> X<false> X<!> X<not> X<negation> X<0>

The number 0, the strings C<'0'> and C<''>, the empty list C<()>, and
C<undef> are all false in a boolean context. All other values are true.
Negation of a true value by C<!> or C<not> returns a special false value.
When evaluated as a string it is treated as C<''>, but as a number, it
is treated as 0.

=head2 Statement Modifiers
X<statement modifier> X<modifier> X<if> X<unless> X<while>
X<until> X<foreach> X<for>

Any simple statement may optionally be followed by a I<SINGLE> modifier,
just before the terminating semicolon (or block ending).  The possible
modifiers are:

    if EXPR
    unless EXPR
    while EXPR
    until EXPR
    foreach LIST

The C<EXPR> following the modifier is referred to as the "condition".
Its truth or falsehood determines how the modifier will behave.

C<if> executes the statement once I<if> and only if the condition is
true.  C<unless> is the opposite, it executes the statement I<unless>
the condition is true (i.e., if the condition is false).

    print "Basset hounds got long ears" if length $ear >= 10;
    go_outside() and play() unless $is_raining;

The C<foreach> modifier is an iterator: it executes the statement once
for each item in the LIST (with C<$_> aliased to each item in turn).

    print "Hello $_!\n" foreach qw(world Dolly nurse);

C<while> repeats the statement I<while> the condition is true.
C<until> does the opposite, it repeats the statement I<until> the
condition is true (or while the condition is false):

    # Both of these count from 0 to 10.
    print $i++ while $i <= 10;
    print $j++ until $j >  10;

The C<while> and C<until> modifiers have the usual "C<while> loop"
semantics (conditional evaluated first), except when applied to a
C<do>-BLOCK (or to the deprecated C<do>-SUBROUTINE statement), in
which case the block executes once before the conditional is
evaluated.  This is so that you can write loops like:

    do {
	$line = <STDIN>;
    } until $line  eq ".\n";

See L<perlfunc/do>.  Note also that the loop control statements described
later will I<NOT> work in this construct, because modifiers don't take
loop labels.  Sorry.  You can always put another block inside of it
(for C<next>) or around it (for C<last>) to do that sort of thing.
For C<next>, just double the braces:
X<next> X<last> X<redo>

    do {{
	next if $x == $y;
	# do something here
    }} until $x++ > $z;

For C<last>, you have to be more elaborate:

    LOOP: { 
	    do {
		last if $x = $y**2;
		# do something here
	    } while $x++ <= $z;

B<NOTE:> The behaviour of a C<my> statement modified with a statement
modifier conditional or loop construct (e.g. C<my $x if ...>) is
B<undefined>.  The value of the C<my> variable may be C<undef>, any
previously assigned value, or possibly anything else.  Don't rely on
it.  Future versions of perl might do something different from the
version of perl you try it out on.  Here be dragons.

=head2 Compound Statements
X<statement, compound> X<block> X<bracket, curly> X<curly bracket> X<brace>
X<{> X<}> X<if> X<unless> X<while> X<until> X<foreach> X<for> X<continue>

In Perl, a sequence of statements that defines a scope is called a block.
Sometimes a block is delimited by the file containing it (in the case
of a required file, or the program as a whole), and sometimes a block
is delimited by the extent of a string (in the case of an eval).

But generally, a block is delimited by curly brackets, also known as braces.
We will call this syntactic construct a BLOCK.

The following compound statements may be used to control flow:

    if (EXPR) BLOCK
    if (EXPR) BLOCK else BLOCK
    if (EXPR) BLOCK elsif (EXPR) BLOCK ... else BLOCK
    LABEL while (EXPR) BLOCK
    LABEL while (EXPR) BLOCK continue BLOCK
    LABEL until (EXPR) BLOCK
    LABEL until (EXPR) BLOCK continue BLOCK
    LABEL foreach VAR (LIST) BLOCK
    LABEL foreach VAR (LIST) BLOCK continue BLOCK
    LABEL BLOCK continue BLOCK

Note that, unlike C and Pascal, these are defined in terms of BLOCKs,
not statements.  This means that the curly brackets are I<required>--no
dangling statements allowed.  If you want to write conditionals without
curly brackets there are several other ways to do it.  The following
all do the same thing:

    if (!open(FOO)) { die "Can't open $FOO: $!"; }
    die "Can't open $FOO: $!" unless open(FOO);
    open(FOO) or die "Can't open $FOO: $!";	# FOO or bust!
    open(FOO) ? 'hi mom' : die "Can't open $FOO: $!";
			# a bit exotic, that last one

The C<if> statement is straightforward.  Because BLOCKs are always
bounded by curly brackets, there is never any ambiguity about which
C<if> an C<else> goes with.  If you use C<unless> in place of C<if>,
the sense of the test is reversed.

The C<while> statement executes the block as long as the expression is
true (does not evaluate to the null string C<""> or C<0> or C<"0">).
The C<until> statement executes the block as long as the expression is
The LABEL is optional, and if present, consists of an identifier followed
by a colon.  The LABEL identifies the loop for the loop control
statements C<next>, C<last>, and C<redo>.
If the LABEL is omitted, the loop control statement
refers to the innermost enclosing loop.  This may include dynamically
looking back your call-stack at run time to find the LABEL.  Such
desperate behavior triggers a warning if you use the C<use warnings>
pragma or the B<-w> flag.

If there is a C<continue> BLOCK, it is always executed just before the
conditional is about to be evaluated again.  Thus it can be used to
increment a loop variable, even when the loop has been continued via
the C<next> statement.

=head2 Loop Control
X<loop control> X<loop, control> X<next> X<last> X<redo> X<continue>

The C<next> command starts the next iteration of the loop:

    LINE: while (<STDIN>) {
	next LINE if /^#/;	# discard comments

The C<last> command immediately exits the loop in question.  The
C<continue> block, if any, is not executed:

    LINE: while (<STDIN>) {
	last LINE if /^$/;	# exit when done with header

The C<redo> command restarts the loop block without evaluating the
conditional again.  The C<continue> block, if any, is I<not> executed.
This command is normally used by programs that want to lie to themselves
about what was just input.

For example, when processing a file like F</etc/termcap>.
If your input lines might end in backslashes to indicate continuation, you
want to skip ahead and get the next record.

    while (<>) {
	if (s/\\$//) {
	    $_ .= <>;
	    redo unless eof();
	# now process $_

which is Perl short-hand for the more explicitly written version:

    LINE: while (defined($line = <ARGV>)) {
	if ($line =~ s/\\$//) {
	    $line .= <ARGV>;
	    redo LINE unless eof(); # not eof(ARGV)!
	# now process $line

Note that if there were a C<continue> block on the above code, it would
get executed only on lines discarded by the regex (since redo skips the
continue block). A continue block is often used to reset line counters
or C<?pat?> one-time matches:

    # inspired by :1,$g/fred/s//WILMA/
    while (<>) {
	?(fred)?    && s//WILMA $1 WILMA/;
	?(barney)?  && s//BETTY $1 BETTY/;
	?(homer)?   && s//MARGE $1 MARGE/;
    } continue {
	print "$ARGV $.: $_";
	close ARGV  if eof();		# reset $.
	reset	    if eof();		# reset ?pat?

If the word C<while> is replaced by the word C<until>, the sense of the
test is reversed, but the conditional is still tested before the first

The loop control statements don't work in an C<if> or C<unless>, since
they aren't loops.  You can double the braces to make them such, though.

    if (/pattern/) {{
	last if /fred/;
	next if /barney/; # same effect as "last", but doesn't document as well
	# do something here

This is caused by the fact that a block by itself acts as a loop that
executes once, see L<"Basic BLOCKs and Switch Statements">.

The form C<while/if BLOCK BLOCK>, available in Perl 4, is no longer
available.   Replace any occurrence of C<if BLOCK> by C<if (do BLOCK)>.

=head2 For Loops
X<for> X<foreach>

Perl's C-style C<for> loop works like the corresponding C<while> loop;
that means that this:

    for ($i = 1; $i < 10; $i++) {

is the same as this:

    $i = 1;
    while ($i < 10) {
    } continue {

There is one minor difference: if variables are declared with C<my>
in the initialization section of the C<for>, the lexical scope of
those variables is exactly the C<for> loop (the body of the loop
and the control sections).

Besides the normal array index looping, C<for> can lend itself
to many other interesting applications.  Here's one that avoids the
problem you get into if you explicitly test for end-of-file on
an interactive file descriptor causing your program to appear to
X<eof> X<end-of-file> X<end of file>

    $on_a_tty = -t STDIN && -t STDOUT;
    sub prompt { print "yes? " if $on_a_tty }
    for ( prompt(); <STDIN>; prompt() ) {
	# do something

Using C<readline> (or the operator form, C<< <EXPR> >>) as the
conditional of a C<for> loop is shorthand for the following.  This
behaviour is the same as a C<while> loop conditional.
X<readline> X<< <> >>

    for ( prompt(); defined( $_ = <STDIN> ); prompt() ) {
        # do something

=head2 Foreach Loops
X<for> X<foreach>

The C<foreach> loop iterates over a normal list value and sets the
variable VAR to be each element of the list in turn.  If the variable
is preceded with the keyword C<my>, then it is lexically scoped, and
is therefore visible only within the loop.  Otherwise, the variable is
implicitly local to the loop and regains its former value upon exiting
the loop.  If the variable was previously declared with C<my>, it uses
that variable instead of the global one, but it's still localized to
the loop.  This implicit localisation occurs I<only> in a C<foreach>
X<my> X<local>

The C<foreach> keyword is actually a synonym for the C<for> keyword, so
you can use C<foreach> for readability or C<for> for brevity.  (Or because
the Bourne shell is more familiar to you than I<csh>, so writing C<for>
comes more naturally.)  If VAR is omitted, C<$_> is set to each value.

If any element of LIST is an lvalue, you can modify it by modifying
VAR inside the loop.  Conversely, if any element of LIST is NOT an
lvalue, any attempt to modify that element will fail.  In other words,
the C<foreach> loop index variable is an implicit alias for each item
in the list that you're looping over.

If any part of LIST is an array, C<foreach> will get very confused if
you add or remove elements within the loop body, for example with
C<splice>.   So don't do that.

C<foreach> probably won't do what you expect if VAR is a tied or other
special variable.   Don't do that either.


    for (@ary) { s/foo/bar/ }

    for my $elem (@elements) {
	$elem *= 2;

    for $count (10,9,8,7,6,5,4,3,2,1,'BOOM') {
	print $count, "\n"; sleep(1);

    for (1..15) { print "Merry Christmas\n"; }

    foreach $item (split(/:[\\\n:]*/, $ENV{TERMCAP})) {
	print "Item: $item\n";

Here's how a C programmer might code up a particular algorithm in Perl:

    for (my $i = 0; $i < @ary1; $i++) {
	for (my $j = 0; $j < @ary2; $j++) {
	    if ($ary1[$i] > $ary2[$j]) {
		last; # can't go to outer :-(
	    $ary1[$i] += $ary2[$j];
	# this is where that last takes me

Whereas here's how a Perl programmer more comfortable with the idiom might
do it:

    OUTER: for my $wid (@ary1) {
    INNER:   for my $jet (@ary2) {
		next OUTER if $wid > $jet;
		$wid += $jet;

See how much easier this is?  It's cleaner, safer, and faster.  It's
cleaner because it's less noisy.  It's safer because if code gets added
between the inner and outer loops later on, the new code won't be
accidentally executed.  The C<next> explicitly iterates the other loop
rather than merely terminating the inner one.  And it's faster because
Perl executes a C<foreach> statement more rapidly than it would the
equivalent C<for> loop.

=head2 Basic BLOCKs and Switch Statements
X<switch> X<block> X<case>

A BLOCK by itself (labeled or not) is semantically equivalent to a
loop that executes once.  Thus you can use any of the loop control
statements in it to leave or restart the block.  (Note that this is
I<NOT> true in C<eval{}>, C<sub{}>, or contrary to popular belief
C<do{}> blocks, which do I<NOT> count as loops.)  The C<continue>
block is optional.

The BLOCK construct is particularly nice for doing case

    SWITCH: {
	if (/^abc/) { $abc = 1; last SWITCH; }
	if (/^def/) { $def = 1; last SWITCH; }
	if (/^xyz/) { $xyz = 1; last SWITCH; }
	$nothing = 1;

There is no official C<switch> statement in Perl, because there are
already several ways to write the equivalent.

However, starting from Perl 5.8 to get switch and case one can use
the Switch extension and say:

	use Switch;

after which one has switch and case.  It is not as fast as it could be
because it's not really part of the language (it's done using source
filters) but it is available, and it's very flexible.

In addition to the above BLOCK construct, you could write

    SWITCH: {
	$abc = 1, last SWITCH  if /^abc/;
	$def = 1, last SWITCH  if /^def/;
	$xyz = 1, last SWITCH  if /^xyz/;
	$nothing = 1;

(That's actually not as strange as it looks once you realize that you can
use loop control "operators" within an expression.  That's just the binary
comma operator in scalar context.  See L<perlop/"Comma Operator">.)


    SWITCH: {
	/^abc/ && do { $abc = 1; last SWITCH; };
	/^def/ && do { $def = 1; last SWITCH; };
	/^xyz/ && do { $xyz = 1; last SWITCH; };
	$nothing = 1;

or formatted so it stands out more as a "proper" C<switch> statement:

    SWITCH: {
	/^abc/ 	    && do {
			    $abc = 1;
			    last SWITCH;

	/^def/ 	    && do {
			    $def = 1;
			    last SWITCH;

	/^xyz/ 	    && do {
			    $xyz = 1;
			    last SWITCH;
	$nothing = 1;


    SWITCH: {
	/^abc/ and $abc = 1, last SWITCH;
	/^def/ and $def = 1, last SWITCH;
	/^xyz/ and $xyz = 1, last SWITCH;
	$nothing = 1;

or even, horrors,

    if (/^abc/)
	{ $abc = 1 }
    elsif (/^def/)
	{ $def = 1 }
    elsif (/^xyz/)
	{ $xyz = 1 }
	{ $nothing = 1 }

A common idiom for a C<switch> statement is to use C<foreach>'s aliasing to make
a temporary assignment to C<$_> for convenient matching:

    SWITCH: for ($where) {
		/In Card Names/     && do { push @flags, '-e'; last; };
		/Anywhere/          && do { push @flags, '-h'; last; };
		/In Rulings/        && do {                    last; };
		die "unknown value for form variable where: `$where'";

Another interesting approach to a switch statement is arrange
for a C<do> block to return the proper value:

    $amode = do {
	if     ($flag & O_RDONLY) { "r" }	# XXX: isn't this 0?
	elsif  ($flag & O_WRONLY) { ($flag & O_APPEND) ? "a" : "w" }
	elsif  ($flag & O_RDWR)   {
	    if ($flag & O_CREAT)  { "w+" }
	    else                  { ($flag & O_APPEND) ? "a+" : "r+" }


        print do {
            ($flags & O_WRONLY) ? "write-only"          :
            ($flags & O_RDWR)   ? "read-write"          :

Or if you are certain that all the C<&&> clauses are true, you can use
something like this, which "switches" on the value of the
C<HTTP_USER_AGENT> environment variable.

    # pick out jargon file page based on browser
    $dir = 'http://www.wins.uva.nl/~mes/jargon';
    for ($ENV{HTTP_USER_AGENT}) { 
	$page  =    /Mac/            && 'm/Macintrash.html'
		 || /Win(dows )?NT/  && 'e/evilandrude.html'
		 || /Win|MSIE|WebTV/ && 'm/MicroslothWindows.html'
		 || /Linux/          && 'l/Linux.html'
		 || /HP-UX/          && 'h/HP-SUX.html'
		 || /SunOS/          && 's/ScumOS.html'
		 ||                     'a/AppendixB.html';
    print "Location: $dir/$page\015\012\015\012";

That kind of switch statement only works when you know the C<&&> clauses
will be true.  If you don't, the previous C<?:> example should be used.

You might also consider writing a hash of subroutine references
instead of synthesizing a C<switch> statement.

=head2 Goto

Although not for the faint of heart, Perl does support a C<goto>
statement.  There are three forms: C<goto>-LABEL, C<goto>-EXPR, and
C<goto>-&NAME.  A loop's LABEL is not actually a valid target for
a C<goto>; it's just the name of the loop.

The C<goto>-LABEL form finds the statement labeled with LABEL and resumes
execution there.  It may not be used to go into any construct that
requires initialization, such as a subroutine or a C<foreach> loop.  It
also can't be used to go into a construct that is optimized away.  It
can be used to go almost anywhere else within the dynamic scope,
including out of subroutines, but it's usually better to use some other
construct such as C<last> or C<die>.  The author of Perl has never felt the
need to use this form of C<goto> (in Perl, that is--C is another matter).

The C<goto>-EXPR form expects a label name, whose scope will be resolved
dynamically.  This allows for computed C<goto>s per FORTRAN, but isn't
necessarily recommended if you're optimizing for maintainability:

    goto(("FOO", "BAR", "GLARCH")[$i]);

The C<goto>-&NAME form is highly magical, and substitutes a call to the
named subroutine for the currently running subroutine.  This is used by
C<AUTOLOAD()> subroutines that wish to load another subroutine and then
pretend that the other subroutine had been called in the first place
(except that any modifications to C<@_> in the current subroutine are
propagated to the other subroutine.)  After the C<goto>, not even C<caller()>
will be able to tell that this routine was called first.

In almost all cases like this, it's usually a far, far better idea to use the
structured control flow mechanisms of C<next>, C<last>, or C<redo> instead of
resorting to a C<goto>.  For certain applications, the catch and throw pair of
C<eval{}> and die() for exception processing can also be a prudent approach.

=head2 PODs: Embedded Documentation
X<POD> X<documentation>

Perl has a mechanism for intermixing documentation with source code.
While it's expecting the beginning of a new statement, if the compiler
encounters a line that begins with an equal sign and a word, like this

    =head1 Here There Be Pods!

Then that text and all remaining text up through and including a line
beginning with C<=cut> will be ignored.  The format of the intervening
text is described in L<perlpod>.

This allows you to intermix your source code
and your documentation text freely, as in

    =item snazzle($)

    The snazzle() function will behave in the most spectacular
    form that you can possibly imagine, not even excepting
    cybernetic pyrotechnics.

    =cut back to the compiler, nuff of this pod stuff!

    sub snazzle($) {
	my $thingie = shift;

Note that pod translators should look at only paragraphs beginning
with a pod directive (it makes parsing easier), whereas the compiler
actually knows to look for pod escapes even in the middle of a
paragraph.  This means that the following secret stuff will be
ignored by both the compiler and the translators.

    =secret stuff
     warn "Neither POD nor CODE!?"
    =cut back
    print "got $a\n";

You probably shouldn't rely upon the C<warn()> being podded out forever.
Not all pod translators are well-behaved in this regard, and perhaps
the compiler will become pickier.

One may also use pod directives to quickly comment out a section
of code.

=head2 Plain Old Comments (Not!)
X<comment> X<line> X<#> X<preprocessor> X<eval>

Perl can process line directives, much like the C preprocessor.  Using
this, one can control Perl's idea of filenames and line numbers in
error or warning messages (especially for strings that are processed
with C<eval()>).  The syntax for this mechanism is the same as for most
C preprocessors: it matches the regular expression

    # example: '# line 42 "new_filename.plx"'
    /^\#   \s*
      line \s+ (\d+)   \s*
      (?:\s("?)([^"]+)\2)? \s*

with C<$1> being the line number for the next line, and C<$3> being
the optional filename (specified with or without quotes).

There is a fairly obvious gotcha included with the line directive:
Debuggers and profilers will only show the last source line to appear
at a particular line number in a given file.  Care should be taken not
to cause line number collisions in code you'd like to debug later.

Here are some examples that you should be able to type into your command

    % perl
    # line 200 "bzzzt"
    # the `#' on the previous line must be the first char on line
    die 'foo';
    foo at bzzzt line 201.

    % perl
    # line 200 "bzzzt"
    eval qq[\n#line 2001 ""\ndie 'foo']; print $@;
    foo at - line 2001.

    % perl
    eval qq[\n#line 200 "foo bar"\ndie 'foo']; print $@;
    foo at foo bar line 200.

    % perl
    # line 345 "goop"
    eval "\n#line " . __LINE__ . ' "' . __FILE__ ."\"\ndie 'foo'";
    print $@;
    foo at goop line 345.


--- NEW FILE: perlopentut.pod ---
=head1 NAME

perlopentut - tutorial on opening things in Perl


Perl has two simple, built-in ways to open files: the shell way for
convenience, and the C way for precision.  The shell way also has 2- and
3-argument forms, which have different semantics for handling the filename.
The choice is yours.

=head1 Open E<agrave> la shell

Perl's C<open> function was designed to mimic the way command-line
redirection in the shell works.  Here are some basic examples
from the shell:

    $ myprogram file1 file2 file3
    $ myprogram    <  inputfile
    $ myprogram    >  outputfile
    $ myprogram    >> outputfile
    $ myprogram    |  otherprogram 
    $ otherprogram |  myprogram

And here are some more advanced examples:

    $ otherprogram      | myprogram f1 - f2
    $ otherprogram 2>&1 | myprogram -
    $ myprogram     <&3
    $ myprogram     >&4

Programmers accustomed to constructs like those above can take comfort
in learning that Perl directly supports these familiar constructs using
virtually the same syntax as the shell.

=head2 Simple Opens

The C<open> function takes two arguments: the first is a filehandle,
and the second is a single string comprising both what to open and how
to open it.  C<open> returns true when it works, and when it fails,
returns a false value and sets the special variable C<$!> to reflect
the system error.  If the filehandle was previously opened, it will
be implicitly closed first.

For example:

    open(INFO,      "datafile") || die("can't open datafile: $!");
    open(INFO,   "<  datafile") || die("can't open datafile: $!");
    open(RESULTS,">  runstats") || die("can't open runstats: $!");
    open(LOG,    ">> logfile ") || die("can't open logfile:  $!");

If you prefer the low-punctuation version, you could write that this way:

    open INFO,   "<  datafile"  or die "can't open datafile: $!";
    open RESULTS,">  runstats"  or die "can't open runstats: $!";
    open LOG,    ">> logfile "  or die "can't open logfile:  $!";

A few things to notice.  First, the leading less-than is optional.
If omitted, Perl assumes that you want to open the file for reading.

Note also that the first example uses the C<||> logical operator, and the
second uses C<or>, which has lower precedence.  Using C<||> in the latter
examples would effectively mean

    open INFO, ( "<  datafile"  || die "can't open datafile: $!" );

which is definitely not what you want.

The other important thing to notice is that, just as in the shell,
any whitespace before or after the filename is ignored.  This is good,
because you wouldn't want these to do different things:

    open INFO,   "<datafile"   
    open INFO,   "< datafile" 
    open INFO,   "<  datafile"

Ignoring surrounding whitespace also helps for when you read a filename
in from a different file, and forget to trim it before opening:

    $filename = <INFO>;         # oops, \n still there
    open(EXTRA, "< $filename") || die "can't open $filename: $!";

This is not a bug, but a feature.  Because C<open> mimics the shell in
its style of using redirection arrows to specify how to open the file, it
also does so with respect to extra whitespace around the filename itself
as well.  For accessing files with naughty names, see 
L<"Dispelling the Dweomer">.

There is also a 3-argument version of C<open>, which lets you put the
special redirection characters into their own argument:

    open( INFO, ">", $datafile ) || die "Can't create $datafile: $!";

In this case, the filename to open is the actual string in C<$datafile>,
so you don't have to worry about C<$datafile> containing characters
that might influence the open mode, or whitespace at the beginning of
the filename that would be absorbed in the 2-argument version.  Also,
any reduction of unnecessary string interpolation is a good thing.

=head2 Indirect Filehandles

C<open>'s first argument can be a reference to a filehandle.  As of
perl 5.6.0, if the argument is uninitialized, Perl will automatically
create a filehandle and put a reference to it in the first argument,
like so:

    open( my $in, $infile )   or die "Couldn't read $infile: $!";
    while ( <$in> ) {
	# do something with $_
    close $in;

Indirect filehandles make namespace management easier.  Since filehandles
are global to the current package, two subroutines trying to open
C<INFILE> will clash.  With two functions opening indirect filehandles
like C<my $infile>, there's no clash and no need to worry about future

Another convenient behavior is that an indirect filehandle automatically
closes when it goes out of scope or when you undefine it:

    sub firstline {
	open( my $in, shift ) && return scalar <$in>;
	# no close() required

=head2 Pipe Opens

In C, when you want to open a file using the standard I/O library,
you use the C<fopen> function, but when opening a pipe, you use the
C<popen> function.  But in the shell, you just use a different redirection
character.  That's also the case for Perl.  The C<open> call 
remains the same--just its argument differs.  

If the leading character is a pipe symbol, C<open> starts up a new
command and opens a write-only filehandle leading into that command.
This lets you write into that handle and have what you write show up on
that command's standard input.  For example:

    open(PRINTER, "| lpr -Plp1")    || die "can't run lpr: $!";
    print PRINTER "stuff\n";
    close(PRINTER)                  || die "can't close lpr: $!";

If the trailing character is a pipe, you start up a new command and open a
read-only filehandle leading out of that command.  This lets whatever that
command writes to its standard output show up on your handle for reading.
For example:

    open(NET, "netstat -i -n |")    || die "can't fork netstat: $!";
    while (<NET>) { }               # do something with input
    close(NET)                      || die "can't close netstat: $!";

What happens if you try to open a pipe to or from a non-existent
command?  If possible, Perl will detect the failure and set C<$!> as
usual.  But if the command contains special shell characters, such as
C<E<gt>> or C<*>, called 'metacharacters', Perl does not execute the
command directly.  Instead, Perl runs the shell, which then tries to
run the command.  This means that it's the shell that gets the error
indication.  In such a case, the C<open> call will only indicate
failure if Perl can't even run the shell.  See L<perlfaq8/"How can I
capture STDERR from an external command?"> to see how to cope with
this.  There's also an explanation in L<perlipc>.

If you would like to open a bidirectional pipe, the IPC::Open2
library will handle this for you.  Check out 
L<perlipc/"Bidirectional Communication with Another Process">

=head2 The Minus File

Again following the lead of the standard shell utilities, Perl's
C<open> function treats a file whose name is a single minus, "-", in a
special way.  If you open minus for reading, it really means to access
the standard input.  If you open minus for writing, it really means to
access the standard output.

If minus can be used as the default input or default output, what happens
if you open a pipe into or out of minus?  What's the default command it
would run?  The same script as you're currently running!  This is actually
a stealth C<fork> hidden inside an C<open> call.  See 
L<perlipc/"Safe Pipe Opens"> for details.

=head2 Mixing Reads and Writes

It is possible to specify both read and write access.  All you do is
add a "+" symbol in front of the redirection.  But as in the shell,
using a less-than on a file never creates a new file; it only opens an
existing one.  On the other hand, using a greater-than always clobbers
(truncates to zero length) an existing file, or creates a brand-new one
if there isn't an old one.  Adding a "+" for read-write doesn't affect
whether it only works on existing files or always clobbers existing ones.

    open(WTMP, "+< /usr/adm/wtmp") 
        || die "can't open /usr/adm/wtmp: $!";

    open(SCREEN, "+> lkscreen")
        || die "can't open lkscreen: $!";

    open(LOGFILE, "+>> /var/log/applog"
        || die "can't open /var/log/applog: $!";

The first one won't create a new file, and the second one will always
clobber an old one.  The third one will create a new file if necessary
and not clobber an old one, and it will allow you to read at any point
in the file, but all writes will always go to the end.  In short,
the first case is substantially more common than the second and third
cases, which are almost always wrong.  (If you know C, the plus in
Perl's C<open> is historically derived from the one in C's fopen(3S),
which it ultimately calls.)

In fact, when it comes to updating a file, unless you're working on
a binary file as in the WTMP case above, you probably don't want to
use this approach for updating.  Instead, Perl's B<-i> flag comes to
the rescue.  The following command takes all the C, C++, or yacc source
or header files and changes all their foo's to bar's, leaving
the old version in the original filename with a ".orig" tacked
on the end:

    $ perl -i.orig -pe 's/\bfoo\b/bar/g' *.[Cchy]

This is a short cut for some renaming games that are really
the best way to update textfiles.  See the second question in 
L<perlfaq5> for more details.

=head2 Filters 

One of the most common uses for C<open> is one you never
even notice.  When you process the ARGV filehandle using
C<< <ARGV> >>, Perl actually does an implicit open 
on each file in @ARGV.  Thus a program called like this:

    $ myprogram file1 file2 file3

Can have all its files opened and processed one at a time
using a construct no more complex than:

    while (<>) {
        # do something with $_

If @ARGV is empty when the loop first begins, Perl pretends you've opened
up minus, that is, the standard input.  In fact, $ARGV, the currently
open file during C<< <ARGV> >> processing, is even set to "-"
in these circumstances.

You are welcome to pre-process your @ARGV before starting the loop to
make sure it's to your liking.  One reason to do this might be to remove
command options beginning with a minus.  While you can always roll the
simple ones by hand, the Getopts modules are good for this:

    use Getopt::Std;

    # -v, -D, -o ARG, sets $opt_v, $opt_D, $opt_o

    # -v, -D, -o ARG, sets $args{v}, $args{D}, $args{o}
    getopts("vDo:", \%args);    

Or the standard Getopt::Long module to permit named arguments:

    use Getopt::Long;
    GetOptions( "verbose"  => \$verbose,        # --verbose
                "Debug"    => \$debug,          # --Debug
                "output=s" => \$output );       
	    # --output=somestring or --output somestring

Another reason for preprocessing arguments is to make an empty
argument list default to all files:

    @ARGV = glob("*") unless @ARGV;

You could even filter out all but plain, text files.  This is a bit
silent, of course, and you might prefer to mention them on the way.

    @ARGV = grep { -f && -T } @ARGV;

If you're using the B<-n> or B<-p> command-line options, you
should put changes to @ARGV in a C<BEGIN{}> block.

Remember that a normal C<open> has special properties, in that it might
call fopen(3S) or it might called popen(3S), depending on what its
argument looks like; that's why it's sometimes called "magic open".
Here's an example:

    $pwdinfo = `domainname` =~ /^(\(none\))?$/
                    ? '< /etc/passwd'
                    : 'ypcat passwd |';

    open(PWD, $pwdinfo)                 
                or die "can't open $pwdinfo: $!";

This sort of thing also comes into play in filter processing.  Because
C<< <ARGV> >> processing employs the normal, shell-style Perl C<open>,
it respects all the special things we've already seen:

    $ myprogram f1 "cmd1|" - f2 "cmd2|" f3 < tmpfile

That program will read from the file F<f1>, the process F<cmd1>, standard
input (F<tmpfile> in this case), the F<f2> file, the F<cmd2> command,
and finally the F<f3> file.

Yes, this also means that if you have files named "-" (and so on) in
your directory, they won't be processed as literal files by C<open>.
You'll need to pass them as "./-", much as you would for the I<rm> program,
or you could use C<sysopen> as described below.

One of the more interesting applications is to change files of a certain
name into pipes.  For example, to autoprocess gzipped or compressed
files by decompressing them with I<gzip>:

    @ARGV = map { /^\.(gz|Z)$/ ? "gzip -dc $_ |" : $_  } @ARGV;

Or, if you have the I<GET> program installed from LWP,
you can fetch URLs before processing them:

    @ARGV = map { m#^\w+://# ? "GET $_ |" : $_ } @ARGV;

It's not for nothing that this is called magic C<< <ARGV> >>.
Pretty nifty, eh?

=head1 Open E<agrave> la C

If you want the convenience of the shell, then Perl's C<open> is
definitely the way to go.  On the other hand, if you want finer precision
than C's simplistic fopen(3S) provides you should look to Perl's
C<sysopen>, which is a direct hook into the open(2) system call.
That does mean it's a bit more involved, but that's the price of 

C<sysopen> takes 3 (or 4) arguments.

    sysopen HANDLE, PATH, FLAGS, [MASK]

The HANDLE argument is a filehandle just as with C<open>.  The PATH is
a literal path, one that doesn't pay attention to any greater-thans or
less-thans or pipes or minuses, nor ignore whitespace.  If it's there,
it's part of the path.  The FLAGS argument contains one or more values
derived from the Fcntl module that have been or'd together using the
bitwise "|" operator.  The final argument, the MASK, is optional; if
present, it is combined with the user's current umask for the creation
mode of the file.  You should usually omit this.

Although the traditional values of read-only, write-only, and read-write
are 0, 1, and 2 respectively, this is known not to hold true on some
systems.  Instead, it's best to load in the appropriate constants first
from the Fcntl module, which supplies the following standard flags:

    O_RDONLY            Read only
    O_WRONLY            Write only
    O_RDWR              Read and write
    O_CREAT             Create the file if it doesn't exist
    O_EXCL              Fail if the file already exists
    O_APPEND            Append to the file
    O_TRUNC             Truncate the file
    O_NONBLOCK          Non-blocking access

Less common flags that are sometimes available on some operating
systems include C<O_BINARY>, C<O_TEXT>, C<O_SHLOCK>, C<O_EXLOCK>,
C<O_NOCTTY>, C<O_NDELAY> and C<O_LARGEFILE>.  Consult your open(2)
manpage or its local equivalent for details.  (Note: starting from
Perl release 5.6 the C<O_LARGEFILE> flag, if available, is automatically
added to the sysopen() flags because large files are the default.)

Here's how to use C<sysopen> to emulate the simple C<open> calls we had
before.  We'll omit the C<|| die $!> checks for clarity, but make sure
you always check the return values in real code.  These aren't quite
the same, since C<open> will trim leading and trailing whitespace,
but you'll get the idea.

To open a file for reading:

    open(FH, "< $path");
    sysopen(FH, $path, O_RDONLY);

To open a file for writing, creating a new file if needed or else truncating
an old file:

    open(FH, "> $path");
    sysopen(FH, $path, O_WRONLY | O_TRUNC | O_CREAT);

To open a file for appending, creating one if necessary:

    open(FH, ">> $path");
    sysopen(FH, $path, O_WRONLY | O_APPEND | O_CREAT);

To open a file for update, where the file must already exist:

    open(FH, "+< $path");
    sysopen(FH, $path, O_RDWR);

And here are things you can do with C<sysopen> that you cannot do with
a regular C<open>.  As you'll see, it's just a matter of controlling the
flags in the third argument.

To open a file for writing, creating a new file which must not previously

    sysopen(FH, $path, O_WRONLY | O_EXCL | O_CREAT);

To open a file for appending, where that file must already exist:

    sysopen(FH, $path, O_WRONLY | O_APPEND);

To open a file for update, creating a new file if necessary:

    sysopen(FH, $path, O_RDWR | O_CREAT);

To open a file for update, where that file must not already exist:

    sysopen(FH, $path, O_RDWR | O_EXCL | O_CREAT);

To open a file without blocking, creating one if necessary:

    sysopen(FH, $path, O_WRONLY | O_NONBLOCK | O_CREAT);

=head2 Permissions E<agrave> la mode

If you omit the MASK argument to C<sysopen>, Perl uses the octal value
0666.  The normal MASK to use for executables and directories should
be 0777, and for anything else, 0666.

Why so permissive?  Well, it isn't really.  The MASK will be modified
by your process's current C<umask>.  A umask is a number representing
I<disabled> permissions bits; that is, bits that will not be turned on
in the created files' permissions field.

For example, if your C<umask> were 027, then the 020 part would
disable the group from writing, and the 007 part would disable others
from reading, writing, or executing.  Under these conditions, passing
C<sysopen> 0666 would create a file with mode 0640, since C<0666 & ~027>
is 0640.

You should seldom use the MASK argument to C<sysopen()>.  That takes
away the user's freedom to choose what permission new files will have.
Denying choice is almost always a bad thing.  One exception would be for
cases where sensitive or private data is being stored, such as with mail
folders, cookie files, and internal temporary files.

=head1 Obscure Open Tricks

=head2 Re-Opening Files (dups)

Sometimes you already have a filehandle open, and want to make another
handle that's a duplicate of the first one.  In the shell, we place an
ampersand in front of a file descriptor number when doing redirections.
For example, C<< 2>&1 >> makes descriptor 2 (that's STDERR in Perl)
be redirected into descriptor 1 (which is usually Perl's STDOUT).
The same is essentially true in Perl: a filename that begins with an
ampersand is treated instead as a file descriptor if a number, or as a
filehandle if a string.

    open(SAVEOUT, ">&SAVEERR") || die "couldn't dup SAVEERR: $!";
    open(MHCONTEXT, "<&4")     || die "couldn't dup fd4: $!";

That means that if a function is expecting a filename, but you don't
want to give it a filename because you already have the file open, you
can just pass the filehandle with a leading ampersand.  It's best to
use a fully qualified handle though, just in case the function happens
to be in a different package:


This way if somefunction() is planning on opening its argument, it can
just use the already opened handle.  This differs from passing a handle,
because with a handle, you don't open the file.  Here you have something
you can pass to open.

If you have one of those tricky, newfangled I/O objects that the C++
folks are raving about, then this doesn't work because those aren't a
proper filehandle in the native Perl sense.  You'll have to use fileno()
to pull out the proper descriptor number, assuming you can:

    use IO::Socket;
    $handle = IO::Socket::INET->new("www.perl.com:80");
    $fd = $handle->fileno;
    somefunction("&$fd");  # not an indirect function call

It can be easier (and certainly will be faster) just to use real
filehandles though:

    use IO::Socket;
    local *REMOTE = IO::Socket::INET->new("www.perl.com:80");
    die "can't connect" unless defined(fileno(REMOTE));

If the filehandle or descriptor number is preceded not just with a simple
"&" but rather with a "&=" combination, then Perl will not create a
completely new descriptor opened to the same place using the dup(2)
system call.  Instead, it will just make something of an alias to the
existing one using the fdopen(3S) library call  This is slightly more
parsimonious of systems resources, although this is less a concern
these days.  Here's an example of that:

    $fd = $ENV{"MHCONTEXTFD"};
    open(MHCONTEXT, "<&=$fd")   or die "couldn't fdopen $fd: $!";

If you're using magic C<< <ARGV> >>, you could even pass in as a
command line argument in @ARGV something like C<"<&=$MHCONTEXTFD">,
but we've never seen anyone actually do this.

=head2 Dispelling the Dweomer

Perl is more of a DWIMmer language than something like Java--where DWIM
is an acronym for "do what I mean".  But this principle sometimes leads
to more hidden magic than one knows what to do with.  In this way, Perl
is also filled with I<dweomer>, an obscure word meaning an enchantment.
Sometimes, Perl's DWIMmer is just too much like dweomer for comfort.

If magic C<open> is a bit too magical for you, you don't have to turn
to C<sysopen>.  To open a file with arbitrary weird characters in
it, it's necessary to protect any leading and trailing whitespace.
Leading whitespace is protected by inserting a C<"./"> in front of a
filename that starts with whitespace.  Trailing whitespace is protected
by appending an ASCII NUL byte (C<"\0">) at the end of the string.

    $file =~ s#^(\s)#./$1#;
    open(FH, "< $file\0")   || die "can't open $file: $!";

This assumes, of course, that your system considers dot the current
working directory, slash the directory separator, and disallows ASCII
NULs within a valid filename.  Most systems follow these conventions,
including all POSIX systems as well as proprietary Microsoft systems.
The only vaguely popular system that doesn't work this way is the
"Classic" Macintosh system, which uses a colon where the rest of us
use a slash.  Maybe C<sysopen> isn't such a bad idea after all.

If you want to use C<< <ARGV> >> processing in a totally boring
and non-magical way, you could do this first:

    #   "Sam sat on the ground and put his head in his hands.  
    #   'I wish I had never come here, and I don't want to see 
    #   no more magic,' he said, and fell silent."
    for (@ARGV) { 
        $_ .= "\0";
    while (<>) {  
        # now process $_

But be warned that users will not appreciate being unable to use "-"
to mean standard input, per the standard convention.

=head2 Paths as Opens

You've probably noticed how Perl's C<warn> and C<die> functions can
produce messages like:

    Some warning at scriptname line 29, <FH> line 7.

That's because you opened a filehandle FH, and had read in seven records
from it.  But what was the name of the file, rather than the handle?

If you aren't running with C<strict refs>, or if you've turned them off
temporarily, then all you have to do is this:

    open($path, "< $path") || die "can't open $path: $!";
    while (<$path>) {
        # whatever

Since you're using the pathname of the file as its handle,
you'll get warnings more like

    Some warning at scriptname line 29, </etc/motd> line 7.

=head2 Single Argument Open

Remember how we said that Perl's open took two arguments?  That was a
passive prevarication.  You see, it can also take just one argument.
If and only if the variable is a global variable, not a lexical, you
can pass C<open> just one argument, the filehandle, and it will 
get the path from the global scalar variable of the same name.

    $FILE = "/etc/motd";
    open FILE or die "can't open $FILE: $!";
    while (<FILE>) {
        # whatever

Why is this here?  Someone has to cater to the hysterical porpoises.
It's something that's been in Perl since the very beginning, if not

=head2 Playing with STDIN and STDOUT

One clever move with STDOUT is to explicitly close it when you're done
with the program.

    END { close(STDOUT) || die "can't close stdout: $!" }

If you don't do this, and your program fills up the disk partition due
to a command line redirection, it won't report the error exit with a
failure status.

You don't have to accept the STDIN and STDOUT you were given.  You are
welcome to reopen them if you'd like.

    open(STDIN, "< datafile")
	|| die "can't open datafile: $!";

    open(STDOUT, "> output")
	|| die "can't open output: $!";

And then these can be accessed directly or passed on to subprocesses.
This makes it look as though the program were initially invoked
with those redirections from the command line.

It's probably more interesting to connect these to pipes.  For example:

    $pager = $ENV{PAGER} || "(less || more)";
    open(STDOUT, "| $pager")
	|| die "can't fork a pager: $!";

This makes it appear as though your program were called with its stdout
already piped into your pager.  You can also use this kind of thing
in conjunction with an implicit fork to yourself.  You might do this
if you would rather handle the post processing in your own program,
just in a different process:

    while (<>) {

    sub head {
        my $lines = shift || 20;
        return if $pid = open(STDOUT, "|-");       # return if parent
        die "cannot fork: $!" unless defined $pid;
        while (<STDIN>) {
            last if --$lines < 0;

This technique can be applied to repeatedly push as many filters on your
output stream as you wish.

=head1 Other I/O Issues

These topics aren't really arguments related to C<open> or C<sysopen>,
but they do affect what you do with your open files.

=head2 Opening Non-File Files

When is a file not a file?  Well, you could say when it exists but
isn't a plain file.   We'll check whether it's a symbolic link first,
just in case.

    if (-l $file || ! -f _) {
        print "$file is not a plain file\n";

What other kinds of files are there than, well, files?  Directories,
symbolic links, named pipes, Unix-domain sockets, and block and character
devices.  Those are all files, too--just not I<plain> files.  This isn't
the same issue as being a text file. Not all text files are plain files.
Not all plain files are text files.  That's why there are separate C<-f>
and C<-T> file tests.

To open a directory, you should use the C<opendir> function, then
process it with C<readdir>, carefully restoring the directory 
name if necessary:

    opendir(DIR, $dirname) or die "can't opendir $dirname: $!";
    while (defined($file = readdir(DIR))) {
        # do something with "$dirname/$file"

If you want to process directories recursively, it's better to use the
File::Find module.  For example, this prints out all files recursively
and adds a slash to their names if the file is a directory.

    @ARGV = qw(.) unless @ARGV;
    use File::Find;
    find sub { print $File::Find::name, -d && '/', "\n" }, @ARGV;

This finds all bogus symbolic links beneath a particular directory:

    find sub { print "$File::Find::name\n" if -l && !-e }, $dir;

As you see, with symbolic links, you can just pretend that it is
what it points to.  Or, if you want to know I<what> it points to, then
C<readlink> is called for:

    if (-l $file) {
        if (defined($whither = readlink($file))) {
            print "$file points to $whither\n";
        } else {
            print "$file points nowhere: $!\n";

=head2 Opening Named Pipes

Named pipes are a different matter.  You pretend they're regular files,
but their opens will normally block until there is both a reader and
a writer.  You can read more about them in L<perlipc/"Named Pipes">.
Unix-domain sockets are rather different beasts as well; they're
described in L<perlipc/"Unix-Domain TCP Clients and Servers">.

When it comes to opening devices, it can be easy and it can be tricky.
We'll assume that if you're opening up a block device, you know what
you're doing.  The character devices are more interesting.  These are
typically used for modems, mice, and some kinds of printers.  This is
described in L<perlfaq8/"How do I read and write the serial port?">
It's often enough to open them carefully:

    sysopen(TTYIN, "/dev/ttyS1", O_RDWR | O_NDELAY | O_NOCTTY)
		# (O_NOCTTY no longer needed on POSIX systems)
        or die "can't open /dev/ttyS1: $!";
    open(TTYOUT, "+>&TTYIN")
        or die "can't dup TTYIN: $!";

    $ofh = select(TTYOUT); $| = 1; select($ofh);

    print TTYOUT "+++at\015";
    $answer = <TTYIN>;

With descriptors that you haven't opened using C<sysopen>, such as
sockets, you can set them to be non-blocking using C<fcntl>:

    use Fcntl;
    my $old_flags = fcntl($handle, F_GETFL, 0) 
        or die "can't get flags: $!";
    fcntl($handle, F_SETFL, $old_flags | O_NONBLOCK) 
        or die "can't set non blocking: $!";

Rather than losing yourself in a morass of twisting, turning C<ioctl>s,
all dissimilar, if you're going to manipulate ttys, it's best to
make calls out to the stty(1) program if you have it, or else use the
portable POSIX interface.  To figure this all out, you'll need to read the
termios(3) manpage, which describes the POSIX interface to tty devices,
and then L<POSIX>, which describes Perl's interface to POSIX.  There are
also some high-level modules on CPAN that can help you with these games.
Check out Term::ReadKey and Term::ReadLine.

=head2 Opening Sockets

What else can you open?  To open a connection using sockets, you won't use
one of Perl's two open functions.  See 
L<perlipc/"Sockets: Client/Server Communication"> for that.  Here's an 
example.  Once you have it, you can use FH as a bidirectional filehandle.

    use IO::Socket;
    local *FH = IO::Socket::INET->new("www.perl.com:80");

For opening up a URL, the LWP modules from CPAN are just what
the doctor ordered.  There's no filehandle interface, but
it's still easy to get the contents of a document:

    use LWP::Simple;
    $doc = get('http://www.linpro.no/lwp/');

=head2 Binary Files

On certain legacy systems with what could charitably be called terminally
convoluted (some would say broken) I/O models, a file isn't a file--at
least, not with respect to the C standard I/O library.  On these old
systems whose libraries (but not kernels) distinguish between text and
binary streams, to get files to behave properly you'll have to bend over
backwards to avoid nasty problems.  On such infelicitous systems, sockets
and pipes are already opened in binary mode, and there is currently no
way to turn that off.  With files, you have more options.

Another option is to use the C<binmode> function on the appropriate
handles before doing regular I/O on them:

    while (<STDIN>) { print } 

Passing C<sysopen> a non-standard flag option will also open the file in
binary mode on those systems that support it.  This is the equivalent of
opening the file normally, then calling C<binmode> on the handle.

    sysopen(BINDAT, "records.data", O_RDWR | O_BINARY)
        || die "can't open records.data: $!";

Now you can use C<read> and C<print> on that handle without worrying
about the non-standard system I/O library breaking your data.  It's not
a pretty picture, but then, legacy systems seldom are.  CP/M will be
with us until the end of days, and after.

On systems with exotic I/O systems, it turns out that, astonishingly
enough, even unbuffered I/O using C<sysread> and C<syswrite> might do
sneaky data mutilation behind your back.

    while (sysread(WHENCE, $buf, 1024)) {
        syswrite(WHITHER, $buf, length($buf));

Depending on the vicissitudes of your runtime system, even these calls
may need C<binmode> or C<O_BINARY> first.  Systems known to be free of
such difficulties include Unix, the Mac OS, Plan 9, and Inferno.

=head2 File Locking

In a multitasking environment, you may need to be careful not to collide
with other processes who want to do I/O on the same files as you
are working on.  You'll often need shared or exclusive locks
on files for reading and writing respectively.  You might just
pretend that only exclusive locks exist.

Never use the existence of a file C<-e $file> as a locking indication,
because there is a race condition between the test for the existence of
the file and its creation.  It's possible for another process to create
a file in the slice of time between your existence check and your attempt
to create the file.  Atomicity is critical.

Perl's most portable locking interface is via the C<flock> function,
whose simplicity is emulated on systems that don't directly support it
such as SysV or Windows.  The underlying semantics may affect how
it all works, so you should learn how C<flock> is implemented on your
system's port of Perl.

File locking I<does not> lock out another process that would like to
do I/O.  A file lock only locks out others trying to get a lock, not
processes trying to do I/O.  Because locks are advisory, if one process
uses locking and another doesn't, all bets are off.

By default, the C<flock> call will block until a lock is granted.
A request for a shared lock will be granted as soon as there is no
exclusive locker.  A request for an exclusive lock will be granted as
soon as there is no locker of any kind.  Locks are on file descriptors,
not file names.  You can't lock a file until you open it, and you can't
hold on to a lock once the file has been closed.

Here's how to get a blocking shared lock on a file, typically used
for reading:

    use 5.004;
    use Fcntl qw(:DEFAULT :flock);
    open(FH, "< filename")  or die "can't open filename: $!";
    flock(FH, LOCK_SH) 	    or die "can't lock filename: $!";
    # now read from FH

You can get a non-blocking lock by using C<LOCK_NB>.

    flock(FH, LOCK_SH | LOCK_NB)
        or die "can't lock filename: $!";

This can be useful for producing more user-friendly behaviour by warning
if you're going to be blocking:

    use 5.004;
    use Fcntl qw(:DEFAULT :flock);
    open(FH, "< filename")  or die "can't open filename: $!";
    unless (flock(FH, LOCK_SH | LOCK_NB)) {
	$| = 1;
	print "Waiting for lock...";
	flock(FH, LOCK_SH)  or die "can't lock filename: $!";
	print "got it.\n"
    # now read from FH

To get an exclusive lock, typically used for writing, you have to be
careful.  We C<sysopen> the file so it can be locked before it gets
emptied.  You can get a nonblocking version using C<LOCK_EX | LOCK_NB>.

    use 5.004;
    use Fcntl qw(:DEFAULT :flock);
    sysopen(FH, "filename", O_WRONLY | O_CREAT)
        or die "can't open filename: $!";
    flock(FH, LOCK_EX)
        or die "can't lock filename: $!";
    truncate(FH, 0)
        or die "can't truncate filename: $!";
    # now write to FH

Finally, due to the uncounted millions who cannot be dissuaded from
wasting cycles on useless vanity devices called hit counters, here's
how to increment a number in a file safely:

    use Fcntl qw(:DEFAULT :flock);

    sysopen(FH, "numfile", O_RDWR | O_CREAT)
        or die "can't open numfile: $!";
    # autoflush FH
    $ofh = select(FH); $| = 1; select ($ofh);
    flock(FH, LOCK_EX)
        or die "can't write-lock numfile: $!";

    $num = <FH> || 0;
    seek(FH, 0, 0)
        or die "can't rewind numfile : $!";
    print FH $num+1, "\n"
        or die "can't write numfile: $!";

    truncate(FH, tell(FH))
        or die "can't truncate numfile: $!";
        or die "can't close numfile: $!";

=head2 IO Layers

In Perl 5.8.0 a new I/O framework called "PerlIO" was introduced.
This is a new "plumbing" for all the I/O happening in Perl; for the
most part everything will work just as it did, but PerlIO also brought
in some new features such as the ability to think of I/O as "layers".
One I/O layer may in addition to just moving the data also do
transformations on the data.  Such transformations may include
compression and decompression, encryption and decryption, and transforming
between various character encodings.

Full discussion about the features of PerlIO is out of scope for this
tutorial, but here is how to recognize the layers being used:

=over 4

=item *

The three-(or more)-argument form of C<open> is being used and the
second argument contains something else in addition to the usual
C<< '<' >>, C<< '>' >>, C<< '>>' >>, C<< '|' >> and their variants,
for example:

    open(my $fh, "<:utf8", $fn);

=item *

The two-argument form of C<binmode> is being used, for example

    binmode($fh, ":encoding(utf16)");


For more detailed discussion about PerlIO see L<PerlIO>;
for more detailed discussion about Unicode and I/O see L<perluniintro>.

=head1 SEE ALSO 

The C<open> and C<sysopen> functions in perlfunc(1);
the system open(2), dup(2), fopen(3), and fdopen(3) manpages;
the POSIX documentation.


Copyright 1998 Tom Christiansen.  

This documentation is free; you can redistribute it and/or modify it
under the same terms as Perl itself.

Irrespective of its distribution, all code examples in these files are
hereby placed into the public domain.  You are permitted and
encouraged to use this code in your own programs for fun or for profit
as you see fit.  A simple comment in the code giving credit would be
courteous but is not required.

=head1 HISTORY

First release: Sat Jan  9 08:09:11 MST 1999

--- NEW FILE: perlmodstyle.pod ---
=head1 NAME

perlmodstyle - Perl module style guide


This document attempts to describe the Perl Community's "best practice"
for writing Perl modules.  It extends the recommendations found in 
L<perlstyle> , which should be considered required reading
before reading this document.

While this document is intended to be useful to all module authors, it is
particularly aimed at authors who wish to publish their modules on CPAN.

The focus is on elements of style which are visible to the users of a 
module, rather than those parts which are only seen by the module's 
developers.  However, many of the guidelines presented in this document
can be extrapolated and applied successfully to a module's internals.

This document differs from L<perlnewmod> in that it is a style guide
rather than a tutorial on creating CPAN modules.  It provides a
checklist against which modules can be compared to determine whether
they conform to best practice, without necessarily describing in detail
how to achieve this.  

All the advice contained in this document has been gleaned from
extensive conversations with experienced CPAN authors and users.  Every
piece of advice given here is the result of previous mistakes.  This
information is here to help you avoid the same mistakes and the extra
work that would inevitably be required to fix them.

The first section of this document provides an itemized checklist; 
subsequent sections provide a more detailed discussion of the items on 
the list.  The final section, "Common Pitfalls", describes some of the 
most popular mistakes made by CPAN authors.


For more detail on each item in this checklist, see below.

=head2 Before you start

=over 4

=item *

Don't re-invent the wheel

=item *

Patch, extend or subclass an existing module where possible

=item *

Do one thing and do it well

=item *

Choose an appropriate name


=head2 The API

=over 4

=item *

API should be understandable by the average programmer

=item *

Simple methods for simple tasks

=item *

Separate functionality from output

=item *

Consistent naming of subroutines or methods

=item *

Use named parameters (a hash or hashref) when there are more than two


=head2 Stability

=over 4

=item *

Ensure your module works under C<use strict> and C<-w>

=item *

Stable modules should maintain backwards compatibility


=head2 Documentation

=over 4

=item *

Write documentation in POD

=item *

Document purpose, scope and target applications

=item *

Document each publically accessible method or subroutine, including params and return values

=item *

Give examples of use in your documentation

=item *

Provide a README file and perhaps also release notes, changelog, etc

=item *

Provide links to further information (URL, email)


=head2 Release considerations

=over 4

=item *

Specify pre-requisites in Makefile.PL or Build.PL

=item *

Specify Perl version requirements with C<use>

=item *

Include tests with your module

=item *

Choose a sensible and consistent version numbering scheme (X.YY is the common Perl module numbering scheme)

=item *

Increment the version number for every change, no matter how small

=item *

Package the module using "make dist"

=item *

Choose an appropriate license (GPL/Artistic is a good default)



Try not to launch headlong into developing your module without spending
some time thinking first.  A little forethought may save you a vast
amount of effort later on.

=head2 Has it been done before?

You may not even need to write the module.  Check whether it's already 
been done in Perl, and avoid re-inventing the wheel unless you have a 
good reason.

Good places to look for pre-existing modules include
http://search.cpan.org/ and asking on modules at perl.org

If an existing module B<almost> does what you want, consider writing a
patch, writing a subclass, or otherwise extending the existing module
rather than rewriting it.

=head2 Do one thing and do it well

At the risk of stating the obvious, modules are intended to be modular.
A Perl developer should be able to use modules to put together the
building blocks of their application.  However, it's important that the
blocks are the right shape, and that the developer shouldn't have to use
a big block when all they need is a small one.

Your module should have a clearly defined scope which is no longer than
a single sentence.  Can your module be broken down into a family of
related modules?

Bad example:

"FooBar.pm provides an implementation of the FOO protocol and the
related BAR standard."

Good example:

"Foo.pm provides an implementation of the FOO protocol.  Bar.pm
implements the related BAR protocol."

This means that if a developer only needs a module for the BAR standard,
they should not be forced to install libraries for FOO as well.

=head2 What's in a name?

Make sure you choose an appropriate name for your module early on.  This
will help people find and remember your module, and make programming
with your module more intuitive.

When naming your module, consider the following:

=over 4

=item *

Be descriptive (i.e. accurately describes the purpose of the module).

=item * 

Be consistent with existing modules.

=item *

Reflect the functionality of the module, not the implementation.

=item *

Avoid starting a new top-level hierarchy, especially if a suitable
hierarchy already exists under which you could place your module.


You should contact modules at perl.org to ask them about your module name
before publishing your module.  You should also try to ask people who 
are already familiar with the module's application domain and the CPAN
naming system.  Authors of similar modules, or modules with similar
names, may be a good place to start.


Considerations for module design and coding:

=head2 To OO or not to OO?

Your module may be object oriented (OO) or not, or it may have both kinds 
of interfaces available.  There are pros and cons of each technique, which 
should be considered when you design your API.

According to Damian Conway, you should consider using OO:

=over 4

=item * 

When the system is large or likely to become so

=item * 

When the data is aggregated in obvious structures that will become objects 

=item * 

When the types of data form a natural hierarchy that can make use of inheritance

=item *

When operations on data vary according to data type (making
polymorphic invocation of methods feasible)

=item *

When it is likely that new data types may be later introduced
into the system, and will need to be handled by existing code

=item *

When interactions between data are best represented by
overloaded operators

=item *

When the implementation of system components is likely to
change over time (and hence should be encapsulated)

=item *

When the system design is itself object-oriented

=item *

When large amounts of client code will use the software (and
should be insulated from changes in its implementation)

=item *

When many separate operations will need to be applied to the
same set of data


Think carefully about whether OO is appropriate for your module.
Gratuitous object orientation results in complex APIs which are
difficult for the average module user to understand or use.

=head2 Designing your API

Your interfaces should be understandable by an average Perl programmer.  
The following guidelines may help you judge whether your API is
sufficiently straightforward:

=over 4

=item Write simple routines to do simple things.

It's better to have numerous simple routines than a few monolithic ones.
If your routine changes its behaviour significantly based on its
arguments, it's a sign that you should have two (or more) separate

=item Separate functionality from output.  

Return your results in the most generic form possible and allow the user 
to choose how to use them.  The most generic form possible is usually a
Perl data structure which can then be used to generate a text report,
HTML, XML, a database query, or whatever else your users require.

If your routine iterates through some kind of list (such as a list of
files, or records in a database) you may consider providing a callback
so that users can manipulate each element of the list in turn.
File::Find provides an example of this with its 
C<find(\&wanted, $dir)> syntax.

=item Provide sensible shortcuts and defaults.

Don't require every module user to jump through the same hoops to achieve a
simple result.  You can always include optional parameters or routines for 
more complex or non-standard behaviour.  If most of your users have to
type a few almost identical lines of code when they start using your
module, it's a sign that you should have made that behaviour a default.
Another good indicator that you should use defaults is if most of your 
users call your routines with the same arguments.

=item Naming conventions

Your naming should be consistent.  For instance, it's better to have:




This applies equally to method names, parameter names, and anything else
which is visible to the user (and most things that aren't!)

=item Parameter passing

Use named parameters. It's easier to use a hash like this:

	    name => "wibble",
	    type => "text",
	    size => 1024,

... than to have a long list of unnamed parameters like this:

    $obj->do_something("wibble", "text", 1024);

While the list of arguments might work fine for one, two or even three
arguments, any more arguments become hard for the module user to
remember, and hard for the module author to manage.  If you want to add
a new parameter you will have to add it to the end of the list for
backward compatibility, and this will probably make your list order
unintuitive.  Also, if many elements may be undefined you may see the
following unattractive method calls:

    $obj->do_something(undef, undef, undef, undef, undef, undef, 1024);

Provide sensible defaults for parameters which have them.  Don't make
your users specify parameters which will almost always be the same.

The issue of whether to pass the arguments in a hash or a hashref is
largely a matter of personal style. 

The use of hash keys starting with a hyphen (C<-name>) or entirely in 
upper case (C<NAME>) is a relic of older versions of Perl in which
ordinary lower case strings were not handled correctly by the C<=E<gt>>
operator.  While some modules retain uppercase or hyphenated argument
keys for historical reasons or as a matter of personal style, most new
modules should use simple lower case keys.  Whatever you choose, be


=head2 Strictness and warnings

Your module should run successfully under the strict pragma and should
run without generating any warnings.  Your module should also handle 
taint-checking where appropriate, though this can cause difficulties in
many cases.

=head2 Backwards compatibility

Modules which are "stable" should not break backwards compatibility
without at least a long transition phase and a major change in version

=head2 Error handling and messages

When your module encounters an error it should do one or more of:

=over 4

=item *

Return an undefined value.

=item *

set C<$Module::errstr> or similar (C<errstr> is a common name used by
DBI and other popular modules; if you choose something else, be sure to
document it clearly).

=item *

C<warn()> or C<carp()> a message to STDERR.  

=item *

C<croak()> only when your module absolutely cannot figure out what to
do.  (C<croak()> is a better version of C<die()> for use within 
modules, which reports its errors from the perspective of the caller.  
See L<Carp> for details of C<croak()>, C<carp()> and other useful

=item *

As an alternative to the above, you may prefer to throw exceptions using 
the Error module.


Configurable error handling can be very useful to your users.  Consider
offering a choice of levels for warning and debug messages, an option to
send messages to a separate file, a way to specify an error-handling
routine, or other such features.  Be sure to default all these options
to the commonest use.


=head2 POD

Your module should include documentation aimed at Perl developers.
You should use Perl's "plain old documentation" (POD) for your general 
technical documentation, though you may wish to write additional
documentation (white papers, tutorials, etc) in some other format.  
You need to cover the following subjects:

=over 4

=item *

A synopsis of the common uses of the module

=item *

The purpose, scope and target applications of your module

=item *

Use of each publically accessible method or subroutine, including
parameters and return values

=item *

Examples of use

=item *

Sources of further information

=item *

A contact email address for the author/maintainer


The level of detail in Perl module documentation generally goes from
less detailed to more detailed.  Your SYNOPSIS section should contain a
minimal example of use (perhaps as little as one line of code; skip the
unusual use cases or anything not needed by most users); the
DESCRIPTION should describe your module in broad terms, generally in
just a few paragraphs; more detail of the module's routines or methods,
lengthy code examples, or other in-depth material should be given in 
subsequent sections.

Ideally, someone who's slightly familiar with your module should be able
to refresh their memory without hitting "page down".  As your reader
continues through the document, they should receive a progressively
greater amount of knowledge.

The recommended order of sections in Perl module documentation is:

=over 4

=item * 


=item *


=item *


=item *

One or more sections or subsections giving greater detail of available 
methods and routines and any other relevant information.

=item *


=item *


=item *


=item *



Keep your documentation near the code it documents ("inline"
documentation).  Include POD for a given method right above that 
method's subroutine.  This makes it easier to keep the documentation up
to date, and avoids having to document each piece of code twice (once in
POD and once in comments).

=head2 README, INSTALL, release notes, changelogs

Your module should also include a README file describing the module and
giving pointers to further information (website, author email).  

An INSTALL file should be included, and should contain simple installation 
instructions. When using ExtUtils::MakeMaker this will usually be:

=over 4

=item perl Makefile.PL

=item make

=item make test

=item make install


When using Module::Build, this will usually be:

=over 4

=item perl Build.PL

=item perl Build

=item perl Build test

=item perl Build install


Release notes or changelogs should be produced for each release of your
software describing user-visible changes to your module, in terms
relevant to the user.


=head2 Version numbering

Version numbers should indicate at least major and minor releases, and
possibly sub-minor releases.  A major release is one in which most of
the functionality has changed, or in which major new functionality is
added.  A minor release is one in which a small amount of functionality
has been added or changed.  Sub-minor version numbers are usually used
for changes which do not affect functionality, such as documentation

The most common CPAN version numbering scheme looks like this:

    1.00, 1.10, 1.11, 1.20, 1.30, 1.31, 1.32

A correct CPAN version number is a floating point number with at least 
2 digits after the decimal. You can test whether it conforms to CPAN by 

    perl -MExtUtils::MakeMaker -le 'print MM->parse_version(shift)' 'Foo.pm'

If you want to release a 'beta' or 'alpha' version of a module but
don't want CPAN.pm to list it as most recent use an '_' after the
regular version number followed by at least 2 digits, eg. 1.20_01. If
you do this, the following idiom is recommended:

  $VERSION = "1.12_01";
  $XS_VERSION = $VERSION; # only needed if you have XS code

With that trick MakeMaker will only read the first line and thus read
the underscore, while the perl interpreter will evaluate the $VERSION
and convert the string into a number. Later operations that treat
$VERSION as a number will then be able to do so without provoking a
warning about $VERSION not being a number.

Never release anything (even a one-word documentation patch) without
incrementing the number.  Even a one-word documentation patch should
result in a change in version at the sub-minor level.

=head2 Pre-requisites

Module authors should carefully consider whether to rely on other
modules, and which modules to rely on.

Most importantly, choose modules which are as stable as possible.  In
order of preference: 

=over 4

=item *

Core Perl modules

=item *

Stable CPAN modules

=item *

Unstable CPAN modules

=item *

Modules not available from CPAN


Specify version requirements for other Perl modules in the
pre-requisites in your Makefile.PL or Build.PL.

Be sure to specify Perl version requirements both in Makefile.PL or
Build.PL and with C<require 5.6.1> or similar. See the section on
C<use VERSION> of L<perlfunc/require> for details.

=head2 Testing

All modules should be tested before distribution (using "make disttest"),
and the tests should also be available to people installing the modules 
(using "make test").  
For Module::Build you would use the C<make test> equivalent C<perl Build test>.

The importance of these tests is proportional to the alleged stability of a 
module -- a module which purports to be stable or which hopes to achieve wide 
use should adhere to as strict a testing regime as possible.

Useful modules to help you write tests (with minimum impact on your 
development process or your time) include Test::Simple, Carp::Assert 
and Test::Inline.
For more sophisticated test suites there are Test::More and Test::MockObject.

=head2 Packaging

Modules should be packaged using one of the standard packaging tools.
Currently you have the choice between ExtUtils::MakeMaker and the
more platform independent Module::Build, allowing modules to be installed in a
consistent manner.
When using ExtUtils::MakeMaker, you can use "make dist" to create your
package. Tools exist to help you to build your module in a MakeMaker-friendly
style. These include ExtUtils::ModuleMaker and h2xs.  See also L<perlnewmod>.

=head2 Licensing

Make sure that your module has a license, and that the full text of it
is included in the distribution (unless it's a common one and the terms
of the license don't require you to include it).

If you don't know what license to use, dual licensing under the GPL
and Artistic licenses (the same as Perl itself) is a good idea.
See L<perlgpl> and L<perlartistic>.


=head2 Reinventing the wheel

There are certain application spaces which are already very, very well
served by CPAN.  One example is templating systems, another is date and
time modules, and there are many more.  While it is a rite of passage to
write your own version of these things, please consider carefully
whether the Perl world really needs you to publish it.

=head2 Trying to do too much

Your module will be part of a developer's toolkit.  It will not, in
itself, form the B<entire> toolkit.  It's tempting to add extra features
until your code is a monolithic system rather than a set of modular
building blocks.

=head2 Inappropriate documentation

Don't fall into the trap of writing for the wrong audience.  Your
primary audience is a reasonably experienced developer with at least 
a moderate understanding of your module's application domain, who's just 
downloaded your module and wants to start using it as quickly as possible.

Tutorials, end-user documentation, research papers, FAQs etc are not 
appropriate in a module's main documentation.  If you really want to 
write these, include them as sub-documents such as C<My::Module::Tutorial> or
C<My::Module::FAQ> and provide a link in the SEE ALSO section of the
main documentation.  

=head1 SEE ALSO

=over 4

=item L<perlstyle>

General Perl style guide

=item L<perlnewmod>

How to create a new module

=item L<perlpod>

POD documentation

=item L<podchecker>

Verifies your POD's correctness

=item Packaging Tools

L<ExtUtils::MakeMaker>, L<Module::Build>

=item Testing tools

L<Test::Simple>, L<Test::Inline>, L<Carp::Assert>, L<Test::More>, L<Test::MockObject>

=item http://pause.perl.org/

Perl Authors Upload Server.  Contains links to information for module

=item Any good book on software engineering


=head1 AUTHOR

Kirrily "Skud" Robert <skud at cpan.org>

--- NEW FILE: perlboot.pod ---
=head1 NAME

perlboot - Beginner's Object-Oriented Tutorial


If you're not familiar with objects from other languages, some of the
other Perl object documentation may be a little daunting, such as
L<perlobj>, a basic reference in using objects, and L<perltoot>, which
introduces readers to the peculiarities of Perl's object system in a
tutorial way.

So, let's take a different approach, presuming no prior object
experience. It helps if you know about subroutines (L<perlsub>),
references (L<perlref> et. seq.), and packages (L<perlmod>), so become
familiar with those first if you haven't already.

=head2 If we could talk to the animals...

Let's let the animals talk for a moment:

    sub Cow::speak {
      print "a Cow goes moooo!\n";
    sub Horse::speak {
      print "a Horse goes neigh!\n";
    sub Sheep::speak {
      print "a Sheep goes baaaah!\n"


This results in:

    a Cow goes moooo!
    a Horse goes neigh!
    a Sheep goes baaaah!

Nothing spectacular here.  Simple subroutines, albeit from separate
packages, and called using the full package name.  So let's create
an entire pasture:

    # Cow::speak, Horse::speak, Sheep::speak as before
    @pasture = qw(Cow Cow Horse Sheep Sheep);
    foreach $animal (@pasture) {

This results in:

    a Cow goes moooo!
    a Cow goes moooo!
    a Horse goes neigh!
    a Sheep goes baaaah!
    a Sheep goes baaaah!

Wow.  That symbolic coderef de-referencing there is pretty nasty.
We're counting on C<no strict subs> mode, certainly not recommended
for larger programs.  And why was that necessary?  Because the name of
the package seems to be inseparable from the name of the subroutine we
want to invoke within that package.

Or is it?

=head2 Introducing the method invocation arrow

For now, let's say that C<< Class->method >> invokes subroutine
C<method> in package C<Class>.  (Here, "Class" is used in its
"category" meaning, not its "scholastic" meaning.) That's not
completely accurate, but we'll do this one step at a time.  Now let's
use it like so:

    # Cow::speak, Horse::speak, Sheep::speak as before

And once again, this results in:

    a Cow goes moooo!
    a Horse goes neigh!
    a Sheep goes baaaah!

That's not fun yet.  Same number of characters, all constant, no
variables.  But yet, the parts are separable now.  Watch:

    $a = "Cow";
    $a->speak; # invokes Cow->speak

Ahh!  Now that the package name has been parted from the subroutine
name, we can use a variable package name.  And this time, we've got
something that works even when C<use strict refs> is enabled.

=head2 Invoking a barnyard

Let's take that new arrow invocation and put it back in the barnyard

    sub Cow::speak {
      print "a Cow goes moooo!\n";
    sub Horse::speak {
      print "a Horse goes neigh!\n";
    sub Sheep::speak {
      print "a Sheep goes baaaah!\n"

    @pasture = qw(Cow Cow Horse Sheep Sheep);
    foreach $animal (@pasture) {

There!  Now we have the animals all talking, and safely at that,
without the use of symbolic coderefs.

But look at all that common code.  Each of the C<speak> routines has a
similar structure: a C<print> operator and a string that contains
common text, except for two of the words.  It'd be nice if we could
factor out the commonality, in case we decide later to change it all
to C<says> instead of C<goes>.

And we actually have a way of doing that without much fuss, but we
have to hear a bit more about what the method invocation arrow is
actually doing for us.

=head2 The extra parameter of method invocation

The invocation of:


attempts to invoke subroutine C<Class::method> as:

    Class::method("Class", @args);

(If the subroutine can't be found, "inheritance" kicks in, but we'll
get to that later.)  This means that we get the class name as the
first parameter (the only parameter, if no arguments are given).  So
we can rewrite the C<Sheep> speaking subroutine as:

    sub Sheep::speak {
      my $class = shift;
      print "a $class goes baaaah!\n";

And the other two animals come out similarly:

    sub Cow::speak {
      my $class = shift;
      print "a $class goes moooo!\n";
    sub Horse::speak {
      my $class = shift;
      print "a $class goes neigh!\n";

In each case, C<$class> will get the value appropriate for that
subroutine.  But once again, we have a lot of similar structure.  Can
we factor that out even further?  Yes, by calling another method in
the same class.

=head2 Calling a second method to simplify things

Let's call out from C<speak> to a helper method called C<sound>.
This method provides the constant text for the sound itself.

    { package Cow;
      sub sound { "moooo" }
      sub speak {
	my $class = shift;
	print "a $class goes ", $class->sound, "!\n"

Now, when we call C<< Cow->speak >>, we get a C<$class> of C<Cow> in
C<speak>.  This in turn selects the C<< Cow->sound >> method, which
returns C<moooo>.  But how different would this be for the C<Horse>?

    { package Horse;
      sub sound { "neigh" }
      sub speak {
	my $class = shift;
	print "a $class goes ", $class->sound, "!\n"

Only the name of the package and the specific sound change.  So can we
somehow share the definition for C<speak> between the Cow and the
Horse?  Yes, with inheritance!

=head2 Inheriting the windpipes

We'll define a common subroutine package called C<Animal>, with the
definition for C<speak>:

    { package Animal;
      sub speak {
	my $class = shift;
	print "a $class goes ", $class->sound, "!\n"

Then, for each animal, we say it "inherits" from C<Animal>, along
with the animal-specific sound:

    { package Cow;
      @ISA = qw(Animal);
      sub sound { "moooo" }

Note the added C<@ISA> array.  We'll get to that in a minute.

But what happens when we invoke C<< Cow->speak >> now?

First, Perl constructs the argument list.  In this case, it's just
C<Cow>.  Then Perl looks for C<Cow::speak>.  But that's not there, so
Perl checks for the inheritance array C<@Cow::ISA>.  It's there,
and contains the single name C<Animal>.

Perl next checks for C<speak> inside C<Animal> instead, as in
C<Animal::speak>.  And that's found, so Perl invokes that subroutine
with the already frozen argument list.

Inside the C<Animal::speak> subroutine, C<$class> becomes C<Cow> (the
first argument).  So when we get to the step of invoking
C<< $class->sound >>, it'll be looking for C<< Cow->sound >>, which
gets it on the first try without looking at C<@ISA>.  Success!

=head2 A few notes about @ISA

This magical C<@ISA> variable (pronounced "is a" not "ice-uh"), has
declared that C<Cow> "is a" C<Animal>.  Note that it's an array,
not a simple single value, because on rare occasions, it makes sense
to have more than one parent class searched for the missing methods.

If C<Animal> also had an C<@ISA>, then we'd check there too.  The
search is recursive, depth-first, left-to-right in each C<@ISA>.
Typically, each C<@ISA> has only one element (multiple elements means
multiple inheritance and multiple headaches), so we get a nice tree of

When we turn on C<use strict>, we'll get complaints on C<@ISA>, since
it's not a variable containing an explicit package name, nor is it a
lexical ("my") variable.  We can't make it a lexical variable though
(it has to belong to the package to be found by the inheritance mechanism),
so there's a couple of straightforward ways to handle that.

The easiest is to just spell the package name out:

    @Cow::ISA = qw(Animal);

Or allow it as an implicitly named package variable:

    package Cow;
    use vars qw(@ISA);
    @ISA = qw(Animal);

If you're bringing in the class from outside, via an object-oriented
module, you change:

    package Cow;
    use Animal;
    use vars qw(@ISA);
    @ISA = qw(Animal);

into just:

    package Cow;
    use base qw(Animal);

And that's pretty darn compact.

=head2 Overriding the methods

Let's add a mouse, which can barely be heard:

    # Animal package from before
    { package Mouse;
      @ISA = qw(Animal);
      sub sound { "squeak" }
      sub speak {
        my $class = shift;
	print "a $class goes ", $class->sound, "!\n";
	print "[but you can barely hear it!]\n";


which results in:

    a Mouse goes squeak!
    [but you can barely hear it!]

Here, C<Mouse> has its own speaking routine, so C<< Mouse->speak >>
doesn't immediately invoke C<< Animal->speak >>.  This is known as
"overriding".  In fact, we didn't even need to say that a C<Mouse> was
an C<Animal> at all, since all of the methods needed for C<speak> are
completely defined with C<Mouse>.

But we've now duplicated some of the code from C<< Animal->speak >>,
and this can once again be a maintenance headache.  So, can we avoid
that?  Can we say somehow that a C<Mouse> does everything any other
C<Animal> does, but add in the extra comment?  Sure!

First, we can invoke the C<Animal::speak> method directly:

    # Animal package from before
    { package Mouse;
      @ISA = qw(Animal);
      sub sound { "squeak" }
      sub speak {
        my $class = shift;
	print "[but you can barely hear it!]\n";

Note that we have to include the C<$class> parameter (almost surely
the value of C<"Mouse">) as the first parameter to C<Animal::speak>,
since we've stopped using the method arrow.  Why did we stop?  Well,
if we invoke C<< Animal->speak >> there, the first parameter to the
method will be C<"Animal"> not C<"Mouse">, and when time comes for it
to call for the C<sound>, it won't have the right class to come back
to this package.

Invoking C<Animal::speak> directly is a mess, however.  What if
C<Animal::speak> didn't exist before, and was being inherited from a
class mentioned in C<@Animal::ISA>?  Because we are no longer using
the method arrow, we get one and only one chance to hit the right

Also note that the C<Animal> classname is now hardwired into the
subroutine selection.  This is a mess if someone maintains the code,
changing C<@ISA> for <Mouse> and didn't notice C<Animal> there in
C<speak>.  So, this is probably not the right way to go.

=head2 Starting the search from a different place

A better solution is to tell Perl to search from a higher place
in the inheritance chain:

    # same Animal as before
    { package Mouse;
      # same @ISA, &sound as before
      sub speak {
        my $class = shift;
        print "[but you can barely hear it!]\n";

Ahh.  This works.  Using this syntax, we start with C<Animal> to find
C<speak>, and use all of C<Animal>'s inheritance chain if not found
immediately.  And yet the first parameter will be C<$class>, so the
found C<speak> method will get C<Mouse> as its first entry, and
eventually work its way back to C<Mouse::sound> for the details.

But this isn't the best solution.  We still have to keep the C<@ISA>
and the initial search package coordinated.  Worse, if C<Mouse> had
multiple entries in C<@ISA>, we wouldn't necessarily know which one
had actually defined C<speak>.  So, is there an even better way?

=head2 The SUPER way of doing things

By changing the C<Animal> class to the C<SUPER> class in that
invocation, we get a search of all of our super classes (classes
listed in C<@ISA>) automatically:

    # same Animal as before
    { package Mouse;
      # same @ISA, &sound as before
      sub speak {
        my $class = shift;
        print "[but you can barely hear it!]\n";

So, C<SUPER::speak> means look in the current package's C<@ISA> for
C<speak>, invoking the first one found. Note that it does I<not> look in
the C<@ISA> of C<$class>.

=head2 Where we're at so far...

So far, we've seen the method arrow syntax:


or the equivalent:

  $a = "Class";

which constructs an argument list of:

  ("Class", @args)

and attempts to invoke

  Class::method("Class", @Args);

However, if C<Class::method> is not found, then C<@Class::ISA> is examined
(recursively) to locate a package that does indeed contain C<method>,
and that subroutine is invoked instead.

Using this simple syntax, we have class methods, (multiple)
inheritance, overriding, and extending.  Using just what we've seen so
far, we've been able to factor out common code, and provide a nice way
to reuse implementations with variations.  This is at the core of what
objects provide, but objects also provide instance data, which we
haven't even begun to cover.

=head2 A horse is a horse, of course of course -- or is it?

Let's start with the code for the C<Animal> class
and the C<Horse> class:

  { package Animal;
    sub speak {
      my $class = shift;
      print "a $class goes ", $class->sound, "!\n"
  { package Horse;
    @ISA = qw(Animal);
    sub sound { "neigh" }

This lets us invoke C<< Horse->speak >> to ripple upward to
C<Animal::speak>, calling back to C<Horse::sound> to get the specific
sound, and the output of:

  a Horse goes neigh!

But all of our Horse objects would have to be absolutely identical.
If I add a subroutine, all horses automatically share it.  That's
great for making horses the same, but how do we capture the
distinctions about an individual horse?  For example, suppose I want
to give my first horse a name.  There's got to be a way to keep its
name separate from the other horses.

We can do that by drawing a new distinction, called an "instance".
An "instance" is generally created by a class.  In Perl, any reference
can be an instance, so let's start with the simplest reference
that can hold a horse's name: a scalar reference.

  my $name = "Mr. Ed";
  my $talking = \$name;

So now C<$talking> is a reference to what will be the instance-specific
data (the name).  The final step in turning this into a real instance
is with a special operator called C<bless>:

  bless $talking, Horse;

This operator stores information about the package named C<Horse> into
the thing pointed at by the reference.  At this point, we say
C<$talking> is an instance of C<Horse>.  That is, it's a specific
horse.  The reference is otherwise unchanged, and can still be used
with traditional dereferencing operators.

=head2 Invoking an instance method

The method arrow can be used on instances, as well as names of
packages (classes).  So, let's get the sound that C<$talking> makes:

  my $noise = $talking->sound;

To invoke C<sound>, Perl first notes that C<$talking> is a blessed
reference (and thus an instance).  It then constructs an argument
list, in this case from just C<($talking)>.  (Later we'll see that
arguments will take their place following the instance variable,
just like with classes.)

Now for the fun part: Perl takes the class in which the instance was
blessed, in this case C<Horse>, and uses that to locate the subroutine
to invoke the method.  In this case, C<Horse::sound> is found directly
(without using inheritance), yielding the final subroutine invocation:


Note that the first parameter here is still the instance, not the name
of the class as before.  We'll get C<neigh> as the return value, and
that'll end up as the C<$noise> variable above.

If Horse::sound had not been found, we'd be wandering up the
C<@Horse::ISA> list to try to find the method in one of the
superclasses, just as for a class method.  The only difference between
a class method and an instance method is whether the first parameter
is an instance (a blessed reference) or a class name (a string).

=head2 Accessing the instance data

Because we get the instance as the first parameter, we can now access
the instance-specific data.  In this case, let's add a way to get at
the name:

  { package Horse;
    @ISA = qw(Animal);
    sub sound { "neigh" }
    sub name {
      my $self = shift;

Now we call for the name:

  print $talking->name, " says ", $talking->sound, "\n";

Inside C<Horse::name>, the C<@_> array contains just C<$talking>,
which the C<shift> stores into C<$self>.  (It's traditional to shift
the first parameter off into a variable named C<$self> for instance
methods, so stay with that unless you have strong reasons otherwise.)
Then, C<$self> gets de-referenced as a scalar ref, yielding C<Mr. Ed>,
and we're done with that.  The result is:

  Mr. Ed says neigh.

=head2 How to build a horse

Of course, if we constructed all of our horses by hand, we'd most
likely make mistakes from time to time.  We're also violating one of
the properties of object-oriented programming, in that the "inside
guts" of a Horse are visible.  That's good if you're a veterinarian,
but not if you just like to own horses.  So, let's let the Horse class
build a new horse:

  { package Horse;
    @ISA = qw(Animal);
    sub sound { "neigh" }
    sub name {
      my $self = shift;
    sub named {
      my $class = shift;
      my $name = shift;
      bless \$name, $class;

Now with the new C<named> method, we can build a horse:

  my $talking = Horse->named("Mr. Ed");

Notice we're back to a class method, so the two arguments to
C<Horse::named> are C<Horse> and C<Mr. Ed>.  The C<bless> operator
not only blesses C<$name>, it also returns the reference to C<$name>,
so that's fine as a return value.  And that's how to build a horse.

We've called the constructor C<named> here, so that it quickly denotes
the constructor's argument as the name for this particular C<Horse>.
You can use different constructors with different names for different
ways of "giving birth" to the object (like maybe recording its
pedigree or date of birth).  However, you'll find that most people
coming to Perl from more limited languages use a single constructor
named C<new>, with various ways of interpreting the arguments to
C<new>.  Either style is fine, as long as you document your particular
way of giving birth to an object.  (And you I<were> going to do that,

=head2 Inheriting the constructor

But was there anything specific to C<Horse> in that method?  No.  Therefore,
it's also the same recipe for building anything else that inherited from
C<Animal>, so let's put it there:

  { package Animal;
    sub speak {
      my $class = shift;
      print "a $class goes ", $class->sound, "!\n"
    sub name {
      my $self = shift;
    sub named {
      my $class = shift;
      my $name = shift;
      bless \$name, $class;
  { package Horse;
    @ISA = qw(Animal);
    sub sound { "neigh" }

Ahh, but what happens if we invoke C<speak> on an instance?

  my $talking = Horse->named("Mr. Ed");

We get a debugging value:

  a Horse=SCALAR(0xaca42ac) goes neigh!

Why?  Because the C<Animal::speak> routine is expecting a classname as
its first parameter, not an instance.  When the instance is passed in,
we'll end up using a blessed scalar reference as a string, and that
shows up as we saw it just now.

=head2 Making a method work with either classes or instances

All we need is for a method to detect if it is being called on a class
or called on an instance.  The most straightforward way is with the
C<ref> operator.  This returns a string (the classname) when used on a
blessed reference, and C<undef> when used on a string (like a
classname).  Let's modify the C<name> method first to notice the change:

  sub name {
    my $either = shift;
    ref $either
      ? $$either # it's an instance, return name
      : "an unnamed $either"; # it's a class, return generic

Here, the C<?:> operator comes in handy to select either the
dereference or a derived string.  Now we can use this with either an
instance or a class.  Note that I've changed the first parameter
holder to C<$either> to show that this is intended:

  my $talking = Horse->named("Mr. Ed");
  print Horse->name, "\n"; # prints "an unnamed Horse\n"
  print $talking->name, "\n"; # prints "Mr Ed.\n"

and now we'll fix C<speak> to use this:

  sub speak {
    my $either = shift;
    print $either->name, " goes ", $either->sound, "\n";

And since C<sound> already worked with either a class or an instance,
we're done!

=head2 Adding parameters to a method

Let's train our animals to eat:

  { package Animal;
    sub named {
      my $class = shift;
      my $name = shift;
      bless \$name, $class;
    sub name {
      my $either = shift;
      ref $either
	? $$either # it's an instance, return name
	: "an unnamed $either"; # it's a class, return generic
    sub speak {
      my $either = shift;
      print $either->name, " goes ", $either->sound, "\n";
    sub eat {
      my $either = shift;
      my $food = shift;
      print $either->name, " eats $food.\n";
  { package Horse;
    @ISA = qw(Animal);
    sub sound { "neigh" }
  { package Sheep;
    @ISA = qw(Animal);
    sub sound { "baaaah" }

And now try it out:

  my $talking = Horse->named("Mr. Ed");

which prints:

  Mr. Ed eats hay.
  an unnamed Sheep eats grass.

An instance method with parameters gets invoked with the instance,
and then the list of parameters.  So that first invocation is like:

  Animal::eat($talking, "hay");

=head2 More interesting instances

What if an instance needs more data?  Most interesting instances are
made of many items, each of which can in turn be a reference or even
another object.  The easiest way to store these is often in a hash.
The keys of the hash serve as the names of parts of the object (often
called "instance variables" or "member variables"), and the
corresponding values are, well, the values.

But how do we turn the horse into a hash?  Recall that an object was
any blessed reference.  We can just as easily make it a blessed hash
reference as a blessed scalar reference, as long as everything that
looks at the reference is changed accordingly.

Let's make a sheep that has a name and a color:

  my $bad = bless { Name => "Evil", Color => "black" }, Sheep;

so C<< $bad->{Name} >> has C<Evil>, and C<< $bad->{Color} >> has
C<black>.  But we want to make C<< $bad->name >> access the name, and
that's now messed up because it's expecting a scalar reference.  Not
to worry, because that's pretty easy to fix up:

  ## in Animal
  sub name {
    my $either = shift;
    ref $either ?
      $either->{Name} :
      "an unnamed $either";

And of course C<named> still builds a scalar sheep, so let's fix that
as well:

  ## in Animal
  sub named {
    my $class = shift;
    my $name = shift;
    my $self = { Name => $name, Color => $class->default_color };
    bless $self, $class;

What's this C<default_color>?  Well, if C<named> has only the name,
we still need to set a color, so we'll have a class-specific initial color.
For a sheep, we might define it as white:

  ## in Sheep
  sub default_color { "white" }

And then to keep from having to define one for each additional class,
we'll define a "backstop" method that serves as the "default default",
directly in C<Animal>:

  ## in Animal
  sub default_color { "brown" }

Now, because C<name> and C<named> were the only methods that
referenced the "structure" of the object, the rest of the methods can
remain the same, so C<speak> still works as before.

=head2 A horse of a different color

But having all our horses be brown would be boring.  So let's add a
method or two to get and set the color.

  ## in Animal
  sub color {
  sub set_color {
    $_[0]->{Color} = $_[1];

Note the alternate way of accessing the arguments: C<$_[0]> is used
in-place, rather than with a C<shift>.  (This saves us a bit of time
for something that may be invoked frequently.)  And now we can fix
that color for Mr. Ed:

  my $talking = Horse->named("Mr. Ed");
  print $talking->name, " is colored ", $talking->color, "\n";

which results in:

  Mr. Ed is colored black-and-white

=head2 Summary

So, now we have class methods, constructors, instance methods,
instance data, and even accessors.  But that's still just the
beginning of what Perl has to offer.  We haven't even begun to talk
about accessors that double as getters and setters, destructors,
indirect object notation, subclasses that add instance data, per-class
data, overloading, "isa" and "can" tests, C<UNIVERSAL> class, and so
on.  That's for the rest of the Perl documentation to cover.
Hopefully, this gets you started, though.

=head1 SEE ALSO

For more information, see L<perlobj> (for all the gritty details about
Perl objects, now that you've seen the basics), L<perltoot> (the
tutorial for those who already know objects), L<perltooc> (dealing
with class data), L<perlbot> (for some more tricks), and books such as
Damian Conway's excellent I<Object Oriented Perl>.

Some modules which might prove interesting are Class::Accessor,
Class::Class, Class::Contract, Class::Data::Inheritable,
Class::MethodMaker and Tie::SecureHash


Copyright (c) 1999, 2000 by Randal L. Schwartz and Stonehenge
Consulting Services, Inc.  Permission is hereby granted to distribute
this document intact with the Perl distribution, and in accordance
with the licenses of the Perl distribution; derived documents must
include this copyright notice intact.

Portions of this text have been derived from Perl Training materials
originally appearing in the I<Packages, References, Objects, and
Modules> course taught by instructors for Stonehenge Consulting
Services, Inc. and used with permission.

Portions of this text have been derived from materials originally
appearing in I<Linux Magazine> and used with permission.

--- NEW FILE: perlreref.pod ---
=head1 NAME

perlreref - Perl Regular Expressions Reference


This is a quick reference to Perl's regular expressions.
For full information see L<perlre> and L<perlop>, as well
as the L</"SEE ALSO"> section in this document.


  =~ determines to which variable the regex is applied.
     In its absence, $_ is used.

        $var =~ /foo/;

  !~ determines to which variable the regex is applied,
     and negates the result of the match; it returns
     false if the match succeeds, and true if it fails.

       $var !~ /foo/;

  m/pattern/igmsoxc searches a string for a pattern match,
     applying the given options.

        i  case-Insensitive
        g  Global - all occurrences
        m  Multiline mode - ^ and $ match internal lines
        s  match as a Single line - . matches \n
        o  compile pattern Once
        x  eXtended legibility - free whitespace and comments
        c  don't reset pos on failed matches when using /g

     If 'pattern' is an empty string, the last I<successfully> matched
     regex is used. Delimiters other than '/' may be used for both this
     operator and the following ones.

  qr/pattern/imsox lets you store a regex in a variable,
     or pass one around. Modifiers as for m// and are stored
     within the regex.

  s/pattern/replacement/igmsoxe substitutes matches of
     'pattern' with 'replacement'. Modifiers as for m//
     with one addition:

        e  Evaluate replacement as an expression

     'e' may be specified multiple times. 'replacement' is interpreted
     as a double quoted string unless a single-quote (') is the delimiter.

  ?pattern? is like m/pattern/ but matches only once. No alternate
      delimiters can be used. Must be reset with L<reset|perlfunc/reset>.

=head2 SYNTAX

   \       Escapes the character immediately following it
   .       Matches any single character except a newline (unless /s is used)
   ^       Matches at the beginning of the string (or line, if /m is used)
   $       Matches at the end of the string (or line, if /m is used)
   *       Matches the preceding element 0 or more times
   +       Matches the preceding element 1 or more times
   ?       Matches the preceding element 0 or 1 times
   {...}   Specifies a range of occurrences for the element preceding it
   [...]   Matches any one of the characters contained within the brackets
   (...)   Groups subexpressions for capturing to $1, $2...
   (?:...) Groups subexpressions without capturing (cluster)
   |       Matches either the subexpression preceding or following it
   \1, \2 ...  The text from the Nth group


These work as in normal strings.

   \a       Alarm (beep)
   \e       Escape
   \f       Formfeed
   \n       Newline
   \r       Carriage return
   \t       Tab
   \037     Any octal ASCII value
   \x7f     Any hexadecimal ASCII value
   \x{263a} A wide hexadecimal value
   \cx      Control-x
   \N{name} A named character

   \l  Lowercase next character
   \u  Titlecase next character
   \L  Lowercase until \E
   \U  Uppercase until \E
   \Q  Disable pattern metacharacters until \E
   \E  End case modification

For Titlecase, see L</Titlecase>.

This one works differently from normal strings:

   \b  An assertion, not backspace, except in a character class


   [amy]    Match 'a', 'm' or 'y'
   [f-j]    Dash specifies "range"
   [f-j-]   Dash escaped or at start or end means 'dash'
   [^f-j]   Caret indicates "match any character _except_ these"

The following sequences work within or without a character class.
The first six are locale aware, all are Unicode aware.  The default
character class equivalent are given.  See L<perllocale> and
L<perlunicode> for details.

   \d      A digit                     [0-9]
   \D      A nondigit                  [^0-9]
   \w      A word character            [a-zA-Z0-9_]
   \W      A non-word character        [^a-zA-Z0-9_]
   \s      A whitespace character      [ \t\n\r\f]
   \S      A non-whitespace character  [^ \t\n\r\f]

   \C      Match a byte (with Unicode, '.' matches a character)
   \pP     Match P-named (Unicode) property
   \p{...} Match Unicode property with long name
   \PP     Match non-P
   \P{...} Match lack of Unicode property with long name
   \X      Match extended unicode sequence

POSIX character classes and their Unicode and Perl equivalents:

   alnum   IsAlnum              Alphanumeric
   alpha   IsAlpha              Alphabetic
   ascii   IsASCII              Any ASCII char
   blank   IsSpace  [ \t]       Horizontal whitespace (GNU extension)
   cntrl   IsCntrl              Control characters
   digit   IsDigit  \d          Digits
   graph   IsGraph              Alphanumeric and punctuation
   lower   IsLower              Lowercase chars (locale and Unicode aware)
   print   IsPrint              Alphanumeric, punct, and space
   punct   IsPunct              Punctuation
   space   IsSpace  [\s\ck]     Whitespace
           IsSpacePerl   \s     Perl's whitespace definition
   upper   IsUpper              Uppercase chars (locale and Unicode aware)
   word    IsWord   \w          Alphanumeric plus _ (Perl extension)
   xdigit  IsXDigit [0-9A-Fa-f] Hexadecimal digit

Within a character class:

    POSIX       traditional   Unicode
    [:digit:]       \d        \p{IsDigit}
    [:^digit:]      \D        \P{IsDigit}

=head2 ANCHORS

All are zero-width assertions.

   ^  Match string start (or line, if /m is used)
   $  Match string end (or line, if /m is used) or before newline
   \b Match word boundary (between \w and \W)
   \B Match except at word boundary (between \w and \w or \W and \W)
   \A Match string start (regardless of /m)
   \Z Match string end (before optional newline)
   \z Match absolute string end
   \G Match where previous m//g left off


Quantifiers are greedy by default -- match the B<longest> leftmost.

   Maximal Minimal Allowed range
   ------- ------- -------------
   {n,m}   {n,m}?  Must occur at least n times but no more than m times
   {n,}    {n,}?   Must occur at least n times
   {n}     {n}?    Must occur exactly n times
   *       *?      0 or more times (same as {0,})
   +       +?      1 or more times (same as {1,})
   ?       ??      0 or 1 time (same as {0,1})

There is no quantifier {,n} -- that gets understood as a literal string.


   (?#text)         A comment
   (?imxs-imsx:...) Enable/disable option (as per m// modifiers)
   (?=...)          Zero-width positive lookahead assertion
   (?!...)          Zero-width negative lookahead assertion
   (?<=...)         Zero-width positive lookbehind assertion
   (?<!...)         Zero-width negative lookbehind assertion
   (?>...)          Grab what we can, prohibit backtracking
   (?{ code })      Embedded code, return value becomes $^R
   (??{ code })     Dynamic regex, return value used as regex
   (?(cond)yes|no)  cond being integer corresponding to capturing parens
   (?(cond)yes)        or a lookaround/eval zero-width assertion


   $_    Default variable for operators to use
   $*    Enable multiline matching (deprecated; not in 5.9.0 or later)

   $&    Entire matched string
   $`    Everything prior to matched string
   $'    Everything after to matched string

The use of those last three will slow down B<all> regex use
within your program. Consult L<perlvar> for C<@LAST_MATCH_START>
to see equivalent expressions that won't cause slow down.
See also L<Devel::SawAmpersand>.

   $1, $2 ...  hold the Xth captured expr
   $+    Last parenthesized pattern match
   $^N   Holds the most recently closed capture
   $^R   Holds the result of the last (?{...}) expr
   @-    Offsets of starts of groups. $-[0] holds start of whole match
   @+    Offsets of ends of groups. $+[0] holds end of whole match

Captured groups are numbered according to their I<opening> paren.


   lc          Lowercase a string
   lcfirst     Lowercase first char of a string
   uc          Uppercase a string
   ucfirst     Titlecase first char of a string

   pos         Return or set current match position
   quotemeta   Quote metacharacters
   reset       Reset ?pattern? status
   study       Analyze string for optimizing matching

   split       Use regex to split a string into parts

The first four of these are like the escape sequences C<\L>, C<\l>,
C<\U>, and C<\u>.  For Titlecase, see L</Titlecase>.


=head3 Titlecase

Unicode concept which most often is equal to uppercase, but for
certain characters like the German "sharp s" there is a difference.

=head1 AUTHOR

Iain Truskett.

This document may be distributed under the same terms as Perl itself.

=head1 SEE ALSO

=over 4

=item *

L<perlretut> for a tutorial on regular expressions.

=item *

L<perlrequick> for a rapid tutorial.

=item *

L<perlre> for more details.

=item *

L<perlvar> for details on the variables.

=item *

L<perlop> for details on the operators.

=item *

L<perlfunc> for details on the functions.

=item *

L<perlfaq6> for FAQs on regular expressions.

=item *

The L<re> module to alter behaviour and aid

=item *

L<perldebug/"Debugging regular expressions">

=item *

L<perluniintro>, L<perlunicode>, L<charnames> and L<locale>
for details on regexes and internationalisation.

=item *

I<Mastering Regular Expressions> by Jeffrey Friedl
(F<http://regex.info/>) for a thorough grounding and
reference on the topic.


=head1 THANKS

David P.C. Wollmann,
Richard Soderberg,
Sean M. Burke,
Tom Christiansen,
Jim Cromie,
Jeffrey Goff
for useful advice.


--- NEW FILE: perlrequick.pod ---
=head1 NAME

perlrequick - Perl regular expressions quick start


This page covers the very basics of understanding, creating and
using regular expressions ('regexes') in Perl.

=head1 The Guide

=head2 Simple word matching

The simplest regex is simply a word, or more generally, a string of
characters.  A regex consisting of a word matches any string that
contains that word:

    "Hello World" =~ /World/;  # matches

In this statement, C<World> is a regex and the C<//> enclosing
C</World/> tells perl to search a string for a match.  The operator
C<=~> associates the string with the regex match and produces a true
value if the regex matched, or false if the regex did not match.  In
our case, C<World> matches the second word in C<"Hello World">, so the
expression is true.  This idea has several variations.

Expressions like this are useful in conditionals:

    print "It matches\n" if "Hello World" =~ /World/;

The sense of the match can be reversed by using C<!~> operator:

    print "It doesn't match\n" if "Hello World" !~ /World/;

The literal string in the regex can be replaced by a variable:

    $greeting = "World";
    print "It matches\n" if "Hello World" =~ /$greeting/;

If you're matching against C<$_>, the C<$_ =~> part can be omitted:

    $_ = "Hello World";
    print "It matches\n" if /World/;

Finally, the C<//> default delimiters for a match can be changed to
arbitrary delimiters by putting an C<'m'> out front:

    "Hello World" =~ m!World!;   # matches, delimited by '!'
    "Hello World" =~ m{World};   # matches, note the matching '{}'
    "/usr/bin/perl" =~ m"/perl"; # matches after '/usr/bin',
                                 # '/' becomes an ordinary char

Regexes must match a part of the string I<exactly> in order for the
statement to be true:

    "Hello World" =~ /world/;  # doesn't match, case sensitive
    "Hello World" =~ /o W/;    # matches, ' ' is an ordinary char
    "Hello World" =~ /World /; # doesn't match, no ' ' at end

perl will always match at the earliest possible point in the string:

    "Hello World" =~ /o/;       # matches 'o' in 'Hello'
    "That hat is red" =~ /hat/; # matches 'hat' in 'That'

Not all characters can be used 'as is' in a match.  Some characters,
called B<metacharacters>, are reserved for use in regex notation.
The metacharacters are


A metacharacter can be matched by putting a backslash before it:

    "2+2=4" =~ /2+2/;    # doesn't match, + is a metacharacter
    "2+2=4" =~ /2\+2/;   # matches, \+ is treated like an ordinary +
    'C:\WIN32' =~ /C:\\WIN/;                       # matches
    "/usr/bin/perl" =~ /\/usr\/bin\/perl/;  # matches

In the last regex, the forward slash C<'/'> is also backslashed,
because it is used to delimit the regex.

Non-printable ASCII characters are represented by B<escape sequences>.
Common examples are C<\t> for a tab, C<\n> for a newline, and C<\r>
for a carriage return.  Arbitrary bytes are represented by octal
escape sequences, e.g., C<\033>, or hexadecimal escape sequences,
e.g., C<\x1B>:

    "1000\t2000" =~ m(0\t2)        # matches
    "cat"        =~ /\143\x61\x74/ # matches, but a weird way to spell cat

Regexes are treated mostly as double quoted strings, so variable
substitution works:

    $foo = 'house';
    'cathouse' =~ /cat$foo/;   # matches
    'housecat' =~ /${foo}cat/; # matches

With all of the regexes above, if the regex matched anywhere in the
string, it was considered a match.  To specify I<where> it should
match, we would use the B<anchor> metacharacters C<^> and C<$>.  The
anchor C<^> means match at the beginning of the string and the anchor
C<$> means match at the end of the string, or before a newline at the
end of the string.  Some examples:

    "housekeeper" =~ /keeper/;         # matches
    "housekeeper" =~ /^keeper/;        # doesn't match
    "housekeeper" =~ /keeper$/;        # matches
    "housekeeper\n" =~ /keeper$/;      # matches
    "housekeeper" =~ /^housekeeper$/;  # matches

=head2 Using character classes

A B<character class> allows a set of possible characters, rather than
just a single character, to match at a particular point in a regex.
Character classes are denoted by brackets C<[...]>, with the set of
characters to be possibly matched inside.  Here are some examples:

    /cat/;            # matches 'cat'
    /[bcr]at/;        # matches 'bat', 'cat', or 'rat'
    "abc" =~ /[cab]/; # matches 'a'

In the last statement, even though C<'c'> is the first character in
the class, the earliest point at which the regex can match is C<'a'>.

    /[yY][eE][sS]/; # match 'yes' in a case-insensitive way
                    # 'yes', 'Yes', 'YES', etc.
    /yes/i;         # also match 'yes' in a case-insensitive way

The last example shows a match with an C<'i'> B<modifier>, which makes
the match case-insensitive.

Character classes also have ordinary and special characters, but the
sets of ordinary and special characters inside a character class are
different than those outside a character class.  The special
characters for a character class are C<-]\^$> and are matched using an

   /[\]c]def/; # matches ']def' or 'cdef'
   $x = 'bcr';
   /[$x]at/;   # matches 'bat, 'cat', or 'rat'
   /[\$x]at/;  # matches '$at' or 'xat'
   /[\\$x]at/; # matches '\at', 'bat, 'cat', or 'rat'

The special character C<'-'> acts as a range operator within character
classes, so that the unwieldy C<[0123456789]> and C<[abc...xyz]>
become the svelte C<[0-9]> and C<[a-z]>:

    /item[0-9]/;  # matches 'item0' or ... or 'item9'
    /[0-9a-fA-F]/;  # matches a hexadecimal digit

If C<'-'> is the first or last character in a character class, it is
treated as an ordinary character.

The special character C<^> in the first position of a character class
denotes a B<negated character class>, which matches any character but
those in the brackets.  Both C<[...]> and C<[^...]> must match a
character, or the match fails.  Then

    /[^a]at/;  # doesn't match 'aat' or 'at', but matches
               # all other 'bat', 'cat, '0at', '%at', etc.
    /[^0-9]/;  # matches a non-numeric character
    /[a^]at/;  # matches 'aat' or '^at'; here '^' is ordinary

Perl has several abbreviations for common character classes:

=over 4

=item *

\d is a digit and represents


=item *

\s is a whitespace character and represents

    [\ \t\r\n\f]

=item *

\w is a word character (alphanumeric or _) and represents


=item *

\D is a negated \d; it represents any character but a digit


=item *

\S is a negated \s; it represents any non-whitespace character


=item *

\W is a negated \w; it represents any non-word character


=item *

The period '.' matches any character but "\n"


The C<\d\s\w\D\S\W> abbreviations can be used both inside and outside
of character classes.  Here are some in use:

    /\d\d:\d\d:\d\d/; # matches a hh:mm:ss time format
    /[\d\s]/;         # matches any digit or whitespace character
    /\w\W\w/;         # matches a word char, followed by a
                      # non-word char, followed by a word char
    /..rt/;           # matches any two chars, followed by 'rt'
    /end\./;          # matches 'end.'
    /end[.]/;         # same thing, matches 'end.'

The S<B<word anchor> > C<\b> matches a boundary between a word
character and a non-word character C<\w\W> or C<\W\w>:

    $x = "Housecat catenates house and cat";
    $x =~ /\bcat/;  # matches cat in 'catenates'
    $x =~ /cat\b/;  # matches cat in 'housecat'
    $x =~ /\bcat\b/;  # matches 'cat' at end of string

In the last example, the end of the string is considered a word

=head2 Matching this or that

We can match different character strings with the B<alternation>
metacharacter C<'|'>.  To match C<dog> or C<cat>, we form the regex
C<dog|cat>.  As before, perl will try to match the regex at the
earliest possible point in the string.  At each character position,
perl will first try to match the first alternative, C<dog>.  If
C<dog> doesn't match, perl will then try the next alternative, C<cat>.
If C<cat> doesn't match either, then the match fails and perl moves to
the next position in the string.  Some examples:

    "cats and dogs" =~ /cat|dog|bird/;  # matches "cat"
    "cats and dogs" =~ /dog|cat|bird/;  # matches "cat"

Even though C<dog> is the first alternative in the second regex,
C<cat> is able to match earlier in the string.

    "cats"          =~ /c|ca|cat|cats/; # matches "c"
    "cats"          =~ /cats|cat|ca|c/; # matches "cats"

At a given character position, the first alternative that allows the
regex match to succeed will be the one that matches. Here, all the
alternatives match at the first string position, so the first matches.

=head2 Grouping things and hierarchical matching

The B<grouping> metacharacters C<()> allow a part of a regex to be
treated as a single unit.  Parts of a regex are grouped by enclosing
them in parentheses.  The regex C<house(cat|keeper)> means match
C<house> followed by either C<cat> or C<keeper>.  Some more examples

    /(a|b)b/;    # matches 'ab' or 'bb'
    /(^a|b)c/;   # matches 'ac' at start of string or 'bc' anywhere

    /house(cat|)/;  # matches either 'housecat' or 'house'
    /house(cat(s|)|)/;  # matches either 'housecats' or 'housecat' or
                        # 'house'.  Note groups can be nested.

    "20" =~ /(19|20|)\d\d/;  # matches the null alternative '()\d\d',
                             # because '20\d\d' can't match

=head2 Extracting matches

The grouping metacharacters C<()> also allow the extraction of the
parts of a string that matched.  For each grouping, the part that
matched inside goes into the special variables C<$1>, C<$2>, etc.
They can be used just as ordinary variables:

    # extract hours, minutes, seconds
    $time =~ /(\d\d):(\d\d):(\d\d)/;  # match hh:mm:ss format
    $hours = $1;
    $minutes = $2;
    $seconds = $3;

In list context, a match C</regex/> with groupings will return the
list of matched values C<($1,$2,...)>.  So we could rewrite it as

    ($hours, $minutes, $second) = ($time =~ /(\d\d):(\d\d):(\d\d)/);

If the groupings in a regex are nested, C<$1> gets the group with the
leftmost opening parenthesis, C<$2> the next opening parenthesis,
etc.  For example, here is a complex regex and the matching variables
indicated below it:

     1  2      34

Associated with the matching variables C<$1>, C<$2>, ... are
the B<backreferences> C<\1>, C<\2>, ...  Backreferences are
matching variables that can be used I<inside> a regex:

    /(\w\w\w)\s\1/; # find sequences like 'the the' in string

C<$1>, C<$2>, ... should only be used outside of a regex, and C<\1>,
C<\2>, ... only inside a regex.

=head2 Matching repetitions

The B<quantifier> metacharacters C<?>, C<*>, C<+>, and C<{}> allow us
to determine the number of repeats of a portion of a regex we
consider to be a match.  Quantifiers are put immediately after the
character, character class, or grouping that we want to specify.  They
have the following meanings:

=over 4

=item *

C<a?> = match 'a' 1 or 0 times

=item *

C<a*> = match 'a' 0 or more times, i.e., any number of times

=item *

C<a+> = match 'a' 1 or more times, i.e., at least once

=item *

C<a{n,m}> = match at least C<n> times, but not more than C<m>

=item *

C<a{n,}> = match at least C<n> or more times

=item *

C<a{n}> = match exactly C<n> times


Here are some examples:

    /[a-z]+\s+\d*/;  # match a lowercase word, at least some space, and
                     # any number of digits
    /(\w+)\s+\1/;    # match doubled words of arbitrary length
    $year =~ /\d{2,4}/;  # make sure year is at least 2 but not more
                         # than 4 digits
    $year =~ /\d{4}|\d{2}/;    # better match; throw out 3 digit dates

These quantifiers will try to match as much of the string as possible,
while still allowing the regex to match.  So we have

    $x = 'the cat in the hat';
    $x =~ /^(.*)(at)(.*)$/; # matches,
                            # $1 = 'the cat in the h'
                            # $2 = 'at'
                            # $3 = ''   (0 matches)

The first quantifier C<.*> grabs as much of the string as possible
while still having the regex match. The second quantifier C<.*> has
no string left to it, so it matches 0 times.

=head2 More matching

There are a few more things you might want to know about matching
operators.  In the code

    $pattern = 'Seuss';
    while (<>) {
        print if /$pattern/;

perl has to re-evaluate C<$pattern> each time through th