1. Perl on Windows - ActivePerl

ActivePerl distribution (developed by ActiveState Tool Corporation ) is at We need Perl for Win32 - Binary for the core Perl distribution. The ActivePerl binary comes as a self-extracting executable that uses the standard Win32 InstallShield setup wizard to guide you through the installation process. By default, Perl is installed into the directory C:\Perl\version, where version is the current version number (e.g., 5.005).

2. The Perl Manpages – online documentation

Perl comes with lots of online documentation. Run ‘man perl’ or ‘perldoc perl’ to read the top-level page. That page in turn directs you to more specific pages. . To make life easier, the manpages have been divided into separate sections If you know which section you want, you can go directly there by using‘man perlvar’or ‘perldoc perlvar’. A partialsections list:

Section / Description
perlre / Regular expressions
perlfunc / Builtin functions
perlvar / Predefined variables

3. Executing Perl

  • The perlinterpreter (i.e., the perl executable) is usually located in /usr/bin/perl. Accordingly, the first line is: #!/usr/bin/perl.For Unix systems, this #! line tells the shell to look for the /usr/bin/perl program and pass the rest of the file to that program for execution.
  • strict is a pragma for doing strict error checking: (1) Generates runtime error if you use any symbolic references. (2) Generates compile-time error if you use a bareword identifier that's not a predeclared subroutine. (3)Generates compile-time error if you access a variable that wasn't declared via my, or wasn't imported.
  • Comments within a program are indicated by #. Everything following a pound sign to the end of the line is interpreted as a comment.

To run Perl, pass Perl the name of your script as the first parameter: > perl testpgm.pl. Alternatively, you may write your script with -e switches on the command line:

> perl -e 'print "Hello, world\n"' #Unix, or > perl -e "print \"Hello, world\n\"" #Win32

4. Variables: Scalars ($), Arrays(@), Hash(%)

- Lexical scoping: (as in C) use my

A. Scalar

- Default Initialization: int:0, string "", logical: false

- Examples:

my $answer = 42; # an integer

my $pi = 3.14159265; # a "real" number

my $avocados = 6.02e23; # scientific notation

my $pet = "Camel"; # string

my $sign = "I love my $pet"; # string with interpolation (variables and

backslash interpolation)

my $cost = 'It costs $100'; # string without interpolation

my $h = $w; # assignent

my $val = $x * $y # expression

my $camels = "123";

print $camels + 1, "\n"; #prints 124, number context!

Operators

+ - (addition, subtraction)

* / %** (multiply, divide, modulus, exponentiation)

++ -- (autoincrement, autodecrement)

= += -= *= etc. (assignment operators)

. (string concatenation)

bits: < > (left bit-shift, right bit-shift)| ^ (bit-and, bit-or, bit-xor)

logical: or ||, and &, not !

Comparison / Numeric / String
Equal / == / eq
Not equal / != / ne
Less than / lt
Greater than / gt
Less than or equal to / <= / le
Greater than or equal to / >= / ge

Examples:

$x = 12;

--$x; # $x is now 11

$y = $x--; # $y is 11, and $x is now 10

$str = $str . " "; # append a space to $str

B. Array

- Default Initialization: empty array.

- examples:

my @home = ("couch", "chair", "table", "stove");

my($potato, $lift, $tennis, $pipe) = @home;

($alpha,$omega) = ($omega,$alpha); #switch alpha and omega values

$home[0] = "couch";

$home[1] = "chair";

$home[2] = "table";

$home[3] = "stove";

print $#home; # prints 3 (last element location)

$home[++$#home] = $bath; # access to [4],

$length = @home # array in scalar context return length = 5

@my_home = @array # copy array

@a = $b # equal to @a=($b)

push (@a,$val); # pushes $val at the end of @a

$last_val = pop(@a); # removes $val from the end of @a and

returns it

unshift (@a,$val); # pushes $val at the beginning of @a

$first_val = shift(@a); # removes $val from the beginning of @a and

returns it

@b=reverse(@a); # reversed order

@b=sort(@a); # sorting strings by ascii, numbers by value

chomp(@a); # removing last \n from each element

chop(@a); # removing last char from each element

$str = "This is a pet";

@a = split " ", $str; # @a becomes ("This","is","a","pet")

$newstr=join (" ", @a); # $newstr equals $str.

#split and join might use any regular expression /…/

C. Hash

%longday = ("Sun", "Sunday", "Mon", "Monday", "Tue", "Tuesday",

"Wed", "Wednesday", "Thu", "Thursday", "Fri",

"Friday", "Sat", "Saturday");

$longday{"Sun"}="Sunday"; # adding key Sun and value Sunday.

delete $longday{"Sun"}; # deleting key and its value

if(exists ($longday{"Sun"})){…} # check if Sun key exists.

%longday = (); # cleaning the hash (also default initialization)

%new_hash = %longday; #copy

@a = keys %longday #@a becomes ("Sun","Mon","Tue","Wed","Thu", "Fri","Sat");

@a = values %longday #@a becomes ("Sunday","Monday","Tuesday","Wednsdaa",

"Thursday", "Friday","Saterday");

foreach $key(sort keys %longday){ print $key.":".$longday{$key}."\n"; }

#sorted by keys

foreach $val(sort values %longday){print $key.":".$longday{$key}."\n"; }

#sorted by values

%merged = (%a,%b);

5. Statements

Block - A sequence of statements. A block is delimited by { }.

Conditionals

The if and unless statements execute blocks of code depending on whether a condition is met. These statements take the following forms:

if (expression) {block} else {block}

unless (expression) {block} else {block}

if (expression1) {block}

elsif (expression2) {block}

...

elsif (lastexpression) {block}

else {block}

while loops- The while statement repeatedly executes a block as long as its conditional expression is true.

$i = 1;

while ($i < 10) {

...

$i++;

}

while (<INFILE>) {

print OUTFILE, "$_\n";

}

The while statement has an optional extra block on the end called a continue block. This block is executed before every successive iteration of the loop, even if the main while block is exited early by the loop control command next. However, the continue block is not executed if the main block is exited by a last statement. The continue block is always executed before the conditional is evaluated again.

for loops- The for loop has three semicolon-separated expressions for initialization, condition, and the re-initialization expressions of the loop.

for ($i = 1; $i < 10; $i++) {

...

}

foreach loops -The foreach loop iterates over a list value and sets the control variable (var) to be each element of the list in turn:

foreach var (list) {

...

}

If VAR is omitted, $_ is used.

If LIST is an actual array, you can modify each element of the array by modifying VAR inside the loop. That's because the foreach loop index variable is an alias for each item in the list that you're looping over.

foreach $elem (@elements) { # multiply by 2

$elem *= 2;

}

foreach $key (sort keys %hash) { # sorting keys

print "$key => $hash{$key}\n";

}

Loop control
The last command is like the break statement in C (as used in loops): exits the loop. The next command is like the continue statement in C: skips the rest of the current iteration and starts the next iteration of the loop. Any block can be given a label (by convention, in uppercase) which identifies the loop. For example:

WID: foreach $this (@ary1) {

JET: foreach $that (@ary2) {

next WID if $this > $that;

$this += $that;

}

}

Example 1: union and intersection of arrays

@a = (1, 3, 5, 6, 7, 8);

@b = (2, 3, 5, 7, 9);

@union_arr = @isect_arr = ();

%union_hash = %isect_hash = ();

foreach $e (@a) { $union_hash{$e} = 1 }

foreach $e (@b) {

if ($union_hash{$e} ) { $isect_hash{$e} = 1 }

$union_hash{$e} = 1;

}

@union_arr = keys %union_hash;

@isect_arr = keys %isect_hash;

Example 2: Find common keys in hashes

my @common = ();

foreach $c (keys %hash1) {

if (exists ($hash2{$c}) )

push(@common, $c);

}

6. Subroutines:

sub NAMEBLOCK # A declaration and a definition.

NAME(LIST); #calling subroutine directly

Arguments are in @_ by reference.

$bestday = max($mon,$tue,$wed,$thu,$fri);

sub max {

my(@values)=@_; #warning: this is copy, use references to save time.

my($max) = shift(@values);

foreach my $foo (@values) {

if($max < $foo){

$max = $foo;

}

}

return $max;

}

7. Command line arguments: in @ARGV. For example:

#!\usr\bin perl

my($param1,$param2,$param3)=@ARGV;

print “got $param2 \n”;

------

> perl myscript.pl a b c

> got b

8. Basic I/O

$a = <STDIN>; # read the next line

@a = <STDIN>; # all remaining lines as a list, until ctrl-D

while ($line=<STDIN>) {

chomp($line);

# other operations with $line here

}

open(FILEHANDLE,"somename"); #opens the filehandle for reading

open(OUT, ">outfile"); #opens the filehandle for writing

open(LOGFILE, ">mylogfile"); #opens the filehandle for append

close(LOGFILE); #finished with a filehandle

example:

open (OUTFILE, “>./dir/out.txt”);

open (INFILE, $ARGV[0]);

while ($line = <INFILE>){

chomp($line);

print OUTFILE “saw $line here \n”;

}

close(INFILE);

close(OUTFILE);

9. Pattern Matching

while ($line = <FILE>) {

if($line =~ /http:/) # match operator // pattern binding operator =~

{print $line;} # prints all lines from FILE that include substring http:

}

$italiano =~ s/butter/olive oil/; # substitution operator s///

# substitutes all butter with olive oil in $italiano

$a =~ s/x//; #delete the first x

$a =~ s/x//g; #delete all x characters (g is modifier for global substitution)

@arr = split /aaa/, "jaaalaaah"; # @arr=("j","l","h")

$: match at the end of the string.^ match at the beginning of the string

/a$/ match "abba" but not "abb"

/^http:/ match "http:/../...", but not "located in http:..."

quantifiers:

* 0 and more times

+ 1 or more times

? 0 or 1 times

{min, max} between min and max. {min,} above min. {,max} below max

{num} exactly num

/abc*/ match ab, abc , abcc , abccc

/abc+/ match abc , abcc , abccc but not ab

/(abc){2,}/ match abcabc and abcabcabcabc but not abccabc

/(abc){3}/ match abcabcabc but not abcabc or abcabcabcabc

. match all except \n

/Frodo./ match "Frodon" but not "Frodo\n"

/Frodo\./ match only "Frodo." \ is used as de-metacharater

metasymbols:

\t tab

\w word character[a-zA-Z_0-9]

\s whitespace[\t\n\r\f],

\d digit[0-9]

/^\w+\s+\w+$/ match exactly two words

/a\t/ match a followed by tab. /a\\t/ match "a\t"

alternations:

/ I am (Fred|Wilma|Pebbles) Flintstone/ match exactly one of the names

capture and clustering:

while($line =<STDIN>){

if($line =~ /^(.*):(.*)$/) {

$hash{$1}=$2;

}}

s/^(\w+) (\w+)/$2 $1/ substitutes between the two first words

/\s(\w+)\s\1\s/ match only two consecutive identical words

10. References and Data structures:

Creating reference:

$scalarref=\$scalar;

$arrref=\@arr;

$hashref=\%hash;

$arrref=["a","b","c","d"]; #anonymous data

$hashref={"red",1,"green",2}; #anonymous data

Dereferencing:

$val = $$scalarref;

@arr = @$arrref;

$val = $arrref->[2];

%hash = %$hashref;

$val = $hashref->{"red"};

Data structures – examples:

@arr = (1,2,{"red","flowers","blue","sky"},["a","b","c","d"]); #creating ds

$val = $arr[2]->{"blue"}; # accessing ds: $val="sky". Same as $arr[2]{"blue"}

$arr[3] = []; #the array ["a","b","c","d"] is now empty!

Array of arrays of all STDIN words:

my(@LoL);

while ($line = <STDIN>) {

my @tmp = split " ", $line;

push @LoL, [ @tmp ];

}

… or you may use only: push @LoL, [split " ", $line ];

foreach $i (0,,$#LoL){

$row_ref = $LoL[$i];

foreach $j (0.. $#{$row_ref}){

print "element $i $j is $LoL[$i][$j] \n";

}