.Open Notes on Perl . Introduction Perl inherits many ideas from $UNIX tools but rejects the $UNIX philosophy that several simple tools are better than a single complex tool. It combines ideas from the Bourne shell, and the following programs: tr, sed, awk, ed, grep, egrep, plus the programming language C. The result is available for most operating systems and popular for writing scripts that react to requests from the WWW for pages using the Common Gateway Interface. See .See http://perl.oreilly.com/news/importance_0498.html for a discussion of the importance of Perl. Perl is a dynamically scoped, block structured language. It uses special characters ($,@,%,&) to distinguish different types of data(scalar, array,associative array, subprogram). . PERL on Windows Question From Alumin (Anonimized) .Box Hello, I hope everything is going good at the school for you both. I have continued to work at XYZ Systems since I graduated and we've run into a problem that requires us to use PERL 5.8.4 and I have been unable to locate a pre-compiled or source code with compilation instruction for a windows system. .Close.Box Answer From Dr. Gomez .Box Try: .See http://win32.perl.org/wiki/index.php?title=Main_Page Also: .See http://www.perl.com/download.csp#win32 These guys have 5.8.8 binaries for windows, but this will work only if you are not sensitive to minor version number: .See http://www.activestate.com/Products/activeperl/index.mhtml .Close.Box . Warnings for UNIX Experts Look out when assigning to scalars: .As_is $var = whatever is ok in Perl, but abnormal in 'sh'. Operations on an element in array @`a` or associative array %`a` use $`a`. Don't use the `sed` \(....\) brackets for subpatterns. Use (....). Don't use the `sed` \1..\9 but $1..$9 for strings that match patterns. Parentheses like ( ... ) are not like in `egrep`. Not $$i but $$ARGV[i] for arguments to subprograms. 1/2 is 0.5 not 0! Security gotcha: Strings handed to a shell are reinterpreted and not quoted. This lets malicious users execute unexpected commands on your server. . Lexicon Perl uses the American Standard Code for Information Interchange or: |-$ASCII (ASCII)|- .Box white_space::= $ASCII.SP | $ASCII.HT | ..., -- space, tab, etc. doublequote::="\"", quote::="'", backquote::="`", backslash::="\\". semicolon::=";". colon::=":". dollar::="$". left_brace::="{". right_brace::="}". EOLN::=`End of line`. .Close.Box type_indicator::= $dollar | "@" | "%". variable::= $all_variable | $scalar_variable | $array_variable | $associative_array_variable. all_variable::= "*" identifier, -- "*"`i` indicates "$"`i`,"%"`i`, and "@"`i. scalar_variable::= $dollar identifier. array_variable::= "@" identifier. associative_array_variable::= "%" identifier. subprogram::= "&" identifier. get_line_of_standard_input::= "<>", also see the .See read_while_file f doing p cliche. default_variable::= "$_" | "@_" | "%_". environment_variable::= "%ENV". ()|- environment_variable ==>associative_array_variable. arguments::= "@ARGV". ()|- arguments ==>array_variable. name_formal_arguments::= "local(" $List($variable) ")" "=" $arguments";". string ::=single_quoted_string | double_quoted_string | back_quoted_string. single_quoted_string::= $quote #non($quote) $quote. double_quoted_string::= $doublequote # (non($doublequote|$backslash) | escape ) $doublequote. back_quoted_string::= $backquote #non($backquote) $backquote. comment::= $C_comment | $shell_comment. C_comment::= "/*" #character "*/". shell_comment::= "#" #non_end_of_line . Perl Patterns perl_sub_string::= "(" $regular_expression ")", -- like $UNIX "\(....\)". sub_string_variable::= $dollar digit. but nesting is ok and UNIX/sed/ed backslashes must not used. perl_anchors::= "\\" ( "b" | "B" ), .As_is b word boundary (between \w and \W) .As_is B non-word boudary. perl_wild_characters::= "\\" ("d" | "D" | "w" | "W" | "s" | "S"), .As_is d decimal digit [0-9] .As_is D non-digit .As_is w word-character [0-9a-z_A-z] .As_is W non-word character .As_is s whitespace [ \t\n\r\f] .As_is S non-whitespace character. . Some Special Variables .As_is $| -- indicates buffering of file, set to 1 to force I/O on each command. .As_is $%, $=, $-, $~, $^ -- used in page and column lay out for reports .As_is $1 .. $9 -- Parts of matched patterns in parentheses .As_is $& -- the last pattern matched .As_is $`,$&,$' -- Before match, match, after match .As_is $$ -- UNIX Process Id .As_is $? -- status report form pipe, sub-shell, etc .As_is $* -- set to 1 to do multiline matches, 0 for efficient one line matching .As_is $/ -- end of input record separator, set to "" to treat paragraphs as a record. .As_is $0 -- name of perl script .As_is $[ -- base of arrays, defaults to 0. .As_is $; -- separates dimensions in multidimensional index .As_is $! -- error number .As_is $@ -- Perl eror message .As_is $<, $>,$(, $) -- user and group ids on UNIX .As_is $: -- Characters after which a string can word wrapped for preference .As_is $#A -- number of scalars in array @A .As_is ARGV -- Command line arguments .As_is -- file_handle that becomes each argument in turn .As_is $ARGV -- Current .As_is @ARGV -- array of arguments .As_is %ENV -- environment handed to perl program by operating system. .As_is STDERR -- Standard error output .As_is STDIN -- Standard input .As_is STDOUT -- Standard output . Some Perl Operators infix_operator::= "," | "=" | ".." | "||" | "&&" | "|" | "^" | "&" | "<<" | ">>" | "+" | "-" | "." | "*" | "/" | "%" | "x" | "**" . relational_operator::= "==" | "eq" | "!=" | "ne" | "<=>" | "cmp" | "<" | "lt" | ">" | "gt" | ">=" | "ge" | "<=" | "le". pattern_binding_operator::= "=~" | "!~". assignment_operator::= "=" | $infix_operator "=". prefix::= "!" | "~" | "++" | "--" | "-". postfix::="++" | "--". . Some Perl Functions "If it looks like a function call then it `is` a function call."[$Camel] (math): atan2(x,y), cos(r), exp(x), log (base e), rand(expr)(in 0..(_)), sin(r), sqrt, srand, chop::=`removes the last character in a string and returns it. Often used to remove the newline character(s) from input. Defaults to operate on "$_"`. defined::=`true if $lvalue has a been given a value, else false`. delete $a{k}::= `remove item with key k from associative array a`. die(s)::=` exit from eval or perl and produce error message as $@ (from eval) or to STDER. dump::$statement, produce core dump. each::$associative_array->iterator, converts next item in an associative array as an array of a Key and a Value. eval::$expression->statement, treat result of expression as a program and execute it as a subprogram - dangerous if the expression is unchecked. exec(s)::$statement, replace this perl program by program in list, is dangerous when used with unchecked `tainted` user supplied data. exit(n)::$statement, terminate perl program ahead of time. goto(l)::$statement, transfer control -- inefficient but used for translating from 'sed'. grep(e,l)::=`Sets $_ to each element in list l and returns an array of those that make e true. hex(s)::=`convert number to hexadecimal string`. index(s,ss,p)::= `return first position after p in s which starts with string ss, if any, else returns $[-1. index(s,ss)::=index(s,ss,$[). index(ss)::=index($_,ss,$[). int(e)::=`integer part of an expression`, ??is this a floor operation rounding down or does it round towards to zero??. join(e,l) ::= `concatenate items in l with e as a separator`. keys::associative_array->array=`list of keys in associative array`. length(s)::=` number of characters in s`. oct(n)::= `n in octal notation`. ord(e)::=` $ASCII code value for first character in value of expression e`, pop::$statement, take last element out of array. print(e,...)::$statement, outputs one or more expressions as strings to a file or output. printf(f,e,...)::$statement, formatted $print -- like $C. push(a, l)::=`put l at end of array a`. (range): e1 .. e2 = `and array of numbers/characters between e1 and e2 inclusive`. reverse(a)::=`reverse scalar(string) or order of items in an array`. (rindex): see $index.... but in reverse order. s(p,r)::=` $substitute string r for pattern p`. scalar(e)::=`evaluate expression e in scalar context`. shift(a)::=`remove first item of array a and shift items up`. sort(s,l)::=`reorder items in l so that for each pair $ a,$ b in result, s($ a,$ b)>o`. splice(a,p,n l)::=` remove items p .. p+m-1 from a and replace by l`. split(/$pattern/, e, limit)::=`value of expression e is split by occurrences of pattern (up to l occurrences)...`. substr(s, p, l)::=`substring s[p].. s[p+l-1]`. tr::$statement, $translate strings. undef(v)::$statement, make value of v undefined. unshift(a): opposite of $shift. values::associative_array->array, values found in an associative array). vec(s)::array, turn string into an array of character codes. write::$statement, writes formatted recorded see $Formats below. .Open Syntax There is a good but informal description at .See http://www.mincom.com/mtr/perl/pod/perlsyn.html what follows are some incomplete (but formal) jottings. I've been scratching down the syntax of the various languages I use like this since roughly 1966. I found the following quote from the Perl $FAQ a good excuse for not trying a complete description: .Box In the words of Chaim Frenkel: 'Perl's grammar can not be reduced to BNF. The work of parsing perl is distributed between yacc, the lexer, smoke and mirrors.' .Close.Box element_in_array::= $dollar identifier "[" numeric_expression "]". element_in_associative_array::= $dollar identifier "{" expression "}". . String Handling pattern_bind::= $variable ("=~" | "!~" ) $pattern. pattern::= "/" $regular_expression "/". regular_expression::= `the standard $UNIX $RE plus some special Perl features, $perl_anchors, $perl_wild_characters, $perl_sub_string, minus \(...\)`. .As_is . any single character except end of line .As_is * any number of previous including none .As_is + one or more of the previous .As_is ? zero or one of the previous ("optional" -- compare $O) .As_is [abc] One of the listed elements .As_is [^ab] one of the unlisted elements .As_is ... translate::= |[t:char](tr t pattern_with_no(t) t #non(t) t tr_options). substitute::= |[t:char](s t pattern_with_no(t) t #non(t) t s_options). match::= |[t:char](m t pattern_with_no(t) t m_options). pattern_with_no(t)::=`patern with no unescaped t`. chop_of_end_of_line::= "chop(" expression ")", the easiest way to remove a "\n" at the end of a string. . File Handling file_operations::= "open(" filename "," mode ")" | "close(" filename ")" | "print" filename "," expression | ... file_tests ::= "-" ("r" | "w" | "x" | ...) .Set -r readable -w writable -x executable ... .Close.Set get_next_line_of_input_on(f)::= "<" f ">". read_while_file f doing p::= "while(<"f">)" p, |- perl(`while()p`) = perl(`while($_= ) p`), |- perl(`while(<>)p`) = perl(`while($_= ) p`). . Subprograms subprogram_declaration::= "sub" identifier ";". subprogram_definition::= "sub" identifier $block, note that the formal parameters are not declared as part of the subprogram. They are implicitly available as "@_". The perl way to define a subprogram with three arguments x,y,z is to write: .As_is sub foo{ .As_is local($x,$y,$z)=@_; .As_is ... .As_is } local_declaration::= "local(" $List(variable) ")" $O( initialization ) -- dynamic scoping. private_decalaration::= "my(" $List(variable) ")" $O( initialization ) -- static scoping, added in Perl 5. dangerous_shell_escape::="system(" os_command_line ")". possibly_safe_shell_escape::="system(" os_command "," arguments ")" -- ??. evaluate string::= "eval" "(" expression ")". . Control Structure selection::= $simple_selection | "if" "(" expresssion ")" $block #( "elsif" "(" expression")" $block )$O( "else" $block) simple_selection::= statement O(("if" | "unless") expression). if_statement::= "if" "(" expresssion ")" $block. if_then_else_statment::="if" "(" expresssion ")" $block "else" $block. block::= "{" #$statement "}". loop::=$simple_loop | $O($label ":") $O($loop_clause) $block. simple_loop::=statement O(("while" | "until") expression). loop_clause::= ("while"|"until")"(" $condition ")" | $C_for_loop | "foreach" $variable "(" $array ")". C_for_loop::= "for" "(" $O expression";" $Oexpression ";" $Oexpression ")". while_statement::= "while" "(" expresssion ")" $block. inner_loop_control::= ("next" | "last" | "redo" ) $O($label), note that redo and last can be used inside any (labeled) block. statement::= $block | $loop | $selection | $local_declaration | $private_declaration | $inner_loop_control | expression";"|... . Long Literal Perl inherits an interesting idea from the $UNIX Bourne shell: that the data for an operation can be supplied by a series of lines: long_string ::= "<<" $single_quoted_string ";" #line line_equal_to_content_of_string. .As_is $example = << 'ENDIT'; .As_is Even the longest string .As_is must end it somewhere. .As_is ENDIT Each line can contain variables that are interpretted and placed in the string. This is very handy for form letters... . Formats Perl has a very special technique for formattting data into reports. For example all write statements that go to a particular $filehandle can be forced to fit a given page layout using: page_header::="format" "top" "=" #line dot_line. formated_output::= "format" file_handle "=" #(format_line data_line) dot_line. . Modules Files and directories form a module hierarchy as of Perl 5 (a bit like Java here!) with the C++ double-colon symbol: .As_is directory::file . Objects and Structures Added in Perl 5 and very incomplete. .As_is $variable={}; -- assign empty object .As_is $variable->{FIELD}; -- access field/method... .Close Syntax . Glossary lvalue::="left hand value" -- an expression that can be put meaningfully on the left hand of an assignment, from BCPL. filehandle::="an identifier that which is an internal name for a program identify an open file", cf $filename. filename::="a string that is used by an operatiing system to identify a file". . Notation O::=`optional (_)`. non::=`any character except those in (_)`. List::= (_) #("," (_)). # ::= `any number including none of (_)` |- #X = $O( X # X ). . See Also Online documentation for perl can often be accessed by running the command: .As_is perldoc perl (Camel): Larry Wall & Randal L Schwartz, Programming perl, O'Reilly & Associates (Nutshell book). (FAQ): .See ftp://ftp.flirble.org/pub/languages/perl/CPAN/doc/manual/html/pod/perlfaq.html (reference_on_WWW): .See http://reference.perl.com/ (smith99): B. Smith posted the following on comp.software_eng in April 1999: .Box (home_page): .See http://www.perl.com/perl (archives): .See http://www.perl.org/CPAN/CPAN.html (CPAN archive, comprehesive collection of Perl programs and modules including most recent versions of Perl itself) .Close.Box (usenet): The following place is full of rather rude experts: .As_is comp.lang.perl.misc so don't forget to try this command .As_is perldoc perl to see if your system has the documentation and then (the newsgroup archive), .See http://www.dejanews.com/ to search to see if your question/problem has already been addressed, or at least get some backround. Then ask your question at .See news://comp.lang.perl.misc ASCII::=http://www.csci.csusb.edu/dick/samples/comp.text.ASCII.html. C::=http://www.csci.csusb.edu/dick/samples/c.html. MATHS::=http://www.csci.csusb.edu/dick/maths/. regular_expression::=http://www.csci.csusb.edu/dick/samples/regular_expressions.html. UNIX::=`an operating system`, See .See http://www.csci.csusb.edu/cs360/ .See http://www.csci.csusb.edu/dick/samples/unix.syntax.html. .See http://www.csci.csusb.edu/dick/samples/unix.commands.html. .Close Notes on Perl