Saturday, November 04, 2006

Perl Tips

AUTOLOAD

Just try the following script:

sub AUTOLOAD {
my $program = our $AUTOLOAD;
$program =~ s/.*:://;
system($program, @_);
}

date();
ls('-l');
whoami();
tree();
# Similarly you can try any command-line commands as above

When Perl encounters an undefined function and that function is not defined, it looks for the function called AUTOLOAD. If one exists, it's called with the same arguments as the original function would have had.

Compare big strings with Logical exclusive-OR

There are 2 strings $string1 and $string2, and both are almost similar... but with a few changes. Supposing if you would like to know the character position where the both strings differ, try the following:

my $string_xor = ("$string1" ^ "$string2");
$string_xor =~ /^(\0*)/;
print "Position = " . length($1) ."\n";

The XOR operator (^) will return a string where every matching byte of which will be null, every mismatching byte will have some bit set. The second statement will catch all the non-null characters from the begining. Its length is the position where both the strings differ.

The ^ operator performs a logical exclusive-OR. The truth table looks like

a b output
0 0 0
0 1 1
1 0 1
1 1 0

Ofcourse, there are also other ways of comparing two strings. For instance you can do something like this:

$strcnt=0;
while (substr($string1,$strcnt,1) eq substr($string2,$strcnt,1)) {$strcnt++}

It is up to u to choose ur flavor.

Simple but cheeeeeky Array manipulations

@array=(1, 2, 3, 4, 5, 6, 7, 8);

#To just retain the first 5 elements, and delete the rest
$#array=5;
print join("\n",@array);

#To chop the last two elements
$#array -= 2;
print join("\n",@array);

Perl XML

Dont know which XML module will suit your need?

Here is a very good article: http://perl-xml.sourceforge.net/faq/

A module to explain ur regular expressions!

Confused with regular expressions? Reading some other code and dont know what a particular regular expression do? Here is a kalaasal module that might help u: YAPE::Regex::Explain

Say you have the regular expression "([^>]+?)\..*", and want to know what it will match. Then try this:

use YAPE::Regex::Explain;
my $myregex='([^>]+?)\..*';
print YAPE::Regex::Explain->new(
$myregex)->explain;

The output of the above code will be:

The regular expression:

(?-imsx:([^>]+?)\..*)

matches as follows:

NODE EXPLANATION
----------------------------------------------------------------------
(?-imsx: group, but do not capture (case-sensitive)
(with ^ and $ matching normally) (with . not
matching \n) (matching whitespace and #
normally):
----------------------------------------------------------------------
( group and capture to \1:
----------------------------------------------------------------------
[^>]+? any character except: '>' (1 or more
times (matching the least amount
possible))
----------------------------------------------------------------------
) end of \1
----------------------------------------------------------------------
\. '.'
----------------------------------------------------------------------
.* any character except \n (0 or more times
(matching the most amount possible))
----------------------------------------------------------------------
) end of grouping
----------------------------------------------------------------------

Coooool.. right?

Perl - BEGIN... END

You wud have seen BEGIN, END etc in some perl codes. Here is it what it does (from chapter 18 of Prgramming Perl http://www.oreilly.com/catalog/pperl3/chapter/ch18.html) :

These four block types run in this order:

BEGIN: Runs ASAP (as soon as parsed) whenever encountered during compilation, before compiling the rest of the file.

CHECK: Runs when compilation is complete, but before the program starts. (CHECK can mean "checkpoint" or "double-check" or even just "stop".)

INIT: Runs at the beginning of execution right before the main flow of your program starts.

END: Runs at the end of execution right after the program finishes.

If you declare more than one of these by the same name, even in separate modules, the BEGINs all run before any CHECKs, which all run before any INITs, which all run before any ENDs--which all run dead last, after your main program has finished. Multiple BEGINs and INITs run in declaration order (FIFO), and the CHECKs and ENDs run in inverse declaration order (LIFO).

No comments: