Interpreted Languages: Perl, PHP, Python, Ruby (Sheet Two)

a side-by-side reference sheet

sheet one: grammar and invocation | variables and expressions | arithmetic and logic | strings | regexes | dates and time
                 arrays | dictionaries | functions | execution control

sheet two: file handles | files | directories | processes and environment | libraries and namespaces | objects | reflection
                 net and web | unit tests | debugging and profiling | java interop | contact

file handles
perl php python ruby
standard file handles STDIN STDOUT STDERR only set by CLI; not set when reading script from standard in:
STDIN STDOUT STDERR
sys.stdin sys.stdout sys.stderr $stdin $stdout $stderr
read line from stdin
 
$line = <STDIN>; $line = fgets(STDIN); line = sys.stdin.readline() line = gets
end-of-file behavior returns string without newline or undef returns string without newline or FALSE returns string without newline or '' returns non-empty string without newline or raises EOFError
chomp
 
chomp $line; chop($line); line = line.rstrip('\r\n') line.chomp!
write line to stdout
 
print "Hello, World!\n"; echo "Hello, World!\n"; print('Hello, World!') puts "Hello, World!"
write formatted string to stdout use Math::Trig 'pi';

printf("%.2f\n", pi);
printf("%.2f\n", M_PI); import math

print('%.2f' % math.pi)
printf("%.2f\n", Math::PI)
open file for reading
 
open my $f, "/etc/hosts" or die; $f = fopen("/etc/hosts", "r"); f = open('/etc/hosts') f = File.open("/etc/hosts")
open file for writing
 
open my $f, ">/tmp/test" or die; $f = fopen("/tmp/test", "w"); f = open('/tmp/test', 'w') f = File.open("/tmp/test", "w")
open file for appending open my $f, ">>/tmp/err.log" or die; $f = fopen("/tmp/test", "a"); f = open('/tmp/err.log', 'a') f = File.open("/tmp/err.log", "a")
close file
 
close $f or die; fclose($f); f.close() f.close
close file implicitly {
  open(my $f, ">/tmp/test") or die;
  print $f "lorem ipsum\n";
}
none with open('/tmp/test', 'w') as f:
  f.write('lorem ipsum\n')
File.open("/tmp/test", "w") do |f|
  f.puts("lorem ipsum")
end
i/o errors return false value return false value and write warning to stderr raise IOError exception raise IOError or subclass of SystemCallError exception
read line
 
$line = <$f>; $line = fgets($f); f.readline() f.gets
iterate over file by line
 
while ($line = <$f>) {
  print $line;
}
while (!feof($f)) {
  $line = fgets($f);
  echo $line;
}
for line in f:
  print(line)
f.each do |line|
  print(line)
end
read file into array of strings @a = <$f>; $a = file("/etc/hosts"); a = f.readlines() a = f.lines.to_a
read file into string $s = do { local $/; <$f> }; $s = file_get_contents(
  "/etc/hosts");
s = f.read() s = f.read
write string
 
print $f "lorem ipsum"; fwrite($f, "lorem ipsum"); f.write('lorem ipsum') f.write("lorem ipsum")
write line
 
print $f "lorem ipsum\n"; fwrite($f, "lorem ipsum"); f.write('lorem ipsum\n') f.puts("lorem ipsum")
flush file handle
 
use IO::Handle;

$f->flush();
CLI output isn't buffered
fflush($f);
f.flush() f.flush
end-of-file test
 
eof($f) feof($f) none f.eof?
get and set file handle position tell($f)
seek($f, 0, SEEK_SET);
ftell($f)
fseek($f, 0);
f.tell()
f.seek(0)
f.tell
f.seek(0)

f.pos
f.pos = 0
open temporary file use File::Temp;

$f = File::Temp->new();

print $f "lorem ipsum\n";

print "tmp file: ";
print $f->filename . "\n";

close $f or die;

file is removed when file handle goes out of scope
$f = tmpfile();

fwrite($f, "lorem ipsum\n");

# no way to get file name

fclose($f);

file is removed when file handle is closed
import tempfile

f = tempfile.NamedTemporaryFile()

f.write('lorem ipsum\n')

print("tmp file: %s" % f.name)

f.close()

file is removed when file handle is closed
require 'tempfile'

f = Tempfile.new('')

f.puts "lorem ipsum"

puts "tmp file: #{f.path}"

f.close

file is removed when file handle is garbage-collected or interpreter exits
in memory file my ($f, $s);
open($f, ">", \$s);
print $f "lorem ipsum\n";
$s;
$meg = 1024 * 1024;
$mem = "php://temp/maxmemory:$meg";
$f = fopen($mem, "r+");
fputs($f, "lorem ipsum");
rewind($f);
$s = fread($f, $meg);
from StringIO import StringIO

f = StringIO()
f.write('lorem ipsum\n')
s = f.getvalue()

Python 3 moved StringIO to the io module
require 'stringio'

f = StringIO.new
f.puts("lorem ipsum")
f.rewind
s = f.read
files
perl php python ruby
file test, regular file test
 
-e "/etc/hosts"
-f "/etc/hosts"
file_exists("/etc/hosts")
is_file("/etc/hosts")
os.path.exists('/etc/hosts')
os.path.isfile('/etc/hosts')
File.exists?("/etc/hosts")
File.file?("/etc/hosts")
file size
 
-s "/etc/hosts" filesize("/etc/hosts") os.path.getsize('/etc/hosts') File.size("/etc/hosts")
is file readable, writable, executable -r "/etc/hosts"
-w "/etc/hosts"
-x "/etc/hosts"
is_readable("/etc/hosts")
is_writable("/etc/hosts")
is_executable("/etc/hosts")
os.access('/etc/hosts', os.R_OK)
os.access('/etc/hosts', os.W_OK)
os.access('/etc/hosts', os.X_OK)
File.readable?("/etc/hosts")
File.writable?("/etc/hosts")
File.executable?("/etc/hosts")
set file permissions
 
chmod 0755, "/tmp/foo"; chmod("/tmp/foo", 0755); os.chmod('/tmp/foo', 0755) File.chmod(0755, "/tmp/foo")
copy file, remove file, rename file use File::Copy;

copy("/tmp/foo", "/tmp/bar");
unlink "/tmp/foo";
move("/tmp/bar", "/tmp/foo");
copy("/tmp/foo", "/tmp/bar");
unlink("/tmp/foo");
rename("/tmp/bar", "/tmp/foo");
import shutil

shutil.copy('/tmp/foo', '/tmp/bar')
os.remove('/tmp/foo')
shutil.move('/tmp/bar', '/tmp/foo')
require 'fileutils'

FileUtils.cp("/tmp/foo", "/tmp/bar")
FileUtils.rm("/tmp/foo")
FileUtils.mv("/tmp/bar", "/tmp/foo")
create symlink, symlink test, readlink symlink "/etc/hosts", "/tmp/hosts";
-l "/etc/hosts"
readlink "/tmp/hosts"
symlink("/etc/hosts", "/tmp/hosts");
is_link("/etc/hosts")
readlink("/tmp/hosts")
os.symlink('/etc/hosts',
  '/tmp/hosts')
os.path.islink('/tmp/hosts')
os.path.realpath('/tmp/hosts')
File.symlink("/etc/hosts",
  "/tmp/hosts")
File.symlink?("/etc/hosts")
Ruby 1.9:
File.realpath("/tmp/hosts")
generate unused file name use File::Temp;

$f = File::Temp->new(DIR=>"/tmp",
  TEMPLATE=>"fooXXXXX",
  CLEANUP=>0);
$path = $f->filename;
$path = tempnam("/tmp", "foo");
$f = fopen($path, "w");
import tempfile

f, path = tempfile.mkstemp(
  prefix='foo',
  dir='/tmp')
directories
perl php python ruby
working directory use Cwd;

my $old_dir = cwd();

chdir("/tmp");
$old_dir = getcwd();

chdir("/tmp");
old_dir = os.path.abspath('.')

os.chdir('/tmp')
old_dir = Dir.pwd

Dir.chdir("/tmp")
build pathname use File::Spec;

File::Spec->catfile("/etc", "hosts")
"/etc" . DIRECTORY_SEPARATOR . "hosts" os.path.join('/etc', 'hosts') File.join("/etc", "hosts")
dirname and basename use File::Basename;

print dirname("/etc/hosts");
print basename("/etc/hosts");
dirname("/etc/hosts")
basename("/etc/hosts")
os.path.dirname('/etc/hosts')
os.path.basename('/etc/hosts')
File.dirname("/etc/hosts")
File.basename("/etc/hosts")
absolute pathname
and tilde expansion
use Cwd;

# symbolic links are resolved:
Cwd::abs_path("foo")
Cwd::abs_path("/foo")
Cwd::abs_path("../foo")
Cwd::abs_path(".")
# no function for tilde expansion
# file must exist; symbolic links are
# resolved:

realpath("foo")
realpath("/foo")
realpath("../foo")
realpath("./foo")
# no function for tilde expansion
# symbolic links are not resolved:
os.path.abspath('foo')
os.path.abspath('/foo')
os.path.abspath('../foo')
os.path.abspath('./foo')
os.path.expanduser('~/foo')
# symbolic links are not resolved:
File.expand_path("foo")
File.expand_path("/foo")
File.expand_path("../foo")
File.expand_path("./foo")
File.expand_path("~/foo")
iterate over directory by file opendir(my $dh, $ARGV[0]);

while (my $file = readdir($dh)) {
  print $file . "\n";
}

closedir($dh);
if ($dir = opendir("/etc")) {
  while ($file = readdir($dir)) {
    echo "$file\n";
  }
  closedir($dir);
}
for filename in os.listdir('/etc'):
  print(filename)
Dir.open("/etc").each do |file|
  puts file
end
glob paths while ( </etc/*> ) {
  print $_ . "\n";
}
foreach (glob("/etc/*") as $file) {
  echo "$file\n";
}
import glob

for path in glob.glob('/etc/*'):
  print(path)
Dir.glob("/etc/*").each do |path|
  puts path
end
make directory use File::Path 'make_path';

make_path "/tmp/foo/bar";
mkdir("/tmp/foo/bar", 0755, TRUE); dirname = '/tmp/foo/bar'
if not os.path.isdir(dirname):
  os.makedirs(dirname)
require 'fileutils'

FileUtils.mkdir_p("/tmp/foo/bar")
recursive copy # cpan -i File::Copy::Recursive
use File::Copy::Recursive 'dircopy';

dircopy "/tmp/foodir",
  "/tmp/bardir";
none import shutil

shutil.copytree('/tmp/foodir',
  '/tmp/bardir')
require 'fileutils'

FileUtils.cp_r("/tmp/foodir",
  "/tmp/bardir")
remove empty directory rmdir "/tmp/foodir"; rmdir("/tmp/foodir"); os.rmdir('/tmp/foodir') File.rmdir("/tmp/foodir")
remove directory and contents use File::Path 'remove_tree';

remove_tree "/tmp/foodir";
none import shutil

shutil.rmtree('/tmp/foodir')
require 'fileutils'

FileUtils.rm_rf("/tmp/foodir")
directory test
 
-d "/tmp" is_dir("/tmp") os.path.isdir('/tmp') File.directory?("/tmp")
generate unused directory use File::Temp qw(tempdir);

$path = tempdir(DIR=>"/tmp",
  CLEANUP=>0);
import tempfile

path = tempfile.mkdtemp(dir='/tmp',
  prefix='foo')
require 'tmpdir'

path = Dir.mktmpdir("/tmp/foo")
system temporary file directory use File::Spec;

File::Spec->tmpdir
sys_get_temp_dir() import tempfile

tempfile.gettempdir()
require 'tmpdir'

Dir.tmpdir
processes and environment
perl php python ruby
command line arguments
and script name
@ARGV
$0
$argv
$_SERVER["SCRIPT_NAME"]
sys.argv[1:]
sys.argv[0]
ARGV
$0
command line options
boolean option, option with argument, usage
use Getopt::Long;

my ($file, $help, $verbose);

my $usage =
  "usage: $0 [-f FILE] [-v] [ARG ...]\n";

if (GetOptions("file=s" => \$file,
               "help" => \$help,
               "verbose" => \$verbose)) {
  print $usage;
  exit 1;
}

if ($help) {
  print $usage;
  exit 0;
}

# After call to GetOptions() only
# positional arguments are in @ARGV.
#
# Options can follow positional arguments.
#
# Long options can be preceded by one or two
# hyphens. Single letters can be used if
# only one long option begins with that
# letter.
#
# Single letter options cannot be bundled
# after a single hyphen.
#
# Single letter options must be separated
# from an argument by a space or =.
$usage = "usage: " .
  $_SERVER["SCRIPT_NAME"] .
  " [-f FILE] [-v] [ARG ...]\n";

$opts = getopt("f:hv",
  array("file:", "help", "verbose"));

if (array_key_exists("h", $opts) ||
    array_key_exists("help", $opts)) {
  echo $usage;
  exit(0);
}

$file = $opts["f"] ? $opts["f"] :
  $opts["file"];

if (array_key_exists("v", $opts) ||
    array_key_exists("verbose", $opts)) {
  $verbose = TRUE;
}

# Processing stops at first positional
# argument.
#
# Unrecognized options are ignored.
# An option declared to have an argument
# is ignored if the argument is not
# provided on the command line.
#
# getopt() does not modify $argv or
# provide means to identify positional
# arguments.
import argparse

parser = argparse.ArgumentParser()
parser.add_argument('positional_args',
  nargs='*',
  metavar='ARG')
parser.add_argument('--file', '-f',
  dest='file')
parser.add_argument('--verbose', '-v',
  dest='verbose',
  action='store_true')

args = parser.parse_args()

the_file = args.file
verbose = args.verbose

# The flags -h and --help and the
# usage message are generated
# automatically.
#
# Positional arguments are in
# args.positional_args
#
# Options can follow positional arguments.
require 'optparse'

options = {}
OptionParser.new do |opts|
  opts.banner =
    "usage: #{$0} [OPTIONS] [ARG ...]"

  opts.on("-f", "--file FILE") do |arg|
    options[:file] = arg
  end

  opts.on("-v", "--verbose") do |arg|
    options[:verbose] = arg
  end
end.parse!

file = options[:file]
verbose = options[:verbose]

# The flags -h and --help and the
# usage message are generated
# automatically.
#
# After calling OptionParser.parse! only
# positional arguments are in ARGV.
#
# Options can follow positional args.
get and set environment variable
 
$ENV{"HOME"}

$ENV{"PATH") = "/bin";
getenv("HOME")

putenv("PATH=/bin");
os.getenv('HOME')

os.environ['PATH'] = '/bin'
ENV["HOME"]

ENV["PATH"] = "/bin"
get pid, parent pid $$
getppid
posix_getpid()
posix_getppid()
os.getpid()
os.getppid()
Process.pid
Process.ppid
get user id and name $<
getpwuid($<)
$uid = posix_getuid();
$uinfo = posix_getpwuid($uid);
$username = $uinfo["name"];
import getpass

os.getuid()
getpass.getuser()
require 'etc'

Process.uid
Etc.getpwuid(Process.uid)["name"]
exit
 
exit 0; exit(0); sys.exit(0) exit(0)
set signal handler
 
$SIG{INT} = sub {
  die "exiting...\n";
};
import signal

def handler(signo, frame):
  print('exiting...')
  sys.exit(1)

signal.signal(signal.SIGINT, handler)
Signal.trap("INT",
  lambda do |signo|
    puts "exiting..."
    exit 1
  end
)
executable test
 
-x "/bin/ls" is_executable("/bin/ls") os.access('/bin/ls', os.X_OK) File.executable?("/bin/ls")
external command
 
system("ls -l /tmp") == 0 or
  die "ls failed";
system("ls -l /tmp", $retval);
if ($retval) {
  throw new Exception("ls failed");
}
if os.system('ls -l /tmp'):
  raise Exception('ls failed')
unless system("ls -l /tmp")
  raise "ls failed"
end
escaped external command
 
$path = <>;
chomp($path);
system("ls", "-l", $path) == 0 or
  die "ls failed";
$path = chop(fgets(STDIN));
$safe = escapeshellarg($path);
system("ls -l " . $safe, $retval);
if ($retval) {
  throw new Exception("ls failed");
}
import subprocess

cmd = ['ls', '-l', '/tmp']
if subprocess.call(cmd):
  raise Exception('ls failed')
path = gets
path.chomp!
unless system("ls", "-l", path)
  raise "ls failed"
end
backticks
 
my $files = `ls -l /tmp`; or
my $files = qx(ls);
$files = `ls -l /tmp`; import subprocess

cmd = ['ls', '-l', '/tmp']
files = subprocess.check_output(cmd)
files = `ls -l /tmp`
unless $?.success?
  raise "ls failed"
end

files = %x(ls)
unless $?.success?
  raise "ls failed"
end
libraries and namespaces
perl php python ruby
load library
 
require 'Foo.pm';

# searches @INC for Foo.pm:
require Foo;
require_once("foo.php"); # searches sys.path for foo.pyc or foo.py:
import foo
require 'foo.rb'

# searches $LOAD_PATH for foo.rb, foo.so,
# foo.o, foo.dll:

require 'foo'
load library in subdirectory require 'Foo/Bar.pm';

require Foo::Bar;
require_once('foo/bar.php'); # foo must contain __init__.py file
import foo.bar
require 'foo/bar.rb'

require 'foo/bar'
hot patch
 
do 'Foo.pm'; require("foo.php"); reload(foo) load 'foo.rb'
load error fatal error if library not found or if last expression in library does not evaluate as true; fatal error parsing library propagates to client require and require_once raise fatal error if library not found; include and include_once emit warnings raises ImportError if library not found; exceptions generated when parsing library propagate to client raises LoadError if library not found; exceptions generated when parsing library propagate to client
main routine in library unless (caller) {
  code
}
none if __name__ == '__main__':
  code
if $0 == __FILE__
  code
end
library path @INC

push @INC, "/some/path";
$libpath = ini_get("include_path");

ini_set("include_path",
  $libpath . ":/some/path");
sys.path

sys.path.append('/some/path')
# $: is synonym for $LOAD_PATH:
$LOAD_PATH

$LOAD_PATH
<< "/some/path"
library path environment variable $ PERL5LIB=~/lib perl foo.pl none $ PYTHONPATH=~/lib python foo.py $ RUBYLIB=~/lib ruby foo.rb
library path command line option $ perl -I ~/lib foo.pl none none $ ruby -I ~/lib foo.rb
simple global identifiers none variables defined outside of functions or with global keyword built-in functions variables which start with $
multiple label identifiers all identifiers not declared with my classes, interfaces, functions, and constants modules constants, classes, and modules
label separator
 
Foo::Bar::baz(); \Foo\Bar\baz(); foo.bar.baz() Foo::Bar.baz
root namespace definition # outside of package or in package main:
our $foo = 3;

# inside package:
our $::foo = 3;
our $main::foo = 3;
\foo none # outside of class or module; only
# constants in root namespace:

FOO = 3

# inside class or module:
::FOO = 3
namespace declaration
 
package Foo;
require Exporter;
our @ISA = ("Exporter");
our @EXPORT_OK = qw(bar baz);
namespace Foo; put declarations in foo.py class Foo
  # class definition
end

module Foo
  # module definition
end
child namespace declaration package Foo::Bar; namespace Foo\Bar; foo must be in sys.path:
$ mkdir foo
$ touch foo/__init__.py
$ touch foo/bar.py
module Foo::Bar
  # module definitions
end

module Foo
  module Bar
    # module definitions
  end
end

# classes can nest inside classes or
# modules; modules can nest in classes
namespace alias use Foo as Fu; import foo as fu Fu = Foo.dup

include Fu
unqualified import of namespace
 
# imports symbols in @EXPORT:
use Foo;
none, but a long module name can be shortened from foo import * # inside class or module:
include Foo
unqualified import of all subnamespaces # subnamespaces in list __all__ of
# foo/__init__.py are imported

from foo import *
unqualified import of definitions
 
# bar and baz must be in
# @EXPORT or @EXPORT_OK:

use Foo qw(bar baz);
only class names can be imported from foo import bar, baz none
list installed packages, install a package
 
$ perldoc perllocal
$ cpan -i Moose
$ pear list
$ pear install Math_BigInteger
$ pip freeze
$ pip install jinja2
$ gem list
$ gem install rails
package specification format in setup.py:

#!/usr/bin/env python

from distutils.core import setup

setup(
  name='foo',
  author='Joe Foo',
  version='1.0',
  description='a package',
  py_modules=['foo'])
in foo.gemspec:

spec = Gem::Specification.new do |s|
  s.name = "foo"
  s.authors = "Joe Foo"
  s.version = "1.0"
  s.summary = "a gem"
  s.files = Dir["lib/*.rb"]
end
objects
perl php python ruby
define class
 
package Int;
use Moose;

has value => (is => 'rw',
  default => 0,
  isa => 'Int');
no Moose;
1;
class Int
{
  public $value;
  function __construct($int=0)
  {
    $this->value = $int;
  }
}
class Int:
  def __init__(self, v=0):
    self.value = v
class Int
  attr_accessor :value
  def initialize(i=0)
    @value = i
  end
end
create object
 
my $i = new Int(); # or
my $i = Int->new();
my $i2 = new Int(value => 7);
$i = new Int();
$i2 = new Int(7);
i = Int()
i2 = Int(7)
i = Int.new
i2 = Int.new(7)
get and set attribute
 
my $v = $i->value;
$i->value($v+1);
$v = $i->value;
$i->value = $v+1;
v = i.value
i.value = v+1
v = i.value
i.value = v+1
instance variable accessibility private by default must be declared public; attributes starting with underscore private by convention private by default; use attr_reader, attr_writer, attr_accessor to make public
define method
 
# in package:
sub plus {
  my $self = shift;
  $self->value + $_[0];
}
function plus($i)
{
  return $this->value + $i;
}
def plus(self,v):
  return self.value + v
def plus(i)
  value + i
end
invoke method
 
$i->plus(7) $i->plus(7) i.plus(7) i.plus(7)
destructor
 
# in package:
sub DEMOLISH {
  my $self = shift;
  my $v = $self->value;
  print "bye, $v\n";
}
function __destruct()
{
  echo "bye, $this->value\n";
}
def __del__(self):
  print('bye, %d' % self.value)
val = i.value
ObjectSpace.define_finalizer(int) {
  puts "bye, #{val}"
}
method missing
 
# in package:
our $AUTOLOAD;
sub AUTOLOAD {
  my $self = shift;
  my $argc = scalar(@_);
  print "no def: $AUTOLOAD"
    . " arity: $argc\n";
}
function __call($name, $args)
{
  $argc = count($args);
  echo "no def: $name " .
    "arity: $argc\n";
}
def __getattr__(self, name):
  s = 'no def: '+name+' arity: %d'
  return lambda *a: print(s % len(a))
def method_missing(name, *a)
  puts "no def: #{name}" +
    " arity: #{a.size}"
end
inheritance
 
package Counter;
use Moose;

extends 'Int';

my $instances = 0;
sub BUILD {
  $instances += 1;
}
sub incr {
  my $self = shift;
  my $v = $self->value;
  $self->value($v + 1);
}
sub instances {
  $instances;
}
no Moose;
class Counter extends Int
{
  private static $instances = 0;
  function __construct($int=0)
  {
    Counter::$instances += 1;
    parent::__construct($int);
  }
  function incr()
  {
    $this->value++;
  }
  static function getInstances()
  {
    return $instances;
  }
}
class Counter(Int):

  instances = 0

  def __init__(self, v=0):
    Counter.instances += 1
    Int.__init__(self, v)

  def incr(self):
    self.value += 1
class Counter < Int

  @@instances = 0

  def initialize
    @@instances += 1
    super
  end

  def incr
    self.value += 1
  end

  def self.instances
    @@instances
  end
end
define class method @classmethod
def get_instances(cls):
  return Counter.instances
invoke class method
 
Counter->instances(); Counter::getInstances() Counter.get_instances Counter.instances
operator overloading class Fixnum
  def /(n)
    self.fdiv(n)
  end
end
method alias class Point

  attr_reader :x, :y, :color

  alias_method :colour, :color

  def initialize(x, y, color=:black)
    @x, @y = x, y
    @color = color
  end
end
reflection
perl php python ruby
object id
 
id(o) o.object_id
inspect type
 
ref([]) eq "ARRAY"

returns empty string if argument not a reference; returns package name for objects
gettype(array()) == "array"

returns object for objects
type([]) == list [].class == Array
basic types SCALAR
ARRAY
HASH
CODE
REF
GLOB
LVALUE
FORMAT
IO
VSTRING
Regexp
NULL
boolean
integer
double
string
array
object
resource
unknown type
NoneType
bool
int
long
float
str
SRE_Pattern
datetime
list
array
dict
object
file
NilClass
TrueClass
FalseClass
Fixnum
Bignum
Float
String
Regexp
Time
Array
Hash
Object
File
inspect class ref($o) eq "Foo" returns FALSE if not an object:
get_class($o) == "Foo"
o.__class__ == Foo
isinstance(o, Foo)
o.class == Foo
o.instance_of?(Foo)
inspect class hierarchy get_parent_class($o) o.__class__.__bases__ o.class.superclass
o.class.included_modules
has method?
 
$o->can("reverse") method_exists($o, "reverse") hasattr(o, 'reverse') o.respond_to?("reverse")
message passing
 
for $i (0..10) {
  $meth = "phone$i";
  $o->$meth(undef);
}
for ($i = 1; $i <= 10; $i++) {
  call_user_func(array($o,
    "phone$i"), NULL);
}
for i in range(1,10):
  getattr(o, 'phone'+str(i))(None)
(1..9).each do |i|
  o.send("phone#{i}=", nil)
end
eval
 
while(<>) {
  print ((eval), "\n");
}
eval evaluates to argument of return statement or NULL:
while ($line = fgets(STDIN)) {
  echo eval($line) . "\n";
}
argument of eval must be an expression:
while True:
  print(eval(sys.stdin.readline()))
loop do
  puts eval(gets)
end
list-obj object methods
 
get_class_methods($o) [m for m in dir(o)
  if callable(getattr(o,m))]
o.methods
list object attributes
 
keys %$o; get_object_vars($o) dir(o) o.instance_variables
list loaded libraries # relative to directory in lib path:
keys %INC

# absolute path:
values %INC
# relative to directory in lib path:
$LOADED_FEATURES
$"
list loaded namespaces grep { $_ =~ /::/ } keys %:: dir() Class.constants.select do |c|
  Module.const_get(c).class == Class
end
inspect namespace keys %URI:: import urlparse

dir(urlparse)
require 'uri'

URI.constants
URI.methods
URI.class_variables
pretty print
 
use Data::Dumper;

%d = (lorem=>1, ipsum=>[2, 3]);

print Dumper(\%d);
$d = array("lorem"=>1,
  "ipsum"=>array(2,3));

print_r($d);
import pprint

d = {'lorem':1, 'ipsum':[2,3]}

pprint.PrettyPrinter().pprint(d)
require 'pp'

d = {"lorem"=>1, "ipsum"=>[2,3]}

pp d
source line number and file name __LINE__
__FILE__
__LINE__
__FILE__
import inspect

cf = inspect.currentframe()
cf.f_lineno
cf.f_code.co_filename
__LINE__
__FILE__
command line documentation $ perldoc Math::Trig none $ pydoc math
$ pydoc math.atan2
$ ri Math
$ ri Math.atan2
net and web
perl php python ruby
get local hostname, dns lookup, reverse dns lookup use Sys::Hostname;
use IO::Socket;

$host = hostname;
$ip = inet_ntoa(
  (gethostbyname(hostname))[4]);
$host2 = (gethostbyaddr(
    inet_aton("10.45.234.23"),
    AF_INET))[0];
$host = gethostname();
$ip = gethostbyname($host);
$host2 = gethostbyaddr($ip);
import socket

host = socket.gethostname()
ip = socket.gethostbyname(host)
host2 = socket.gethostbyaddr(ip)[0]
require 'socket'

hostname = Socket.gethostname

ip = Socket.getaddrinfo(
  Socket.gethostname,
  "echo")[0][3]

host2 = Socket.gethostbyaddr(ip)[0]
http get
 
use LWP::UserAgent;

$url = "http://www.google.com";
$r = HTTP::Request->new(GET=>$url);
$ua = LWP::UserAgent->new;
$resp = $ua->request($r);
my $s = $resp->content();
$url = 'http://www.google.com';
$s = file_get_contents($url);
import httplib

url = 'www.google.com'
f = httplib.HTTPConnection(url)
f.request("GET", '/')
s = f.getresponse().read()
require 'net/http'

url = "www.google.com"
r = Net::HTTP.start(url, 80) do |f|
  f.get("/")
end
s = r.body
serve working directory none PHP 5.4:
$ php -S localhost:8000
$ python -m SimpleHTTPServer 8000 $ ruby -rwebrick -e \
'WEBrick::HTTPServer.new(:Port => 8000, '\
':DocumentRoot => Dir.pwd).start'
absolute url
from base and relative url
use URI;

URI->new_abs("analytics",
  "http://google.com");
none import urlparse

urlparse.urljoin('http://google.com',
  'analytics')
require 'uri'

URI.join("http://google.com", "analytics")
parse url use URI;

$url = "http://google.com:80/foo?q=3#bar";
$up = URI->new($url);

$protocol = $up->scheme;
$hostname = $up->host;
$port = $up->port;
$path = $up->path;
$query_str = $up->query;
$fragment = $up->fragment;

# flat list of alternating keys and values:
@params = $up->query_form();
$url = "http://google.com:80/foo?q=3#bar";
$up = parse_url($url);

$protocol = $up["scheme"];
$hostname = $up["host"];
$port = $up["port"];
$path = $up["path"];
$query_str = $up["query"];
$fragment = $up["fragment"];

# $params is associative array; if keys
# are reused, later values overwrite
# earlier values

parse_str($query_str, $params);
# Python 3 location: urllib.parse
import urlparse

url = 'http://google.com:80/foo?q=3#bar'
up = urlparse.urlparse(url)

protocol = up.scheme
hostname = up.hostname
port = up.port
path = up.path
query_str = up.query
fragment = up.fragment

# returns dict of lists:
params = urlparse.parse_qs(query_str)
require 'uri'

url = "http://google.com:80/foo?q=3#bar"
up = URI(url)

protocol = up.scheme
hostname = up.host
port = up.port
path = up.path
query_str = up.query
fragment = up.fragment

# Ruby 1.9; returns array of pairs:
params = URI.decode_www_form(query_str)
url encode/decode
 
use CGI;

CGI::escape("lorem ipsum?")
CGI::unescape("lorem%20ipsum%3F")
urlencode("lorem ipsum?")
urldecode("lorem+ipsum%3F")
# Python 3 location: urllib.parse
import urllib

urllib.quote_plus("lorem ipsum?")
urllib.unquote_plus("lorem+ipsum%3F")
require 'cgi'

CGI::escape("lorem ipsum?")
CGI::unescape("lorem+ipsum%3F")
base64 encode/decode use MIME::Base64;

open my $f, "<", "foo.png";
my $s = do { local $/; <$f> };
my $b64 = encode_base64($s);
my $s2 = decode_base64($b64);
$s = file_get_contents("foo.png");
$b64 = base64_encode($s);
$s2 = base64_decode($b64);
import base64

s = open('foo.png').read()
b64 = base64.b64encode(s)
s2 = base64.b64decode(b64)
require 'base64'

s = File.open("foo.png").read
b64 = Base64.encode64(s)
s2 = Base64.decode64(b64)
json generate/parse # cpan -i JSON
use JSON;

$raw = {t => 1, f => 0};
$json = JSON->new->allow_nonref;
$s = $json->encode($raw);
$d = $json->decode($s);
$a = array("t" => 1, "f" => 0);
$s = json_encode($a);
$d = json_decode($s, TRUE);
import json

s = json.dumps({'t': 1, 'f': 0})
d = json.loads(s)
# Ruby 1.8: sudo gem install json
require 'json'

s = {'t' => 1,'f' => 0}.to_json
d = JSON.parse(s)
generate xml # cpan -i XML::Writer
use XML::Writer;

my $writer = XML::Writer->new;
$writer->startTag("a");
$writer->startTag("b");
$writer->characters("foo");
$writer->endTag("b");
$writer->endTag("a");
$writer->end;
$xml = "<a></a>";
$sxe = new SimpleXMLElement($xml);
$sxe->addChild("b", "foo");
echo $sxe->asXML();
import xml.etree.ElementTree as ET

builder = ET.TreeBuilder()
builder.start('a', {})
builder.start('b', {})
builder.data('foo')
builder.end('b')
builder.end('a')

et = builder.close()
print(ET.tostring(et))
# gem install builder
require 'builder'

builder = Builder::XmlMarkup.new
xml = builder.a do |child|
  child.b("foo")
end
puts xml
parse xml
all nodes matching xpath query; first node matching xpath query
# cpan -i XML::XPath
use XML::XPath;

my $xml = "<a><b><c>foo</c></b></a>";

# fatal error if XML not well-formed
my $doc = XML::XPath->new(xml => $xml);

my $nodes = $doc->find("/a/b/c");
print $nodes->size . "\n";

$node = $nodes->get_node(0);
print $node->string_value . "\n";
print $node->getAttribute("ref") . "\n";
$xml = "<a><b><c ref='3'>foo</c></b></a>";

# returns NULL and emits warning if not
# well-formed:

$doc = simplexml_load_string($xml);

$nodes = $doc->xpath("/a/b/c");
echo count($nodes);
echo $nodes[0];

$node = $nodes[0];
echo $node;
echo $node["ref"];
from xml.etree import ElementTree

xml = '<a><b><c ref="3">foo</c></b></a>'

# raises xml.etree.ElementTree.ParseError
# if not well-formed:

doc = ElementTree.fromstring(xml)

nodes = doc.findall('b/c')
print(len(nodes))
print(nodes[0].text)

node = doc.find('b/c')
print(node.text)
print(node.attrib['ref'])
require 'rexml/document'
include REXML

xml = "<a><b><c ref='3'>foo</c></b></a>"

# raises REXML::ParseException if
# not well-formed:

doc = Document.new(xml)

nodes = XPath.match(doc,"/a/b/c")
puts nodes.size
puts nodes[0].text

node = XPath.first(doc,"/a/b/c")
puts node.text
puts node.attributes["ref"]
parse html # cpan -i Mojo::DOM
use Mojo::DOM;
$html = file_get_contents("foo.html");
$doc = new DOMDocument;
$doc->loadHTML($html);
$xpath = new DOMXPath($doc);

$nodes = $xpath->query("//a/@href");
foreach($nodes as $href) {
  echo $href->nodeValue;
}
# pip install beautifulsoup4
import bs4

html = open('foo.html').read()
doc = bs4.BeautifulSoup(html)

for link in doc.find_all('a'):
  print(link.get('href'))
# gem install nokogiri
require 'nokogiri'

html = File.open("foo.html").read
doc = Nokogiri::HTML(html)
doc = doc.xpath("//a").each do |link|
  puts link["href"]
end
unit tests
perl php python ruby
test class # cpan -i Test::Class Test::More
package TestFoo;
use Test::Class;
use Test::More;
use base qw(Test::Class);

sub test_01 : Test {
  ok(1, "not true!");
}

1;
import unittest

class TestFoo(unittest.TestCase):
  def test_01(self):
    self.assertTrue(True, 'not True!')

if __name__ == '__main__':
  unittest.main()
require 'test/unit'

class TestFoo < Test::Unit::TestCase
  def test_01
    assert(true, "not true!")
  end
end
run tests, run test method $ cat TestFoo.t
use TestFoo;
Test::Class->runtests;
$ perl ./TestFoo.t
$ python test_foo.py
$ python test_foo.py TestFoo.test_01
$ ruby test_foo.rb
$ ruby test_foo.rb -n test_01
equality assertion my $s = "do re me";
is($s, "do re me");
s = 'do re me'
self.assertEqual('do re me',
  s,
  's: {}'.format(s))
s = "do re me"
assert_equal("do re me", s)
approximate assertion x = 10.0 * (1.0 / 3.0)
y = 10.0 / 3.0

# default for delta is 0.1**7
self.assertAlmostEqual(x, y, delta=0.1**6)
x = 10.0 * (1.0 / 3.0)
y = 10.0 / 3.0

# default for delta is 0.001
assert_in_delta(x, y, 0.1**6)
regex assertion my $s = "lorem ipsum";
like($s, qr/lorem/);
s = 'lorem ipsum'
# uses re.search, not re.match:
self.assertRegexpMatches(s, 'lorem')
s = "lorem ipsum"
assert_match(/lorem/, s)
exception assertion use Test::Fatal;

ok(exception { 1 / 0 });
a = []
with self.assertRaises(IndexError):
  a[0]
assert_raises(ZeroDivisionError) do
  1 / 0
end
mock method # pip install mock
import mock

foo = Foo()
foo.run = mock.MagicMock(return_value=7)

self.assertEqual(7, foo.run(13))
foo.run.assert_called_once_with(13)
# gem install mocha
require 'mocha'

foo = mock()
foo.expects(:run).returns(7).with(13).once

foo.run(13)
setup # in class TestFoo:
sub make_fixture : Test(setup) {
  print "setting up";
};
# in class TestFoo:
def setUp(self):
  print('setting up')
# in class TestFoo:
def setup
  puts "setting up"
end
teardown # in class TestFoo:
sub teardown : Test(teardown) {
  print "tearing down";
};
# in class TestFoo:
def tearDown(self):
  print("tearing down")
# in class TestFoo:
def teardown
  puts "tearing down"
end
debugging and profiling
perl php python ruby
check syntax
 
$ perl -c foo.pl $ php -l foo.php import py_compile

# precompile to bytecode:
py_compile.compile('foo.py')
$ ruby -c foo.rb
flags for stronger and strongest warnings $ perl -w foo.pl
$ perl -W foo.pl
none $ python -t foo.py
$ python -3t foo.py
$ ruby -w foo.rb
$ ruby -W2 foo.rb
lint $ perl MO=Lint foo.pl $ sudo pip install pylint
$ pylint foo.py
source cleanup $ sudo pip install pep8
$ pep8 foo.py
run debugger $ perl -d foo.pl $ python -m pdb foo.py $ sudo gem install ruby-debug
$ rdebug foo.rb
debugger commands h l n s b c T ?? ?? p q h l n s b c w u d p q h l n s b c w u down p q
benchmark code use Benchmark qw(:all);

$t = timeit(1_000_000, '$i += 1;');
print timestr($t);
import timeit

timeit.timeit('i += 1',
  'i = 0',
  number=1000000)
require 'benchmark'

n = 1_000_000
i = 0
puts Benchmark.measure do
  n.times { i += 1 }
end
profile code $ perl -d:DProf foo.pl
$ dprofpp
$ python -m cProfile foo.py $ sudo gem install ruby-prof
$ ruby-prof foo.rb
java interoperation
perl php python ruby
version
 
Jython 2.5 JRuby 1.4
repl
 
$ jython $ jirb
interpreter
 
$ jython $ jruby
compiler
 
none in 2.5.1 $ jrubyc
prologue
 
import java none
new
 
rnd = java.util.Random() rnd = java.util.Random.new
method
 
rnd.nextFloat() rnd.next_float
import
 
from java.util import Random

rnd = Random()
java_import java.util.Random
rnd = Random.new
non-bundled java libraries
 
import sys

sys.path.append('path/to/mycode.jar')
import MyClass
require 'path/to/mycode.jar'
shadowing avoidance
 
import java.io as javaio module JavaIO
  include_package "java.io"
end
convert native array to java array
 
import jarray

jarray.array([1,2,3],'i')
[1,2,3].to_java(Java::int)
are java classes subclassable?
 
yes yes
are java class open?
 
no yes
__________________________________________ __________________________________________ __________________________________________ __________________________________________

File Handles

standard file handles

The names for standard input, standard output, and standard error.

read line from stdin

How to read a line from standard input.

The illustrated function read the standard input stream until a end-of-line marker is found or the end of the stream is encountered. Only in the former case will the returned string be terminated by an end-of-line marker.

php:

fgets takes an optional second parameter to specify the maximum line length. If the length limit is encountered before a newline, the string returned will not be newline terminated.

ruby:

gets takes an optional parameter to specify the maximum line length. If the length limit is encountered before a newline, the string returned will not be newline terminated.

end-of-file behavior

What happens when attempting to read a line and the seek point is after the last newline or at the end of the file.

chomp

Remove a newline, carriage return, or carriage return newline pair from the end of a line if there is one.

php:

chop removes all trailing whitespace. It is an alias for rtrim.

perl:

chomp modifies its argument, which thus must be a scalar, not a string literal.

python:

Python strings are immutable. rstrip returns a modified copy of the string. rstrip('\r\n') is not identical to chomp because it removes all contiguous carriage returns and newlines at the end of the string.

ruby:

chomp! modifies the string in place. chomp returns a modified copy.

write line to stdout

How to write a line to standard out. The line will be terminated by an operating system appropriate end of line marker.

python:

print appends a newline to the output. To suppress this behavior, put a trailing comma after the last argument. If given multiple arguments, print joins them with spaces.

In Python 2 print parses as a keyword and parentheses are not required:

print "Hello, World!"

ruby:

puts appends a newline to the output. print does not.

write formatted string to stdout

How to format variables and write them to standard out.

The function printf from the C standard library is a familiar example. It has a notation for format strings which uses percent signs %. Many other languages provide an implementation of printf.

open file for reading

How to open a file for reading.

perl:

Before Perl 5.6 one would create and store file handles in bare words:

open FIN, "/etc/hosts";
open FOUT, ">/tmp/test";

my $line = <FIN>;
print FOUT $line;

close FIN;
close FOUT;

To pass a bare word file handle to a function one would use a combination of the typeglob operator and the reference operator to store the file handle in a scalar:

sub log_msg {
  my $fh = shift;
  my $msg = shift;

  print $fh $msg;
}

open(LOG, ">>/tmp/test.log");

log_msg(\*LOG, "log opened for append\n");

In Perl 5.6 and later, the open function, when given an undefined scalar (or a scalar declaration) as a first argument will create and store the file handle in that scalar.

Using scalars to manipulate file handles is preferred.

  • Scalars can be passed to functions.
  • Scalars can have local scope whereas bare words are always global.
  • A file handle stored in a scalar is implicitly closed when the scalar goes out of scope.
  • Under 'use strict', misspelled scalar names are caught by the interpreter, but misspelled bare words are not.

ruby:

When File.open is given a block, the file is closed when the block terminates.

open file for writing

How to open a file for writing. If the file exists its contents will be overwritten.

open file for appending

How to open a file with the seek point at the end of the file. If the file exists its contents will be preserved.

close file

How to close a file.

close file implicitly

How to have a file closed when a block is exited.

python:

File handles are closed when the variable holding them is garbage collected, but there is no guarantee when or if a variable will be garbage collected.

ruby:

File handles are closed when the variable holding them is garbage collected, but there is no guarantee when or if a variable will be garbage collected.

i/o errors

How I/O errors are treated.

read line

How to read up to the next newline in a file.

iterate over file by line

How to iterate over a file line by line.

read file into array of string

How to put the lines of a file into an array of strings.

read file into string

How to put the contents of a file into a single string.

write string

How to write a string to a file handle.

write line

How to write a line to a file handle. An operating system appropriate end-of-line marker is appended to the output.

**php:

Newlines in strings are translated to the operating system appropriate line terminator unless the file handle was opened with a mode string that contained 'b'.

perl:

Newlines in strings are translated to the operating system appropriate line terminator unless the file handle has been set to binary mode with the binmode function.

python:

When file handles are opened with the mode strings 'r', 'w', or 'a', the file handle is in text mode. In text mode the operating system line terminator is translated to '\n' when reading and '\n' is translated back to the operating system line terminator when writing. The standard file handles sys.stdin, sys.stdout, and sys.stderr are opened in text mode.

When file handles are opened with the mode strings 'rb', 'rw', or 'ra', the file handle is in binary mode and line terminator translation is not performed. The operating system line terminator is available in os.linesep.

flush file handle

How to flush a file handle that has been written to.

end-of-file test

How to test whether the seek point of a file handle is at the end of the file.

get and set file handle position

How to get or set the file handle seek point.

The seek point is where the next read on the file handle will begin. The seek point is measured in bytes starting from zero.

open temporary file

How to get a file handle to a file that will be removed automatically sometime between when the file handle is closed and the interpreter exits.

The file is guaranteed not to have existed before it was opened.

The file handle is opened for both reading and writing so that the information written to the file can be recovered
by seeking to the beginning of the file and reading from the file handle.

On POSIX operating systems it is possible to unlink a file after opening it. The file is removed from the directory but continues to exist as long as the
file handle is open. This guarantees that no other process will be able to read or modify the file contents.

php:

Here is how to create a temporary file with a name:

$path = tempnam(sys_get_temp_dir(), "");
$f = fopen($path, "w+");

perl:

How to unlink a temporary file on open:

use File::Temp;

$f = File::Temp->new(UNLINK=>1);

python:

To unlink a temporary file on open, used TemporaryFile instead of NamedTemporaryFile:

import tempfile

f = tempfile.TemporaryFile()

in memory file

How to create a file descriptor which writes to an in-memory buffer.

python:

StringIO also supports the standard methods for reading input. To use them the client must first seek to the beginning of the in-memory file:

f = StringIO()
f.write('lorem ipsum\n')
f.seek(0)
r.read()

Files

file test, regular file test

How to test whether a file exists; how to test whether a file is a regular file (i.e. not a directory, special device, or named pipe).

file size

How to get the file size in bytes.

is file readable, writable, executable

How to test whether a file is readable, writable, or executable.

python:

The flags can be or'ed to test for multiple permissions:

os.access('/etc/hosts', os.R_OK | os.W_OK | os.X_OK)

set file permissions

How to set the permissions on the file.

For Perl, Python, and Ruby, the mode argument is in the same format as the one used with the Unix chmod command. It uses bitmasking to get the various permissions which is why it is normally an octal literal.

The mode argument should not be provided as a string such as "0755". Python and Ruby will raise an exception if a string is provided. Perl will convert "0755" to 755 and not 0755 which is equal to 493 in decimal.

copy file, remove file, rename file

How to copy a file; how to remove a file; how to rename a file.

create symlink, symlink test, readlink

How to create a symlink; how to test whether a file is a symlink; how to get the target of a symlink.

generate unused file name

How to generate an unused file name. The file is created to avoid a race condition with another process looking for an unused file name.

The file is not implicitly deleted.

Directories

working directory

How to get and set the working directory.

build pathname

How to construct a pathname without hard coding the system file separator.

dirname and basename

How to extract the directory portion of a pathname; how to extract the non-directory portion of a pathname.

absolute pathname

How to get the get the absolute pathname for a pathname. If the pathname is relative the working directory will be appended.

In the examples provided, if /foo/bar is the working directory and .. is the relative path, then the return value is foo

perl:

File::Spec->rel2abs is similar to Cwd::abs_path except that paths with ".." will not be simplified. If the working directory is /foo/bar then the output of File::Spec->rel2abs("..") is

/foo/bar/..

iterate over directory by file

How to iterate through the files in a directory.

In PHP, Perl, and Ruby, the files representing the directory itself . and the parent directory .. are returned.

php:

The code in the example will stop if a filename which evaluates as FALSE is encountered. One such filename is "0". A safer way to iterate through the directory is:

if ($dir = opendir("/etc")) {
  while (FALSE !== ($file = readdir($dir))) {
    echo "$file\n";
  }
  closedir($dir);
}

python:

file() is the file handle constructor. file can be used as a local variable name but doing so hides the constructor. It can still be invoked by the synonym open(), however.

os.listdir() does not return the special files . and .. which represent the directory itself and the parent directory.

glob paths

How to iterate over files using a glob pattern.

Glob patterns employ these special characters:

* matches zero or more characters, the first of which is not . and none of which is /
? matches one character
[ ] matches one character from the list inside the brackets
\ escapes one of the previous characters

Use glob patterns instead of simple directory iteration when

  • dot files, including the directory itself (.) and the parent directory (..), should skipped
  • a subset of the files in a directory, where the subset can be specified with a glob pattern, is desired
  • files from multiple directories, where the directories can be specified with a glob pattern, are desired
  • the full pathnames of the files is desired

php:

glob takes a second argument for flags. The flag GLOB_BRACE enables brace notation.

python:

glob.glob returns a list. glob.iglob accepts the same arguments and returns an iterator.

ruby:

Ruby globs support brace notation.

A brace expression matches any of the comma separated strings inside the braces.

Dir.glob("/{bin,etc,usr}/*").each do |path|
  puts path
end

make directory

How to create a directory.

If needed, the examples will create more than one directory.

No error will result if a directory at the pathname already exists. An exception will be raised if the pathname is occupied by a regular file, however.

recursive copy

How to perform a recursive copy. If the source is a directory, then the directory and all its contents will be copied.

remove empty directory

How to remove an empty directory. The operation will fail if the directory is not empty.

remove directory and contents

How to remove a directory and all its contents.

directory test

How to determine if a pathname is a directory.

generate unused directory

How to generate an unused directory. The directory is created to avoid a race condition with another process looking for an unused directory.

The directory is not implicitly deleted.

ruby:

When Dir.mktmpdir is provided with a block the directory is deleted after the block finishes executing:

require 'tmpdir'
require 'fileutils'

Dir.mktmpdir("/tmp/foo") do |path|
  puts path
  FileUtils.cp("/etc/hosts", "#{path}/hosts")
end

system temporary file directory

The name of the system provided directory for temporary files.

On Linux the directory is often /tmp, and the operating system is often configured to delete the contents of /tmp at boot.

Processes and Environment

command line arguments

How to access arguments provided at the command line when the script was run; how to get the name of the script.

command line options

How to process command line options.

We describe the style used by getopt_long from the C standard library. The characteristics of this style are:

  • Options can be short or long. Short options are a single character preceded by a hyphen. Long options are a word preceded by two hyphens.
  • A double hyphen by itself can be used to terminate option processing. Arguments after the double hyphen are treated as positional arguments and can start with a hyphen.
  • Options can be declared to be with or without argument. Options without argument are used to set a boolean value to true.
  • Short options without argument can share a hyphen.
  • Long options can be separated from their argument by a space or an equals sign (=). Short options can be separated from their argument by nothing, a space, or an equals sign (=).

The option processing function should identify the positional arguments. These are the command line arguments which are not options, option arguments, or the double hyphen used to terminate option processing. getopt_long permits options to occur after positional arguments.

perl:

The type of an option argument can be specified as a string, integer, or float. If the argument cannot be converted to an integer or float a warning is written to standard error and GetOptions() returns a non-zero value.

if (GetOptions("file=s" => \$file,
               "count=i" => \$count,
               "ratio=f" => \$ratio)) {
  ...

python:

The type of an argument can be specified using the named parameter type:

parser.add_argument('--count', '-c', dest='count', type=int)
parser.add_argument('--ratio', '-r', dest='ratio', type=float)

If the argument cannot be converted to the type, the script prints out a usage statement and exits with a non-zero value.

The default value is None, but this can be changed using the named parameter default:

parser.add_argument('--file', '-f', dest='file', default='tmpfile')
parser.add_argument('--count', '-c', dest='count', type=int, default=1)
parser.add_argument('--ratio', '-r', dest='ratio', type=float, default=0.5)

get and set environment variable

How to get and set an environment variable. If an environment variable is set the new value is inherited by child processes.

php:

putenv returns a boolean indicating success. The command can fail because when PHP is running in safe mode only some environment variables are writable.

get pid, parent pid

How to get the process id of the interpreter process; how to get the id of the parent process.

perl:

When use English is in effect, the interpreter pid is in the variables $PID and $PROCESS_ID.

ruby:

The process pid is also available in the global variable $$.

get user id and name

How to get the user id of the interpreter process; how to get the username associated with the user id.

When writing a setuid application on Unix, there is a distinction between the real user id and the effective user id. The code examples return the real user id.

The process may be able to determine the username by inspecting environment variables. A POSIX system is required to set the environment variable LOGNAME at login. Unix systems often set USER at login, and Windows systems set %USERNAME%. There is nothing to prevent the user from altering any of these environment variables after login. The methods illustrated in the examples are thus more secure.

perl:

When use English is in effect the user id is also available in the variables $UID and $REAL_USER_ID.

The effective user id is available in $>. When use English is in effect the effective user id is also available in $EUID and $EFFECTIVE_USER_ID.

python:

How to get the effective user id:

os.geteuid()

ruby:

How to get the effective user id:

Process.euid

exit

python:

It is possible to register code to be executed upon exit:

import atexit
atexit.register(print, "goodbye")

It is possible to terminate a script without executing registered exit code by calling os._exit.

ruby:

It is possible to register code to be executed upon exit:

at_exit { puts "goodbye" }

The script can be terminated without executing registered exit code by calling exit!.

set signal handler

How to register a signal handling function.

executable test

How to test whether a file is executable.

external command

How to execute an external command.

escaped external command

How to prevent shell injection.

backticks

How to invoke an external command and read its output into a variable.

The use of backticks for this operation goes back to the Bourne shell (1977).

perl:

The qx operator can be used with any delimiter. If the opening delimiter is (, [, or {, the closing delimiter must be ), ], or }.

python:

A more concise solution is:

file = os.popen('ls -l /tmp').read()

os.popen was marked as deprecated in Python 2.6 but it is still available in Python 2.7 and Python 3.2.

ruby:

%x can be used with any delimiter. If the opening delimiter is (, [, or {, the closing delimiter must be ), ], or }.

Libraries and Namespaces

Terminology used in this sheet:

  • library: code in its own file that can be included, loaded, or linked by client code.
  • client: code which calls code in a separate file.
  • top-level file or top-level script: the file containing the code in the program which executes first.
  • load: to add definitions in a file to the text of a running process.
  • namespace: a set of names that can be imported as a unit.
  • import: to add definitions defined elsewhere to a scope.
  • unqualified import: to add definitions to a scope using the same identifiers as where they are defined.
  • qualified import: to add definitions to a scope. The identifiers in the scope are derived from the original identifiers in a formulaic manner. Usually the name of the namespace is added as a prefix.
  • label: one of the parts of a qualified identifier.
  • alias import: to add a definition to a scope under an identifier which is specified in the import statement.
  • package: one or more libraries that can be installed by a package manager.

load library

Execute the specified file. Normally this is used on a file which only contains declarations at the top level.

perl:

The last expression in a perl library must evaluate to true.

php:

include_once behaves like require_once except that it is not fatal if an error is encountered executing the library.

load library in subdirectory

How to load a library in a subdirectory of the library path.

hot patch

How to reload a library. Altered definitions in the library will replace previous versions of the definition.

php:

Also include.

load error

How errors which are encountered while loading libraries are handled.

main routine in library

How to put code in a library which executes only when the file is run as a top-level script.

library path

The library path is a list of directory paths which are searched when loading libraries.

library path environment variable

How to augment the library path by setting an environment variable before invoking the interpreter.

library path command line option

How to augment the library path by providing a command line option when invoking the interpreter.

simple global identifiers

multiple label identifiers

label separator

The punctuation used to separate the labels in the full name of a subnamespace.

root namespace definition

namespace declaration

How to declare a section of code as belonging to a namespace.

subnamespace declaration

How to declare a section of code as belonging to a subnamespace.

import namespace

import subnamespace

import all definitions in namespace

How to import all the definitions in a namespace.

import definitions

How to import specific definitions from a namespace.

list installed packages, install a package

How to show the installed 3rd party packages, and how to install a new 3rd party package.

perl

cpanm is an alternative to cpan which is said to be easier to use.

How to use cpan to install cpanm:

$ sudo cpan -i App::cpanminus

How to install a module with cpanm:

$ sudo cpanm Moose

python

Two ways to list the installed modules and the modules in the standard library:

$ pydoc modules
$ python
>>> help('modules')

Most 3rd party Python code is packaged using distutils, which is in the Python standard library. The code is placed in a directory with a setup.py file. The code is installed by running the Python interpreter on setup.py:

package specification format

The format of the file used to specify a package.

python:

distutils.core reference

How to create a Python package using distutils. Suppose that the file foo.py contains the following code:

def add(x, y):
    return x+y

In the same directory as foo.py create setup.py with the following contents:

#!/usr/bin/env python

from distutils.core import setup

setup(name='foo',
      version='1.0',
      py_modules=['foo'],
     )

Create a tarball of the directory for distribution:

$ tar cf foo-1.0.tar foo
$ gzip foo-1.0.tar

To install a tar, perform the following:

$ tar xf foo-1.0.tar.gz
$ cd foo
$ sudo python setup.py install

If you want people to be able to install the package with pip, upload the tarball to the Python Package Index.

ruby:

gemspec attributes

For an example of how to create a gem, create a directory called foo. Inside it create a file called lib/foo.rb which contains:

def add(x, y)
  x + y
end

Then create a file called foo.gemspec containing:

spec = Gem::Specification.new do |s|
  s.name = 'foo'
  s.authors = 'Joe Foo'
  s.version = '1.0'
  s.summary = 'a gem'
  s.files = Dir['lib/*.rb']
end

To create the gem, run this command:

$ gem build foo.gemspec

A file called foo-1.0.gem is created. To install foo.rb run this command:

$ gem install foo-1.0.gem

Objects

define class

php:

Properties (i.e. instance variables) must be declared public, protected, or private. Methods can optionally be declared public, protected, or private. Methods without a visibility modifier are public.

perl:

The sheet shows how to create objects using the CPAN module Moose. To the client of an object, Moose objects and traditional Perl objects are largely indistinguishable. Moose provides convenience functions to aid in the definition of a class, and as a result a Moose class definition and a traditional Perl class definition look quite different.

The most common keywords used when defining a Moose class are has, extends, subtype.

The before, after, and around keywords are used to define method modifiers. The with keyword indicates that a Moose class implements a role.

The no Moose; statement at the end of a Moose class definition removes class definition keywords, which would otherwise be visible to the client as methods.

Here is how to define a class in the traditional Perl way:

package Int;

sub new {
  my $class = shift;
  my $v = $_[0] || 0;
  my $self = {value => $v};
  bless $self, $class;
  $self;
}

sub value {
  my $self = shift;
  if ( @_ > 0 ) {
    $self->{'value'} = shift;
  }
  $self->{'value'};
}

sub add {
  my $self = shift;
  $self->value + $_[0];
}

sub DESTROY {
  my $self = shift;
  my $v = $self->value;
  print "bye, $v\n";
}

python:

As of Python 2.2, classes are of two types: new-style classes and old-style classes. The class type is determined by the type of class(es) the class inherits from. If no superclasses are specified, then the class is old-style. As of Python 3.0, all classes are new-style.

New-style classes have these features which old-style classes don't:

  • universal base class called object.
  • descriptors and properties. Also the __getattribute__ method for intercepting all attribute access.
  • change in how the diamond problem is handled. If a class inherits from multiple parents which in turn inherit from a common grandparent, then when checking for an attribute or method, all parents will be checked before the grandparent.

create object

How to create an object.

get and set attribute

How to get and set an attribute.

perl:

Other getters:

$i->value()
$i->{'value'}

Other setters:

$i->{'value'} = $v;

python:

Defining explicit setters and getters in Python is considered poor style. If it becomes necessary to extra logic to attribute, this can be achieved without disrupting the clients of the class by creating a property:

def getValue(self):
  print("getValue called")
  return self.__dict__['value']
def setValue(self,v):
  print("setValue called")
  self.__dict__['value'] = v
value = property(fget=getValue, fset = setValue)

instance variable accessibility

How instance variable access works.

define method

How to define a method.

invoke method

How to invoke a method.

perl:

If the method does not take any arguments, the parens are not necessary to invoke the method.

destructor

How to define a destructor.

perl:

Perl destructors are called when the garbage collector reclaims the memory for an object, not when all references to the object go out of scope. In traditional Perl OO, the destructor is named DESTROY, but in Moose OO it is named DEMOLISH.

python:

A Python destructor is not guaranteed to be called when all references to an object go out of scope, but apparently this is how the CPython implementations work.

ruby:

Ruby lacks a destructor. It is possible to register a block to be executed before the memory for an object is released by the garbage collector. A ruby
interpreter may exit without releasing memory for objects that have gone out of scope and in this case the finalizer will not get called. Furthermore, if the finalizer block holds on to a reference to the object, it will prevent the garbage collector from freeing the object.

method missing

How to handle when a caller invokes an undefined method.

php:

Define the method __callStatic to handle calls to undefined class methods.

python:

__getattr__ is invoked when an attribute (instance variable or method) is missing. By contrast, __getattribute__, which is only available in Python 3, is always invoked, and can be used to intercept access to attributes that exist. __setattr__ and __delattr__ are invoked when attempting to set or delete attributes that don't exist. The del statement is used to delete an attribute.

ruby:

Define the method self.method_missing to handle calls to undefined class methods.

inheritance

How to use inheritance.

perl:

Here is how inheritance is handled in traditional Perl OO:

package Counter;

our @ISA = "Int";

my $instances = 0;
our $AUTOLOAD;

sub new {
  my $class = shift;
  my $self = Int->new(@_);
  $instances += 1;
  bless $self, $class;
  $self;
}

sub incr {
  my $self = shift;
  $self->value($self->value + 1);
}
sub instances {
  $instances;
}

sub AUTOLOAD {
  my $self = shift;
  my $argc = scalar(@_);
  print "undefined: $AUTOLOAD " .
    "arity: $argc\n";
}

define class method

invoke class method

How to invoke a class method.

operator overloading

How to define the behavior of the binary operators.

method alias

How to create an alias for a method.

ruby:

Ruby provides the keyword alias and the method alias_method in the class Module. Inside a class body they behave identifically. When called from inside a method alias has no effect but alias_method works as expected. Hence some recommend always using alias_method.

Reflection

object id

How to get an identifier for an object or a value.

inspect type

php:

The PHP manual says that the strings returned by gettype are subject to change and advises using the following predicates instead:

is_null
is_bool
is_numeric
is_int
is_float
is_string
is_array
is_object
is_resource

perl:

ref returns the empty string when the argument is not a scalar containing a reference. If the argument is a reference, ref returns the package name if it points to a blessed object. Otherwise it returns the name of the built-in type.

basic types

php:

All possible return values of gettype are listed.

perl:

All the built-in types are listed.

inspect class

How to get the class of an object.

inspect class hierarchy

has method?

perl:

$a->can() returns a reference to the method if it exists, otherwise it returns undef.

python:

hasattr(o,'reverse') will return True if there is an instance variable named 'reverse'.

message passing

eval

How to interpret a string as code and return its value.

php:

The value of the string is the value of of the return statement that terminates execution. If execution falls off the end of the string without encountering a return statement, the eval evaluates as NULL.

python:

The argument of eval must be an expression or a SyntaxError is raised. The Python version of the mini-REPL is thus considerably less powerful than the versions for the other languages. It cannot define a function or even create a variable via assignment.

list object methods

list object attributes

perl:

keys %$a assumes the blessed object is a hash reference.

python:

dir(o) returns methods and instance variables.

pretty print

How to display the contents of a data structure for debugging purposes.

source line number and file name

How to get the current line number and file name of the source code.

command line documentation

How to get documentation from the command line.

ruby:

Searching for Math.atan2 will return either class method or instance method documentation. If there is documentation for both one can be specific with the following notation:

$ ri Math::atan2
$ ri Math#atan2

Net and Web

get local hostname, dns lookup, reverse dns lookup

How to get the hostname and the ip address of the local machine without connecting to a socket.

The operating system should provide a method for determining the hostname. Linux provides the uname system call.

A DNS lookup can be performed to determine the IP address for the local machine. This may fail if the DNS server is unaware of the local machine or if the DNS server has incorrect information about the local host.

A reverse DNS lookup can be performed to find the hostname associated with an IP address. This may fail for the same reasons a forward DNS lookup might fail.

http get

How to make an HTTP GET request and read the response into a string.

serve working directory

A command line invocation to start a single process web server which serves the working directory at http://localhost:8000.

perl:

The following webserver serves files but does not provide directory listings:

$ sudo cpan -i IO::All

$ perl -MIO::All -e 'io(":8000")->fork->accept->(sub { $_[0] < io(-x $1 ? "./$1 |" : $1) if /^GET \/(.*) / })'

absolute url

How to construct an absolute URL from a base URL and a relative URL as documented in RFC 1808.

When constructing the absolute URL, the rightmost path component of the base URL is removed unless it ends with a slash /. The query string and fragment of the base URL are always removed.

If the relative URL starts with a slash / then the entire path of the base URL is removed.

If the relative URL starts with one or more occurrences of ../ then one or more path components are removed from the base URL.

The base URL and the relative URL will be joined by a single slash / in the absolute URL.

php:

Here is a PHP function which computes absolute urls.

parse url

How to extract the protocol, host, port, path, query string, and fragment from a URL. How to extract the parameters from the query string.

python:

urlparse can also be used to parse FTP URLs:

up = urlparse.urlparse('ftp://foo:bar@google.com/baz;type=binary')

# 'foo'
up.username

# 'bar'
up.password

# 'type=binary'
up.params

ruby:

How to parse an FTP URL:

up = URI('ftp://foo:bar@google.com/baz;type=binary')

# "foo"
 up.user

# up.password
"bar"

# "binary"
up.typecode

url encode/decode

How to URL encode and URL unencode a string.

URL encoding, also called percent encoding, is described in RFC 3986. It replaces all characters except for the letters, digits, and a few punctuation marks with a percent sign followed by their two digit hex encoding. The characters which are not escaped are:

A-Z a-z 0-9 - _ . ~

URL encoding can be used to encode UTF-8, in which case each byte of a UTF-8 character is encoded separately.

When form data is sent from a browser to a server via an HTTP GET or an HTTP POST, the data is percent encoded but spaces are replaced by plus signs + instead of %20. The MIME type for form data is application/x-www-form-urlencoded.

perl:

CGI::escape replaces spaces with %20. CGI::unescape will replace both + and %20 with a space, however.

python:

In Python 3 the functions quote_plus, unquote_plus, quote, and unquote moved from urllib to urllib.parse.

urllib.quote replaces a space character with %20.

urllib.unquote does not replace + with a space character.

base64 encode/decode

How to encode binary data in ASCII using the Base64 encoding scheme.

A popular Base64 encoding is the one defined by RFC 2045 for MIME. Every 3 bytes of input is mapped to 4 of these characters: [A-Za-z0-9/+].
If the input does not consist of a multiple of three characters, then the output is padded with one or two hyphens: =.

Whitespace can inserted freely into Base64 output; this is necessary to support transmission by email. When converting Base64 back to binary whitespace is ignored.

json generate/parse

How to encode JSON sdata in a JSON string; how to decode such a string.

JSON data consists of objects, arrays, and JSON values. Objects are dictionaries in which the keys are strings and the values are JSON values. Arrays contain JSON values. JSON values can be objects, arrays, strings, numbers, true, false, or null.

A JSON string is JSON data encoded using the corresponding literal notation used by JavaScript source code.

JSON strings are sequences of Unicode characters. The following backslash escape sequences are supported:

  \" \\ \/ \b \f \n \r \t \uhhhh.

generate xml

How to build an XML document.

An XML document can be constructed by concatenating strings, but the techniques illustrated here guarantee the result to be well-formed XML.

parse xml

How to parse XML and extract nodes using XPath.

ruby:

Another way of handling an XPath expression which matches multiple nodes:

XPath.each(doc,"/a/b/c") do |node|
  puts node.text
end

parse html

How to parse an HTML document.

Unit Tests

test class

How to define a test class and make a truth assertion.

The argument of a truth assertion is typically an expression. It is a good practice to include a failure message as a second argument which prints out variables in the expression.

perl:

If there is more than one assertion in a test, then set the Test attribute appropriately in the test method signature to quiesce a warning:

sub test_01 : Test(2) {
  ok(1);
  ok(2);
}

run tests; run test method

How to run all the tests in a test class; how to run a single test from the test class.

equality assertion

How to test for equality.

python:

Note that assertEquals does not print the values of its first two arguments when the assertion fails. A third argument can be used to provide a more informative failure message.

approximate assertion

How to assert that two floating point numbers are approximately equal.

regex assertion

How to test that a string matches a regex.

exception assertion

How to test whether an exception is raised.

mock method

How to create a mock method.

A mock method is used when calling the real method from a unit test would be undesirable. The method that is mocked is not in the code that is being tested, but rather a library which is used by that code. Mock methods can raise exceptions if the test fails to invoke them or if the wrong arguments are provided.

python:

assert_called_once_with can takes the same number of arguments as the method being mocked.

If the mock method was called multiple times, the method assert_called_with can be used in place of asert_called_once_with to make an assertion about the arguments that were used in the most recent call.

A mock method which raises an exception:

foo = Foo()
foo.run = mock.Mock(side_effect=KeyError('foo'))

with self.assertRaises(KeyError):
  foo.run(13)

foo.run.assert_called_with(13)

ruby:

The with method takes the same number of arguments as the method being mocked.

Other methods are available for use in the chain which defines the assertion. The once method can be replaced by never or twice. If there is uncertainty about how often the method will be called one can used at_least_once, at_least(m), at_most_once, at_most(n) to set lower or upper bounds. times(m..n) takes a range to set both the lower and upper bound.

A mock method which raises an exception:

    foo = mock()
    foo.expects(:run).
      raises(exception = RuntimeError, message = 'bam!').
      with(13).
      once

    assert_raises(RuntimeError) do
      foo.run(13)
    end

There is also a method called yields which can be used in the chain which defines the assertion. It makes the mock method yield to a block. It takes as arguments the arguments it passes to the block.

setup

How to define a setup method which gets called before every test.

teardown

How to define a cleanup method which gets called after every test.

Debugging and Profiling

check syntax

How to check the syntax of code without executing it.

flags for stronger and strongest warnings

Flags to increase the warnings issued by the interpreter.

perl:

The

use warnings;

pragma is the same as the -w flag except that warnings are only issued for constructs in the current scope.

python:

The -t flag warns about inconsistent use of tabs in the source code. The -3 flag is a Python 2.X option which warns about syntax which is no longer valid in Python 3.X.

lint

A lint tool.

source cleanup

A tool which detects or removes semantically insignificant variation in the source code.

run debugger

How to run a script under the debugger.

debugger commands

A selection of commands available when running the debugger. The gdb commands are provided for comparison.

cmd perl -d python -m pdb rdebug gdb
help h h h h
list l [first, last] l [first, last] l [first, last] l [first, last]
next statement n n n n
step into function s s s s
set breakpoint b b [file:]line
b function
b [file:]line
b class[.method]
b [file:]line
list breakpoints L b info b i b
delete breakpoint B num cl num del num d num
continue c c c c
show backtrace T w w bt
move up stack u u u
move down stack d down do
print expression p expr p expr p expr p expr
(re)run R restart [arg1[, arg2 …]] restart [arg1[, arg2 …]] r [arg1[, arg2 …]]
quit debugger q q q q

benchmark code

How to run a snippet of code repeatedly and get the user, system, and total wall clock time.

profile code

How to run the interpreter on a script and get the number of calls and total execution time for each function or method.

perl:

perl -d:DProf writes the profiling information to the file tmon.out in the current directory. dprofpp reads that file.

Java Interoperation

Both Python and Ruby have JVM implementations. It is possible to compile both Python code and Ruby code to Java bytecode and run it on the JVM. It is also possible to run a version of the Python interpreter or the Ruby interpreter on the JVM which reads Python code or Ruby code, respectively.

version

Version of the scripting language JVM implementation used in this reference sheet.

repl

Command line name of the repl.

interpreter

Command line name of the interpreter.

compiler

Command line name of the tool which compiles source to java byte code.

prologue

Code necessary to make java code accessible.

new

How to create a java object.

method

How to invoke a java method.

import

How to import names into the current namespace.

import non-bundled java library

How to import a non-bundled Java library

shadowing avoidance

How to import Java names which are the same as native names.

convert native array to java array

How to convert a native array to a Java array.

are java classes subclassable?

Can a Java class be subclassed?

are java classes open?

Can a Java array be monkey patched?

Unless otherwise stated, the content of this page is licensed under Creative Commons Attribution-ShareAlike 3.0 License