Tardate 2016: 10/01/2008

Monday, October 27, 2008

Synchronising two directory trees - tree-sync 2.4

tree-sync.pl has had a little update thanks to a contribution from Dave Stafford.

Added option 'diff' to view only changes

Added option 'ignore' to exclude certain extensions

Added option 'brief' to remove src/dest directories in report view to make cleaner output

In the 2.4 release I've also added the 'force' option to control read-only file handling.

tree-sync.pl is a bit of a geeky tool, suitable for cases where you want detailed control over a file-based copy/mirror/backup process from the command-line. There are many flashier solutions available these days, but if tree-sync scratches your itch - go for it!

tree-sync.pl continues to be available on CPAN, but from the 2.4 release I have moved the source to github. If you have changes to suggest or bugs to fix, please go ahead and join this project.

Tuesday, October 21, 2008

Rolling Project Euler on Ruby

I first heard about Project Euler last week on the stackoverflow podcast. Michael Pryor (fogcreek co-founder) makes a quick side reference in discussion with Joel Spolsky, Jeff Atwood and the rest of the SO team.

Well I checked it out last week, got hooked and spent most of the weekend earning my "level 1" badge;-)

Aside from dusting off some long-forgotten and arcane knowledge from my youth, I found it a fantastic opportunity to stretch my fundamental ruby chops. In fact, I'd recommend a few questions at Project Euler as a right-of-passage whenever you are learning a new programming language.

I've only been using ruby for a year or so, and in that time thought I had picked up a fair bit. But I was still able to amaze myself at just how many of the problems were knocked over in just 1 or 2 lines with a bit of duck punching and liberal use of blocks with Enumerables.

I'm late to the Project Euler craze, so you will already find many people posting hints for specific questions if you just google. I thought I'd share some of the "common code" I've been building up as I go through the questions.

I put a recent copy of the source up on github for anyone who is interested (MyMath.rb), but what follows a sampling of some of the more interesting pieces.

First thing you will note is that I have written all these "common" routines as extensions to some of the fundamental classes in the ruby library: Integer, Array, String.

It doesn't have to be this way, and for less trivial activities you might be right to be concerned about messing with the behaviour of the standard classes. Maybe I'm still enjoying my ruby honeymoon period, but I do get a thrill out of being able to write things like

1551.palindrome?
=> true

Integer Extensions

It's just so easy to code up simple calculation and introspection routines..


class Integer
  # @see project euler #15,20,34
  def factorial
    (2..self).inject(1) { |prod, n| prod * n }
  end

  # sum of digits in the number, expressed as a decimal
  # @see project euler #16, 20
  def sum_digits
    self.to_s.split('').inject(0) { |memo, c| memo + c.to_i }
  end

  # num of digits in the number, expressed as a decimal
  # @see project euler #25
  def num_digits
    self.to_s.length
  end
  
  # tests if all the base10 digits in the number are odd
  # @see project euler #35
  def all_digits_odd?
    self.to_s.split('').inject(0) { |memo, s| memo + ( s.to_i%2==0 ? 1 : 0 ) } == 0
  end
  
  # generates triangle number for this integer
  # http://en.wikipedia.org/wiki/Triangle_number
  # @see project euler #42
  def triangle
    self * ( self + 1 ) / 2
  end
end

Prime numbers feature heavily on Project Euler, and I think calculating a prime series was my first lesson on why you can't brute-force everything;-) Enter the Sieve of Eratosthenes and related goodness..

class Integer 
  # http://en.wikipedia.org/wiki/Prime_factor
  # @see project euler #12
  def prime_factors
    primes = Array.new
    d = 2  
    n = self      
    while n > 1
     if n%d==0
        primes << d
        n/=d
      else
        d+=1
      end
    end
    primes
  end
  
  # http://en.wikipedia.org/wiki/Divisor_function
  # @see project euler #12
  def divisor_count
    primes = self.prime_factors
    primes.uniq.inject(1) { |memo, p| memo * ( ( primes.find_all {|i| i == p} ).length + 1) }
  end
  
  #
  # @see project euler #12, 21, 23
  def divisors
    d = Array.new
    (1..self-1).each { |n| d << n if self % n == 0 }
    d
  end

  # @see project euler #
  def prime?
    divisors.length == 1 # this is a brute force check
  end
  
  # prime series up to this limit, using Sieve of Eratosthenes method
  # http://en.wikipedia.org/wiki/Sieve_of_Eratosthenes
  # @see project euler #7, 10, 35
  def prime_series
    t = self
    limit = Math.sqrt(t)
    a = (2..t).to_a
    n = 2
    while (n < limit) do
      x = n*2
      begin
        a[x-2]=2
        x+=n
      end until (x > t )
      begin
        n+=1
      end until ( a[n-2] != 2 )
    end
    a.uniq!
  end

  # @see project euler #23
  def perfect?
    self == divisors.sum
  end

  # @see project euler #23
  def deficient?
    self > divisors.sum
  end

  # @see project euler #23
  def abundant?
    self < divisors.sum
  end
end

Next we visit the Collatz conjecture and an interesting routine to make numbers "speak english"..

class Integer     
  # http://en.wikipedia.org/wiki/Collatz_conjecture
  # @see project euler #14
  def collatz_series
    a = Array.new
    a << n = self
    while n > 1
      if n % 2 == 0
        n /= 2
      else
        n = 3*n + 1
      end
      a << n
    end
    a  
  end

  # express integer as an english phrase
  # @see project euler #17
  def speak
    case
    when self <20
      ["zero", "one", "two", "three", "four", "five", "six", "seven", "eight", "nine", "ten",
       "eleven", "twelve", "thirteen", "fourteen", "fifteen", "sixteen", "seventeen", "eighteen", "nineteen" ][self]
    when self > 19 && self < 100 
      a = ["twenty", "thirty", "forty", "fifty", "sixty", "seventy", "eighty", "ninety"][self / 10 - 2]
      r = self % 10
      if r == 0
        a
      else
        a + "-" + r.speak
      end
    when self > 99 && self < 1000
      a = (self / 100).speak + " hundred"
      r = self % 100
      if r == 0
        a
      else
        a + " and " + r.speak
      end      
    when self > 999 && self < 10000
      a = (self / 1000).speak + " thousand"
      r = self % 1000
      if r == 0
        a
      else
        a + ( r <100 ? " and " : " " ) + r.speak
      end      
    else
      self
    end
  end
end

Calculating integer partitions is one of my favourites ... a nice, super-fast recursive algorithm. For problems like "how many ways to make $2 in change?"

class Integer 

  # calculates integer partitions for given number using array of elements 
  # http://en.wikipedia.org/wiki/Integer_partition
  # @see project euler #31
  def integer_partitions(pArray, p=0)
    if p==pArray.length-1
      1
    else
      self >= 0 ? (self - pArray[p]).integer_partitions(pArray ,p) + self.integer_partitions(pArray,p+1) : 0
    end
  end   
end

Finally, rotations and palindromes (base 2 or 10): methods that rely on some underlying String routines that come later...

class Integer 
  # returns an array of all the base10 digit rotations of the number
  # @see project euler #35
  def rotations
    self.to_s.rotations.collect { |s| s.to_i }
  end

  # @see project euler #4, 36, 91
  def palindrome?(base = 10)
    case base 
    when 2
      sprintf("%0b",self).palindrome?
    else
      self.to_s.palindrome?
    end
  end
end

Array Manipulations

Array handling is particularly important. Start with some simple helpers, then move onto greatest common factor and a couple of least-common multiple implementations. My favourite here - lexicographic permutations.

class Array

  # sum elements in the array
  def sum
    self.inject(0) { |sum, n| sum + n }
  end
  
  # sum of squares for elements in the array
  # @see project euler #6
  def sum_of_squares
    self.inject(0) { |sos, n| sos + n**2 }
  end
  
  # @see project euler #17
  def square_of_sum
    ( self.inject(0) { |sum, n| sum + n } ) ** 2
  end
    
  # index of the smallest item in the array
  def index_of_smallest
    value, index  = self.first, 0
    self.each_with_index {| obj, i | value, index = obj, i if obj<value  }
    index
  end

  # removes numbers from the array that are factors of other elements in the array
  # @see project euler #5
  def remove_factors
    a=Array.new
    self.each do | x | 
      a << x if 0 == ( self.inject(0) { | memo, y | memo + (x!=y && y%x==0 ? 1 : 0)  } )
    end
    a
  end

  # http://utilitymill.com/edit/GCF_and_LCM_Calculator
  # @see project euler #5
  def GCF
    t_val = self[0]
    for cnt in 0...self.length-1
      num1 = t_val
      num2 = self[cnt+1]
      num1,num2=num2,num1 if num1 < num2
      while num1 - num2 > 0
        num3 = num1 - num2 
        num1 = [num2,num3].max
        num2 = [num2,num3].min
      end
      t_val = num1
    end
    t_val
  end

  # http://utilitymill.com/edit/GCF_and_LCM_Calculator
  # @see project euler #5
  def LCM
    a=self.remove_factors
    t_val = a[0]
    for cnt in 0...a.length-1
      num1 = t_val
      num2 = a[cnt+1]
      tmp = [num1,num2].GCF
      t_val = tmp * num1/tmp * num2/tmp
    end
    t_val  
  end

  # brute force method:
  # http://www.cut-the-knot.org/Curriculum/Arithmetic/LCM.shtml
  # @see project euler #5
  def lcm2
    a=self.remove_factors
    c=a.dup
    while c.uniq.length>1
      index  = c.index_of_smallest  
      c[index]+=a[index]
    end
    c.first
  end

  # returns the kth Lexicographical permutation of the elements in the array
  # http://en.wikipedia.org/wiki/Permutation#Lexicographical_order_generation
  # @see project euler #24
  def lexicographic_permutation(k)
    k -= 1
    s = self.dup
    n = s.length
    n_less_1_factorial = (n - 1).factorial # compute (n - 1)!
    
    (1..n-1).each do |j|
      tempj = (k / n_less_1_factorial) % (n + 1 - j)

      s[j-1..j+tempj-1]=s[j+tempj-1,1]+s[j-1..j+tempj-2] unless tempj==0
      n_less_1_factorial = n_less_1_factorial / (n- j)
    end
    s
  end
  
  # returns ordered array of all the lexicographic permutations of the elements in the array
  # http://en.wikipedia.org/wiki/Permutation#Lexicographical_order_generation
  # @see project euler #24
  def lexicographic_permutations
    a=Array.new
    (1..self.length.factorial).each { |i| a << self.lexicographic_permutation(i) }
    a
  end
    
end

String Helpers

Last but not least, some String methods that just make things so much easier...

class String

  # sum of digits in the number
  # @see project euler #16, 20
  def sum_digits
    self.split('').inject(0) { |memo, c| memo + c.to_i }
  end

  # product of digits in the number
  # @see project euler #8
  def product_digits
    self.split('').inject(1) { |memo, c| memo * c.to_i }
  end
  
  #
  # @see project euler #4, 36, 91
  def palindrome?
    self==self.reverse
  end 

  # returns an array of all the character rotations of the string
  # @see project euler #35
  def rotations
    s = self
    rots = Array[s]
    (1..s.length-1).each do |i|
      s=s[1..s.length-1]+s[0,1]
      rots << s
    end
    rots 
  end
    
end

With all the above in place - and with the aid of a few brain cells - some deceptively complicated questions (like "How many different ways can £2 be made using any number of coins?") are essentially one-liners:

require 'benchmark'
require 'MyMath'

Benchmark.bm do |r|
  r.report {
    answer = 200.integer_partitions([200,100,50,20,10,5,2,1])
  }
end

Love it;-)

Tuesday, October 07, 2008

Best Oracle OpenWorld 2008 quote so far..

WSJ blogger Ben Worthen on the HP Oracle Exadata Storage Server:

Oracle now has a device that is kind of like an iPod except that it is a lot bigger and a lot more expensive and not as cool.

"... priced at US$4,000 per terabyte of storage, plus the database license costs" Oh my.

Sunday, October 05, 2008

On context paths in Java EE

I was recently involved (tangentially) in an exercise to migrate a tomcat-based JSP application to Java EE packaging, which had me taking a fresh look at the concept of context paths and considering best practices for handling them.

When you package and deploy a Java EE web application, it has a context-root which effectively becomes the path on your application server that the application is available under. For example, the following module definition would setup MyApp to be available under http://server.domain/myapp/

<module>
<web>
<web-uri>MyApp.war</web-uri>
<context-root>myapp</context-root>
</web>
</module>

Context paths make it possible to host many applications on the one server as long as you keep the paths unique. As advised in Build to Spec! Part II:

Always specify a unique context root for every Web application you deploy to avoid naming collisions with applications already deployed.

It is possible to install an application with a context root of "/" but there are two considerations to bear in mind:

Applications servers will usually have a default application already installed under "/" which would need to be removed first.

The reason why the default application exists is to host resources that are not packaged as a web application (which may or may not be a concern, depending on what you are serving).

Problems arise when applications are coded with implicit assumption or explicit reference to the context path they will run under.

This is often the case - as I discovered - when migrating simple JSP applications to Jave EE packaging. Either the app assumes it will run from "/", or it has hard-coded paths under the root.

It can also occur in applications designed to be packaged as a .war or .ear file, if the developer assumes that the context-path will remain the same and does some hard-coded shortcuts. This breaks the Java EE separation of duties design, and prevents the system administrator from chosing to deploy the application on another context path (if, for example, there is a conflict with another application).

What the Specs Say

The JSR 53: JavaTM Servlet 2.3 and JavaServer PagesTM 1.2 Specifications define the context path concept, and some relevant API features.

Firstly there is getContextPath() which allows you to obtain the context path.

There's always been some doubt as to how sendRedirect() should behave though, but now that is cleared up in the 2.3 spec. From Servlet 2.3: New features exposed

And finally, after a lengthy debate by a group of experts, Servlet API 2.3 has clarified once and for all exactly what happens on a res.sendRedirect("/index.html") call for a servlet executing within a non-root context. The issue is that Servlet API 2.2 requires an incomplete path like "/index.html" to be translated by the servlet container into a complete path, but doesn't say how context paths are handled. If the servlet making the call is in a context at the path "/contextpath," should the redirect URI translate relative to the container root (http://server:port/index.html) or the context root (http://server:port/contextpath/index.html)? For maximum portability, it's imperative to define the behavior; after lengthy debate, the experts chose to translate relative to the container root. For those who want context relative, you can prepend the output from getContextPath() to your URI.

How to Retrofit Correct Context Path Handling

Migrating an application to Java EE packaging can be a it of a nightmare if url references are not nicely relative, and avoid any assumptions about the full path to the application.

Obviously, in this situation you probably can't avoid going in to clean up the code at some point.

But there are some tricks that can be used to delay that activity.

I've experimented with using servlet filters to do rewrites on the oubound HTML, and that seems to work fine. The filter intercepts all the output of the application, and can be used to fixup text/html or css using regex replacements, and even change the sendRedirect behaviour if desired. But it does introduce some overhead, and I wouldn't see it as a permanent solution. (see EnforceContextRootFilter-1.0-src.zip for all the lowdown on the servlet filter approach, including code you can use and adapt if it meets your needs)

Saturday, October 04, 2008

restful_authentication and OpenID with Rails 2

Note to self: Yes! managed to navigate the various OpenID resources for rails and managed to successfully setup OpenID with restful_authentication on Rails 2.1.

There are a few tricks to be aware of. Prompted by a question on stackoverflow.com, I thought I would post my notes here.

Installing restful_authentication

Usually, you will be plugging open_id_authentication into your app's identity management framework. One of the most popular - and the one I've been using of late - is restful_authentication.

Getting restful_authentication working seems to require a few tweaks however - at least for the latest version found up on github.

Firstly, I've had no success directly installing it using the plugin script like this:

ruby script/plugin install git://github.com/technoweenie/restful-authentication.git

Rather, I've had to manually clone the github repository and manually setup the plugin directory:

$ git clone git://github.com/technoweenie/restful-authentication.git
$ mv restful-authentication/ myapp/vendor/plugins/restful_authentication
$ rm vendor/plugins/restful_authentication/.gitignore
$ rm -fR vendor/plugins/restful_authentication/.git

Note that I'm cleaning up some of the git bits, and also changing the plugin directory name to use an underscore rather than a hyphen (Rails usually complains to me otherwise).

Setting up restful_authentication is then pretty straight-forward:

$ mkdir lib
$ ruby script/generate authenticated user sessions
$ rake db:migrate

Note: if you don't already have the 'lib' directory in your application, it must be created first, otherwise the generator will fail.

Installing open_id_authentication

Ryan Bates' Railscast on Openid is the best thing I've found to follow. Even though it was recorded with Rails 1.2.3, I've been able to successfully follow the tutorial with Rails 2.1.0. The only point to note is that for:

gem install ruby-openid

I installed 2.1.2, rather than the 1.1.4 used in railscast.

Installing open_id_authentication is then a doddle.

$ ruby script/plugin install open_id_authentication

Follow the Railscast from this point to integrate OpenID with restful_authentication.

Using restful_authentication on heroku

The plugin installation problems mentioned above also mean that you need to use a few tricks to get it working in a heroku-hosted application. I've found it best to clone your heroku app and add the restful_authentication plugin locally, and then git push it back to heroku when done.

my recent reads..