Friday, December 26, 2008

Why dont Array.first() and Array.last() accept blocks?

It would be nice if Array.first and Array.last took block parameters. That way you could get the first or last item in the array based on a block condition. Example:
>> [1,2,3].first {|x| x > 1 }
=> 2

As most things in ruby, its super easy to implement:
class Array
 def first
   each do |x|
     return x if !block_given? || yield(x)
   end
 end
 def last
   reverse_each do |x|
     return x if !block_given? || yield(x)
   end
 end
end

a = [1,2,3]
puts a.first            # => 1
puts a.first {|x| x>1 } # => 2
puts a.last             # => 3
puts a.last  {|x| x<2 }  # => 1


Heres how to implement this with the MRI interpreter. (Not as beautiful)
static VALUE
rb_ary_first(argc, argv, ary) 
    int argc;
    VALUE *argv;
    VALUE ary; 
{
    if (argc == 0) { 
        if (rb_block_given_p()) {
                long i;
                RETURN_ENUMERATOR(ary, 0, 0);
                for (i=0; ilen; i++) {
                        if (RTEST(rb_yield(RARRAY(ary)->ptr[i]))) {
                                return RARRAY(ary)->ptr[i];
                        }
                }
                return Qnil;
        } else {
                if (RARRAY(ary)->len == 0) return Qnil;
                return RARRAY(ary)->ptr[0];
        }
    }    
    else {
        return ary_shared_first(argc, argv, ary, Qfalse);
    }
}

Quicksort

Array#sort works great, but its still fun to write a quick sort in ruby!
class Array
 def quicksort
   return self if length <= 1

   pivot = shift
   less, greater = partition {|x| x <= pivot}

   less.quicksort + [pivot] + greater.quicksort
 end
end

puts [8,99,4,1000,1,2,3,100,5,6].quicksort.join(',')

Wednesday, December 10, 2008

Paged Enumerable

Scraping some pages with WWW::Mechanize and finding myself needing to download multiple pages. So I wrote a paged enumerable module. Paged enumerable does everything Enumerable does, except instead of implementing a meaningful each you implement a meaningful each_page which yields an array or items per page.

to use the paged enumerable, you would write a class like this:
class MultiplePageSearch
  include PagedEnumerable

  def each_page
    10.times { |page| yield download_page(page) } # simulate 10 pages
  end 

private
  def download_page(page)
    puts "downloading page #{page}..."
    sleep 1 # simulate slooow operation
    start = page*10

    start...(start+10)
  end 
end

paged = MultiplePageSearch.new
puts paged.any? {|x| x > 50} # will only hit 5 pages
paged.each {|x| puts x} # will process all pages

The implementation of PagedEnumerable is quite simple
module PagedEnumerable  
  def self.included(obj)
    obj.send :include, Enumerable
  end 

  def each(&blk)
    each_page { |page| page.each(&blk) }
  end
end

and in most cases you would cache the pages for better performance on a second pass.

Wednesday, December 3, 2008

Installing Git on CentOS 5.2 from sources

Installing Git SCM on Centos 5.2

Before running replace VERSION with the version you wish to install, usually the latest version will do fine.

yum install openssl-devel curl-devel expat-devel -y
wget http://www.kernel.org/pub/software/scm/git/git-VERSION.tar.gz
tar xvf git-VERSION.tar.gz
cd git-VERSION
make
make install