12
Apr
2012
MSc thesis poster and complete text

My initial thought was to start writing posts so as to explain and document my thesis. However  as a first step I decided to post here the thesis poster and upload the complete text of my thesis on arxiv.

Read more

18
Mar
2012
programming bread slicing

In the dark ages of programming, functions acted on data. To slice your bread, you passed a bread data structure to a slice function:

slice(bread);

Then came object oriented programming. Instead of having an external function slice our bread, we would ask the bread to slice itself by calling the slice method on a bread object:

bread.slice();

Obviously a vast improvement.
Now object oriented programming has become more refined. First we create a bread-slicing object and then we simply pass bread objects to the slice method on the bread-slicer:

BreadSlicer slicer = new BreadSlicer();
slicer.slice(bread);

via John Cook

update

There is an argument against the final improvement. Travis Griggs advices not to make objects with -er, which probably will act the same way functions act (applying a function over an argument). You can read about it here

06
Mar
2012
Memento decorator in python and fibonacci sequence

Computing fibonacci with the naive method takes exponential time. However, it’s known that by using dynamic programming you can compute fibonacci in linear time. The trick is to check if you have already computed the value of f(a), and if yes return the already-computed value instead of calling the function again (which is expensive).

Here is the naive implementation in python :

def fib( n ) :
	if n <= 1 :
		return 1
	return fib(n-1) + fib(n-2)

Calculating fibonacci sequence from 1 to 35 takes about 8 seconds (8.056s). So, let's how we can improve this by using decorators.

The Memento decorator will allow us to save time by not calling a function which we've already called.

Here is an implementation of Memento decorator:

class Memento :
	__mem__ = {}
	def __init__(self, f) :
		self.f = f

	def __call__(self, arg) :
		if arg not in self.__mem__ :
			self.__mem__[arg] = self.f(arg)
		return self.__mem__[arg]

Let's see what it does. The __init__ function is pretty obvious. It assigns the function f to an object variable. The __call__ function is triggered every time the decorated function is called. By using a dictionary we check and return the result of a function call with the same argument (if we've already computed that). The cost of querying the dictionary is almost O(1).

By just applying this decorator to our function

@Memento
def fib( n ) :
	if n <= 1 :
		return 1
	return fib(n-1) + fib(n-2)

we manage to compute the first 1000 fibonacci numbers in less than a second.

Here is a gist of this example https://gist.github.com/1988734 .

03
Mar
2012
Loop detection in O(n) using Floyd’s algorithm

Given a linked list find whether the list contains a loop or not.

There are several solutions to this but the following is considered one of the best. Also it’s dead simple. The algorithm is also known as “Tortoise and Hare Algorithm

def toirtoise_and_hare(l) :
	tortoise = l.head
	hare = l.head
	steps = 0
	while True:
		if hare == None :
			return (steps, 'No loop found')
		hare = hare.next
		if hare == None :
			return (steps, 'No loop found')
		hare = hare.next
		tortoise = tortoise.next
		if hare == tortoise:
			#print hare, tortoise
			return (steps, 'Loop found')
		steps += 1

Read more

02
Mar
2012
Review: Collaborative Filtering for Implicit Feedback Datasets

Introduction

Review of “Collaborative Filtering For Implicit Feedback Datasets

Collaborative Filtering is a method for providing recommendations based on preferences fetched from users about items (products, shows, tracks, etc. ). Recommender systems utilise such techniques for personalised suggestions using past history of users. For example, Amazon use a type of Collaborative Filtering, called item-based collaborative filtering. According this method, relationships between pair of items are being extracted and combined with user’s data (purchase history) so as to infer recommendations.

As we saw in the previous example, users provide feedback to Amazon about items by rating and buying. This (explicit) feedback is characterised by the fact that the users explicitly define what they like (by buying it or rating positively) and what they don’t (by giving low ratings). However, there are cases that this kind of feedback is not available. For example, when a user is listening to a track or an album, without rating it, it is difficult to know if he liked it or not. This kind of feedback is known as implicit feedback and its main characteristic is that it describes users’ behaviours. As the authors argue, there are several implications that arise from the lack of explicit feedback. Read more

21
Jan
2012
name2gender in python

Given a name or an email address how can you guess the gender ? The answer is as simple as that. It’s all about data.

US Social Security Administration office http://www.ssa.gov/ created a service where you can request the number of births of a babies per name per gender since 1880. However, their pages are not easily accessible. After searching a while I found this page http://www.infochimps.com/datasets/popular-baby-names-by-year-top-1000-us-social-security-administr where is given a method of how to scrape SAS name list with a bash script.

!#/bin/sh
url=http://www.ssa.gov/cgi-bin/popularnames.cgi
mkdir -p names
for ((yr=1879 ; $yr <= 2010 ; yr++)) ; do
	echo $yr
	curl -d "year=$yr&top=1000&number=n" $url > names/$yr.html
done

Alternatively you can just download the data from infochimp’s website.

However this is raw information and thus not very helpful. What we need is to have them in a data structure which will enable us to query the gender per name (ideally a distribution).

I used BeautifulSoup to parse the page and extract records.

import glob
from BeautifulSoup import BeautifulSoup

files = glob.glob('names/*.html')

for f in files :
	html_data = open( f ,'r').read()

	soup = BeautifulSoup(html_data)
	year = soup.find(id="yob")['value']
	tables = soup.findAll('table')
	trs = tables[2].findAll('tr')
	for tr in trs[1:-1]:
		tds = tr.findAll('td')
		print "%s,%s,%s,%s" % (
                       tds[1].contents[0],
                       tds[0].contents[0].replace(',',''),
                       'male',
                       year)
		print "%s,%s,%s,%s" % (
                       tds[3].contents[0],
                       tds[2].contents[0].replace(',',''),
                       'female',
                       year)

I stored the output to names.csv .

Then, to transform this information to a probability distribution per name I used the following code.

import json 

def prob( m, f ) :
    s = m + f
    return {'male':m/(1.0*s), 'female':f/(1.0*s)}

def load_data( file ) :
    names = {}
    f = open( file, 'r' )
    for l in f :
        d = l.rstrip().split(',')

        name = d[0]
        counter = d[1]
        gender = d[2]
        year = d[3]

        if name not in names :
            names[name] = { 'male':0, 'female':0 }

        if gender == 'male' :
            names[name]['male'] += int(counter)
        else :
            names[name]['female'] += int(counter)

    return names

db = load_data('names.csv')

names = {}
for d in db:
    p = prob( db[d]['male'], db[d]['female'])
    if p['male'] > p['female'] :
        gender = 'male'
    elif p['female'] > p['male'] :
        gender = 'female'
    else:
        gender = 'both'

    names[d] = gender

print json.dumps( names )

I saved this to names.json.

Finally, to query the gender of a name you just compare the probabilities

f = open('names.json','r')
names = json.loads( f.read() );
print names
def check_name( name ):
	if name in names :
                if names[name]['male']>names[name]['female']:
                         return 'male';
                elif names[name]['male']>names[name]['female']:
                         return 'female';
                else :
                         return 'unknown';
	else :
		return 'unknown'

The probability approach gives the flexibility to compute also a confidence of the gender for a given name [TODO].

28
May
2011
Robot rock

After three weeks of development we finally finished our project in robotics. The task was to construct a robot from lego parts from the lego NXT kit. It wasn’t so straighforward how to achieve that since a bad design could cause a lot of trouble in the end. Finally, we ended up with a simple tripod.

Then, the software part was to programm the robot to perform a taks, known as global localization. In simple words, imagine that you are given a map but you have no idea about the position of the robot in the map. Then, by using a ultrasound sensor you have to find the position and the using this information to drive the robot to a target. As you see, an error in the estimation can be disastrous.

The method we used for the localization part is know as Monte Carlo Localization. This method (uniformly) samples a number of hypotheses (particles) and then by comparing the measurements with the hypothetical measurements we re-sample hypotheses by weightening them so as the more consistent a measurement is, the more probabilities are that a hypothesis will be re-chosen.

After a serveral iterations of the MCL, an estimation of the position is found. Then, using this estimation, with the hope that it is also the correct position, we guide the robot to the target. So, how do we chose how to get from a position to another ? A simple and safe solution to that was to compute the Delauney triangulation of the map and then compute the dual graph of it, G'. Then we find the closest node of the dual graph to the robot and to the target. So, we have a starting position, a node (A) of G' that is close to the starting point, a node (B) of G' that is close to the target and a path from A to B on G'. Voila, this is the path that we have to follow so as to be as safe as possible that we won’t hit on a wall.

Well, here it is on action.

That was fun :D

15
May
2011
Sol Robeson wise words

Sol Robeson: Have you met Archimedes? The one with the black spots, you see? You remember Archimedes of Syracuse, eh? The king asks Archimedes to determine if a present he’s received is actually solid gold. Unsolved problem at the time. It tortures the great Greek mathematician for weeks – insomnia haunts him and he twists and turns in his bed for nights on end. Finally, his equally exhausted wife – she’s forced to share a bed with this genius – convinces him to take a bath to relax. While he’s entering the tub, Archimedes notices the bath water rise. Displacement, a way to determine volume, and that’s a way to determine density – weight over volume. And thus, Archimedes solves the problem. He screams “Eureka” and he is so overwhelmed he runs dripping naked through the streets to the king’s palace to report his discovery

Sol Robeson: Now, what is the moral of the story?

Maximillian Cohen: That a breakthrough will come.

Sol Robeson: Wrong! The point of the story is the wife. You listen to your wife, she will give you perspective, meaning. You need a break, you have to take a bath or you will get nowhere.

From Aronovsky’s Pi. I’ve forgotten how much I enjoyed that movie.

24
Feb
2011
Real-time jam session system

I’ve always been fascinated by arts and science. But, what was getting me excited most was their combination and one of my ambitions is to achieve that. As a computer scientist I couldn’t find a better way to fulfill this ambition. I finally chose my MSc project which combines my two passions. Algorithms and music.

As the title suggests, my project is to research and develop a system which will make use of machine learning and music information retrieval so as to understand and then generate music automatically, simulating an improvising musician. In short, a system which will improvise in a jam session.

There are several ideas to begin with. The first milestone is to construct a system which will use simple midi input and fixed tempo, like the system suggested in [1]. However, the goal is to be able to use real instruments and beat to be extracted by rhythm instruments (take a look at [2] and [3] for real-time feature extraction).

Here is a list of the tools and languages i am planning to use:

Feel free to leave a comment, to make a suggestion or just tell me what you think.

References

1. Kitahara, Tetsuro, Naoyuki Totani, R. Tokuami, and H. Katayose. 2010. €œBayesianBand: Jam Session System Based on Mutual Prediction by User and System.€ Entertainment Computing €“ICEC 2009

2. Stark, A.M., and M.D. Plumbley. 2009. “Real-time chord recognition for live performance.”in Proceedings of International Computer Music Conference.

3. Stark, Adam M, Matthew E P Davies, and Mark D Plumbley. 2009. “REAL-TIME BEAT-SYNCHRONOUS ANALYSIS OF MUSICAL AUDIO Centre for Digital Music Queen Mary University of London London , United Kingdom.” Analysis 1-6.

 


18
Feb
2011
Simple CAPTCHA solver in python

This is a very simple method for exploiting very simple CAPTCHAs  like those proposed here and here .

In this example we are going to use the following images.


test image

It’s easy to observe the followings. First of all, a fixed size (monospace) font has been used. This makes extracting all the letters and using them as masks to check each digit, one by one, very easy. Also, the alphabet is simple lowercase hexadecimal letters. Thus, we had to extract only 16 letters.

The first part was to to extract all the letters. To achieve that, first of all we sampled several images so as to be sure that the images we have contains all the 16 letters. Then, using a simple image editor we cropped all the letters, one by one. We had to be careful so all the letters be aligned properly. Here is the final mask.


mask

As you can notice there is some noise which we have to remove. After playing with several techniques we finally ended to the following. We turned the image to greyscale. Then we used a threshold to remove some of the noise. Here is the example after the filtering (cropping also applied).


after filtering

So, now we have the image almost cleared and some letters to play with.

Procedure

Move each letter across the image and take the difference of the pixels for each position and sum them. Thus for each position we have a score of how much the letter (mask) fits the letter behind it. Then, store for each letter the position where the maximum score found. Then sort by score, take the top five results (our captcha is five letters) and finally sort by position. The result is the CAPTCHA text.

More formally

Let

d(I,l,o)=\sum_{0\leq i \leq W \\ 0 \leq j \leq H}{[I(o+i, j)-l(i,j)]}

be the distance of the image I, with the letter l in position o

Then

p(I,l) = \arg\max_{o}d(I,l,o)

Thus, we need 5 letters l_{1},l_{2},l_{3},l_{4},l_{5} with maximum d(l_{i},I,o) ordered by p(l_{i}, I).

def p(img, letter):
		A = img.load()
		B = letter.load()
		mx = 1000000
		max_x = 0
		x = 0
		for x in xrange(img.size[0]-letter.size[0]):
			sum = 0
			for i in xrange(letter.size[0]):
			    for j in xrange(letter.size[1]):
					sum = sum + abs(A[x+i, j][0] - B[i, j][0])
			if sum < mx :
				mx = sum
				max_x = x
		return (mx, max_x)

Here is the code which implements this method. You can browse and download everything from https://github.com/ptigas/simple-CAPTCHA-solver

from PIL import Image

def ocr(im, threshold = 200, aplhabet = "0123456789abcdef"):
	img = Image.open(im)
	img = img.convert("RGB")
	box = (8, 8, 58, 18)
	img = img.crop(box)
	pixdata = img.load()

	letters = Image.open('letters.bmp')
	ledata = letters.load()

	# Clean the background noise, if color != black, then set to white.
	for y in xrange(img.size[1]):
	    for x in xrange(img.size[0]):
			if not(pixdata[x, y][0] > threshold \
			and pixdata[x, y][1] > threshold \
			and pixdata[x, y][2] > threshold):
				pixdata[x, y] = (0, 0, 0, 255)
			else:
				pixdata[x, y] = (255, 255, 255, 255)

	counter = 0;
	old_x = -1;

	letterlist = []

	for x in xrange(letters.size[0]):
		black = True
		for y in xrange(letters.size[1]):
			if ledata[x, y][0] <> 0 :
				black = False
				break
		if black :
			if True :
				box = (old_x+1, 0, x, 10)
				letter = letters.crop(box)
				t = p(img, letter);
				print counter, x, t
				letterlist.append((t[0],aplhabet[counter], t[1]))
			old_x = x
			counter = counter + 1

	box = (old_x+1, 0, 140, 10)
	letter = letters.crop(box)
	t = p(img, letter)
	letterlist.append((t[0],aplhabet[counter], t[1]))

	t = sorted(letterlist)
	t = t[0:5] # 5-letter captcha

	final = sorted(t, key=lambda x: x[2])
	answer = ""
	for l in final:
		answer = answer + l[1]
	return answer

print ocr('test.jpg')

[update]

Today I found this. Very nice tutorial for CAPTCHA solving using python and vector space searching.