My initial thought was to start writing posts so as to explain and document my thesis. However as a first step I decided to post here the thesis poster and upload the complete text of my thesis on arxiv.
My initial thought was to start writing posts so as to explain and document my thesis. However as a first step I decided to post here the thesis poster and upload the complete text of my thesis on arxiv.
In the dark ages of programming, functions acted on data. To slice your bread, you passed a bread data structure to a slice function:
slice(bread);Then came object oriented programming. Instead of having an external function slice our bread, we would ask the bread to slice itself by calling the slice method on a bread object:
bread.slice();Obviously a vast improvement.
Now object oriented programming has become more refined. First we create a bread-slicing object and then we simply pass bread objects to the slice method on the bread-slicer:BreadSlicer slicer = new BreadSlicer(); slicer.slice(bread);
via John Cook
update
There is an argument against the final improvement. Travis Griggs advices not to make objects with -er, which probably will act the same way functions act (applying a function over an argument). You can read about it here
Computing fibonacci with the naive method takes exponential time. However, it’s known that by using dynamic programming you can compute fibonacci in linear time. The trick is to check if you have already computed the value of , and if yes return the already-computed value instead of calling the function again (which is expensive).
Here is the naive implementation in python :
def fib( n ) : if n <= 1 : return 1 return fib(n-1) + fib(n-2)
Calculating fibonacci sequence from 1 to 35 takes about 8 seconds (8.056s). So, let's how we can improve this by using decorators.
The Memento decorator will allow us to save time by not calling a function which we've already called.
Here is an implementation of Memento decorator:
class Memento :
__mem__ = {}
def __init__(self, f) :
self.f = f
def __call__(self, arg) :
if arg not in self.__mem__ :
self.__mem__[arg] = self.f(arg)
return self.__mem__[arg]
Let's see what it does. The __init__ function is pretty obvious. It assigns the function f to an object variable. The __call__ function is triggered every time the decorated function is called. By using a dictionary we check and return the result of a function call with the same argument (if we've already computed that). The cost of querying the dictionary is almost .
By just applying this decorator to our function
@Memento def fib( n ) : if n <= 1 : return 1 return fib(n-1) + fib(n-2)
we manage to compute the first 1000 fibonacci numbers in less than a second.
Here is a gist of this example https://gist.github.com/1988734 .
Given a linked list find whether the list contains a loop or not.
There are several solutions to this but the following is considered one of the best. Also it’s dead simple. The algorithm is also known as “Tortoise and Hare Algorithm”
def toirtoise_and_hare(l) : tortoise = l.head hare = l.head steps = 0 while True: if hare == None : return (steps, 'No loop found') hare = hare.next if hare == None : return (steps, 'No loop found') hare = hare.next tortoise = tortoise.next if hare == tortoise: #print hare, tortoise return (steps, 'Loop found') steps += 1
Review of “Collaborative Filtering For Implicit Feedback Datasets”
Collaborative Filtering is a method for providing recommendations based on preferences fetched from users about items (products, shows, tracks, etc. ). Recommender systems utilise such techniques for personalised suggestions using past history of users. For example, Amazon use a type of Collaborative Filtering, called item-based collaborative filtering. According this method, relationships between pair of items are being extracted and combined with user’s data (purchase history) so as to infer recommendations.
As we saw in the previous example, users provide feedback to Amazon about items by rating and buying. This (explicit) feedback is characterised by the fact that the users explicitly define what they like (by buying it or rating positively) and what they don’t (by giving low ratings). However, there are cases that this kind of feedback is not available. For example, when a user is listening to a track or an album, without rating it, it is difficult to know if he liked it or not. This kind of feedback is known as implicit feedback and its main characteristic is that it describes users’ behaviours. As the authors argue, there are several implications that arise from the lack of explicit feedback. Read more
Given a name or an email address how can you guess the gender ? The answer is as simple as that. It’s all about data.
US Social Security Administration office http://www.ssa.gov/ created a service where you can request the number of births of a babies per name per gender since 1880. However, their pages are not easily accessible. After searching a while I found this page http://www.infochimps.com/datasets/popular-baby-names-by-year-top-1000-us-social-security-administr where is given a method of how to scrape SAS name list with a bash script.
!#/bin/sh url=http://www.ssa.gov/cgi-bin/popularnames.cgi mkdir -p names for ((yr=1879 ; $yr <= 2010 ; yr++)) ; do echo $yr curl -d "year=$yr&top=1000&number=n" $url > names/$yr.html done
Alternatively you can just download the data from infochimp’s website.
However this is raw information and thus not very helpful. What we need is to have them in a data structure which will enable us to query the gender per name (ideally a distribution).
I used BeautifulSoup to parse the page and extract
import glob
from BeautifulSoup import BeautifulSoup
files = glob.glob('names/*.html')
for f in files :
html_data = open( f ,'r').read()
soup = BeautifulSoup(html_data)
year = soup.find(id="yob")['value']
tables = soup.findAll('table')
trs = tables[2].findAll('tr')
for tr in trs[1:-1]:
tds = tr.findAll('td')
print "%s,%s,%s,%s" % (
tds[1].contents[0],
tds[0].contents[0].replace(',',''),
'male',
year)
print "%s,%s,%s,%s" % (
tds[3].contents[0],
tds[2].contents[0].replace(',',''),
'female',
year)
I stored the output to names.csv .
Then, to transform this information to a probability distribution per name I used the following code.
import json
def prob( m, f ) :
s = m + f
return {'male':m/(1.0*s), 'female':f/(1.0*s)}
def load_data( file ) :
names = {}
f = open( file, 'r' )
for l in f :
d = l.rstrip().split(',')
name = d[0]
counter = d[1]
gender = d[2]
year = d[3]
if name not in names :
names[name] = { 'male':0, 'female':0 }
if gender == 'male' :
names[name]['male'] += int(counter)
else :
names[name]['female'] += int(counter)
return names
db = load_data('names.csv')
names = {}
for d in db:
p = prob( db[d]['male'], db[d]['female'])
if p['male'] > p['female'] :
gender = 'male'
elif p['female'] > p['male'] :
gender = 'female'
else:
gender = 'both'
names[d] = gender
print json.dumps( names )
I saved this to names.json.
Finally, to query the gender of a name you just compare the probabilities
f = open('names.json','r')
names = json.loads( f.read() );
print names
def check_name( name ):
if name in names :
if names[name]['male']>names[name]['female']:
return 'male';
elif names[name]['male']>names[name]['female']:
return 'female';
else :
return 'unknown';
else :
return 'unknown'
The probability approach gives the flexibility to compute also a confidence of the gender for a given name [TODO].
After three weeks of development we finally finished our project in robotics. The task was to construct a robot from lego parts from the lego NXT kit. It wasn’t so straighforward how to achieve that since a bad design could cause a lot of trouble in the end. Finally, we ended up with a simple tripod.
Then, the software part was to programm the robot to perform a taks, known as global localization. In simple words, imagine that you are given a map but you have no idea about the position of the robot in the map. Then, by using a ultrasound sensor you have to find the position and the using this information to drive the robot to a target. As you see, an error in the estimation can be disastrous.
The method we used for the localization part is know as Monte Carlo Localization. This method (uniformly) samples a number of hypotheses (particles) and then by comparing the measurements with the hypothetical measurements we re-sample hypotheses by weightening them so as the more consistent a measurement is, the more probabilities are that a hypothesis will be re-chosen.
After a serveral iterations of the MCL, an estimation of the position is found. Then, using this estimation, with the hope that it is also the correct position, we guide the robot to the target. So, how do we chose how to get from a position to another ? A simple and safe solution to that was to compute the Delauney triangulation of the map and then compute the dual graph of it, . Then we find the closest node of the dual graph to the robot and to the target. So, we have a starting position, a node (A) of
that is close to the starting point, a node (B) of
that is close to the target and a path from A to B on
. Voila, this is the path that we have to follow so as to be as safe as possible that we won’t hit on a wall.
Well, here it is on action.
That was fun
Sol Robeson: Have you met Archimedes? The one with the black spots, you see? You remember Archimedes of Syracuse, eh? The king asks Archimedes to determine if a present he’s received is actually solid gold. Unsolved problem at the time. It tortures the great Greek mathematician for weeks – insomnia haunts him and he twists and turns in his bed for nights on end. Finally, his equally exhausted wife – she’s forced to share a bed with this genius – convinces him to take a bath to relax. While he’s entering the tub, Archimedes notices the bath water rise. Displacement, a way to determine volume, and that’s a way to determine density – weight over volume. And thus, Archimedes solves the problem. He screams “Eureka” and he is so overwhelmed he runs dripping naked through the streets to the king’s palace to report his discovery
Sol Robeson: Now, what is the moral of the story?
Maximillian Cohen: That a breakthrough will come.
Sol Robeson: Wrong! The point of the story is the wife. You listen to your wife, she will give you perspective, meaning. You need a break, you have to take a bath or you will get nowhere.
From Aronovsky’s Pi. I’ve forgotten how much I enjoyed that movie.
I’ve always been fascinated by arts and science. But, what was getting me excited most was their combination and one of my ambitions is to achieve that. As a computer scientist I couldn’t find a better way to fulfill this ambition. I finally chose my MSc project which combines my two passions. Algorithms and music.
As the title suggests, my project is to research and develop a system which will make use of machine learning and music information retrieval so as to understand and then generate music automatically, simulating an improvising musician. In short, a system which will improvise in a jam session.
There are several ideas to begin with. The first milestone is to construct a system which will use simple midi input and fixed tempo, like the system suggested in [1]. However, the goal is to be able to use real instruments and beat to be extracted by rhythm instruments (take a look at [2] and [3] for real-time feature extraction).
Here is a list of the tools and languages i am planning to use:
Feel free to leave a comment, to make a suggestion or just tell me what you think.
References
1. Kitahara, Tetsuro, Naoyuki Totani, R. Tokuami, and H. Katayose. 2010. BayesianBand: Jam Session System Based on Mutual Prediction by User and System. Entertainment Computing ICEC 2009
2. Stark, A.M., and M.D. Plumbley. 2009. “Real-time chord recognition for live performance.”in Proceedings of International Computer Music Conference.
3. Stark, Adam M, Matthew E P Davies, and Mark D Plumbley. 2009. “REAL-TIME BEAT-SYNCHRONOUS ANALYSIS OF MUSICAL AUDIO Centre for Digital Music Queen Mary University of London London , United Kingdom.” Analysis 1-6.
This is a very simple method for exploiting very simple CAPTCHAs like those proposed here and here .
In this example we are going to use the following images.
![]()
test image
It’s easy to observe the followings. First of all, a fixed size (monospace) font has been used. This makes extracting all the letters and using them as masks to check each digit, one by one, very easy. Also, the alphabet is simple lowercase hexadecimal letters. Thus, we had to extract only 16 letters.
The first part was to to extract all the letters. To achieve that, first of all we sampled several images so as to be sure that the images we have contains all the 16 letters. Then, using a simple image editor we cropped all the letters, one by one. We had to be careful so all the letters be aligned properly. Here is the final mask.

mask
As you can notice there is some noise which we have to remove. After playing with several techniques we finally ended to the following. We turned the image to greyscale. Then we used a threshold to remove some of the noise. Here is the example after the filtering (cropping also applied).

after filtering
So, now we have the image almost cleared and some letters to play with.
Move each letter across the image and take the difference of the pixels for each position and sum them. Thus for each position we have a score of how much the letter (mask) fits the letter behind it. Then, store for each letter the position where the maximum score found. Then sort by score, take the top five results (our captcha is five letters) and finally sort by position. The result is the CAPTCHA text.
More formally
Let
be the distance of the image , with the letter
in position
Then
Thus, we need 5 letters with maximum
ordered by
.
def p(img, letter): A = img.load() B = letter.load() mx = 1000000 max_x = 0 x = 0 for x in xrange(img.size[0]-letter.size[0]): sum = 0 for i in xrange(letter.size[0]): for j in xrange(letter.size[1]): sum = sum + abs(A[x+i, j][0] - B[i, j][0]) if sum < mx : mx = sum max_x = x return (mx, max_x)
Here is the code which implements this method. You can browse and download everything from https://github.com/ptigas/simple-CAPTCHA-solver
from PIL import Image
def ocr(im, threshold = 200, aplhabet = "0123456789abcdef"):
img = Image.open(im)
img = img.convert("RGB")
box = (8, 8, 58, 18)
img = img.crop(box)
pixdata = img.load()
letters = Image.open('letters.bmp')
ledata = letters.load()
# Clean the background noise, if color != black, then set to white.
for y in xrange(img.size[1]):
for x in xrange(img.size[0]):
if not(pixdata[x, y][0] > threshold \
and pixdata[x, y][1] > threshold \
and pixdata[x, y][2] > threshold):
pixdata[x, y] = (0, 0, 0, 255)
else:
pixdata[x, y] = (255, 255, 255, 255)
counter = 0;
old_x = -1;
letterlist = []
for x in xrange(letters.size[0]):
black = True
for y in xrange(letters.size[1]):
if ledata[x, y][0] <> 0 :
black = False
break
if black :
if True :
box = (old_x+1, 0, x, 10)
letter = letters.crop(box)
t = p(img, letter);
print counter, x, t
letterlist.append((t[0],aplhabet[counter], t[1]))
old_x = x
counter = counter + 1
box = (old_x+1, 0, 140, 10)
letter = letters.crop(box)
t = p(img, letter)
letterlist.append((t[0],aplhabet[counter], t[1]))
t = sorted(letterlist)
t = t[0:5] # 5-letter captcha
final = sorted(t, key=lambda x: x[2])
answer = ""
for l in final:
answer = answer + l[1]
return answer
print ocr('test.jpg')
[update]
Today I found this. Very nice tutorial for CAPTCHA solving using python and vector space searching.