Archive for October 2009
Netflix prize winners at McGill
Almost every Friday at 3pm there is a talk at McGill’s Computer Science department. Today was definitely one of the coolest ones: two members of the team that won the Netflix prize gave a talk on their machine learning techniques. Martin Chabbert and Martin Piotte, graduates of the Ecole Polytechnique de Montreal were members of BellKor Pragmatic Chaos that won the $1 million, 20 minutes before the deadline. Among the most interesting lessons in their fascinating talk, was the fact that at the end they had intelligently combined about 800 predictors using machine learning techniques and clever approximations. However, as they said, it was becoming evident towards the end of the competition that the only way to win, by hitting the mark of 10% improvement on Netflix’s movie recommendation algorithm, was to join forces with other teams, namely BellKor and BigChaos — explains the weird team name. This meant sharing technical expertise, sharing predictors to be combined in ensemble classification, and of course sharing the money
Here are the technical documents that explain how they did it: Pragmatic Theory, BigChaos, BellKor.
Support vector machines
Yet another classification method used in machine learning. Here is the most accessible tutorial I found on this topic, by Tristan Fletcher at UCL. It might also be useful to see how you can use SVMs for regression (to predict continuous variables, instead of classes). This technical report, by Steve Gunn at U of Southampton, was the one that added the most intuition among the tutorials I found on SVMs, along with Geoff Hinton’s notes. And if you are interested in having a library that implements different SVMs then you might want to take a look at Shogun. It provides interfaces to Matlab, R, Octave and Python. It seems like a pretty neat library — at least from what I can see on its website.
Bagging and boosting
Bagging and boosting are two methods introduced in machine learning that aim to combine different classifiers into one classifier that outperforms its components. AdaBoost seems to be the most widely used boosting algorithm, or at least the one profs seem to be more excited about. A very detailed description of how and why it works can be found here. However, I find Chris Bishop’s chapter 14 in Pattern Recognition and Machine Learning much more accessible if you haven’t seen this material before.
I have the feeling that the more algorithms I am introduced to in my machine learning class the less ways I know to solve problems, or at least, the less confident I am about which method is most appropriate. I guess that will come with time and experience, but still, I have to admit I have more respect now for statisticians.
Thankfully, Prof. Aaron Hertzmann’s notes for CSC411 at UofT are really helpful and accessible for people who visit the machine learning jungle for the first time. Maybe after some years I’ll become a machine learning Tarzan, but for the moment I have to watch out not to fall prey to next week’s ML midterm.
The backpropagation algorithm
One of the lectures in my Machine Learning class briefly described the backpropagation algorithm used in neural networks. I wasn’t very satisfied with how the lecture notes described it, and there were many details I didn’t understand. Fortunately, this chapter from Raul Rojas’ Neural Networks: a Systematic Introduction does a much better and detailed job of describing the algorithm. Hope it helps!
Interested in going to Copenhagen for the Climate Change Convention (COP15)?
That’s the title of a Facebook message that arrived at my inbox today. It was sent from the University of Toronto’s Student Union (UTSU) and it basically urged interested students to apply for a position for UofT’s student delegation that is going to attend the Copenhagen Climate Change Conference in December. The message included a link to UofT’s Environmental Resource Network which explained all the details about the application process. Extended application deadline: October 5th (that is tomorrow!).
While I am not a UofT student any more, I couldn’t help but wonder how many students were notified of this opportunity in the beginning of September when they were choosing their courses. The convention takes place from December 5th to 18th, 2009, which is exactly exam period for most north-american universities. If you read the application page, it mentions that students who wish to participate will probably have to ask their registrar’s office to defer their December exams and take them in February — a request that might or might not be satisfied.
This brings up some interesting questions: how many students are going to realistically be able to attend the conference and is the University willing to make it easier for them to attend? Will the university grant exam deferrals for this reason? As far as I know exam deferrals are granted for situations of medical (or other) emergencies, but I might be mistaken. What other support will this delegation need from the university?
And, since every time you ask for something you’d better have something to “bring to the table,” what role does the UofT delegation hope to play in the convention? Will they participate in peaceful demonstrations? Are they going to meet with delegations from other universities? Are they going to act as journalists, providing the latest news to the UofT community through blogging? Are they going to be given the opportunity to actually talk to any politicians? (I’m not referring to heads of states, of course, but perhaps to some of their representatives…) How many people does this delegation need to have in order to accomplish its goals?
I have no idea how things like this are organized, and how the voices of all these delegations are going to reach the ears of politicians, but I sincerely hope that they have some sort of an impact, and that everyone who’ll participate is going to be committed enough not to regard this trip solely as an opportunity for tourism. There are far better places to visit in December than Copenhagen.
PS: the Climate Thinkers Blog might be of interest.