I absolutely recommend Russ Tedrake‘s online course on Underactuated Robotics (mostly nonlinear control) and Magnus Egerstedt’s course on linear control. The latter is not a formal prerequisite of the former, but it helps if you know some of the concepts. Russ’ course is just amazing, and extremely rich in mathematical concepts and algorithmic tools. He insists on providing visual intuition for almost every single concept that he introduces, which is great. I’ve recommended it to everyone who cared to listen. Some of my friends and I are currently taking the course.
This post might be a bit specialized, yet vague and hand-wavy, but it reflects some of the discussions that I have been having in the last couple of days. The discussions were prompted by the papers of Jonathan Kelly, who was visiting McGill to give a research talk on his vision + IMU state estimation work. In this line of work, one deals with the system of a monocular camera + IMU. The state of this system usually includes the following: the position of the IMU in some global frame of reference, the rotation of the IMU in that global frame, the two bias vectors of the IMU (one for the gyroscope and one for the accelerometer), and the velocity of the IMU in the global frame. The measurements of the system are the IMU’s linear acceleration and angular velocity, and the feature matches in consecutive images recorded from the camera. This system has been shown to be observable, in the sense that the outputs of the system are sufficient to estimate the (hidden) state of the system.
This is what is still unclear to me: how many measurements do we need and what should they be to “excite all the degrees of freedom”? How can we characterize the quality of the measurements that we have gotten with respect to how much they enable us to infer and not infer about the system?
To me this sounds like a sampling and coverage problem, where the measurements (the samples) need to be selected in such a way that they cover some space associated with the dynamics of the system. Each measurement would be associated with an information gain, and we would say that no more measurements are necessary when all the “information space” has been almost covered. Something like this would have strong links with sampling-based path planning. In fact, there have been papers that have considered problems such as path planning for a robot so that the path gives rise to the lowest positional uncertainty at the end, but I’m not sure that this is the same as saying that the “information space” has been covered.
Or planning that takes into account velocity, acceleration, momentum, force bounds, as well as the dynamics of the vehicle. For example, lateral motion is possible for an omnidirectional robot, but not for a car. This sub-field of planning is very closely related to and borrows ideas from optimal control, however, it always takes into consideration the presence of static or dynamic obstacles in the environment.
- One issue with existing solutions to this problem, such as kinodynamic RRT and RRT* is that they assume a Steer() function which essentially does optimal control from one state to another nearby one (often without taking into account obstacles). As far as I know, this is hard to do even for Dubins and Reeds-Shepp car dynamics. It would be interesting to see if we could eliminate the need for solving an optimal control problem within a larger-scale optimal control problem. I wonder if there is a way to make RRTs keep their Voronoi bias even in the case where we do not sample a state uniformly, but we appropriately sample from the control space in such a way that the forward integration to the end state results in a uniformly sampled new state. In Lavalle and Kuffner’s Fig. 3 this sampling from the control space was dismissed quickly, because it lead to a random tree that preferred extensions near the already-explored space, but I wonder how they sampled the controls.
- Another issue is that both RRT and RRT* require a NearestNeighbour() function that returns the closest state to a given state. In the case of omnidirectional robots, one can use a kd-tree to partition the euclidean metric space into regions, such that nearest neighbour queries take O(logn) instead of n in the case of brute-force search. In the case of non-holonomic robots you might have distance functions that are non-euclidean, in which case kd-trees will not work. So, what are some good data structures to use for nearest neighbour queries in non-euclidean spaces? Or, a better question perhaps, can we modify RRT so that it doesn’t need nearest neighbour queries at all?
- A third issue is that multi-robot kinodynamic planning becomes a bit cumbersome if we think about it in terms of the current formulation of RRT and RRT*. Consider for example the notion of a terminating time: if you are planning the trajectories of multiple robots in a way that the planning of one robot cannot be separated from the others’, then you will need to consider a different terminating time for each robot if you extend the tree by sampling in state space. On the other hand, if you sampled in control space and decided to extend the state by a time duration dt then this would be avoided.
- A fourth issue, which sometimes bothers me, is whether long-term planning is really necessary for robots in the field. I agree that it is very useful and in fact mandatory for robotic hands in assembly lines or in labs. But, for robots that operate in natural, unstructured environments it seems like to make an optimal long-term plan and then blindly execute it is foolish, because unexpected circumstances will most likely appear. Plans will have to adapt to incoming streams of sensor feedback. So, why optimize for the optimal plan in a perfectly controlled environment when we should probably be optimizing for the most “adaptable” plan in a “noisy” environment? Finally, if we insist to plan optimal paths for a known environment, how long should the planning horizon be?
Friday night in Montreal, with the streets being almost blocked by a snowstorm that left us half a meter of snow between our door and the outside world. It is also a night in which my next-door neighbours have decided to test the full extent of their speaker’s bass capabilities — currently playing a remix of Lana Del Rey’s Blue Jeans. It’s impressive to hear that they are accompanying the song during its high notes; perhaps it’s time to throw that neighbourhood karaoke party after all.
What’s also impressive is that this note has diverged so early on, from its first paragraph. This note is actually about a literature review of a set of pattern recognition and robotics papers, whose existence I think is important to know. I wrote this review as a partial requirement for my upcoming Ph.D. comprehensive exam at McGill. The purpose of the literature review is to summarize ~20 papers related to your research area, and the committee is free to ask whatever question they wish around these papers and their references, trying to see exactly what are the boundaries of the student’s knowledge and how this knowledge has been acquired: by memorization or by actual understanding. If the student is unclear about a method or an algorithm, this committee, which consists of some of the most experienced professors in the department, will most likely detect it and dig deep enough with questions to see exactly what the student does not understand and why.
My review document covers some methods in these broad topics:
- Bayesian filtering and estimation
- Sampling-based path planning
- Classification and ensemble methods
- Active vision
The summary is quite dense, and admittedly one could spend an entire document focusing on one of these areas. I though it was going to be more fun though to write a compact document. I should clarify that this document has not been reviewed yet by the committee, so I haven’t received much feedback on it, and I don’t know exactly how many errors it contains. Notwithstanding that, I thought it is useful enough to share.
This post is a series of instructions on how to use your own version of the opencv library with ROS. If you’ve had trouble doing this then you’re probably familiar with these instructions — while they are good in general, they are misleading because they assume that you are using the latest version of ROS, which I’d guess is not true for most people. Anyway, here is the story in more detail if you have no idea what these terms refer to:
opencv is an open source computer vision library. It provides implementations of the most popular computer vision algorithms that have appeared in the vision research literature. Needless to say that it evolves very quickly. The latest version is 2.2 and provides support for a large number of feature detectors and descriptors, aside from the classical SIFT and SURF. In case you didn’t know, it also provides face detection and recognition classifiers, image segmentation, object recognition, tracking, and many other goodies.
ROS stands for Robot Operating System. I guess it’s fair to say that its aim is to become for robots what Linux became for computers. ROS is the software platform running on the PR2. It provides support for many sensors and offers abstraction layers that allow roboticists to share code more easily. Code in ROS is organized into entities that are called “nodes,” which communicate with each other by exchanging standardized messages. For instance, ROS offers a node for firewire cameras, which publishes image messages. An image viewing program listens for image messages and when it gets a message it displays it on the screen. The point here is that the communication between the publisher and the listener relies on a standard image message interface, which means that I can write the former and you can write the latter, and we won’t have problems sharing our code. ROS is designed to be distributed, its nodes may run on different machines in the same network. This is all good news, because for the first time (?) the robotics community has a software platform that is very well designed, takes care of most details, and is not going to die soon because of low contribution.
But anyway, I have digressed. This is not a fanboy post about ROS and opencv. My problem was that the latest stable distribution of ROS, called “cturtle,” comes with opencv 2.1, while I wanted 2.2. Since many nodes in ROS depend on opencv, changing its version means that you have to recompile a big part of the system. This page helped a lot, but it didn’t work for me, because it assumed that I was running the latest (unstable?) version of ROS. As a result, compilation was unsuccessful. Here’s what worked at the end:
- Create a file called something like cturtle_overlay.rosconfig and paste the following in it:
- svn: uri: https://code.ros.org/svn/ros-pkg/stacks/vision_opencv/tags/cturtle local-name: vision_opencv - svn: uri: https://code.ros.org/svn/ros-pkg/stacks/image_transport_plugins/tags/cturtle local-name: image_transport_plugins - svn: uri: https://code.ros.org/svn/ros-pkg/stacks/image_pipeline/tags/cturtle local-name: image_pipeline - svn: uri: https://code.ros.org/svn/wg-ros-pkg/stacks/web_interface/tags/cturtle local-name: web_interface
- Create a directory named something like cturtle_overlay on your local workspace.
- Run “rosinstall ~/some-path/cturtle_overlay /opt/ros/cturtle ~/some-path/cturtle_overlay.rosconfig”
- Edit “~/some-path/cturtle_overlay/vision_opencv/opencv2/Makefile” in order to specify the version of opencv source you’d like to fetch. In my case it was:
SVN_URL = https://code.ros.org/svn/opencv/branches/2.2/opencv SVN_REVISION = -r4351
- Append “source ~/some-path/cturtle_overlay/setup.sh” in your bashrc file, and make sure that the paths in setup.sh are correct.
- Run “rosmake -a”
Hopefully, everything will go smoothly after this.
TED videos are fun and educational to watch in general, and this is another excellent example:
One of the few things I didn’t like at UofT’s computer science department was the lack of robotics courses in the undergraduate curriculum. This becomes all the more puzzling if you take into consideration the fact that UofT excels in teaching and researching artificial intelligence: computer vision, check, machine learning, double check, cognitive robotics, check, multi-agent systems, also check. It seems to me that the only thing missing is a couple of professors whose primary focus would be to create the necessary hardware that would provide a testbed for all those research efforts. Until that hardware becomes a reality though, some friends of mine and I thought that it would be a good idea to try out participating in RoboCup’s Simulation League.
RoboCup is a robotics competition whose aim is to promote artificial intelligence by creating robots, either actual moving hardware or just software-simulated agents, that will play soccer against each other. It’s (ambitious) goal is that by the year 2050 a team of humanoid robots will be able to beat the World Cup champions in soccer (say Brazil for instance), proving that the field of artificial intelligence has advanced to the point where machines can beat humans in an activity that requires elaborate motor skills, team strategy, and coordination. Personally, I find the fact that humans are striving to build machines that will beat them in a popular sport quite scary (and pointless) — I think Garry Kasparov won’t disagree. Nonetheless, until (and if) that happens, I believe that it is absolutely worthwhile to participate in RoboCup, simply because one can learn so much by the the engineering and artificial intelligence challenges involved in that effort.
Long story short, we convinced one of our professors, Steve Engels to supervise a fourth-year course whose topic would be the creation of a team for RoboCup’s Simulation League. About fifteen students were interested in taking this (summer) course, some of them for credit and others as volunteers. The students were split into four teams, each of which competes against the others every week. This way provides a way to set a benchmark by which to compare the progress of different teams. We used the official RoboCup soccer simulator and its incredibly detailed and helpful documentation. Fortunately, despite of the lack of previous experience of everyone who is involved, the course is going pretty well. See for example what one of the teams has been up to:
The bulletin board for the course is here and, admittedly, it doesn’t do the course much justice but I’ll do my best to make it a bit more informative. A lot of emails went back and forth discussing documentation, the pluses and minuses of available libraries for C++, Java and Python as well as other issues. Those notes will be very helpful for future teams who are considering the possibility of participating in the competition, so I’ll make sure to include them in the above board. As for the future of the course at UofT, it seems that it has attracted enough interest from students, so it will be continued, at least for another semester in a different format. Greg is working on it.