I recently stumbled upon the Visual Structure From Motion code from Changchang Wu. It’s a beautiful GUI frontend to a collection of utilities that include, among others, multicore bundle adjustment, SIFT on the GPU, and other functionality. Instructions on how to install it on Ubuntu are here. Its output on this dataset of high-res images of a statue is very similar to what is shown in this video. In addition to being easy to use, VisualSFM does not require camera calibration information, although you could configure it to do so. The only disadvantage that I can see is that it does not make it possible to enter known position and translation transformations between pairs of camera poses, which would be very useful for stereo imagery. In other words, due to using monocular camera data, the rotation information is accurate but the translation is not; at best the distances are scaled. That said, it seems to work very well, and the meshes that it produces look great, particularly if you visually cover the entire object.
If you haven’t come across specialized, technical podcasts then Talking Machines is going to pleasantly surprise you. It is a show devoted to machine learning. Unlike podcasts such as Radiolab, which have a generic mission to make science accessible to the public, TM has a much more precise focus and assumes that you already have a basic background in the field. It manages to provide sophisticated overviews of different sub-fields in machine learning and never shies away from technical details. Its creators, Katherine Gorman and Ryan Adams (who’s a machine learning prof), are able to simplify concepts without dumbing them down. They’ve done an excellent job and I hope they continue to make this gem for a long time.
I absolutely recommend Russ Tedrake‘s online course on Underactuated Robotics (mostly nonlinear control) and Magnus Egerstedt’s course on linear control. The latter is not a formal prerequisite of the former, but it helps if you know some of the concepts. Russ’ course is just amazing, and extremely rich in mathematical concepts and algorithmic tools. He insists on providing visual intuition for almost every single concept that he introduces, which is great. I’ve recommended it to everyone who cared to listen. Some of my friends and I are currently taking the course.
This post might be a bit specialized, yet vague and hand-wavy, but it reflects some of the discussions that I have been having in the last couple of days. The discussions were prompted by the papers of Jonathan Kelly, who was visiting McGill to give a research talk on his vision + IMU state estimation work. In this line of work, one deals with the system of a monocular camera + IMU. The state of this system usually includes the following: the position of the IMU in some global frame of reference, the rotation of the IMU in that global frame, the two bias vectors of the IMU (one for the gyroscope and one for the accelerometer), and the velocity of the IMU in the global frame. The measurements of the system are the IMU’s linear acceleration and angular velocity, and the feature matches in consecutive images recorded from the camera. This system has been shown to be observable, in the sense that the outputs of the system are sufficient to estimate the (hidden) state of the system.
This is what is still unclear to me: how many measurements do we need and what should they be to “excite all the degrees of freedom”? How can we characterize the quality of the measurements that we have gotten with respect to how much they enable us to infer and not infer about the system?
To me this sounds like a sampling and coverage problem, where the measurements (the samples) need to be selected in such a way that they cover some space associated with the dynamics of the system. Each measurement would be associated with an information gain, and we would say that no more measurements are necessary when all the “information space” has been almost covered. Something like this would have strong links with sampling-based path planning. In fact, there have been papers that have considered problems such as path planning for a robot so that the path gives rise to the lowest positional uncertainty at the end, but I’m not sure that this is the same as saying that the “information space” has been covered.
Or planning that takes into account velocity, acceleration, momentum, force bounds, as well as the dynamics of the vehicle. For example, lateral motion is possible for an omnidirectional robot, but not for a car. This sub-field of planning is very closely related to and borrows ideas from optimal control, however, it always takes into consideration the presence of static or dynamic obstacles in the environment.
- One issue with existing solutions to this problem, such as kinodynamic RRT and RRT* is that they assume a Steer() function which essentially does optimal control from one state to another nearby one (often without taking into account obstacles). As far as I know, this is hard to do even for Dubins and Reeds-Shepp car dynamics. It would be interesting to see if we could eliminate the need for solving an optimal control problem within a larger-scale optimal control problem. I wonder if there is a way to make RRTs keep their Voronoi bias even in the case where we do not sample a state uniformly, but we appropriately sample from the control space in such a way that the forward integration to the end state results in a uniformly sampled new state. In Lavalle and Kuffner’s Fig. 3 this sampling from the control space was dismissed quickly, because it lead to a random tree that preferred extensions near the already-explored space, but I wonder how they sampled the controls.
- Another issue is that both RRT and RRT* require a NearestNeighbour() function that returns the closest state to a given state. In the case of omnidirectional robots, one can use a kd-tree to partition the euclidean metric space into regions, such that nearest neighbour queries take O(logn) instead of n in the case of brute-force search. In the case of non-holonomic robots you might have distance functions that are non-euclidean, in which case kd-trees will not work. So, what are some good data structures to use for nearest neighbour queries in non-euclidean spaces? Or, a better question perhaps, can we modify RRT so that it doesn’t need nearest neighbour queries at all?
- A third issue is that multi-robot kinodynamic planning becomes a bit cumbersome if we think about it in terms of the current formulation of RRT and RRT*. Consider for example the notion of a terminating time: if you are planning the trajectories of multiple robots in a way that the planning of one robot cannot be separated from the others’, then you will need to consider a different terminating time for each robot if you extend the tree by sampling in state space. On the other hand, if you sampled in control space and decided to extend the state by a time duration dt then this would be avoided.
- A fourth issue, which sometimes bothers me, is whether long-term planning is really necessary for robots in the field. I agree that it is very useful and in fact mandatory for robotic hands in assembly lines or in labs. But, for robots that operate in natural, unstructured environments it seems like to make an optimal long-term plan and then blindly execute it is foolish, because unexpected circumstances will most likely appear. Plans will have to adapt to incoming streams of sensor feedback. So, why optimize for the optimal plan in a perfectly controlled environment when we should probably be optimizing for the most “adaptable” plan in a “noisy” environment? Finally, if we insist to plan optimal paths for a known environment, how long should the planning horizon be?
I’ve been trying to simulate a circuit I found online using the MultiSim circuit simulator, which I’ve come to appreciate and find pretty cool. One of the components in this circuit is a 78L10AC voltage regulator, which outputs 10V if the current drawn by the load circuit is up to 100mA. Unfortunately, MultiSim does not have any representative in the 78L10 family of voltage regulators. To be fair, it does have regulators for 5V, 8V, 9V, 12V, and many others, but not 10V. I spent some time searching online for a SPICE model for 78L10, but nothing promising came up. I also tried inferring the SPICE model from the 78Lxx datasheet from Texas Instruments, but unfortunately, the datasheet leaves most resistor values unspecified, most likely to account for the different models of the 78Lxx family, so, there were too many unknown variables to replicate the original design. Then, Anqi pointed me to the configurable LM317 voltage regulator and in particular to this fun tutorial by Sparkfun’s Pete Dokter on how to use it. It turns out that you can use LM317 and its low-power version, LM317L, both of which are supported in MultiSim, to output a wide range of voltages. So, here’s a 10V voltage regulator with output current up to 100mA:
Friday night in Montreal, with the streets being almost blocked by a snowstorm that left us half a meter of snow between our door and the outside world. It is also a night in which my next-door neighbours have decided to test the full extent of their speaker’s bass capabilities — currently playing a remix of Lana Del Rey’s Blue Jeans. It’s impressive to hear that they are accompanying the song during its high notes; perhaps it’s time to throw that neighbourhood karaoke party after all.
What’s also impressive is that this note has diverged so early on, from its first paragraph. This note is actually about a literature review of a set of pattern recognition and robotics papers, whose existence I think is important to know. I wrote this review as a partial requirement for my upcoming Ph.D. comprehensive exam at McGill. The purpose of the literature review is to summarize ~20 papers related to your research area, and the committee is free to ask whatever question they wish around these papers and their references, trying to see exactly what are the boundaries of the student’s knowledge and how this knowledge has been acquired: by memorization or by actual understanding. If the student is unclear about a method or an algorithm, this committee, which consists of some of the most experienced professors in the department, will most likely detect it and dig deep enough with questions to see exactly what the student does not understand and why.
My review document covers some methods in these broad topics:
- Bayesian filtering and estimation
- Sampling-based path planning
- Classification and ensemble methods
- Active vision
The summary is quite dense, and admittedly one could spend an entire document focusing on one of these areas. I though it was going to be more fun though to write a compact document. I should clarify that this document has not been reviewed yet by the committee, so I haven’t received much feedback on it, and I don’t know exactly how many errors it contains. Notwithstanding that, I thought it is useful enough to share.
If you are searching for a new activity this semester, and you think you want to give horseback riding a chance, then you can go to the McGill Athletics website and sign up for 1 of 12 available slots. The lessons take place at a school called Equitation Elysee, in a farm in St. Lazare, one-hour drive away from downtown Montreal.
Commuting there by public transportation is next to impossible, but if you don’t have a car, the instructor will put you in contact with other students who do, so you can share the ride. The students range from 5-year olds to 70-year olds, and a previous background is not assumed. The instructor, Jo Sweet, is very experienced, she will tell you when you’re doing things wrong and will help you individually to improve your technique. A typical lesson starts with the student grooming the horse in the stables before the ride, then guiding the horse to the arena for the exercises. I should mention that the purpose of the lessons is to control the gait and speed of the horse, and eventually perform jumps; not to simply go riding along trails. The arena is cool though, and there is classical music playing while you’re training. The school runs all year round, and you can contact Jo even if you’re not a McGill student.