I recently stumbled upon the Visual Structure From Motion code from Changchang Wu. It’s a beautiful GUI frontend to a collection of utilities that include, among others, multicore bundle adjustment, SIFT on the GPU, and other functionality. Instructions on how to install it on Ubuntu are here. Its output on this dataset of high-res images of a statue is very similar to what is shown in this video. In addition to being easy to use, VisualSFM does not require camera calibration information, although you could configure it to do so. The only disadvantage that I can see is that it does not make it possible to enter known position and translation transformations between pairs of camera poses, which would be very useful for stereo imagery. In other words, due to using monocular camera data, the rotation information is accurate but the translation is not; at best the distances are scaled. That said, it seems to work very well, and the meshes that it produces look great, particularly if you visually cover the entire object.
If you haven’t come across specialized, technical podcasts then Talking Machines is going to pleasantly surprise you. It is a show devoted to machine learning. Unlike podcasts such as Radiolab, which have a generic mission to make science accessible to the public, TM has a much more precise focus and assumes that you already have a basic background in the field. It manages to provide sophisticated overviews of different sub-fields in machine learning and never shies away from technical details. Its creators, Katherine Gorman and Ryan Adams (who’s a machine learning prof), are able to simplify concepts without dumbing them down. They’ve done an excellent job and I hope they continue to make this gem for a long time.
I absolutely recommend Russ Tedrake‘s online course on Underactuated Robotics (mostly nonlinear control) and Magnus Egerstedt’s course on linear control. The latter is not a formal prerequisite of the former, but it helps if you know some of the concepts. Russ’ course is just amazing, and extremely rich in mathematical concepts and algorithmic tools. He insists on providing visual intuition for almost every single concept that he introduces, which is great. I’ve recommended it to everyone who cared to listen. Some of my friends and I are currently taking the course.
This post might be a bit specialized, yet vague and hand-wavy, but it reflects some of the discussions that I have been having in the last couple of days. The discussions were prompted by the papers of Jonathan Kelly, who was visiting McGill to give a research talk on his vision + IMU state estimation work. In this line of work, one deals with the system of a monocular camera + IMU. The state of this system usually includes the following: the position of the IMU in some global frame of reference, the rotation of the IMU in that global frame, the two bias vectors of the IMU (one for the gyroscope and one for the accelerometer), and the velocity of the IMU in the global frame. The measurements of the system are the IMU’s linear acceleration and angular velocity, and the feature matches in consecutive images recorded from the camera. This system has been shown to be observable, in the sense that the outputs of the system are sufficient to estimate the (hidden) state of the system.
This is what is still unclear to me: how many measurements do we need and what should they be to “excite all the degrees of freedom”? How can we characterize the quality of the measurements that we have gotten with respect to how much they enable us to infer and not infer about the system?
To me this sounds like a sampling and coverage problem, where the measurements (the samples) need to be selected in such a way that they cover some space associated with the dynamics of the system. Each measurement would be associated with an information gain, and we would say that no more measurements are necessary when all the “information space” has been almost covered. Something like this would have strong links with sampling-based path planning. In fact, there have been papers that have considered problems such as path planning for a robot so that the path gives rise to the lowest positional uncertainty at the end, but I’m not sure that this is the same as saying that the “information space” has been covered.
Or planning that takes into account velocity, acceleration, momentum, force bounds, as well as the dynamics of the vehicle. For example, lateral motion is possible for an omnidirectional robot, but not for a car. This sub-field of planning is very closely related to and borrows ideas from optimal control, however, it always takes into consideration the presence of static or dynamic obstacles in the environment.
- One issue with existing solutions to this problem, such as kinodynamic RRT and RRT* is that they assume a Steer() function which essentially does optimal control from one state to another nearby one (often without taking into account obstacles). As far as I know, this is hard to do even for Dubins and Reeds-Shepp car dynamics. It would be interesting to see if we could eliminate the need for solving an optimal control problem within a larger-scale optimal control problem. I wonder if there is a way to make RRTs keep their Voronoi bias even in the case where we do not sample a state uniformly, but we appropriately sample from the control space in such a way that the forward integration to the end state results in a uniformly sampled new state. In Lavalle and Kuffner’s Fig. 3 this sampling from the control space was dismissed quickly, because it lead to a random tree that preferred extensions near the already-explored space, but I wonder how they sampled the controls.
- Another issue is that both RRT and RRT* require a NearestNeighbour() function that returns the closest state to a given state. In the case of omnidirectional robots, one can use a kd-tree to partition the euclidean metric space into regions, such that nearest neighbour queries take O(logn) instead of n in the case of brute-force search. In the case of non-holonomic robots you might have distance functions that are non-euclidean, in which case kd-trees will not work. So, what are some good data structures to use for nearest neighbour queries in non-euclidean spaces? Or, a better question perhaps, can we modify RRT so that it doesn’t need nearest neighbour queries at all?
- A third issue is that multi-robot kinodynamic planning becomes a bit cumbersome if we think about it in terms of the current formulation of RRT and RRT*. Consider for example the notion of a terminating time: if you are planning the trajectories of multiple robots in a way that the planning of one robot cannot be separated from the others’, then you will need to consider a different terminating time for each robot if you extend the tree by sampling in state space. On the other hand, if you sampled in control space and decided to extend the state by a time duration dt then this would be avoided.
- A fourth issue, which sometimes bothers me, is whether long-term planning is really necessary for robots in the field. I agree that it is very useful and in fact mandatory for robotic hands in assembly lines or in labs. But, for robots that operate in natural, unstructured environments it seems like to make an optimal long-term plan and then blindly execute it is foolish, because unexpected circumstances will most likely appear. Plans will have to adapt to incoming streams of sensor feedback. So, why optimize for the optimal plan in a perfectly controlled environment when we should probably be optimizing for the most “adaptable” plan in a “noisy” environment? Finally, if we insist to plan optimal paths for a known environment, how long should the planning horizon be?
I’ve been trying to simulate a circuit I found online using the MultiSim circuit simulator, which I’ve come to appreciate and find pretty cool. One of the components in this circuit is a 78L10AC voltage regulator, which outputs 10V if the current drawn by the load circuit is up to 100mA. Unfortunately, MultiSim does not have any representative in the 78L10 family of voltage regulators. To be fair, it does have regulators for 5V, 8V, 9V, 12V, and many others, but not 10V. I spent some time searching online for a SPICE model for 78L10, but nothing promising came up. I also tried inferring the SPICE model from the 78Lxx datasheet from Texas Instruments, but unfortunately, the datasheet leaves most resistor values unspecified, most likely to account for the different models of the 78Lxx family, so, there were too many unknown variables to replicate the original design. Then, Anqi pointed me to the configurable LM317 voltage regulator and in particular to this fun tutorial by Sparkfun’s Pete Dokter on how to use it. It turns out that you can use LM317 and its low-power version, LM317L, both of which are supported in MultiSim, to output a wide range of voltages. So, here’s a 10V voltage regulator with output current up to 100mA: