This post might be a bit specialized, yet vague and hand-wavy, but it reflects some of the discussions that I have been having in the last couple of days. The discussions were prompted by the papers of Jonathan Kelly, who was visiting McGill to give a research talk on his vision + IMU state estimation work. In this line of work, one deals with the system of a monocular camera + IMU. The state of this system usually includes the following: the position of the IMU in some global frame of reference, the rotation of the IMU in that global frame, the two bias vectors of the IMU (one for the gyroscope and one for the accelerometer), and the velocity of the IMU in the global frame. The measurements of the system are the IMU’s linear acceleration and angular velocity, and the feature matches in consecutive images recorded from the camera. This system has been shown to be observable, in the sense that the outputs of the system are sufficient to estimate the (hidden) state of the system.
This is what is still unclear to me: how many measurements do we need and what should they be to “excite all the degrees of freedom”? How can we characterize the quality of the measurements that we have gotten with respect to how much they enable us to infer and not infer about the system?
To me this sounds like a sampling and coverage problem, where the measurements (the samples) need to be selected in such a way that they cover some space associated with the dynamics of the system. Each measurement would be associated with an information gain, and we would say that no more measurements are necessary when all the “information space” has been almost covered. Something like this would have strong links with sampling-based path planning. In fact, there have been papers that have considered problems such as path planning for a robot so that the path gives rise to the lowest positional uncertainty at the end, but I’m not sure that this is the same as saying that the “information space” has been covered.