My first two lectures on computer graphics left me puzzled with some unanswered questions that have to do with basic projective geometry and its transformations. I’ll try to collect my thoughts and some links, and write them down primarily for the sake of summarizing to reduce my own confusion.
Basically, both in computer vision and in computer graphics the first geometric problem that we study is that of projecting points from a 3D scene to a 2D image plane (think of it as a photograph), using the pinhole camera model. The projection looks like this:
An important thing to notice on this picture is that any point that lies on one of the dotted lines will be projected to the same image location. This leads us to write for any nonzero , where is the euclidean coordinate system that we are accustomed to. This “equality” holds in the sense of equivalence classes, not in the sense of euclidean geometry.
Using a similar triangles argument we can prove that the pixel on the image is equal to . However, can also be viewed as part of the 3D scene like this provided that . The correspondence between image pixels and representatives of the equivalence classes of is one-to-one. The equivalence class corresponding to an image pixel is called the homogeneous coordinates of that pixel. Many of the articles that I found online assumed for the sake of simplicity and mentioned that the mapping is “just a trick” to be followed blindly because
What happens when ? In that case does not correspond to any point in the euclidean space, but to a unique point-at-infinity along the direction . The proof is in the first link.