Building your own eye tracker for dirt cheap. How hard can it be? Turns out the basics are surprisingly simple! On this page, I’ll try to keep you posted on how the project advances. If you want to play around with the source code, feel free to grab if off GitHub.
11 October 2013
This video shows the result of my first attempt at pupil tracking. A lot of work needs to be done before this is actually useable, but I’m already quite happy with the results. The software is, of course, based on PyGaze. It uses the new webcam library. Image analysis is done on the fly, using PyGame. The webcam I use is a Trust Cuby model (retail price: 15 Euro), from which I removed the infrared filter. All regular lights in my office were off during tracking; I used a bunch of infrared LEDs to illuminate my face.
25 November 2013
I’ve added two new videos, showing the current state of the project. As you can see, I made quite a bit of progress! A Graphical User Interface (GUI) has been added, to allow users to set their own system up in an easier way (previously, it was done via code and keyboard shortcuts). Furthermore, the pupil’s edges (indicated by the green rectangle) are now detected, which means the software can now be used for pupilometry (the science of measuring pupil size). The pupil centre is still indicated by the red dot, but now there is the option to only locate a potential pupil inside of a user-defined enclosure. This is useful, because the pupil detection is done based on a rather simple principle: “find the largest dark bit in the picture, this must be the pupil!”. As you can see, my eyebrows and hair are pretty dark as well, therefore oftentimes falsely recognized as a pupil. The blue rectangle you see in the screen that says “select pupil and set pupil detection bounds” is the enclosure outside of which no pupil detection is attempted. As you can see in the video, this pupil enclosure moves along with the pupil, so you don’t have to worry about moving your head. Note that the lighting conditions were pretty normal in the current videos: I was simply sitting in my office, with all the lights and two monitors on, without the infrared LEDs that I used for the previous video.
My software works in a relatively straightforward way. Every image that the webcam produces (30 per second!) is analyzed to find the dark bits in the image. This makes sense, because your pupil usually is one of the darkest parts of an image of your face (go check for yourself by looking through your Facebook profile pics). Of course, how dark exactly your pupil is can differ depending on the environment you are in. Therefore, you will have to tell my software what ‘dark’ actually is. You do so by setting a threshold value: a single number below which everything is considered ‘dark’. If you are now wondering “how can any part of an image be ‘lower than a single number’?”, you’re on the right track!
A computer does not see a picture in the way we humans do. Quite frankly, it doesn’t really ‘see’ the picture at all! For your computer, your profile pic is simple a collection of pixels, the tiniest parts of an image. If you zoom in real close, you should be able to see them in any image: small squares, consisting of only a single colour. To a computer a colour isn’t a colour in the way we perceive colours. To a computer, a colour consists of three numbers: one value for the amount of red, one for the amount of green, and one for the amount of blue. In our case, these numbers range from 0 to 255, where 0 means ‘none of this colour at all, please’ and 255 means ‘maximal colour!’. To give you some examples: (0,0,0) is black (no colour for all) and (255,255,255) is white (maximal colour for all). By now, you might be able to guess what the brightest red is to a computer: (255,0,0).
Now you know how a computer reads out the webcam’s images, it’s time to return to pupil detection. As I mentioned before, you can select a single threshold value for what my software must consider ‘dark’ (and therefore potentially the pupil!). My software then looks at every single pixel in the image, and checks if any of the values for red, green and blue are below that threshold. Only if ALL the values for that pixel are below the threshold value, that pixels is considered ‘dark’. In the movies below, you can see a blue-black screen, where all the dark parts of the webcam’s video stream are black and all other pixels are blue. As you can see: the pupil is one of the darkest parts in the video stream. In essence, this is what we want. But what about all other dark parts of the images? We don’t want the software to mistake my eyebrows for my pupil!
The easiest way to prevent incorrect pupil detection, is by specifying where in each image my software is allowed to look. Basically, it needs to know where the pupil is, and how big the area around the pupil is in which it can look for the pupil. The easiest way to achieve this, is by directly telling my software where your pupil is. If you’re thinking “Wait, hang on Mr. Genius. Your software is supposed to tell me where the pupil is, not the other way around!”, you might have a point. Luckily, you will only have to tell my software where your pupil is once. After that, it’s perfectly capable of telling you where your pupil is. Alternatively, I could try to write some sophisticated face detection algorithm, that finds your face in an image, and then knows where to look for your pupil. This, however, has the disadvantage that an entire face would have to be present in the image, which is not necessarily the case (see the videos under First steps and Progress). On top of this, even the world’s best programmer wouldn’t be able to write face detection software that surpassed your ability to recognize an eye in an image, because you’re just so darn good at it! Therefore, I chose to let you tell the software where your pupil is, by clicking on it with the mouse.
After indicating the pupil location, you can increase or reduce the size of the ‘pupil bounding rect’, the enclosure outside of which my software ignores everything. You can set its limits to anything (and you can even deactivate it), but I’d recommend a bounding rect that encloses the entire eye, and maybe even a bit around it. The larger the bounding rect, the higher the risk of false pupil detection, but the smaller the bounding rect, the higher the risk of losing pupil detection if you move too fast. After setting the rect, you can test if your settings are good by moving and gazing around a bit, and you can adjust the threshold if needed (or go back to any earlier step). Please see the videos above for some nice demo’s!