by Gooly (Li Yang Ku)
It’s always good to go back to the reason that lured you into computer vision once in a while. Mine was to understand the brain after I astonishingly realized that computers have no intelligence while I was studying EE in undergrad. In fact if they use the translation “computer” instead of “electrical brain” in my mother language, I would probably be better off.
Anyway, I am currently revisiting some of the first few computer vision papers I read, and to tell the truth I still learn a lot from reading stuffs I read several times before, which you can also interpret it as I never actually understood a paper.
So back to the papers,
Simoncelli, Eero P., and Bruno A. Olshausen. “Natural image statistics and neural representation.” Annual review of neuroscience 24.1 (2001): 1193-1216.
Olshausen, Bruno A., and David J. Field. “Sparse coding with an overcomplete basis set: A strategy employed by VI?.” Vision research 37.23 (1997): 3311-3326.
Olshausen, Bruno A. “Emergence of simple-cell receptive field properties by learning a sparse code for natural images.” Nature 381.6583 (1996): 607-609.
These 3 papers are essentially the same, the first two are the spin-offs of the 3rd paper published in Nature. I personally prefer the second paper for reading.
In this paper, Bruno explains why overcomplete sparse coding is essential for human vision in a statistical way. The goal is to obtain a set of basis functions that can be used to regenerate an image. (basis functions are filters) This can be viewed as an image encoding problem, but instead of having an encoder that compresses the image to the minimum size, the goal is to also remain sparsity, which means only a small amount of basis are used compared to the whole basis pool. Sparsity has obvious advantage biologically, such as saving energy, but Bruno conjectured that sparsity is also essential to vision and is originated from the sparse structure in natural image.
In order to obtain this set of sparse basis, a sparsity constraint is added to the energy function for optimization. The final result is a set of basis function (image atop) that interestingly looks very similar to Gabor filters which is found in the visual cortex. This some how proves that sparseness is essential in the evolution of human vision.