Last night, I headed down the the beautiful Bell House in the Gowanus section of Brooklyn for this month's Secret Science Club lecture, featuring Dr Rob Fergus of NYU and Facebook's Artificial Intelligence lab. Dr Fergus' lecture concerned the development of visual systems for artificial intelligences.
The lecture began with a question- can computers see and make sense of their surroundings? Dr Fergus' project is to build machines that can see with deep learning. The goal is to build intelligent machines... such machines need to be able to perceive- they need visual recognition and understanding of that which is perceived. Until recently, this problem was unsolved, interpreting images is not straightforward to a computer. The human visual system is complex, involving not only the eyes and optic nerve, but multiple parts of the brain- the pathway from eyes to decision making area of the brain is complex.
AI developers aren't copying natural visual systems because the understanding of the brain is still vague, and the "architecture" of the brain isn't the best model. Compared to computer processors, the human brain is made up of slow but parallel systems, while computers have fast but linear systems. The ideal artificial visual processor would be able to outperform nature's designs and constraints. Convolutional neural networks are the networks with special connectivity designed for computer visual systems.
The first problem of developing a visual system is image classification. Can a particular description of an image match a single label? The key to image description is making the best prediction for an image. Pixels have to receive a class label, one per image, the image and label forms a data set.
Training has to take place- a model needs to be chosen to map images to the labels- the training involves incrementally upgrading the parameters of the data sets to reduce a loss of visual function. After the training is accomplished, test data needs to be added- overly complex models can impede training.
In order to achieve Deep Learning, models with hierarchical structures need to be built. These hierarchies become increasingly complicated, building to a desired stage. The initial "layer" of the image is a simple filtered image, and each subsequent layer extracts features from the previous layer. The pixels are filtered through a non-linear dimension, and the process occurs in a "learned" direction. Multiple filters, hundreds or thousands in practice, are used to create "feature maps".
Pooling of the feature maps the occurs, serving to create invariant output despite multiple inputs... ideally, changes in the input won't result in changes to the output. As pooling increases, smaller local models accumulate, with higher convolution layers adding up to a whole picture.
Dr Fergus then gave us a brief history of this field, from 1989 to 2012. One major breakthrough occured in 2012 with the creation of the ImageNet database, which includes about 14 million images from about twenty-thousand classes. Another breakthrough was the implementation of Graphics Processing Units in visual systems. Current visual models have filters which can be retrained late in the filtering process to improve performance. The CLARIFAI image recognition system is able to autotag processed images.
The talk then proceeded to a demonstration of the different layers of filters in a hierarchy. Due to the visual component of this part of the talk, it's hard to encapsulate it in a blog post. Luckily, here's a video by Dr Fergus- the camera-work insufficiently covers the slides, but a viewer should get some idea of the different filter values:
After the lecture, some bastard asked Dr Fergus what sort of progress has been made in AIs' ability to process novel images. He indicated that this is a subject which is just now being broached- getting AI visual processing up to its present standard has been difficult enough even with labelling.
Once again, the Secret Science Club presented a fine, fine lecture. Kudos to Dr Fergus, Secret Science goddesses Dorian and Margaret, and the staff of the beautiful Bell House.