By Andrew Jeavons
One approach to Artificial Intelligence (AI) has been around since the early 1940’s, that approach is called “Artificial Neural Networks”, ANNs for short. It was in 1943 that McCulloch and Pitts wrote a paper describing how they could see the way nerves in the brain were organized as a basis for logical operations. Over the years ANNs grew in importance and in the last 20 years or so they have become one of the main AI approaches. It’s always tempting to believe that AI is a new concept, but it isn’t. The use of AI in MR may be new, but as a discipline AI has been around for nearly 80 years.
What has happened in the last 20 years is that computing power has grown to a point where we can start to use ANN on large scale problems. ANNs, as the name implies, works by mimicking the way neurones work in the brain. Neurones have an interesting way of passing on information, they work on a principle of “all or none”. This means a neurone may pass on information in a structured way, for instance a neurone may not send information on unless a basic amount of information is received. What neurones output is not proportional to their input. ANNs copy this property of neurones but in software. By connecting thousands of artificial software neurones together we can create an artificial neuronal network.
The human brain has billions and billions of neurones. Current ANN systems have a tiny number of artificial neurones compared to the human brain, but they can still be used for problems where traditional analytical techniques may not produce the best answers. Until relatively recently the way ANN’s were organized was not regarded as biologically plausible, that is they were not set up in a way that we may expect the brain to be organized. This has begun to change as we have gained a deeper understanding of the brain and faster computers.
So what are ANN’s useful for? The first uses of ANN’s were as predictive systems. This is when you build a ANN that can predict an outcome based on a set of input data. This might be used as a market modeling tool, to predict how products will behave in a marketplace. Sadly it does not appear to work for lottery numbers. ANN’s can be useful when there is a complex relationship between the input variables and the output value. A disadvantage of ANN’s is that they can’t “explain” how they produce an output value. Traditional statistics allows you to see how input variables affect and outcome variable. ANN’s don’t have this capability unless you do a lot of simulations. But by using traditional statistics on a problem you can get a good idea of the relationship between the input variables and the output, ANN’s can be seen as an optimal way to model that relationship.
One area where ANN’s have become very popular is in image recognition, a certain kind of network called a convolutional neural network is popular for this type of problem. The way this network is configured has some parallels with the way the brain is organized. This is a change from previous ANNs which were not biologically plausible. Convolutional neural networks are complex and require a lot of computing power, but they are being used increasingly.
Both of the examples given so far of ANN’s being used are to process data and produce a prediction, either a value or the identity of an image. A newer generation of ANN’s have a different function. Google has produced a neural network architecture that processes text and reduces the content of a piece of text to a list of numbers, usually 300 to 400 numbers. This is called a “vector” hence the name “document to vector” for the algorithm (doc2vec) . What is interesting is that these numbers can be compared to give a value which indicates how close the documents are to each other. A document can be anything, a sentence or a 10-page article. If trained on the right data the doc2vec algorithm can tell you that the sentence “the cat sat on the mat” is closer to the sentence “the dog sat on the mat” than the sentence “the chair sat on the mat”. Note that I said the “right data”. Having a big enough and representative enough dataset is critical for any analytic technique, and ANN’s are no different.
Finally, there is a movie written by an ANN. It’s worth watching, and reading the article about how it was made. This link has the movie and article. It won’t win any Oscars, that is for sure.
ANN’s won’t be taking over anyone’s job anytime soon, but they do provide a unique approach to problems that traditional analytical techniques may not perform well at. We will have to wait for SKYNET a bit longer.
Andrew Jeavons, Mass Cognition