In Holland’s model for adaptation and learning in complex adaptive systems (CASs), which I introduced in Part 78, the bulletin-board messages must compete for domination and survival. Moreover, since in the game for survival in real life, certain players form syntheses or alliances for mutual benefit, Holland added this feature in his model by assigning to each classifier a certain plausibility or probability factor. The classifiers with high plausibility stood a higher chance of being selected for bulletin-board display in a given cycle; the rest were likely to be ignored or eliminated. The Hebbian reinforcement and feedback mechanism (cf. Part 74) was used for determining the plausibility factor for a classifier. The plausibility was determined by performance. If the selection of a classifier resulted in better performance, as indicated by a positive feedback from the environment, its plausibility factor was enhanced. If the performance was poor, the plausibility factor was diminished, just like the weakening of the synaptic strength in the Hebbian model of the brain.
A classifier (cf. Part 78) interacts with many others, and is influenced by them. Therefore, if it does well, the credit is not entirely its own; a chain of classifier actions leads to the successful action of the final classifier. To take note of this, Holland deviated from the general Hebbian approach and, instead, drew inspiration from how a market economy functions. The market is driven by the powerful profit motive. Let us see what this marketplace metaphor is.
In a market, there is buying and selling. If a product (say a raw material, or a service) is good, it fetches a good price from the buyer. The act of buying depletes the funds of the buyer, but the buying is done with the expectation that, by doing value-addition to the product purchased, the buyer can sell it in the market at a good price, and thus not only recover the cost of the original product, but also make a profit. There is a chain of buyers and sellers at various levels, leading to the manufacture of the end product, ready for buying or consumption by the consumer, and the profit motive operates at each level of transaction. Thus, for a successful end product, although the payment of price is made by the consumer only to the final seller, all the players in the game get rewarded. And if an end product fails, nobody buys it and the loss (or absence of profit) percolates down the line to all the persons or firms involved.
Holland used similar ideas for his computer simulation of adaptation for evolving the plausibility factors for the classifiers. The messages competing for being posted on the bulletin board are like the goods and services for sale, and the classifiers generating those messages are like the firms that produce the goods and services. The classifiers not only sell, they also buy, because they scan all the messages on the board, and get activated if the corresponding IF condition is satisfied by a displayed message. When a classifier buys, it is like a firm making a bid, and the highest bidder wins the bid. And how does the classifier ‘pay’ for the purchase? By transferring some of its plausibility strength to the supplying classifier. And since the classifiers are a highly interconnected system, the profits and losses get dispersed, just like in the market economy. Since the environment (likened to the consumer of the end product) provides a Hebbian reinforcement for good performance by feedback, the ‘understanding’ evolves at all levels that if you make a good intermediate product, you get rewarded with a profit, and if your product in not good you stand to lose. Finally the plausibility factor of each classifier evolves to a value matching its true worth to the environment, and this happens without any top-down instructions.
Holland called this part of his modelled CAS the ‘bucket brigade’ algorithm: It passes the reward from each classifier to the previous classifier down the chain. The idea is similar to how synapses get strengthened in the Hebbian model of the brain. Something similar also happens when reinforcement of synaptic weights occurs when an artificial neural network (ANN) is trained for achieving the desired performance via gradual learning.
Exploitation vs. exploration
The bucket-brigade algorithm was lacking in one additional aspect of evolution of learning. It lacked the novelty feature. It strengthened the classifiers that the system already possessed. It lacked the ability to explore new possibilities. It had the exploitation feature, but lacked the exploration feature. This was easy to handle. All that Holland had to do was to bring in his genetic algorithms (GAs). Classifiers evolved via the GA approach. The crossover feature of GAs ensured that new possibilities were explored and, once in a while, fitter classifiers arose which could replace weaker ones.
To summarize the previous post and this one, the rule-based system, along with the bucket-brigade approach and the GA, not only learned from experience, but was also innovative and creative. The bottom-up formalism eliminated the need for a deductive approach wherein rules have to be imposed for deciding what is consistent and desirable, and what is not. Instead, processes of inference, learning, and discovery emerged by induction through the bottom-up route. This was probably the first successful model of emergence in a CAS.
This formalism holds the promise of a theory of cognition.
A number of successful computer simulations have been reported. The one that impressed Holland the most in those days was the work of his student Goldberg (1989). Goldberg’s Ph. D. thesis work demonstrated how Holland’s adaptive-systems formalism could be used successfully for controlling a simulated gas pipeline, an extremely complex problem.