Discussion 2: The quest for an architecture continues

The genesis and definition for biological abstraction will be defined shortly, so, as the explorations for a design architecture continued, the next approach that was considered is the modeling form known as Neural Networks. More specifically, these computing systems, called Artificial Neural Networks (ANNs), are based on a collection of connected units called artificial neurons, whose behavior and connection patterns are (very) loosely based on biological neurons.

The property that distinguishes neural networks from the symbolic models that are employed in conventional knowledge engineering research concerns the manner in which representations are manipulated and transformed.

In symbolic systems, representations are governed by rules that are themselves represented symbolically. The rules in a symbolic system operate on representations in a manner that is sensitive to their constituent structure. That is, symbolic representations can have a complex structure in that they are composed of symbols that stand in syntactic relations to other symbols. Rules can operate on a symbolic representation by virtue of its syntactic structure alone.

In artificial neural network models, the relation between representation and rules is very different. Representations are not viewed as formal objects; therefore they are not subject to direct manipulation. The rules in a neural network operate at the level of the individual processing units. They specify how units compute their activation, how activation is transmitted, and how connection “strengths” are modified. With this, representations do not act as a unit to affect other representations. Rather, individual processing units interact and causally affect the activation of other units. Behavior emerges from the way in which interactive units are connected, it does not arise from the direct interaction and transformation of representations.

Because ANNs incorporate at least some basic properties of biological networks, representations consist of patterns of activation across populations of connected units. An individual unit may participate in several distinct representations. Additionally, individual units in ANNs perform computations in parallel, somewhat like biological networks. (Biological neuron computations are asynchronous, not temporally parallel). The activation of each particular unit changes constantly in response to the activity of the other units to which it is interconnected.

Also, ANNs learn from experience. The models discover a set of connection strengths that capture the internal structure of a domain. These connection strengths are typically found by means of error-driven adjustments in the connections among units.

ANNs provide good platforms for modeling pattern recognition, motor control and associative learning systems. Such tasks appear to require mechanisms for processing massive amounts of information in parallel while satisfying large sets of probabilistic constraints, much like the supercomputer simulation. The connection between two units is a constraint in that it determines whether or not one unit should be activated when another unit is activated. This constraint is “soft” in that it can be easily overridden by the input of other units. The network behaves in a way that balances a number of these soft constraints, and therefore a network “settles” into stable representational patterns.

Finally, ANNs are alluring alternatives to symbolic AI models because they exhibit spontaneous generalization. These models can respond appropriately to novel stimuli once they have learned the internal structure of a domain. And generalization in any manner is considered the first step to gestalt abstraction.

With all of the attractive qualities of neural networks, it would seem at first that they would be an excellent candidate for a model of the Organon Sutra architecture, and indeed, they were extensively evaluated. Although they are the darling of many current AI research applications, like the supercomputer simulation, they too suffered from a number of disqualifying characteristics.

Most of these disadvantages stem from the many differences between the models used for ANNs and the true functionality of biological neurons, differences which had the promise to be overcome when analyzed individually. Take for instance a common dilemma experienced in many neural network implementations: There is often a difficulty in balancing the dynamic range with the minimum signal sensitivity of a network. If the network is sensitive to large inputs, in an effort to extend its dynamic range, it tends to ignore small inputs, and the minimum sensitivity suffers. Conversely, if the network is sensitive to small inputs, large inputs tend to saturate the system and dynamic range suffers. This situation is akin to the sensitivity of our biological eyes. Our optic reception must have good minimum sensitivity to any light environment, whether we are reading by the light of a single candle at night, or looking at a road sign backlit by the bright sun during full daylight. There are devised approaches that have been developed to achieve this balance in neural networks, but this gain also has a factor cost.

Similarly, many artificial network models suffer from a phenomenon called “catastrophic interference”. This occurs when a model exhibits rapid forgetting for well-learned conditioning when it is trained on a set of new training datums. This happens primarily in situations involving the rapid acquisition of arbitrary associations between inputs and outputs, when networks are required to model a variety of memory phenomena. Like the previous disadvantage, mitigating this phenomenon also has a factor cost.

And there is the reality that current network technology limits the applications of neural networks to small versions of real problems. However, small systems may not be adequate to model the environments we may find our artificial agent interacting with. In order to scale up networks to sizes that could model real world environments, several theoretical and computational limitations must be overcome, vastly increasing the now growing factor costs.

Neural networks store patterns or learned derivations by a process of distributed encoding. They superimpose pattern information on the many node connections between artificial neurons. Distributed encoding enables neural networks to compute partial patterns and filter unwanted signals from noisy input.

But neural networks pay a price for distributed encoding: crosstalk. Distributed encoding produces crosstalk or interference between stored patterns. Similar patterns clump together. New patterns may crowd out older learned patterns. Older patterns may distort new patterns.

Granted, at the cellular level, even biological neurons suffer from crosstalk. However, in a later part of the dialog, the discussion will relate how Nature has crafted a remedy to this devilish property, in a way that has confounded contemporary neural network architectures.

Curiously, this property of distributed encoding, although introducing the negative factor of crosstalk, also presented the most appealing aspect of neural networks. The distributed encoding seen in neural network models bears a strong resemblance to some aspects of the native behavior described as gestalt abstraction in the Introduction. Perhaps this is the reason ANNs are so popular with AI researchers. And assuredly, it was recognized that ANNs would provide the most intuitive platform for implementing this emergent behavior. So with an almost fervent desire, this recognition drove extensive experimentation with various forms of networks, but no combination of learning paradigms or connection models, with any assortment of convolutional, pooling or hidden layers could demonstrate the necessary segregation of global generalizations from local generalizations (considered to be another step in gestalt abstraction needed for the architecture.)

The evaluation of ANNs was confined to software simulations of neural networks in a C++ object oriented environment, which gave the evaluation greater flexibility to experiment and assess whether further experimentation was warranted in a more robust, non-simulated environment. But even this software appraisal was enough to demonstrate that the architecture was insufficient to provide the foundational functionality required by the Organon Sutra.

Although purported to share many characteristics with biological neurons, ANNs have great difficulties in forming and manipulating structured representations precisely because they do not exhibit several other, more crucial characteristics of their biological analog. Many of these characteristics will be detailed in a later discussion, but the most telling divergence can be seen in their networked architecture. Artificial neural ‘networks’ are so named because they are organized around networked layers, layers whose inputs and outputs are organized around a layer architecture. In artificial neural networks, the output of a “neuron” in a layer (beyond the interface “input” layer) is typically driven by the states of all of the units comprising the preceding layer. In biological systems, the threshold for activation of a given neuron can be achieved by a lesser subset of the total inputs to that neuron. Although Google has made exciting progress with the network forms based on convolution theory, this model foundation still does not exhibit the many necessary behaviors of generalization. Because of the many distinctions with biological neurons, artificial neural networks really should be referred to as connectionist models, to better reflect their true functional nature.



Copyright © 2019 All rights reserved