Self-Organizing Multilayered Neural Networks of Optimal Complexity

Introduction

The well-known principles of self-organization are used to synthesize the neural networks on the unrepresentative learning sets and eliminate a priori uncertainty into their structures. The self-organization can be realized under specific conditions: firstly, if the various structures of the neural network can be generated and, secondly, if the best of them can be selected by a criterion of their efficiency. The complexity of the learned neural network will be optimal if its variety will be adequate under the minimal number of its nodes and their synaptic connections.

For known F. Rosenblatt's perceptron consisting of the input (sensor), associative and adjustable layers of the nodes, the complexity is not optimal because of the synaptic links between its layers are randomly, redundantly defined. It is stated that the supplement of the new layers into its structure improves its recognition capability. The redundancy of the neural network structure can be reduced by the random search methods in which are selected such ones that can decrease the value of a lost function.

Criteria of Efficiency

The behavior of the neural network that has the m input variables x1, ..., xm and one output y a function f(x1, ..., xm) describes. The self-organization of the neural network is made on the unrepresentative learning set composed of the small number n of the independent instances classified as y. Within the heuristic self-organization, the desired neural network is described as the F. Rosenblatt scheme that consists of the sensor and associative layers. The synthesis of the associative layers is made with reference function g(u1, ..., up) of the p arguments u1, ..., up, typically p= 2. The reference function g() can belong to the arbitrary class of the function (e.g., the Kolmogorov-Gabor polynomials, the logical functions).

Criteria of Self-Organization

For self-organizing the neural network of optimal complexity, we suggest the exterior criteria whose above-mentioned drawbacks were eliminated. These criteria are realized with computing the value m of an empirical function introduced to evaluate the neural network accuracy loss that occurs on the whole learning set.

Statement 1. Let the values mi, mj and mk of the loss are known correspondingly to the neural networks fr, fr-1 and the feature xk which are used to decide the task of dichotomy classification. Note that function fj0= xj, j ≠ k= 1, ..., m. Then for selecting the best neural networks generated in the layer r, sufficiently the next condition to compute mi< min(mj, mk) (7) This condition is carried out when the structural modifications the reference function g(fjr1, xk) brings into neural network of the layer r are new ones which do not yet belong to the previous neural network fjr1. By the definition, these modifications are able to form the exterior addition to the neural network fjr1 the feature xk brings.

Conclusion

The vertical lines the sensor nodes z1, ..., z8 fed are Boolean variables. The horizontal lines the hidden nodes fed are described with reference functions gi(u1, u2) depicted as solid circles. The circles placed on the intersection of the vertical and horizontal lines denote that the one of sensor lines to be connected with the input to a formal neuron which implements a reference function gi(u1, u2). Note that number marked near solid circles is the one of 10 logical functions of two Boolean variables. Thus, the horizontal lines are activated under the certain states of the vertical lines. With the rule (9), only two connections have to be placed on the horizontal lines of the first layer. When the horizontal lines are below intersected with vertical ones, the next layers are formed. Starting at second layer, the rule (9) allows only one connection in the points of intersections. Additionally, the horizontal lines can be split as for second one on Figure 1. Lastly, the horizontal lines of second layer do form the variables y1, ..., y9 which are outputs of the collective consisting of the 9 equal efficiency neural networks.