In the first part of this paper we study the performance of a single-layer perceptron that is expected to classify patterns into classes in the case where the mapping to be learned is corrupted by noise. Extending previous results concerning the statistical behavior of perceptrons, we distinguish two mutually exclusive kinds of noise (I noise and R noise) and study their effect on the statistical information that can be drawn from the output. In the presence of I noise, the learning stage results in the convergence of the output to the probabilities that the input occurs in each class. R noise, on the contrary, perturbs the learning of probabilities to the extent that the performance of the perceptron deteriorates and the network becomes equivalent to a random predictor. We derive an analytical expression for the efficiency of classification of inputs affected by strong R noise. We argue that, from the standpoint of the efficiency score, the network is equivalent to a device performing biased random flights in the space of the weights, which are ruled by the statistical information stored by the network during the learning stage. The second part of the paper is devoted to the application of our model to the prediction of protein secondary structures where one has to deal with the effects of R noise. Our results are shown to be consistent with data drawn from experiments and simulations of the folding process. In particular, the existence of coding and noncoding traits of the protein is properly rationalized in terms of R-noise intensity. In addition, our model provides a justification of the seeming existence of a relationship between the prediction efficiency and the amount of R noise in the sequence-to-structure mapping. Finally, we define an entropylike parameter that is useful as a measure of R noise.
Noise and Random-like Behavior of Perceptrons: Theory and Application to Protein Structure Prediction
COMPIANI, Mario;
1997-01-01
Abstract
In the first part of this paper we study the performance of a single-layer perceptron that is expected to classify patterns into classes in the case where the mapping to be learned is corrupted by noise. Extending previous results concerning the statistical behavior of perceptrons, we distinguish two mutually exclusive kinds of noise (I noise and R noise) and study their effect on the statistical information that can be drawn from the output. In the presence of I noise, the learning stage results in the convergence of the output to the probabilities that the input occurs in each class. R noise, on the contrary, perturbs the learning of probabilities to the extent that the performance of the perceptron deteriorates and the network becomes equivalent to a random predictor. We derive an analytical expression for the efficiency of classification of inputs affected by strong R noise. We argue that, from the standpoint of the efficiency score, the network is equivalent to a device performing biased random flights in the space of the weights, which are ruled by the statistical information stored by the network during the learning stage. The second part of the paper is devoted to the application of our model to the prediction of protein secondary structures where one has to deal with the effects of R noise. Our results are shown to be consistent with data drawn from experiments and simulations of the folding process. In particular, the existence of coding and noncoding traits of the protein is properly rationalized in terms of R-noise intensity. In addition, our model provides a justification of the seeming existence of a relationship between the prediction efficiency and the amount of R noise in the sequence-to-structure mapping. Finally, we define an entropylike parameter that is useful as a measure of R noise.I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.