# Capacity and complexity in learning binary perceptrons

#### Abstract

A basic neural model for Boolean computation is examined in the context of learning from examples. The binary perceptron, with weights constrained to vertices of the cube $\{-1,1\}\sp{n}$ and computing a linear threshold function of its inputs with a fixed threshold value of zero, is shown sufficient as a basis for Boolean computation. Given a set of positive training examples drawn from vertices of the cube, the immediate goal is loading-seeking a binary perceptron whose mapping behavior is consistent on the training set. The loading problem, equivalent to binary integer programming, is shown intractable in general. Three algorithmic approaches to the theoretical bottleneck are pursued: Harmonic Update is a randomized on-line learning algorithm with a linear time complexity; Directed Drift, also randomized, operates on-line in its simplest incarnation but extends readily to work in batch mode; and Majority Rule is an off-line learning algorithm that executes deterministically in linear time. Each algorithm's capability in loading randomly drawn training examples is quantified by finding its capacity, a probability threshold function which intuitively captures the size of the largest typical training set which the learning algorithm can load with high confidence. An analogous probability threshold, the complexity function, is proposed and utilized to characterize each algorithm's capability in achieving generalization, the ultimate aim of learning from examples. For the nonce, assuming training examples are randomly generated and labeled by an underlying target perceptron, the objective is to identify the target perceptron from the examples. Intuitively, the complexity function captures the size of the smallest typical training set which the learning algorithm requires to achieve perfect generalization with high confidence. Analytical techniques employed to estimate the capacity and complexity functions of the algorithms include multivariate large-deviation normal approximation and Poisson approximation. And computer simulations are performed to corroborate analytical results. Together, the notions of capacity and complexity provide fundamental, theoretical means to characterize the loading and generalization capabilities of learning algorithms for neural models of computation in the context of distribution-dependent learning. A progression of possible research extensions based on the approach and results obtained in the present investigation is outlined.

#### Subject Area

Electrical engineering|Computer science

#### Recommended Citation

Fang, Shao Chieh, "Capacity and complexity in learning binary perceptrons" (1995). *Dissertations available from ProQuest*. AAI9532171.

https://repository.upenn.edu/dissertations/AAI9532171