Synaptic balancing: A biologically plausible local learning rule that provably increases neural network noise robustness without sacrificing task performance

doi:10.1371/journal.pcbi.1010418

Fig 1.

Two nonlinear recurrent networks consisting of input, hidden, and output neurons possess an identical input-output relationship, but behave differently in the presence of noise.

(a-b) A single input u (teal) drives two recurrent neurons x₁ (yellow) and x₂ (red) which project to a single output y (purple). The connectivity patterns of the two networks are related by the task-preserving transformation (4). Line thickness denotes synaptic strength. (c) Input trajectory, fed to both networks. Horizontal axis in all panels is time. (d) Output trajectory, produced by both networks when run according to the deterministic dynamics (1). (e-f) Hidden unit neural activity under deterministic dynamics, for networks a and b respectively. Trajectories are identical up to a per-neuron scale factor, determined by the parameters of the task-preserving transformation. (g-h) Hidden neuron neural activity for networks a and b respectively when additive Gaussian noise is injected into the neural dynamics (1). 50 trials are shown. (i-j) Output neuron neural activity in the noisy case.

More »

Expand

Fig 2.

The local computations underlying synaptic balancing.

Each synapse (indicated in teal) calculates a cost as a function of synaptic strength, as in (13). Neuron k receives signals of incoming synaptic cost c_kj and outgoing synaptic cost c_ik (teal arrows from synapses to soma) and computes the difference g_k as in (17). The signal g_k then propagates outwards (purple arrows from soma to synapses) to modify the strength of incoming and outgoing synaptic connections, as in (18), such that the total incoming and total outgoing costs are eventually balanced in every neuron.

More »

Expand

Fig 3.

Network topology determines the geometry of the task-preserving manifold and the dynamics of synaptic balancing.

Top row: ReLu network with two hidden units connected by a single synapse, corresponding to initial synaptic cost . Bottom row: ReLu network network with two hidden units connected reciprocally, with initial synaptic costs and . (a-b) Network diagrams showing topology and initial synaptic costs, indicated by line thickness. Input and output neurons are not shown. (c-d) Trajectory of synaptic costs over the course of synaptic balancing. Line colors match synapse colors in panels (a-b). Panel (c) matches (31) and panel (d) matches (29). (e-f) The feasible set of h satisfying (22), with flow lines indicating trajectory of synaptic balancing. Panel (f): Red point indicates the (finite) minimizer of the total cost. (g-h) Tradeoff between c₁₂ and c₂₁ as a function of h₂ − h₁, with red point indicating the optimal value of total cost C = c₁₂ + c₂₁. (i-j) Total cost C as a function of h₂ − h₁. (i) Optimal cost is zero, attained at an infinite value of h. (j) Optimal cost is positive, attained at finite h.

More »

Expand

Fig 4.

The balance condition in trained networks.

(a) ReLu networks with N = 256 neurons are trained via gradient descent on a context-dependent integration (CDI) task modeled after [23], with varying levels of ℓ₂ penalty . The norm of neural gradients, ‖g‖, is shown over the course of training. When λ = 0, neural gradients are fixed by gradient descent dynamics. When λ > 0, trained solutions tend towards the balance condition (21), with . Histograms at right denote the empirical null distribution of ‖g‖ under permutation of the rows of C at the final training iteration. For positive values of λ, the actual value of ‖g‖ (dotted line) falls significantly below the null distribution. (b) Synaptic balancing with the robustness cost function (13) is applied to several ReLu networks trained with λ = 0.3 (corresponding to the purple curve in panel a). The original and balanced networks are run on the CDI task at varying levels ε of Gaussian noise injected into hidden dynamics. For each network pair the ratio is plotted of task loss of the balanced network to that of the original network, as a function of ε. As dynamics become more noisy, the performance of the original networks (as measured by loss on the task) degrades faster than that of the corresponding balanced networks. Inset: total cost C of the original vs. balanced networks (12).

More »

Expand

Fig 5.

Dynamics of synaptic balancing in a 12-neuron ring network with single perturbed synapse.

(a). Left: A ring network is at an initial equilibrium C*ⁱ with all synaptic costs equal to 1. Nodes indicate neurons; arrows indicate directed synapses. Center: Synapse A, from neuron j to i, is instantaneously potentiated to a synaptic cost of . Right: Synaptic balancing relaxes to a new equilibrium C*^f. Synapse colors and thickness indicate synaptic cost. Neuron colors indicate value of h*^f according to color scheme of panel (c). (b) Time course of synaptic balancing following perturbation. Incoming synapses to neuron i (A and D) are weakened and outgoing synapses from neuron i (B and C) are strengthened. Incoming synapses to neuron j (B and E) are strengthened and outgoing synapses from neuron j (A and F) are weakened. Synapses (H and G) that are distant from the site of perturbation respond more slowly than proximate synapses though they reach the same equilibrium values. (c) Three eigenmodes v₃, v₅, v₇, with eigenvalues λ₃, λ₅, λ₇, of the Laplacian matrix L⁰ corresponding to the conductance matrix at the moment of perturbation. Color indicates mode value at each neuron. (d) Dynamics of h approximately decompose into the basis of Laplacian eigenmodes. The scalar projection of h onto each mode v is shown along with the quadratic approximation (39), using L⁰ as Laplacian. Line color matches eigenvalue color in panel (c).

More »

Expand