# Analogue building blocks for neural-inspired circuits

Steve Hall University of Liverpool, <u>s.hall@liv.ac.uk</u>

Liam McDaid University of Ulster, lj.mcdaid@ulster.ac.uk



## Some facts about the brain as a PC...

- The brain has ~100 billion neurons  $(10^{11})$  about 30µm large
  - Neuron Fan-in ~  $10^3 10^4$  (logic gates 2-4!)
  - complex dynamics includes several time constants,
  - maintains a more complex internal state
  - output is a time-series of action potentials
    or 'spikes no information in amplitude!
- Massively parallel in nature
  - Typical 10<sup>15</sup> interconnections
  - Total computation rate of about 10<sup>16</sup> complex operations /sec (cf 10 P-FLOPs)
- Millisecond time frame of 'events'
- Low level function: 'reasonably well understood'.
- High level function.....???????







## Some other brains

- A fly (1 grain of sugar a day to feed it!): 250 k neurons
- Honeybee (fantastic navigator!): 1 million neurons
- Rat (pretty smart animal): 55million neurons
- But how do the following work:
  - the arithmetic
  - Fault-tolerance
  - The parallelism (beat Moore's Law hands down)

This is the inspiration!

But must find a simpler, scaleable, low power approach

#### Synapses and neurons Spike-timing dependent plasticity





If spike,  $t_1$  causes neuron, N to fire  $(t_2 - t_1 \text{ small})$ ..

Weight W<sub>1</sub> may be increased

and  $W_2$  etc decreased

#### **Motivation**

Create building blocks that can emulate biological functionality

Implement in mixed signal CMOS (cheap!)

Assess layout / scalability / systems functionality

Circuits that can learn!

- Plasticity / decision circuits (STDP) / FG weight storage
- Build large, useful electronic systems learn more about 'brain computation' .....

#### **Circuit Challenges**

- Store and update weights
- Detect timing  $(t_2 t_1)$
- Axonal delay
- Low power operation
- Scale to VLSI
- Learn!





Dowrick et al.' IEEE Trans. On Neural Networks and Learning Systems, 23(10), p.1513 (2012)





#### Fan-in: theory

Consider transients of capacitive nodes



Dowrick et al, Neurocom., vol.314, pp.78-85, <u>https://doi.org/10.1016/j.neucom.2018.06.065</u> (2018)

#### Fan-in

$$V_{PSPMAX} = m_p V_t \frac{C_{IN}(n)}{C_{PSP}} \ell n \left[ 1 + \frac{I_{Op}}{m_p V_t C_{IN}(n)} exp\left(\frac{V_{IN}'(0)}{m_p V_t}\right) \tau_R \right] - \frac{I_{M6}}{C_{PSP}} \tau_R$$



Conclusion: Fan-in intrinsic limit > 10<sup>5</sup> ! Practical limit is set by layout / interconnect

Dowrick et al, Neurocom., vol.314, pp.78-85, <u>https://doi.org/10.1016/j.neucom.2018.06.065</u> (2018)

#### Compact decision circuits (STDP)

Weight Increase, WI, Circuit Block, Output Buffers and SIFGNVM Device



How it works: WI Block Operation Pre-Post Spiking Event

- When a presynaptic spike occurs (V<sub>Pre</sub>)
  - $V_1$  is pulled up to 3V-  $V_{TMpre}(V_1)$ ,  $C_1$  charges via  $M_{pre}$
  - C<sub>1</sub> Slowly discharges via sub-threshold M<sub>leak</sub>
  - $V_{post}$  triggers the sample/hold as some time, t after  $V_{pre}$



#### Axonal delay



#### **Pulse burst creation**



#### Integrate axon delays (A) into paths



Dowrick et al, Neurocomputing, 2012 http://dx.doi.org/10.1016/j.neucom.2012.12.004,

#### Add feedback (M<sub>10</sub>) Define pulse trains



#### Scaling

• Two solutions: sum voltages or sum currents





Scaleability: easier to sum currents

Transmit voltage steps and re-create spikes for long interconnect

But added complexity!

### Scaling: circuit issues

Large synapse fan-out problem: non uniform spike inputs due to parasitics non-linearities occur in currents





Hope it all comes out in the wash! Nature is messy as well

#### Ex. Spikes (V) Inhibitory Synapses lth Vin\_p 🗗 lpsc\_p Inh. Spikes (V) CTS\_p W\_p[5..1] 200-JEPSC EPSC (uA) EPSC (uA) excite Vin\_q 🗗 lpsc\_q CTS\_q 50.0-W\_q[5..1] (IPSC+lth ≩ R inhibite Ŧ Excitatory 100 Neuron State (uA) Neuron State (uA) Synapses D Vout lpsc\_m Vin m 🕞 CTS m W\_m[5..1] Step Output (N) $\sum_{m..n} Ipsc_{ex} - \sum_{p..q} Ipsc_{inh} - Ith \ge 0$ lpsc\_n Vin n 🕞 Spike Output (V) CTS\_n W\_n[5..1] R≹ .250 ò .500 .750 1.0 time (ms)

#### Neurons with excitatory and inhibitory synapses

#### Programmable weights

- Analog weight
  - Good: Continuous weight value, compact analog storage circuit
  - Bad: Inaccurate, require bias reference circuit and complex control circuit for high resolution, also require high voltage rail and undocumented
  - technology feature
- Digital weight
  - Good: accurate, mature digital memory technology, easy to program
  - Bad: discrete quantitative weight, require more space

#### Programmable weight



### Embrace: an alternative approach

- Network-on-chip address the issues of scalability and connectivity between components.
- Low-area/power spiking neuron cells with associated training provides neural computing capability.



- 2-dimensional array of interconnected neural tiles + I/O blocks.
- Neural tiles connected in North, East, South and West.
- Tile can be programmed to realise neuron-level functions.

Harkin et al, Int. Jnl of Reconfigurable Computing, doi:10.1155/2009/908740 (2009) Slide courtesy of Jim Harkin

### **Evaluation**

- Learning in software (calculate weight values)
  - Fit the experimental synapse results
- Solve benchmark problems
  - Wisconsin breast cancer (WBC) dataset
  - IRIS dataset
- Temporally encoded input values







Ghani at al, Neurocomputing, 83 (2012) pp.188–197 (2011)

## Circuits fabricated in AMS 0.35, mixed signal CMOS







Breslin et al, PLoS Computational Biology, doi.org/10.1371/journal.pcbi.1006151, May (2018)

#### **Endocannabinoid Mediated Self-Repair**



Wade, McDaid et al, Frontiers in computational neuroscience, v6, Art 76 (2012)

#### Astrocytes mediate self-repair



- Astrocyte 'forces' synapses to 'work harder'
- Opens up STDP window restarts learning

Wade, McDaid et al, Frontiers in computational neuroscience, v6, Article 76 (2012)

#### What we learnt..

- Can build compact analogue circuits that emulate aspects of biology with a degree of success (better than in software? – potentially much faster)
- Getting them to learn is another matter..
  - Need feedback
  - Weight update
  - Starts to get very complicated...
- A lot of redundancy once the circuit has 'learnt'
- Scaling soon results in a huge amount of interconnect

Need software/hardware combination – learning in software

#### Still some way to go before....



#### Thanks to

J Harkin *(jg.harkin@ulster.ac.uk)* 

T Dowrick (PhD) A Smith (PhD) S Chen (PhD) S Zhang (post-doc) A Ghani (post-doc)

**Funding:** EPSRC, NAP, EPSRC-DTA awards, Dorothy Hodgkin scholarship

https://www.riverpublishers.com/book\_details.php?book\_id=693