Highly Trainable Neural Network Configuration

Patent No. US10984320 (titled "Highly Trainable Neural Network Configuration") was filed by Tesseract Systems Llc on May 1, 2017.

’320 is related to the field of computer-based neural networks, specifically addressing the challenge of training very deep networks. Traditional deep neural networks, while theoretically powerful, suffer from optimization difficulties as the number of layers increases. This is due to issues like vanishing gradients, where the error signal diminishes as it propagates back through the network during training, hindering the learning process, especially in early layers.

The underlying idea behind ’320 is to introduce a gating mechanism within each neuron that controls the flow of information. Instead of always applying a non-linear transformation to the input signal, the neuron can selectively allow the original input signal to pass through, unmodified, or pass a transformed version of the input, or a mix of both. This is achieved using transform and carry gates , which determine the weighting between the transformed and non-transformed components.

The claims of ’320 focus on a computer-based method, a neural network architecture, and a computer-readable medium implementing this architecture. The core element is a neuron that receives an input signal, applies a first non-linear transform to produce a 'plain' signal, and then uses two additional non-linear transforms (gates) to produce a 'transform' signal and a 'carry' signal. The neuron then calculates a weighted sum of the original input signal and the 'plain' signal, where the weights are determined by the 'transform' and 'carry' signals.

In practice, this architecture allows for the creation of 'information highways' through the network. By initializing the network to favor the 'carry' behavior, the original input signal can propagate through many layers without significant attenuation. This makes it easier to train very deep networks because the error signal can propagate back more effectively, even through many layers. The transform gate learns to regulate information flow, allowing the network to dynamically choose between transforming the input or simply passing it through.

This approach differs from prior solutions that rely on careful initialization schemes or complex training techniques to overcome the optimization challenges of deep networks. By introducing the gating mechanism, ’320 provides a more robust and flexible way to train networks with virtually arbitrary depth, using standard Stochastic Gradient Descent (SGD) with momentum. This allows for the exploration of deeper architectures and potentially more complex problem-solving capabilities without the limitations imposed by traditional training difficulties.

How does this patent fit in bigger picture?

Technical landscape at the time

In the mid-2010s when ’320 was filed, neural networks were typically implemented using deep architectures, at a time when training very deep networks was non-trivial. Optimization of deep networks was difficult, leading to research on initialization techniques and multi-stage training approaches.

Novelty and Inventive Step

The examiner allowed the claims because amendments made by the applicant overcame previous rejections. Specifically, the examiner was persuaded that the claims, as amended, contained additional elements (applying several non-linear transforms to signals) that were significantly more than the abstract ideas recited in the claims, thus rendering the claims statutory under 35 U.S.C. 101. The examiner also noted that the prior art did not teach the specific transforms and calculations in the context of the methods, neural network, and computer-readable medium as claimed.

Claims

This patent includes 28 claims, with independent claims 1, 15, 26, and 28. The independent claims generally focus on a computer-based method, a computer-based neural network, a computer-readable medium, and another method, all related to facilitating training in a computer-based neural network using non-linear transforms and weighted sums within neurons. The dependent claims generally elaborate on and refine the specifics of the independent claims, adding details and limitations to the method, network, and medium.

Key Claim Terms New

Definitions of key terms used in the patent claims.

Term (Source)	Support for Specification	Interpretation
Carry signal (Claim 1, Claim 15, Claim 26, Claim 28)	“In yet another aspect, a training method is disclosed that applies to a computer-based neural network. The computer-based neural network includes a plurality of layers of neurons, where each layer of neurons comprises a plurality of neurons. One or more of the neurons includes a means for applying a first non-linear transform to an input signal to produce a plain signal, a first gate configured to apply a second non-linear transform to the input signal to produce a transform signal, and a second gate configured to apply a third non-linear transform to the input signal to produce a carry signal, and a means for calculating a weighted sum of a first component of the input signal and the plain signal.”	A signal produced by applying a third non-linear transform to the input signal at a second gate in the neuron.
Plain signal (Claim 1, Claim 15, Claim 26, Claim 28)	“In one aspect, a computer-based method includes receiving an input signal at a neuron in a computer-based neural network that comprises a plurality of neuron layers, applying a first non-linear transform to the input signal at the neuron to produce a plain signal, and calculating a weighted sum of a first component of the input signal and the plain signal at the neuron.”	A signal produced by applying a first non-linear transform to the input signal at a neuron.
Transform signal (Claim 1, Claim 15, Claim 26, Claim 28)	“In yet another aspect, a training method is disclosed that applies to a computer-based neural network. The computer-based neural network includes a plurality of layers of neurons, where each layer of neurons comprises a plurality of neurons. One or more of the neurons includes a means for applying a first non-linear transform to an input signal to produce a plain signal, a first gate configured to apply a second non-linear transform to the input signal to produce a transform signal, and a second gate configured to apply a third non-linear transform to the input signal to produce a carry signal, and a means for calculating a weighted sum of a first component of the input signal and the plain signal.”	A signal produced by applying a second non-linear transform to the input signal at a first gate in the neuron.
Weighted input signal component (Claim 1, Claim 15, Claim 26, Claim 28)	“The training method includes initializing parameters associated with each of the first, second and/or third of the non-linear transforms for each of a plurality of the neurons in the computer-based neural network, analyzing an input by passing an associated input signal through the computer-based neural network in a forward direction to produce an output signal at an output of the computer-based neural network, propagating an error associated with the output signal back through the computer-based neural network, and determining whether to adjust any parameters (e.g., weights) associated with the computer-based neural network.”	A product of the non-transformed first component of the input signal and the carry signal.
Weighted plain signal (Claim 1, Claim 15, Claim 26, Claim 28)	“The training method includes initializing parameters associated with each of the first, second and/or third of the non-linear transforms for each of a plurality of the neurons in the computer-based neural network, analyzing an input by passing an associated input signal through the computer-based neural network in a forward direction to produce an output signal at an output of the computer-based neural network, propagating an error associated with the output signal back through the computer-based neural network, and determining whether to adjust any parameters (e.g., weights) associated with the computer-based neural network.”	A product of the plain signal and the transform signal.

Patent Family

File Wrapper

The dossier documents provide a comprehensive record of the patent's prosecution history - including filings, correspondence, and decisions made by patent offices - and are crucial for understanding the patent's legal journey and any challenges it may have faced during examination.

Date
Description

Get instant alerts for new documents

Mar 31, 2021
Issue Notification
Mar 26, 2021
List of References cited by applicant and considered by examiner
Mar 22, 2021
Issue Fee Payment (PTO-85B)
Mar 22, 2021
Post Allowance Communication - Incoming
Mar 22, 2021
Fee Worksheet (SB06)
Mar 22, 2021
Electronic Filing System Acknowledgment Receipt
Dec 23, 2020
Notice of Allowance and Fees Due (PTOL-85)
Dec 23, 2020
Examiner Interview Summary Record (PTOL - 413)
Dec 23, 2020
Search information including classification, databases and other search related notes
Dec 23, 2020
Issue Information including classification, examiner, name, claim, renumbering, etc.
Dec 23, 2020
Examiner's search strategy and results
Dec 23, 2020
Examiner's search strategy and results
Nov 16, 2020
Fee Worksheet (SB06)
Nov 16, 2020
Fee Worksheet (SB06)
Nov 16, 2020
Electronic Filing System Acknowledgment Receipt
Nov 16, 2020
Extension of Time
Nov 16, 2020
Amendment/Request for Reconsideration-After Non-Final Rejection
Nov 16, 2020
Claims
Nov 16, 2020
Applicant Arguments/Remarks Made in an Amendment
Jul 16, 2020
Non-Final Rejection
Jul 16, 2020
List of references cited by examiner
Jul 16, 2020
Index of Claims
Jul 16, 2020
Search information including classification, databases and other search related notes
Jul 16, 2020
List of References cited by applicant and considered by examiner
Jul 16, 2020
Examiner's search strategy and results
Jul 16, 2020
Bibliographic Data Sheet
Jul 16, 2020
Examiner's search strategy and results
Nov 2, 2017
Notice of Publication
May 22, 2017
Communication - Re: Power of Attorney (PTOL-308)
May 19, 2017
Power of Attorney
May 19, 2017
Oath or Declaration filed
May 19, 2017
Oath or Declaration filed
May 19, 2017
Electronic Filing System Acknowledgment Receipt
May 9, 2017
Miscellaneous Communication to Applicant - No Action Count
May 9, 2017
Filing Receipt
May 9, 2017
Fee Worksheet (SB06)
May 1, 2017
Placeholder sheet indicating presence of supplemental content in Supplemental Complex Repository for Examiners(SCORE)
May 1, 2017
Information Disclosure Statement (IDS) Form (SB08)
May 1, 2017
Specification
May 1, 2017
Claims
May 1, 2017
Abstract
May 1, 2017
Drawings-other than black and white line drawings
May 1, 2017
Fee Worksheet (SB06)
May 1, 2017
Electronic Filing System Acknowledgment Receipt
May 1, 2017
Application Data Sheet
Jan 3, 1
Drawings-black and white line and/or other drawings

US10984320

TESSERACT SYSTEMS LLC

Application Number: US15582831
Filing Date: May 1, 2017
Status: Granted
Expiry Date: Jan 20, 2040
External Links: Slate, USPTO, Google Patents

IP Verse