Neural Networks For Embedded Devices

Patent No. US11562231 (titled "Neural Networks For Embedded Devices") was filed by Deepscale Inc on Sep 3, 2019.

What is this patent about?

’231 is related to the field of deep neural networks (DNNs), specifically concerning their deployment on resource-constrained devices like those found in Internet-of-Things (IoT) applications. The background involves the challenge of implementing complex neural networks, typically requiring high-precision floating-point arithmetic, on devices with limited processing power and memory, often restricted to lower-bit integer arithmetic (e.g., 8-bit). This necessitates architectural innovations to reduce computational load and prevent data overflow.

The underlying idea behind ’231 is to design a neural network architecture, named StarNet, that can operate efficiently on reduced-bit processors by carefully managing the size and quantization of filter weights and input activations. The key inventive insight is to constrain the dimensionality of filters and the bit-widths of activations and weights to prevent arithmetic overflow during calculations, while also incorporating a novel "star-conv" filter structure to reduce the number of elements per filter.

The claims of ’231 focus on a method, a non-transitory computer-readable medium, and a system for generating a neural network structure tailored for devices with limited bit-length registers. The core of the claims involves determining appropriate integer representations for input layers and filters, and then generating dimensionalities for these layers and filters such that the output of their combination does not exceed the register's bit-length. A key feature is the use of star-shaped filters , which only use non-diagonal elements of a 3x3 grid, reducing the number of computations.

In practice, the StarNet architecture achieves efficient computation by employing a combination of techniques. First, the number of elements in each filter is limited (e.g., to 32 elements). Second, linear quantization is applied to both filter weights and input activations, mapping floating-point values to lower-bit integer representations. The bit-widths are chosen such that the maximum possible output value of a convolution operation remains within the representable range of the 8-bit registers, preventing overflow. A "star-shuffle block," consisting of 1x1 convolutions, ReLU activations, star convolutions, and shuffle layers, is used as a recurring module.

The "star-conv" filter is a key differentiator from traditional convolutional layers. By using only the immediate top, bottom, left, and right neighbors of a pixel, instead of all nine pixels in a 3x3 grid, the number of elements per filter is reduced. This allows for more bits to be used to represent the weights and activations, improving accuracy. Furthermore, the "shuffle" layer interleaves the ordering of channels to enable communication across channels, addressing the reduction in representational power that can occur when using group convolutions with a group-length greater than 1.

How does this patent fit in bigger picture?

Technical landscape at the time

In the late 2010s when ’231 was filed, deep neural networks were typically implemented using floating-point arithmetic on complex processors. At a time when systems commonly relied on 32-bit operations, hardware or software constraints made implementing neural networks on low-cost, low-power devices with reduced-bit arithmetic and storage non-trivial.

Novelty and Inventive Step

The examiner approved the application primarily because the claims include specific details of a neural network structure. This includes determining a bit length of registers for arithmetic operations, determining ranges of integer values, generating dimensionalities of input layers by combining elements of an input layer with elements of a corresponding filter to prevent overflow, and using star-shaped filters with zero weight values in the corners of a 3x3 rectangle. These features, in combination with other recited elements, were not found in the prior art.

Claims

This patent contains 17 claims, with independent claims 1, 7, and 13. The independent claims are directed to a method, a computer-readable medium, and a system for generating a neural network structure by determining dimensionalities of input layers and filters based on register bit length to avoid overflow. The dependent claims generally elaborate on the quantization process, the inclusion of a shuffle layer, and the bit length of the registers.

Key Claim Terms New

Definitions of key terms used in the patent claims.

Term (Source)Support for SpecificationInterpretation
Bit length
(Claim 1, Claim 7, Claim 13)
“Further, while computer implementations of neural networks commonly use 32-bit arithmetic, low-power computer processors often run most efficiently (i.e. lowest power usage and/or highest throughput) using 8-bit arithmetic.”The number of bits of the registers used to perform arithmetic operations.
Dimensionalities
(Claim 1, Claim 7, Claim 13)
“To effectively implement a neural network in reduced-bit architecture, the DNN is structured to have fewer elements per filter, such as 32 elements per filter. In one embodiment, the StarNet DNN architecture for 8-bit arithmetic and 8-bit storage has a maximum of 32 elements per filter.”The size and shape of the input layers and filters, determined to prevent overflow.
Integer representation
(Claim 1, Claim 7, Claim 13)
“In one implementation, linear quantization is used to bin floating-point values of filters and activations into a low-bit width integer representation. In one linear quantization scheme, the range of values of the linear bins is determined by analyzing the maximum and minimum numerical values that are observed in tensors of the neural network, looking at the dimensions of the filters, and then selecting maximum and minimum values for the bins such that the output cannot overflow.”A representation of the input layers and filters associated with a range of integer values.
Neural network structure
(Claim 1, Claim 7, Claim 13)
“As used herein, a deep neural network is a computer model that generates a set of outputs based on a set of inputs using a plurality of processing layers between the inputs and outputs. These processing layers may be “hidden” in the sense that the processing layers are not directly exposed during use, and represent arithmetic processes that together generate the set of outputs from the set of inputs. As an example, to process an image, the pixels of an image may be represented as an input layer.”A network configured to ingest one or more images.
Star-shaped filters
(Claim 1, Claim 7, Claim 13)
“Rather than a 3×3 filter, the idea is to use a “star-shaped” filter. (See summary of the StarNet-A DNN architecture that is described in the following. With the exception of the first convolution layer in StarNet-A, all layers of StarNet-A can be implemented using only 8-bit arithmetic and 8-bit storage.”Filters comprising non-zero weight values only for non-diagonal elements of a 3x3 rectangle.

Litigation Cases New

US Latest litigation cases involving this patent.

Case NumberFiling DateTitle
2:25-cv-00742Jul 23, 2025Perceptive Automata Llc V. Tesla, Inc.

Patent Family

Patent Family

File Wrapper

The dossier documents provide a comprehensive record of the patent's prosecution history - including filings, correspondence, and decisions made by patent offices - and are crucial for understanding the patent's legal journey and any challenges it may have faced during examination.

  • Get instant alerts for new documents

US11562231

DEEPSCALE INC
Application Number
US16559483
Filing Date
Sep 3, 2019
Status
Granted
Expiry Date
May 19, 2041
External Links
Slate, USPTO, Google Patents