Systems And Methods For Hardware-Based Pooling

Patent No. US12307350 (titled "Systems And Methods For Hardware-Based Pooling") was filed by Tesla Inc on Jan 4, 2018.

’350 is related to the field of convolutional neural networks (CNNs) , specifically focusing on improving the efficiency of pooling layers. CNNs are used for image classification and object recognition, employing multiple layers to extract features from input images. Pooling layers are essential for down-sampling feature maps, reducing computational load, and increasing performance. However, traditional pooling methods can be computationally intensive and limit overall CNN efficiency.

The underlying idea behind ’350 is to implement a hardware-based pooling architecture that directly processes the output of a convolution engine. Instead of storing the convolution output in memory and then performing pooling, the invention reformats the data on-the-fly into a grid-like structure. This allows for efficient application of pooling operations, such as max-pooling or average-pooling, without the need for complex intermediate steps or extensive data storage.

The claims of ’350 focus on a pooling unit that reformats input data into multiple rows to create a pooling array. The input data is a linearized array representing an output channel from a convolutional layer. The rows are shifted relative to each other over several arithmetic cycles to align data for pooling. The pooling unit then applies pooling operations to these aligned rows to generate a pooling output, outputting a pooling value every number of arithmetic cycles corresponding to a stride value.

In practice, the pooling unit receives a linearized array from a matrix processor, which represents a feature map. The row aligner then arranges this data into rows, effectively creating a grid where neighboring values are aligned vertically and horizontally. This alignment allows the pooler to easily extract values for pooling calculations. The stride value determines how often pooling data is output, controlling the sliding window's movement across the feature map.

This approach differs from prior solutions by avoiding intermediate memory storage of the convolution result. By directly processing the output channel from the matrix processor, the pooling unit reduces computation time and improves overall efficiency. The on-the-fly reformatting and pooling operations enable faster processing of CNN layers, leading to improved performance in image classification and object recognition tasks. The hardware-based implementation allows for parallel processing and optimized data flow, further enhancing efficiency compared to software-based pooling methods.

How does this patent fit in bigger picture?

Technical landscape at the time

In the late 2010s when ’350 was filed, at a time when convolutional neural networks were increasingly used for image classification and object recognition, systems commonly relied on techniques like weight sharing to improve the performance of convolutional layers, but pooling layers were often neglected due to architectural constraints. Hardware or software constraints made efficient hardware implementations of pooling operations non-trivial.

Novelty and Inventive Step

The examiner approved the application because prior art references, taken individually or together, did not disclose the specific steps for hardware-based pooling. This includes reformatting data into rows and shifting those rows based on a stride defined by a number of arithmetic cycles, which dictates how often the pooling data is output. These features, combined with the other elements of the claimed invention, distinguish it from the prior art.

Claims

This patent contains 18 claims, of which claims 1, 11, and 16 are independent. The independent claims are directed to a pooling unit and methods for using a hardware-based pooling system and a pooling unit, focusing on reformatting input data and applying pooling operations. The dependent claims generally elaborate on the specifics and features of the pooling unit and methods described in the independent claims.

Key Claim Terms New

Definitions of key terms used in the patent claims.

Term (Source)	Support for Specification	Interpretation
Arithmetic cycles (Claim 1, Claim 11, Claim 16)	“While great progress has been achieved in improving the performance of convolutional layers by sharing of weights and improving arithmetic logic unit utilization, pooling layers, which are similarly computationally intensive, have been neglected mainly due to constraints inherent to existing neural network architectures.”	Cycles of an arithmetic unit, used to shift rows relative to each other and to define the stride, dictating how often pooling values are output.
Grid format (Claim 1, Claim 11, Claim 16)	“Pooling generally involves sliding a pooling window, e.g., a two-dimensional square of multiple pixels in width and multiple pixels in height, stepwise across small, non-overlapping areas (i.e., receptive field) of the output of a preceding convolution layer. Aggregating the values of the group of neurons in that area provides single output values (e.g., integers) for each group in a local neighborhood.”	A format in which the pooling array or set of data arrays are arranged such that elements of an individual group of neighborhood values are aligned in a horizontal and vertical direction.
Pooling array (Claim 1)	“As applied to images, a pooling layer allows the network to determine a feature map and learn a set of features for the image. Pooling is viewed as a form of nonlinear sub-sampling or down-sampling that uses a nonlinear function, such as max-pooling or average-pooling, to reduce the number of neurons when progressing from layer to layer through the network; thereby, reducing the amount of computation and further increasing computational performance.”	A reformatted version of the input data, arranged into rows, where the number of rows corresponds to the size of a pooling kernel. The array is in grid format such that elements of a group of neighborhood values are aligned horizontally and vertically.
Pooling kernel (Claim 1, Claim 11)	“A convolution layer may use several filters known as kernels or activation functions that apply to the pixels of a convolution window of an image a set of weights. The weights have been learned by the CNN during a training phase to generate an activation value associated with that window. For each filter, the convolution layer may have, for each pixel, one node, i.e., neuron, that outputs an activation value that is calculated based on the set of weights.”	A kernel defining the size of the pooling window, where the number of rows in the pooling array corresponds to the size of the pooling kernel.
Pooling window (Claim 1, Claim 11, Claim 16)	“Pooling generally involves sliding a pooling window, e.g., a two-dimensional square of multiple pixels in width and multiple pixels in height, stepwise across small, non-overlapping areas (i.e., receptive field) of the output of a preceding convolution layer. Aggregating the values of the group of neurons in that area provides single output values (e.g., integers) for each group in a local neighborhood.”	A window that shifts according to a stride, where the stride value is defined by the number of arithmetic cycles and dictates how often pooling values are output. The number of rows corresponds to the size of the pooling window.

Litigation Cases New

US Latest litigation cases involving this patent.

Case Number	Filing Date	Title
2:25-cv-00742	Jul 23, 2025	Perceptive Automata Llc V. Tesla, Inc.

Patent Family

File Wrapper

The dossier documents provide a comprehensive record of the patent's prosecution history - including filings, correspondence, and decisions made by patent offices - and are crucial for understanding the patent's legal journey and any challenges it may have faced during examination.

Get instant alerts for new documents

US12307350

TESLA INC

Application Number: US15862369
Filing Date: Jan 4, 2018
Status: Granted
Expiry Date: May 15, 2041
External Links: Slate, USPTO, Google Patents

IP Verse