Optimizing Neural Network Structures For Embedded Systems

Patent No. US11636333 (titled "Optimizing Neural Network Structures For Embedded Systems") was filed by Tesla Inc on Jul 25, 2019.

’333 is related to the field of autonomous control systems, specifically the generation and deployment of machine-learned models for embedded systems within vehicles. These systems analyze the surrounding environment to guide vehicles safely, often using computer models trained on large datasets. However, training these models is computationally expensive and time-consuming, especially when exploring different model architectures for various target platforms.

The underlying idea behind ’333 is to create a model training pipeline that efficiently selects and trains machine learning models tailored to the specific capabilities of individual embedded systems. This is achieved by generating an intermediate representation of the model, evaluating its performance *before* full training, and iteratively refining the model architecture based on these pre-training performance estimates. This approach avoids wasting resources on training models that are unsuitable for the target platform.

The claims of ’333 focus on a method, system, and non-transitory computer readable media for generating a machine-learned model. The process involves generating an untrained model, creating an intermediate representation of it compatible with a virtual machine, and evaluating its performance in a target system. This evaluation includes determining latency, frequency of application, resource usage, and power consumption. The process iteratively generates and evaluates new untrained models, selects a subset based on performance, trains them, evaluates their accuracies, and selects a final model for deployment.

The invention works by first generating a model and compiling it into an intermediate representation that is independent of the target platform's specific hardware. This representation is then fed into a virtual machine, which translates it into machine code optimized for the target processor (CPU, GPU, or DSP). A performance evaluator estimates the model's performance based on both the intermediate representation and the generated machine code, considering factors like memory usage, computational complexity, and power consumption. This allows for rapid exploration of different model architectures and configurations without the overhead of full training.

A key differentiation from prior approaches is the emphasis on pre-training performance estimation . Instead of fully training every model candidate, the pipeline uses the intermediate representation and virtual machine to predict performance characteristics *before* training. This allows the system to quickly discard unsuitable models and focus resources on training only the most promising candidates. Furthermore, the use of an intermediate representation enables deployment across a variety of embedded platforms, as long as a suitable virtual machine and kernel set are available for that platform.

How does this patent fit in bigger picture?

Technical landscape at the time

In the late 2010s when ’333 was filed, machine learning models were increasingly being deployed on embedded systems at a time when hardware constraints made efficient model execution non-trivial. At a time when X was typically implemented using Y, model training often occurred on powerful servers, while inference was performed on resource-constrained devices. When systems commonly relied on Z rather than A, the translation of complex models into a format suitable for embedded systems was a significant challenge.

Novelty and Inventive Step

The examiner approved the application because the prior art did not teach or make obvious generating an untrained model, creating an intermediate representation of it compatible with a virtual machine, evaluating the untrained model's performance (including latency, frequency, resource usage, and power consumption), iteratively generating and evaluating new untrained models based on previous models' performance, selecting a subset of models, training them, evaluating their accuracies, and selecting a model for deployment based on those accuracies. The examiner stated that the prior art failed to render obvious iteratively generating and evaluating new untrained models, wherein individual new untrained models are generated based on performance of one or more previous untrained models; select a subset of models based on the performance of the generated models; train the selected subset of models; evaluate respective accuracies of the subset of models; and select a particular model of the subset of models for deployment to the target system based on the accuracies, when the performance of untrained models is determined by evaluation of the performance of the untrained models that includes at least one of determining a latency in applying the untrained model in a target system, determining a frequency at which the untrained model can be applied in the target system, determining an amount of resources used by the untrained model, and determining an amount of power consumed by the target system using the untrained model, and an intermediate representation of the generated untrained model is generated that is compatible with a virtual machine, as required by the amended independent claims.

Claims

This patent contains 24 claims, of which claims 1, 9, and 17 are independent. The independent claims are directed to a method, a system, and computer readable media for generating a machine-learned model. The dependent claims generally elaborate on the specifics of the model generation, evaluation, and selection processes described in the independent claims.

Key Claim Terms New

Definitions of key terms used in the patent claims.

Term (Source)	Support for Specification	Interpretation
Intermediate representation (Claim 1, Claim 9, Claim 17)	“During the model generation stage, the pipeline translates the description of the model together with the model parameters into an intermediate representation in a language that is compatible with a virtual machine. The intermediate representation is agnostic or independent to the configuration of the target platform. That is, as long as a virtual machine is designed for a platform, the platform is able to apply the model by executing the intermediate representation of the model through the virtual machine.”	A representation of a model in a language that is compatible with a virtual machine, allowing the model to be executed on different target platforms.
Respective accuracies (Claim 1, Claim 9, Claim 17)	“In some embodiments, different performance parameters are tested after the models have been trained. For example, the trained models are evaluated based on their accuracy in addition to their performance characteristics. That is, if models for identifying road hazards is being tested, the accuracy of the models for detecting various road hazards in test images.”	The degree to which each model in the selected subset correctly performs its intended task, such as identifying road hazards.
Target system (Claim 1, Claim 9, Claim 17)	“A model training and implementation pipeline trains models for individual embedded systems by generating an intermediate representation of a model for interpretation on the embedded system. The pipeline iterates through multiple models and estimates the performance of the models to determine if the models are able to be applied by the target platform. Based on the analysis of the performance of the trained models, a single model is selected for deployment to the target platform.”	The specific embedded system or platform on which the machine-learned model will be deployed and used.
Untrained model (Claim 1, Claim 9, Claim 17)	“During the model performance estimation stage, the pipeline evaluates the performance of the models without training the models. For instance, the model is generated using default or randomized parameters. Based on the analysis of the performance of the untrained models, a subset of models that perform within the specified performance are selected.”	A model that has been generated but not yet trained with data to refine its parameters.
Virtual machine (Claim 1, Claim 9, Claim 17)	“The intermediate representation is agnostic or independent to the configuration of the target platform. That is, as long as a virtual machine is designed for a platform, the platform is able to apply the model by executing the intermediate representation of the model through the virtual machine. The virtual machine is a software module that enables a computer to run or execute programs that are written in the language of the intermediate representation.”	A software module that enables a computer to run programs written in the intermediate language by translating the intermediate representation into machine code.

Litigation Cases New

US Latest litigation cases involving this patent.

Case Number	Filing Date	Title
2:25-cv-00742	Jul 23, 2025	Perceptive Automata Llc V. Tesla, Inc.

Patent Family

File Wrapper

The dossier documents provide a comprehensive record of the patent's prosecution history - including filings, correspondence, and decisions made by patent offices - and are crucial for understanding the patent's legal journey and any challenges it may have faced during examination.

Get instant alerts for new documents

US11636333

TESLA INC

Application Number: US16522411
Filing Date: Jul 25, 2019
Status: Granted
Expiry Date: Nov 26, 2041
External Links: Slate, USPTO, Google Patents

IP Verse