Real-Time Accent Conversion Model

Patent No. US11948550 (titled "Real-Time Accent Conversion Model") was filed by Canadian Imperial Bank Of Commerce on Aug 27, 2021.

’550 is related to the field of speech processing, specifically accent conversion. The background involves challenges in communication due to different accents, even among speakers of the same language. Existing solutions, such as voice conversion methods that adjust audio characteristics or speech-to-text-to-speech (STT-TTS) approaches, have limitations in capturing pronunciation nuances and maintaining real-time performance.

The underlying idea behind ’550 is to use a machine-learning pipeline to convert speech from one accent to another in real-time. This involves first deriving a linguistic representation of the input speech using an automatic speech recognition (ASR) engine, and then synthesizing audio data with the target accent using a voice conversion (VC) engine. The key is to operate on a non-text linguistic representation to preserve nuances and minimize latency.

The claims of ’550 focus on a system, a non-transitory computer-readable medium, and a method for real-time accent conversion. The core process involves training a first machine-learning algorithm with audio data from multiple speakers of a first accent, applying this algorithm to received speech to derive a non-text linguistic representation, synthesizing audio data with a second accent using a second machine-learning algorithm, and converting the synthesized audio data into a synthesized version of the received speech with the second accent. A key aspect is mapping phonemes from one accent to another at the linguistic representation level.

In practice, the system uses an ASR engine trained on multiple speakers of the input accent to create a generalized linguistic representation. This representation is then fed into a VC engine, which has been trained to map the input accent's linguistic features to those of the target accent. The VC engine synthesizes audio data, which is then converted into speech by a vocoder. This approach allows for real-time conversion because it avoids the latency associated with converting speech to text and back.

The differentiation from prior approaches lies in the use of a non-text linguistic representation as an intermediate step. Traditional methods either adjust audio characteristics directly or rely on STT-TTS, both of which have drawbacks. By operating on a linguistic level, the invention can capture subtle pronunciation differences and maintain low latency, making it suitable for real-time communication applications. The system also trains the ASR engine with multiple speakers to create a more robust and generalized representation of the input accent.

How does this patent fit in bigger picture?

Technical landscape at the time

In the early 2020s when ’550 was filed, machine learning models were increasingly used for speech processing tasks, at a time when speech-to-text and text-to-speech systems commonly relied on large datasets and significant computational resources. Accent conversion, while recognized as a desirable feature, was typically implemented using simpler voice conversion techniques that adjusted audio characteristics, when hardware or software constraints made real-time, nuanced accent transformation non-trivial.

Novelty and Inventive Step

The examiner approved the application because the prior art cited (Dirac, Hwang, and Peng) did not teach the limitations of the claims. Specifically, the examiner found that the prior art did not disclose the training and application of the machine learning model, nor the synthesizing of fourth audio data using the method described in the claims. Therefore, the examiner concluded that the prior art, either alone or in combination, did not teach the combination of limitations found in the claims.

Claims

This patent contains 22 claims, with independent claims 1, 11, and 19. The independent claims are directed to a system, a computer-readable medium, and a method, respectively, all generally focused on converting speech content from a first accent to a second accent using machine learning. The dependent claims generally elaborate on and refine the features and functionalities described in the independent claims, such as specific mappings, data characteristics, user inputs, and real-time processing aspects.

Key Claim Terms New

Definitions of key terms used in the patent claims.

Term (Source)	Support for Specification	Interpretation
First machine-learning algorithm (Claim 1, Claim 11, Claim 19)	“To address these and other problems with existing solutions for performing accent conversion, disclosed herein is new software technology that utilizes machine-learning models to receive input speech in a first accent and then output a synthesized version of the input speech in a second accent, all with very low latency (e.g., 300 milliseconds or less).”	A machine-learning algorithm that is trained with audio data comprising speech content captured from a plurality of different speakers having a first accent. It is used to derive a non-text linguistic representation of the received speech content.
Non-text linguistic representation (Claim 1, Claim 11, Claim 19)	“Some other approaches have involved a speech-to-text (STT) conversion of input speech as a midpoint, followed by a text-to-speech (TTS) conversion to generate the output audio content. However, this type of STT-TTS approach cannot capture many of the nuances of input speech that can provide information beyond the meaning of the words themselves, such as the prosody or emotion of the speaker.”	A representation of speech content derived by the first machine-learning algorithm, which is not in text form. It is based on the set of phonemes associated with a first pronunciation of the received speech content.
Second machine-learning-algorithm (Claim 1, Claim 11, Claim 19)	“To address these and other problems with existing solutions for performing accent conversion, disclosed herein is new software technology that utilizes machine-learning models to receive input speech in a first accent and then output a synthesized version of the input speech in a second accent, all with very low latency (e.g., 300 milliseconds or less).”	A machine-learning algorithm trained with (i) second audio data comprising the first accent and (ii) third audio data comprising a second accent. It is used to synthesize fourth audio data representative of the received speech content having the second accent.
Synthesized version (Claim 1, Claim 11, Claim 19)	“To address these and other problems with existing solutions for performing accent conversion, disclosed herein is new software technology that utilizes machine-learning models to receive input speech in a first accent and then output a synthesized version of the input speech in a second accent, all with very low latency (e.g., 300 milliseconds or less).”	The final output, which is the received speech content converted to the second accent. It comprises the updated set of phonemes associated with the second pronunciation of the received speech content.
Updated set of phonemes (Claim 1, Claim 11, Claim 19)	“For example, many accents of the English language, such as Indian English and Irish English do not pronounce the phoneme for the digraph “th” found in Standard American English (SAE), instead replacing it with a “d” or “t” sound (sometimes referred to as th-stopping).”	A set of phonemes associated with a second pronunciation of the received speech content that is different from the first pronunciation of the received speech content. It is the result of mapping a first non-text linguistic representation of a first phoneme to a second non-text linguistic representation of a second phoneme.

Patent Family

File Wrapper

The dossier documents provide a comprehensive record of the patent's prosecution history - including filings, correspondence, and decisions made by patent offices - and are crucial for understanding the patent's legal journey and any challenges it may have faced during examination.

Date
Description

Get instant alerts for new documents

Apr 2, 2024
Digitally signed official patent eGrant document
Apr 2, 2024
eGrant day-of Notification
Mar 13, 2024
Issue Notification
Feb 26, 2024
Response to Amendment under Rule 312
Feb 26, 2024
Amendment After Final or under 37CFR 1.312, initialed by the examiner.
Feb 14, 2024
Amendment after Notice of Allowance (Rule 312)
Feb 14, 2024
Amendment after Notice of Allowance (Rule 312)
Feb 14, 2024
Claims
Feb 14, 2024
Applicant Arguments/Remarks Made in an Amendment
Feb 14, 2024
Issue Fee Payment (PTO-85B)
Feb 14, 2024
Electronic Fee Payment
Feb 14, 2024
Electronic Filing System Acknowledgment Receipt
Feb 9, 2024
Notice of Allowance and Fees Due (PTOL-85)
Feb 9, 2024
List of references cited by examiner
Feb 9, 2024
Issue Information including classification, examiner, name, claim, renumbering, etc.
Feb 9, 2024
Search information including classification, databases and other search related notes
Feb 9, 2024
Index of Claims
Feb 9, 2024
Examiner's search strategy and results
Feb 9, 2024
Examiner's search strategy and results
Feb 9, 2024
Bibliographic Data Sheet
Nov 22, 2023
Appeal Brief Filed
Nov 22, 2023
Electronic Filing System Acknowledgment Receipt
Nov 22, 2023
Appeal Brief Filed
Nov 22, 2023
Claims
Sep 28, 2023
Notice of Appeal Filed
Sep 28, 2023
Fee Worksheet (SB06)
Sep 28, 2023
Electronic Filing System Acknowledgment Receipt
Sep 18, 2023
Final Rejection
Sep 18, 2023
List of references cited by examiner
Sep 18, 2023
Index of Claims
Sep 18, 2023
Search information including classification, databases and other search related notes
Sep 18, 2023
Examiner's search strategy and results
Sep 18, 2023
Examiner's search strategy and results
Sep 11, 2023
Fee Worksheet (SB06)
Sep 6, 2023
Amendment/Request for Reconsideration-After Non-Final Rejection
Sep 6, 2023
Claims
Sep 6, 2023
Applicant Arguments/Remarks Made in an Amendment
Sep 6, 2023
Electronic Filing System Acknowledgment Receipt
Sep 6, 2023
Fee Worksheet (SB06)
Sep 1, 2023
Examiner Interview Summary Record (PTOL - 413)
Sep 1, 2023
Office Action Appendix
Aug 17, 2023
Electronic Request for Interview with Examiner
Jul 14, 2023
Non-Final Rejection
Jul 14, 2023
List of references cited by examiner
Jul 14, 2023
Index of Claims
Jul 14, 2023
Search information including classification, databases and other search related notes
Jul 14, 2023
Examiner's search strategy and results
Jul 14, 2023
Examiner's search strategy and results
Apr 17, 2023
Request for Continued Examination (RCE)
Apr 17, 2023
Fee Worksheet (SB06)
Apr 17, 2023
Electronic Filing System Acknowledgment Receipt
Apr 17, 2023
Amendment Submitted/Entered with Filing of Continued Prosecution Application (CPA)/Request for Continued Examination(RCE)
Apr 17, 2023
Claims
Apr 17, 2023
Applicant Arguments/Remarks Made in an Amendment
Apr 17, 2023
Fee Worksheet (SB06)
Apr 14, 2023
Communication - Re: Power of Attorney (PTOL-308)
Apr 14, 2023
Communication - Re: Power of Attorney (PTOL-308)
Apr 4, 2023
Power of Attorney
Apr 4, 2023
Electronic Filing System Acknowledgment Receipt
Mar 20, 2023
Advisory Action (PTOL-303)
Mar 20, 2023
Examiner Interview Summary Record (PTOL - 413)
Mar 20, 2023
List of references cited by examiner
Mar 20, 2023
After Final Consideration Program Decision
Mar 20, 2023
Amendment After Final or under 37CFR 1.312, initialed by the examiner.
Mar 3, 2023
Response After Final Action
Mar 3, 2023
After Final Consideration Program Request
Mar 3, 2023
Claims
Mar 3, 2023
Response After Final Action
Mar 3, 2023
Applicant Arguments/Remarks Made in an Amendment
Mar 3, 2023
Electronic Filing System Acknowledgment Receipt
Mar 3, 2023
Applicant Arguments/Remarks Made in an Amendment
Mar 3, 2023
Fee Worksheet (SB06)
Feb 28, 2023
Examiner Interview Summary Record (PTOL - 413)
Feb 28, 2023
Office Action Appendix
Jan 23, 2023
Final Rejection
Jan 23, 2023
List of references cited by examiner
Jan 23, 2023
Index of Claims
Jan 23, 2023
Search information including classification, databases and other search related notes
Jan 23, 2023
List of References cited by applicant and considered by examiner
Jan 23, 2023
Examiner's search strategy and results
Jan 23, 2023
Examiner's search strategy and results
Jan 5, 2023
Amendment/Request for Reconsideration-After Non-Final Rejection
Jan 5, 2023
Claims
Jan 5, 2023
Applicant Arguments/Remarks Made in an Amendment
Jan 5, 2023
Fee Worksheet (SB06)
Jan 5, 2023
Electronic Filing System Acknowledgment Receipt
Jan 4, 2023
Applicant Arguments/Remarks Made in an Amendment
Jan 4, 2023
Fee Worksheet (SB06)
Nov 10, 2022
Notice of Publication
Sep 21, 2022
Examiner Interview Summary Record (PTOL - 413)
Sep 21, 2022
Office Action Appendix
Sep 21, 2022
Information Disclosure Statement (IDS) Form (SB08)
Sep 21, 2022
Foreign Reference
Sep 21, 2022
Foreign Reference
Sep 21, 2022
Foreign Reference
Sep 21, 2022
Foreign Reference
Sep 21, 2022
Other reference-Patent/Application/Search Documents
Sep 21, 2022
Electronic Filing System Acknowledgment Receipt
Aug 10, 2022
Non-Final Rejection
Aug 10, 2022
List of references cited by examiner
Aug 10, 2022
Search information including classification, databases and other search related notes
Aug 10, 2022
Index of Claims
Aug 10, 2022
Examiner's search strategy and results
Aug 10, 2022
Examiner's search strategy and results
May 5, 2022
Request for Continued Examination (RCE)
May 5, 2022
Amendment Submitted/Entered with Filing of Continued Prosecution Application (CPA)/Request for Continued Examination(RCE)
May 5, 2022
Claims
May 5, 2022
Applicant Arguments/Remarks Made in an Amendment
May 5, 2022
Fee Worksheet (SB06)
May 5, 2022
Electronic Filing System Acknowledgment Receipt
May 4, 2022
Fee Worksheet (SB06)
Apr 11, 2022
Advisory Action (PTOL-303)
Apr 11, 2022
Examiner Interview Summary Record (PTOL - 413)
Apr 11, 2022
List of references cited by examiner
Apr 11, 2022
After Final Consideration Program Decision
Apr 11, 2022
Amendment After Final or under 37CFR 1.312, initialed by the examiner.
Apr 1, 2022
Fee Worksheet (SB06)
Apr 1, 2022
After Final Consideration Program Request
Apr 1, 2022
Response After Final Action
Apr 1, 2022
Claims
Apr 1, 2022
Applicant Arguments/Remarks Made in an Amendment
Apr 1, 2022
Electronic Filing System Acknowledgment Receipt
Mar 23, 2022
Examiner Interview Summary Record (PTOL - 413)
Mar 23, 2022
Office Action Appendix
Feb 4, 2022
Final Rejection
Feb 4, 2022
List of references cited by examiner
Feb 4, 2022
Search information including classification, databases and other search related notes
Feb 4, 2022
Index of Claims
Feb 4, 2022
Examiner's search strategy and results
Feb 4, 2022
Examiner's search strategy and results
Jan 25, 2022
Amendment/Request for Reconsideration-After Non-Final Rejection
Jan 25, 2022
Specification
Jan 25, 2022
Claims
Jan 25, 2022
Applicant Arguments/Remarks Made in an Amendment
Jan 25, 2022
Electronic Filing System Acknowledgment Receipt
Jan 14, 2022
Examiner Interview Summary Record (PTOL - 413)
Jan 14, 2022
Office Action Appendix
Oct 25, 2021
Non-Final Rejection
Oct 25, 2021
List of references cited by examiner
Oct 25, 2021
Index of Claims
Oct 25, 2021
Search information including classification, databases and other search related notes
Oct 25, 2021
Examiner's search strategy and results
Oct 25, 2021
List of References cited by applicant and considered by examiner
Oct 25, 2021
Bibliographic Data Sheet
Oct 25, 2021
Examiner's search strategy and results
Oct 1, 2021
Track One request Granted
Sep 10, 2021
Filing Receipt
Sep 10, 2021
Fee Worksheet (SB06)
Aug 31, 2021
Information Disclosure Statement (IDS) Form (SB08)
Aug 31, 2021
Electronic Filing System Acknowledgment Receipt
Aug 28, 2021
Power of Attorney
Aug 28, 2021
Authorization for Extension of Time all replies
Aug 28, 2021
Oath or Declaration filed
Aug 28, 2021
Oath or Declaration filed
Aug 28, 2021
Drawings-other than black and white line drawings
Aug 28, 2021
Specification
Aug 28, 2021
Claims
Aug 28, 2021
Abstract
Aug 28, 2021
Fee Worksheet (SB06)
Aug 28, 2021
Electronic Filing System Acknowledgment Receipt
Aug 28, 2021
Track One Request
Aug 28, 2021
Application Data Sheet
Aug 27, 2021
Placeholder sheet indicating presence of supplemental content in Supplemental Complex Repository for Examiners(SCORE)
Aug 27, 2021
Drawings-black and white line and/or other drawings

US11948550

CANADIAN IMPERIAL BANK OF COMMERCE

Application Number: US17460145
Filing Date: Aug 27, 2021
Status: Granted
Expiry Date: Aug 27, 2041
External Links: Slate, USPTO, Google Patents

IP Verse