Systems And Methods For Dynamically Improving User Intelligibility Of Synthesized Speech In A Work Environment

Patent No. US11817078 (titled "Systems And Methods For Dynamically Improving User Intelligibility Of Synthesized Speech In A Work Environment") was filed by Vocollect Inc on Jun 2, 2023.

What is this patent about?

’078 is related to the field of speech-based systems, specifically addressing the challenge of maintaining speech intelligibility in dynamic work environments. Existing systems often rely on static text-to-speech (TTS) engine settings, which can become inadequate due to factors like ambient noise, unfamiliar prompts, or user fatigue, leading to errors and reduced efficiency in speech-directed tasks such as inventory management.

The underlying idea behind ’078 is to dynamically adjust the operational parameters of a TTS engine in response to real-time environmental conditions. Instead of relying on fixed settings, the system monitors factors that affect speech intelligibility and automatically modifies parameters like speed, pitch, or volume to optimize the user's comprehension of synthesized speech. This adaptation aims to improve accuracy and reduce frustration in speech-driven workflows.

The claims of ’078 focus on a communication system incorporating a speech recognition system and a TTS engine, where processing circuitry accesses an inventory management system to provide tasks audibly to a user. The key element is the system's ability to monitor an environmental condition, such as ambient noise or the user's experience level with the system, and then modify an operational parameter of either the TTS engine or the speech recognition system based on that condition.

In practice, the system might use sensors to detect high ambient noise levels in a warehouse. Upon detecting such noise, the processing circuitry could automatically increase the volume of the TTS engine to ensure the user can clearly hear instructions. Alternatively, if the system detects that a user is relatively new to a particular task or application area, it might automatically slow down the speech rate of the TTS engine to improve comprehension.

This approach differentiates itself from prior solutions by providing an automated, context-aware adaptation of TTS parameters. Instead of requiring users to manually adjust settings, the system proactively optimizes speech intelligibility based on real-time conditions. This dynamic adjustment ensures that users receive clear and understandable instructions, even in challenging environments, leading to improved efficiency and reduced errors in speech-directed work.

How does this patent fit in bigger picture?

Technical landscape at the time

In the early 2010s when ’078 was filed, speech recognition and text-to-speech (TTS) systems were becoming more prevalent in mobile computing environments, particularly for hands-free operation in work environments. At a time when such systems commonly relied on wireless communication between a central computer and wearable devices, adjusting TTS engine parameters often required navigating through multiple menus, which was cumbersome and time-consuming for users experiencing temporary changes in their environment or task.

Novelty and Inventive Step

The examiner allowed the claims because they were deemed non-obvious over Kirsch et al. in view of Veprek et al., both of which were already of record. The examiner stated that the prior art failed to teach or fairly suggest the limitations recited in the independent claims, as specifically amended. The dependent claims were also considered allowable because they further limited the allowable independent claims.

Claims

This patent contains 30 claims, with independent claims numbered 1, 10, 17, and 24. The independent claims are directed to a communication system and a method for managing tasks within an inventory system, using speech recognition and text-to-speech technology, and adjusting operational parameters based on environmental conditions. The dependent claims generally elaborate on and refine the specifics of the communication system and method described in the independent claims.

Key Claim Terms New

Definitions of key terms used in the patent claims.

Term (Source)Support for SpecificationInterpretation
Environmental condition
(Claim 1, Claim 10, Claim 17, Claim 24)
“Processing circuitry monitors an environmental condition related to intelligibility of an output of the text-to-speech engine, and modifies the one or more adjustable operational parameters of the text-to-speech engine in response to the monitored environmental condition. The device 12 may also determine that the user 13 is in close proximity to other users 13 (and thus subjected to higher levels of background noise or talking) by monitoring Bluetooth® signals to detect other nearby devices 12 of other users. For example, the total number of hours logged by a specific user 13 may determine the level of user experience (e.g., a less experienced user may require a slower setting in the text-to-speech engine) with a text-to-speech engine, or the level of experience with an area of a task application, or the level of experience with a specific task application.”A condition that is monitored and used to modify an operational parameter of the text-to-speech engine or speech recognition system. Examples include ambient noise level and user experience level.
Inventory management system
(Claim 1, Claim 10, Claim 17, Claim 24)
“Inventory-driven industries rely on computerized inventory management systems for performing various diverse tasks, such as food and retail product distribution, manufacturing, and quality control. An overall integrated management system typically includes a combination of a central computer system for tracking and management, and the people who use and interface with the computer system in the form of order fillers and other users. In one scenario, the users handle the manual aspects of the integrated management system under the command and control of information transmitted from the central computer system to the wireless mobile device and to the user through a speech-driven interface.”A system that provides one or more tasks that are audibly output to the user via the text-to-speech engine.
Operational parameter
(Claim 1, Claim 10, Claim 17, Claim 24)
“The speech synthesizer or TTS engine operating in the system or on the device translates the system messages into speech, and typically provides the user with adjustable operational parameters or settings such as audio volume, speed, and pitch. As users become more experienced with the operation of the inventory management system, they will typically increase the speech rate and/or pitch of the TTS engine.”A setting of the text-to-speech engine or speech recognition system that can be modified based on a monitored environmental condition.
Speech recognition system
(Claim 1, Claim 10, Claim 17, Claim 24)
“Speech recognition has simplified many tasks in the workplace by permitting hands-free communication with a computer as a convenient alternative to communication via conventional peripheral input/output devices. A user may enter data and commands by voice using a device having a speech recognizer. Speech from the user is captured by the headset and converted using speech recognition into data used by the central computer system.”A system that receives speech from a user and converts it into text.
Text-to-speech engine
(Claim 1, Claim 10, Claim 17, Claim 24)
“Commands, instructions, or other information may also be communicated to the user by a speech synthesizer. Generally, the synthesized speech is provided by a text-to-speech (TTS) engine. Similarly, instructions from the central computer or mobile device in the form of text are delivered to the user as voice prompts generated by the TTS engine and played through the headset speaker.”A component that provides audible output to the user.

Patent Family

Patent Family

File Wrapper

The dossier documents provide a comprehensive record of the patent's prosecution history - including filings, correspondence, and decisions made by patent offices - and are crucial for understanding the patent's legal journey and any challenges it may have faced during examination.

  • Get instant alerts for new documents

US11817078

VOCOLLECT INC
Application Number
US18328189
Filing Date
Jun 2, 2023
Status
Granted
Expiry Date
May 18, 2032
External Links
Slate, USPTO, Google Patents