Systems And Methods For Common Data Ingestion

Patent No. US11379416 (titled "Systems And Methods For Common Data Ingestion") was filed by Jpmorgan Chase Bank Na on Mar 15, 2017.

What is this patent about?

’416 is related to the field of data processing, specifically addressing the challenge of efficiently ingesting data from diverse sources into a unified system. Organizations grapple with the complexity and cost of handling data that arrives in various formats and structures. The patent aims to streamline this process, reducing both monetary expenses and the man-hours required for data integration.

The underlying idea behind ’416 is to create a standardized data ingestion pipeline that automates the process of receiving, transforming, and storing data. This involves registering projects and their associated data, then using file-watcher threads to monitor a common drop zone for incoming data files matching predefined patterns. Upon detection, a data processing job is triggered, which includes classifying personal information, establishing security policies, and routing the data to appropriate data zones within a data reservoir.

The claims of ’416 focus on a method implemented in an information processing apparatus for common data ingestion. This involves registering a project with multiple data files, each having a specific characteristic and adhering to a predefined naming pattern. The claims also cover registering multiple data zones within a data reservoir, each zone associated with a different data characteristic like format, retention, and compression. The method includes receiving data files into a common drop zone, using file-watcher threads to monitor for predefined patterns , triggering a data processing job based on pattern matching, classifying personal information, establishing security policies, routing data to appropriate zones, and exporting the processed data to a target.

In practice, the system establishes a data reservoir logically partitioned into zones like archive, conformed, and semantic data zones. The archive zone stores raw data, the conformed zone stores reformatted data in a common format, and the semantic zone stores business-friendly data. The file watcher threads monitor the common drop zone, and upon detecting a file matching a registered project's pattern, a data ingestion service is triggered. This service then performs tasks like renaming the file with a timestamp to prevent collisions, classifying personal information, and routing the data to the appropriate data zone based on its characteristics.

The differentiation from prior approaches lies in the automated and standardized nature of the data ingestion process. Instead of manually handling each data source and format, the system uses a registration process and file-watcher threads to automatically trigger data processing jobs. The inclusion of personal information classification and security policy enforcement within the ingestion pipeline also enhances data governance and compliance. This self-service onboarding procedure allows data providers and consumers to easily integrate their data into the system, reducing the need for manual intervention and improving overall efficiency.

How does this patent fit in bigger picture?

Technical landscape at the time

In the mid-2010s when ’416 was filed, data ingestion systems were typically implemented using a combination of file systems, databases, and ETL (Extract, Transform, Load) processes. At a time when systems commonly relied on batch processing rather than real-time streaming for many data ingestion tasks, hardware or software constraints made efficient and scalable data ingestion pipelines non-trivial. It was also a time when security and compliance requirements were becoming increasingly important, making it necessary to incorporate security policies and personal information classification into data ingestion workflows.

Novelty and Inventive Step

The examiner approved the application because the prior art of record, including Neiman, Herrnstadt, Lietz, and Earhart, did not explicitly teach "a plurality of project data files, each project data file having a source file name that adheres to a predefined pattern; instantiating a plurality of file-watcher threads at a data ingestion service of the processing apparatus, wherein each of the file-watcher threads is configured to monitor for a corresponding pattern, monitoring, by one file watcher thread of the plurality of file-watcher threads, for incoming data from the common drop zone that is associated with the predefined pattern; determining, by the one file-watcher thread of the plurality of file watcher threads, that the project data files are associated with the predefined pattern".

Claims

This patent includes 13 claims, with claim 1 being independent. Independent claim 1 is directed to a method for common data ingestion. The dependent claims elaborate on and add detail to the method described in the independent claim, further defining aspects such as project registration, data zones, routing processes, and system resource requirements.

Key Claim Terms New

Definitions of key terms used in the patent claims.

Term (Source)Support for SpecificationInterpretation
Common drop zone
(Claim 1)
“In one embodiment, the project data may be received at a common drop zone, and the step of receiving project data associated with the project may include: the server associating a unique job identifier for each incoming file in the project data; renaming each incoming file with a name comprising a time stamp; and writing the renamed file to an archive.”A single location where project data files are received.
Data reservoir
(Claim 1)
“In one embodiment, the method may further include storing the project data in a data reservoir. In one embodiment, the data reservoir may include an archive data zone, a conformed data zone, and a semantic data zone. The archive data zone may store the project data in a format in which it was received. The conformed data zone may store reformatted project data in a common format.”A storage location containing multiple data zones with different data characteristics.
Data zones
(Claim 1)
“In one embodiment, the data reservoir may include an archive data zone, a conformed data zone, and a semantic data zone. The archive data zone may store the project data in a format in which it was received. The conformed data zone may store reformatted project data in a common format.”Distinct areas within a data reservoir, each associated with a different data characteristic such as data format, retention setting, and compression state.
File-watcher threads
(Claim 1)
“In one embodiment, the method may further include using a file watcher thread to monitor incoming data for at least one pattern.”Multiple threads configured to monitor for specific patterns in incoming data.
Project data files
(Claim 1)
“Organizations receive data from many sources and of many different types. Processing this data is expensive, in both monetary cost and in man-hours.”Data files associated with a specific project, each having a project data characteristic and a source file name that adheres to a predefined pattern.

Litigation Cases New

US Latest litigation cases involving this patent.

Case NumberFiling DateTitle
2:25-cv-00595Apr 18, 2025Digitaldoors, Inc. v. SouthPoint Bank
8:25-cv-00002Jan 1, 2025Digital Doors, Inc. v. Sandy Spring Bank

Patent Family

Patent Family

File Wrapper

The dossier documents provide a comprehensive record of the patent's prosecution history - including filings, correspondence, and decisions made by patent offices - and are crucial for understanding the patent's legal journey and any challenges it may have faced during examination.

  • Date

    Description

  • Get instant alerts for new documents

US11379416

JPMORGAN CHASE BANK NA
Application Number
US15459374
Filing Date
Mar 15, 2017
Status
Granted
Expiry Date
Jun 29, 2038
External Links
Slate, USPTO, Google Patents