Multi-Core Computer Systems With Private/Shared Cache Line Indicators

Patent No. US11068410 (titled "Multi-Core Computer Systems With Private/Shared Cache Line Indicators") was filed by Array Cache Technologies Llc on Mar 4, 2019.

What is this patent about?

’410 is related to the field of cache coherence in multiprocessor systems, particularly those employing clustered cache hierarchies. Modern processors often group cores and their caches into clusters to reduce network congestion and improve scalability. However, maintaining cache coherence across these clusters introduces complexity, especially with traditional invalidation-based protocols that require extensive signaling.

The underlying idea behind ’410 is to simplify cache coherence in clustered systems by recognizing that data can be shared entirely within a cluster while remaining private from the perspective of other clusters. This is achieved by determining a common shared level (CSL) for each data block, which represents the highest level in the cache hierarchy where the block is shared within a cluster. By identifying this level, coherence operations can be localized, reducing the need for complex recursive protocols.

The claims of ’410 focus on a computer system with multiple processor cores and a clustered cache memory hierarchy. Each cache line includes a bit indicating whether it is private or shared. A key aspect is identifying the CSL for a memory block, based on where sharing occurs between cores. The CSL dictates which cache coherence operation is selected and performed when a coherence event arises, effectively tailoring the coherence protocol based on the data's sharing scope.

In practice, the invention involves tracking the CSL for each data block (or page) and using this information to optimize coherence operations. When a core attempts to access a data block, the system determines the CSL and then applies coherence mechanisms (like self-invalidation or write-through) only within the cluster defined by that CSL. This localization reduces the overhead associated with maintaining coherence across the entire system.

This approach differs from prior solutions that often rely on complex hierarchical directory protocols, which require intermediate nodes to behave both as root and leaf caches, leading to a large number of states and increased implementation complexity. By encapsulating the hierarchical complexity into the simple function of determining the CSL, ’410 enables the use of simpler coherence mechanisms, such as self-invalidation and write-through , but restricts their operation to the scope defined by the CSL, resulting in a more efficient and scalable coherence protocol.

How does this patent fit in bigger picture?

Technical landscape at the time

In the mid 2010s when ’410 was filed, cache coherence protocols were essential for maintaining data consistency in multiprocessor systems, at a time when architectures commonly relied on explicit invalidation or updating mechanisms to manage shared data across multiple cache levels. Hardware or software constraints made minimizing coherence traffic non-trivial, especially in clustered cache hierarchies where processors and their caches were grouped together to improve scalability.

Novelty and Inventive Step

The examiner approved the claims because the combination of limitations, particularly the identification of a common shared level among intermediary cache memory and shared memory based on which memory is shared between cores, was not found in the prior art. Specifically, prior art references failed to teach that each cache line has a bit signifying whether it is private or shared, and that a cache coherence operation is selected based on this common shared level.

Claims

There are 11 claims in this patent, with independent claims 1, 4, and 8. The independent claims are directed to a computer system and a method involving multiple processor cores and a clustered cache memory hierarchy, focusing on identifying a common shared level and selecting a cache coherence operation. The dependent claims generally specify details and limitations related to the private/shared bit and cache coherence operations within the computer system and method.

Key Claim Terms New

Definitions of key terms used in the patent claims.

Term (Source)Support for SpecificationInterpretation
Cache coherence operation
(Claim 1, Claim 4, Claim 8)
“According to an embodiment, a method for cache coherence in a computer system having a clustered cache hierarchy, includes the steps of storing a common shared level (CSL) value for a data block stored in the clustered cache hierarchy; and when the data block is written, using a coherence mechanism to update the status of the data block for one or more caches within a cache cluster indicated by the CSL value and treating the data block as private for one or more caches outside of the cache cluster indicated by the CSL value.”An action taken to maintain data consistency across multiple caches in the system.
Clustered cache memory hierarchy
(Claim 4, Claim 8)
“Recently, architectures have been introduced where processors (or cores), and their respective cache memory devices, are grouped together into clusters. This can reduce network congestion by localizing traffic among several hierarchical levels, potentially enabling much higher scalability.”A memory system organization where caches are grouped into clusters, with local caches associated with cores and shared memories accessible by subsets of cores.
Common shared level
(Claim 1, Claim 4, Claim 8)
“According to an embodiment, a method for cache coherence in a computer system having a clustered cache hierarchy, includes the steps of storing a common shared level (CSL) value for a data block stored in the clustered cache hierarchy; and when the data block is written, using a coherence mechanism to update the status of the data block for one or more caches within a cache cluster indicated by the CSL value and treating the data block as private for one or more caches outside of the cache cluster indicated by the CSL value.”A level in the computer system's memory hierarchy where a bit's value for a memory block transitions from private to shared as one moves away from the local cache memory.
Private/shared bit
(Claim 1, Claim 4, Claim 8)
“According to another embodiment, a computer system includes multiple processor cores, a clustered cache memory hierarchy including: at least one local cache memory associated with and operatively coupled to each core for storing one or more cache lines accessible only by the associated core; and a shared memory, the shared memory being operatively coupled to other shared memories or the local cache memories and accessible by a subset of cores that are transitively coupled to said shared memory via any number of local memories and intermediate shared memories, the shared memory being capable of storing a plurality of cache lines, wherein each cache line has a private/shared bit that signifies whether this cache line is private or shared in said shared memory.”A bit associated with each cache line in shared memory that indicates whether the cache line is private or shared.

Patent Family

Patent Family

File Wrapper

The dossier documents provide a comprehensive record of the patent's prosecution history - including filings, correspondence, and decisions made by patent offices - and are crucial for understanding the patent's legal journey and any challenges it may have faced during examination.

  • Get instant alerts for new documents

US11068410

ARRAY CACHE TECHNOLOGIES LLC
Application Number
US16291154
Filing Date
Mar 4, 2019
Status
Granted
Expiry Date
Mar 17, 2036
External Links
Slate, USPTO, Google Patents