Matches in SemOpenAlex for { <https://semopenalex.org/work/W1004303424> ?p ?o ?g. }
Showing items 1 to 87 of
87
with 100 items per page.
- W1004303424 abstract "In modern computer systems, the memory system plays a key role in determining the computer system's overall performance and power consumption. However, the memory system is also the most vulnerable component in the system that directly impacts the system's overall manufacturing costs and run-time reliability. As fabrication process technologies scale into the deep nanometer regime, both the frequency and scale of manufacturing defects (mostly caused by variability errors) and run-time errors (mostly caused by soft errors and wearouts) will increase. These errors will cause high manufacturing costs, information losses, and physical failures. However, conventional memory protection techniques such as error correcting codes (ECC) and memory redundancy cannot handle errors that occur in such an increasing frequency and cannot scale without incurring high VLSI overheads. This thesis proposes two-dimensional (2D) memory protection techniques for building highly reliable, available, and serviceable memory systems while maintaining low manufacturing costs and high yields. The key innovation of 2D memory protection is to take reconstruction of large-scale information loss off the critical path of normal operations, that is distinct from low-overhead small-scale error detection and correction mechanisms. 2D memory protection can be applied at various levels of the memory system from on-chip memory arrays to off-chip memory modules and nodes. This thesis proposes and evaluates three distinct applications of 2D memory protection techniques: 2D error coding, 2D erasure coding, and RunFlat memory to combat multi-bit errors, variability errors, and node failures, respectively. This thesis first proposes 2D error coding, a scalable multi-bit error protection technique applied 'within' embedded memory arrays, which combines in-line small-scale error correction and off-line large-scale error correction to detect and correct large-scale information losses (e.g., multi-bit upsets) at minimum VLSI overheads. This thesis evaluates this scheme in the cache hierarchies of two chip multiprocessor designs and shows that 2D error coding can correct clustered errors up to 32x32 bits during run time with significantly smaller performance, area, and power overheads than conventional techniques. Next, this thesis investigates how this increased resilience can be traded off for higher-density bitcells, higher cell performance, greater cell stability, and lower power design by correcting variability-induced manufacture-time hard errors in embedded memory arrays, while maintaining ∼100% yield. By conducting a series of Monte Carlo simulations of scaled cell models with device variability, this thesis first identifies a strong potential of using multi-bit ECC for variability tolerance, and then proposes 2D erasure coding, a low-overhead multi-bit ECC designed to correct variability-induced manufacture-time hard errors at the speed of conventional single-bit ECC by making use of erasure coding algorithm. The proposed scheme when combined with a small amount of row redundancy significantly improves the memory access latency, power, and stability, while maintaining ∼100% yield and run-time reliability. This thesis proposes RunFlat memory, a highly reliable, available, and serviceable (RAS) distributed shared-memory (DSM) system to survive large-scale run-time hard errors such as node failures. RunFlat memory applies 2D protection 'across' off-chip memory arrays by combining a conventional block-level protection (e.g., ECC, 2D coding) and a node-level memory RAID protection. RunFlat memory combined with a hardware-based on-line memory reconfiguration mechanism can detect and correct entire node failures, enable continued operation, and allow on-line repair service, while preserving the system's original performance and protection. Full-system simulations of a 16-node DSM server show that RunFlat memory incurs a negligible performance overhead during error free mode and significantly reduced performance overheads when operating with a failed node." @default.
- W1004303424 created "2016-06-24" @default.
- W1004303424 creator A5050452730 @default.
- W1004303424 date "2008-01-01" @default.
- W1004303424 modified "2023-09-28" @default.
- W1004303424 title "Two-dimensional memory system protection" @default.
- W1004303424 hasPublicationYear "2008" @default.
- W1004303424 type Work @default.
- W1004303424 sameAs 1004303424 @default.
- W1004303424 citedByCount "0" @default.
- W1004303424 crossrefType "journal-article" @default.
- W1004303424 hasAuthorship W1004303424A5050452730 @default.
- W1004303424 hasConcept C100660578 @default.
- W1004303424 hasConcept C103088060 @default.
- W1004303424 hasConcept C111919701 @default.
- W1004303424 hasConcept C11413529 @default.
- W1004303424 hasConcept C115874739 @default.
- W1004303424 hasConcept C119907115 @default.
- W1004303424 hasConcept C127413603 @default.
- W1004303424 hasConcept C138885662 @default.
- W1004303424 hasConcept C149635348 @default.
- W1004303424 hasConcept C152124472 @default.
- W1004303424 hasConcept C176649486 @default.
- W1004303424 hasConcept C18131444 @default.
- W1004303424 hasConcept C200601418 @default.
- W1004303424 hasConcept C201995342 @default.
- W1004303424 hasConcept C41008148 @default.
- W1004303424 hasConcept C41895202 @default.
- W1004303424 hasConcept C53838383 @default.
- W1004303424 hasConcept C57863822 @default.
- W1004303424 hasConcept C63511323 @default.
- W1004303424 hasConcept C87907426 @default.
- W1004303424 hasConcept C92855701 @default.
- W1004303424 hasConcept C93446704 @default.
- W1004303424 hasConcept C9390403 @default.
- W1004303424 hasConcept C98986596 @default.
- W1004303424 hasConceptScore W1004303424C100660578 @default.
- W1004303424 hasConceptScore W1004303424C103088060 @default.
- W1004303424 hasConceptScore W1004303424C111919701 @default.
- W1004303424 hasConceptScore W1004303424C11413529 @default.
- W1004303424 hasConceptScore W1004303424C115874739 @default.
- W1004303424 hasConceptScore W1004303424C119907115 @default.
- W1004303424 hasConceptScore W1004303424C127413603 @default.
- W1004303424 hasConceptScore W1004303424C138885662 @default.
- W1004303424 hasConceptScore W1004303424C149635348 @default.
- W1004303424 hasConceptScore W1004303424C152124472 @default.
- W1004303424 hasConceptScore W1004303424C176649486 @default.
- W1004303424 hasConceptScore W1004303424C18131444 @default.
- W1004303424 hasConceptScore W1004303424C200601418 @default.
- W1004303424 hasConceptScore W1004303424C201995342 @default.
- W1004303424 hasConceptScore W1004303424C41008148 @default.
- W1004303424 hasConceptScore W1004303424C41895202 @default.
- W1004303424 hasConceptScore W1004303424C53838383 @default.
- W1004303424 hasConceptScore W1004303424C57863822 @default.
- W1004303424 hasConceptScore W1004303424C63511323 @default.
- W1004303424 hasConceptScore W1004303424C87907426 @default.
- W1004303424 hasConceptScore W1004303424C92855701 @default.
- W1004303424 hasConceptScore W1004303424C93446704 @default.
- W1004303424 hasConceptScore W1004303424C9390403 @default.
- W1004303424 hasConceptScore W1004303424C98986596 @default.
- W1004303424 hasLocation W10043034241 @default.
- W1004303424 hasOpenAccess W1004303424 @default.
- W1004303424 hasPrimaryLocation W10043034241 @default.
- W1004303424 hasRelatedWork W135608641 @default.
- W1004303424 hasRelatedWork W1493494529 @default.
- W1004303424 hasRelatedWork W1994524702 @default.
- W1004303424 hasRelatedWork W2010401603 @default.
- W1004303424 hasRelatedWork W2013590087 @default.
- W1004303424 hasRelatedWork W201936821 @default.
- W1004303424 hasRelatedWork W2051614723 @default.
- W1004303424 hasRelatedWork W2109240571 @default.
- W1004303424 hasRelatedWork W2119482888 @default.
- W1004303424 hasRelatedWork W2133454107 @default.
- W1004303424 hasRelatedWork W2303883103 @default.
- W1004303424 hasRelatedWork W2444018127 @default.
- W1004303424 hasRelatedWork W2566916692 @default.
- W1004303424 hasRelatedWork W2626846295 @default.
- W1004303424 hasRelatedWork W2760274952 @default.
- W1004303424 hasRelatedWork W2769363361 @default.
- W1004303424 hasRelatedWork W2789543534 @default.
- W1004303424 hasRelatedWork W3138493339 @default.
- W1004303424 hasRelatedWork W3162700383 @default.
- W1004303424 hasRelatedWork W2994085976 @default.
- W1004303424 isParatext "false" @default.
- W1004303424 isRetracted "false" @default.
- W1004303424 magId "1004303424" @default.
- W1004303424 workType "article" @default.