October 5, 2023
This paper presents SweepCache, a new compiler/architecture co-design scheme that can equip energy harvesting systems with a volatile cache in a performant yet lightweight way. Unlike prior just-in-time checkpointing designs that persists volatile data just before power failure and thus dedicates additional energy, SweepCache partitions program into a series of recoverable regions and persists stores at region granularity to fully utilize harvested energy for computation. In particular, SweepCache introduces persist buffer—as a redo buffer resident in nonvolatile memory (NVM)—to keep the main memory consistent across power failure while persisting region’s stores in a failure-atomic manner. Specifically, for writebacks during region execution, SweepCache saves their cachelines to the persist buffer. At each region end, SweepCache first flushes dirty cachelines to the buffer, allowing the next region to start with a clean cache, and then moves all buffered cachelines to the corresponding NVM locations. In this way, no matter when power failure occurs, the buffer contents or their memory locations always remain intact, which serves as a basis for correct recovery. To hide the persistence delay, SweepCache speculatively starts a region right after the prior region finishes its execution—as if its stores were already persisted—with the two regions having their own persist buffer, i.e., dual-buffering. This region-level parallelism helps SweepCache to achieve the full potential of a high-performance data cache. The experimental results show that compared to the original cachefree nonvolatile processor, SweepCache delivers speedups of 14.60x and 14.86x—outperforming the state-of-the-art work by 3.47x and 3.49x—for two representative energy harvesting power traces, respectively.
About Yuchen Zhou
Yuchen Zhou is a third-year Ph.D. student in the Department of Computer Science at Purdue University, under the guidance of Professor Changhee Jung. His research mainly focuses on energy harvesting systems and whole system persistence. To achieve this objective, he often engages in compiler and architecture co-design to reduce hardware complexity while achieving high performance.