Readings

About

My reading list includes papers, articles, books, tutorials, videos, etc. for research purposes. Items are characterized by their topics/keywords.


Table of Contents

Recent

LLM Serving

Categories

Erasure Coding

Erasure Coding (basics)

Venue Title Link / Summary Brief
Summary Concepts that must know Summary EC basic concepts and keywords
Manuscript An Introduction to Galois Fields and Reed-Solomon Coding Link Intro to Finite Field and RS code (communication) in Clemenson Univ.
Manuscript Reed-Solomon Codes Link Intro to RS codes from Duke Univ.
USENIX Login'13 Erasure Codes for Storage Systems: A Brief Primer Summary Plank EC basics
FAST Tutorial'13 Tutorial: Erasure Coding for Storage Systems Summary Plank, EC tutorial
FAST'09 A Performance Evaluation and Examination of Open-Source Erasure Coding Libraries For Storage Summary Plank, EC computation evaluation

Network Coding (basics)

Venue Title Link / Summary Brief
FAST'11 Poster Repairing Erasure Codes Link Network coding for storage (poster)
IEEE Survey'11 A Survey on Network Codes Summary Network coding for storage (survey)
TIT'10 Network Coding for Distributed Storage Systems Summary Network coding for storage video, report
PPT Regenerating codes for distributed storage Link Network Coding, intro

Erasure Codes

Venue Title Link / Summary Brief
FAST'23 Practical Design Considerations for Wide Locally Recoverable Codes (LRCs) Summary Uniform Cauchy LRC, wide stripe, LRC
SRDS'22 XHR-Code: An Efficient Wide Stripe Erasure Code to Reduce Cross-Rack Overhead in Cloud Storage Systems Summary XHR-Code, repair, wide stripe, hierarchical settings, multiple failures
MSST'19 AZ-Code: An Efficient Availability Zone Level Erasure Code to Provide High Fault Tolerance in Cloud Storage Systems Link AZ-Code
ISIT'18 Codes with Combined Locality and Regeneration Having Optimal Rate, dmin and Linear Field Size Link Local Regenerating Codes, LRC, regenerating codes
DSN'18 Alpha Entanglement Codes: Practical Erasure Codes to Archive Data in Unreliable Environments Link Alpha Entanglement Codes, multiple failures
ATC'18 On Fault Tolerance, Locality, and Optimality in Locally Repairable Codes Summary LRC, comparison, Ceph
FAST'18 RAID+: Deterministic and Balanced Data Distribution for Large Disk Enclosures Link RAID+, load balancing
FAST’18 Clay Codes: Moulding MDS Codes to Yield an MSR Code Summary Clay codes, MSR codes
TIT'17 Explicit constructions of high-rate MDS array codes with optimal repair bandwidth Link Ye-Barg codes, MSR codes
ISIT'16 Double Regenerating Codes for hierarchical data centers Link DRC, MSR codes, hierarchical settings
STOC'16 Repairing Reed-solomon codes Link RS codes, repair, sub-packetization
FAST’16 Opening the Chrysalis: On the Real Repair Performance of MSR Codes Summary Butterfly codes, MSR codes
FAST'15 Having Your Cake and Eating It Too: Jointly Optimal Erasure Codes for I/O, Storage, and Network-bandwidth Summary PM-RBT codes, MSR codes
TOS'14 Sector-Disk (SD) Erasure Codes for Mixed Failure Modes in RAID Systems Link Sector-Disk (SD) codes, sector-disk failures
TIT'14 A family of optimal locally recoverable codes Summary Optimal LRCs, LRC
TIT'14 Locally Repairable Codes Link LRC
TIT'14 Codes With Local Regeneration and Erasure Correction Summary Local Regenerating Codes, LRC, multiple failures
TIT'14 Repair locality with multiple erasure tolerance Link LRC, multiple failures
SIGCOMM’14 A “Hitchhiker’s” Guide to Fast and Efficient Data Reconstruction in Erasure-coded Data Centers Summary Hitchhikker codes, regenerating codes, piggybacking codes
FAST'14 STAIR Codes: A General Family of Erasure Codes for Tolerating Device and Sector Failures in Practical Storage Systems Summary STAIR Codes, sector-disk failures
PVLDB'13 XORing Elephants: Novel Erasure Codes for Big Data Summary Xorbas codes, LRC
HotStorage'13 A Solution to the Network Challenges of Data Recovery in Erasure-coded Distributed Storage Systems: A Study on the Facebook Warehouse Cluster Link Piggybacking codes
TIT'13 Zigzag Codes: MDS Array Codes With Optimal Rebuilding Link Zigzag Codes, regenerating codes
ISIT'13, TIT'17 A Piggybacking Design Framework for Read-and Download-efficient Distributed Storage Codes Link Piggybacking codes
TOS'12 Generalized X-code: An efficient RAID-6 code for arbitrary size of disk array Summary Generalized X-codes
TIT'12 On the Locality of Codeword Symbols Link Theory of LRCs
ATC'12 Erasure Coding in Windows Azure Storage Summary Azure-LRC
INFOCOM'12 Simple regenerating codes: Network coding for cloud storage Link Simple regenerating code
ISIT'10, TIT'11 Optimal Exact-Regenerating Codes for Distributed Storage at the MSR and MBR Points via a Product-Matrix Construction Link Product-Matrix Code
TOS'09 GRID codes: Strip-based erasure codes with high fault tolerance for storage systems Summary GRID codes
NCA'07 Pyramid Codes: Flexible Schemes to Trade Space for Access Efficiency in Reliable Data Storage Systems Summary, Summary (older, for TOS'13) Pyramid Codes, LRC
FAST'04 Improving Storage System Availability with D-GRAID Link D-GRAID codes, RAID
FAST'04 Row-Diagonal Parity for Double Disk Failure Correction Link RDP codes, array codes, RAID
ATC'1996 AFRAID - A Frequently Redundant Array of Independent Disks Link AFRAID, RAID
ISCA'1994, TC'1995 EVENODD: an optimal scheme for tolerating double disk failures in RAID architectures Link EVENODD codes, array codes, RAID
SIGMOD'1988 A Case for Redundant Arrays of Inexpensive Disks (RAID) Link RAID
SIGMETRICS Perf Eval. Review'1995 Striping in a RAID level 5 disk array Link RAID striping, RAID
SIAM'1960 Polynomial Codes Over Certain Finite Fields Summary RS codes (the original version)
Monograph from Prof. P. Vijay Kumar Codes for Distributed Storage Link EC theory basics and survey (including RS, MSR, LRC, etc.)

Redundancy Transitioning

Venue Title Link / Summary Brief
ISIT'23 Locally Repairable Convertible Codes: Erasure Codes for Efficient Repair and Conversion Summary LRC conversion, code conversion, LRC
OSDI'22 Tiger: disk-adaptive redundancy without placement restrictions Summary Tiger, redundancy transitioning, disk heterogeneity
ISIT'22 Bandwidth Cost of Code Conversions in the Split Regime Link Convertible codes: bandwidth, code conversion, theory
ISIT'21, TIT'23 Bandwidth Cost of Code Conversions in Distributed Storage: Fundamental Limits and Optimal Constructions Link Convertible codes: bandwidth, code conversion, theory
INFOCOM'22 Optimal Data Placement for Stripe Merging in Locally Repairable Codes Summary LRC stripe merging, code conversion, LRC
ICDCS'21 StripeMerge: Efficient Wide-Stripe Generation for Large-Scale Erasure-Coded Storage Summary StripeMerge, wide stripe, code conversion
OSDI'20 Pacemaker: avoiding HeART attacks in storage clusters with disk-adaptive redundancy Summary PACEMAKER, redundancy transitioning, disk heterogeneity
SRDS'20 Enabling I/O-Efficient Redundancy Transitioning in Erasure-Coded KV Stores via Elastic Reed-Solomon Codes Summary Elastic Reed-Solomon (ERS) codes, redundancy trasntioning
INFOCOM'20 On the Optimal Repair-Scaling Trade-off in Locally Repairable Codes Summary LRC Repair-Scaling Tradeoff, redundancy transitioning, LRC
IEEE Access'20 Efficient Storage Scaling for MBR and MSR Codes Summary MSR codes, scaling
ITCS'20, TIT'22 Convertible Codes: New Class of Codes for Efficient Conversion of Coded Data in Distributed Storage Summary Convertible Codes: I/O, code conversion
ISIT'20 Access-optimal Linear MDS Convertible Codes for All Parameters Summary Access-optimal Convertible Codes
FAST'19 Cluster storage systems gotta have HeART: improving storage efficiency by exploiting disk-reliability heterogeneity Summary HeART, disk heterogeneity, redundancy transitioning
ISIT'18 Generalized Optimal Storage Scaling via Network Coding Summary Network coding, scaling
INFOCOM'18, TPDS'22 Toward Optimal Storage Scaling via Network Coding: From Theory to Practice Summary NCScale, scaling, network coding
TPDS'16 I/O-Efficient Scaling Schemes for Distributed Storage Systems with CRS Codes Link CRS, scaling
DSN'15, TPDS'17 Enabling Efficient and Reliable Transition from Replication to Erasure Coding for Clustered File Systems Link Replication to EC, redundancy transitioning
FAST'15 A Tale of Two Erasure Codes in HDFS Summary HACFS, redundancy transitioning
TC'15 Accelerate RDP RAID-6 Scaling by Reducing Disk I/Os and XOR Operations Link RAID, scaling
TPDS'14 An Efficient Scaling Scheme for RS-Coded Storage Clusters Summary Scale-RS, scaling
ICPP'12 GSR: A Global Stripe-Based Redistribution Approach to Accelerate RAID-5 Scaling Link GSR, RAID, scaling (C. Wu)
FAST'11 Accelerate RAID Scaling by Minimizing Data Migration Link FastScale, RAID, scaling
TOCS'1996 The HP AutoRAID Hierarchical Storage System Link AutoRAID, replication to RAID

Erasure Coding Reliability Analysis

Venue Title Link / Summary Brief
SRDS'17, TPDS'19 SimEDC: A Simulator for the Reliability Analysis of Erasure-Coded Data Centers Link SimEDC
HotStorage'10 Mean time to meaningless: MTTDL, Markov models, and storage system reliability Link MTTDL Meaningless
OSDI'09 Availability in Globally Distributed Storage Systems Summary Google Availability
I2TS'08 When MTTDLs Are Not Good Enough: Providing Better Estimates of Disk Array Reliability Link Calculation of MTTDL (1)
SNAPI'07 Outshining Mirrors: MTTDL of Fixed-Order SSPiRAL Layouts Link Calculation of MTTDL (2)

Techniques for Erasure Coding

Venue Title Link / Summary Brief
ATC'23 Explore Data Placement Algorithm for Balanced Recovery Load Distribution Summary Recovery, data placement
IPDPS'23 Boosting Multi-Block Repair in Cloud Storage Systems with Wide-Stripe Erasure Coding Summary Multiple repair, wide stripe
ICPP'23 Toward Optimal Repair and Load Balance in Locally Repairable Codes Summary LRC, repair, load balancing
ICDCS'22 PivotRepair: Fast Pipelined Repair for Erasure-Coded Hot Storage Link repair
ICPP'22 Exploiting Parallelism of Disk Failure Recovery via Partial Stripe Repair for an Erasure-Coded High-Density Storage Server Link repair, high density storage
ICPP'22 Repair-Optimal Data Placement for Locally Repairable Codes with Optimal Minimum Hamming Distance Summary LRC, repair, data placement
ATC'21 Boosting Full-Node Repair in Erasure-Coded Storage Summary RepairBoost, full-node recovery
SOSP'21 Geometric Partitioning: Explore the Boundary of Optimal Erasure Code Repair Link Geometric Partitioning
FAST'21 Exploiting Combined Locality for Wide-Stripe Erasure Coding in Distributed Storage Summary, Summary (earlier) ECWide, repair, LRC, wide stripe
ICPP'21 Multi-level Forwarding and Scheduling Repair Technique in Heterogeneous Network for Erasure-coded Clusters Link repair, heterogeneous
IWQoS'21 EC-Scheduler: A Load-Balanced Scheduler to Accelerate the Straggler Recovery for Erasure Coded Storage Systems Summary repair, load balancing
IPDPS'20 EC-Fusion: An Efficient Hybrid Erasure Coding Framework to Improve Both Application and Recovery Performance in Cloud Storage Systems Link EC-Fusion, multiple erasure codes
HotStorage'20 SelectiveEC: Selective Reconstruction in Erasure-coded Storage Systems Summary SelectiveEC, load balancing
Eurosys'20 RAIDP: replication with intra-disk parity Summary RAID-P
FAST'20 CRaft: An Erasure-coding-supported Version of Raft for Reducing Storage Cost and Network Cost Link CRaft
FAST'19 Fast Erasure Coding for Data Storage: A Comprehensive Study of the Acceleration Techniques Summary repair acceleration
DSN'19 Fast Predictive Repair in Erasure-Coded Storage Summary FastPR, repair, parallelization
ICPP'19 Fast Recovery Techniques for Erasure-coded Clusters in Non-uniform Traffic Network Link multiple failure repair
ATC'17 Repair Pipelining for Erasure-Coded Storage Summary ECPipe, repair, parallelization
ATC'17 PARIX: Speculative Partial Writes in Erasure-Coded Systems Link Parix
Eurosys'16 Partial-Parallel-Repair (PPR): A Distributed Technique for Repairing Erasure Coded Storage Summary PPR, repair, parallelization
MSST'13 CORE: Augmenting Regenerating-Coding-Based Recovery for Single and Concurrent Failures in Distributed Storage Systems Link CORE, repair, mutli-failure
SYSTOR'14 Lazy Means Smart: Reducing Repair Bandwidth Costs in Erasure-coded Distributed Storage Link Lazy recovery
TC'14 Boosting Degraded Reads in Heterogeneous Erasure-Coded Storage Systems Summary degraded read, heterogeneous network
FAST'14 Parity Logging with Reserved Space: Towards Efficient Updates and Recovery in Erasure-coded Clustered Storage Link CodFS
MSST’12 On the speedup of single-disk failure recovery in XOR-coded storage systems: Theory and practice Summary Zhu, replace recovery algorithms for XOR based codes
FAST'12 Rethinking Erasure Codes for Cloud File Systems: Minimizing I/O for Recovery and Degraded Reads Summary Khan, RotatedRS, repair I/O improvement

Erasure-coded Systems

Venue Title Link / Summary Brief
NSDI'22 C2DN: How to Harness Erasure Codes at the Edge
for Efficient Content Delivery Summary C2DN
FAST'22 Hydra : Resilient and Highly Available Remote Memory Link Hydra, RDMA
FAST'22 DEPART: Replica Decoupling for Distributed Key-Value Storage Link DEPART, distributed KVStore, EC
NSDI'20 Near-Optimal Latency Versus Cost Tradeoffs in Geo-Distributed Storage Summary PANDO, consensus, EC
SC'20 INEC: Fast and Coherent In-Network Erasure Coding Link INEC, RDMA
SC'19 TriEC: tripartite graph based erasure coding NIC offload Link TriEC, RDMA
SoCC'19 Coupling Decentralized Key-Value Stores with Erasure Coding Summary ECHash, KVStore
HPDC'19 UMR-EC: A Unified and Multi-Rail Erasure Coding Library for High-Performance Distributed Storage Systems Link UMR-EC, RDMA
FAST'19 OpenEC: Toward Unified and Configurable Erasure Coding Management in Distributed Storage Systems Summary OpenEC
ICDCS'17 High-Performance and Resilient Key-Value Store with Online Erasure Coding for Big Data Workloads Link RDMA
ATC'17 Giza: Erasure Coding Objects across Global Data Centers Link Giza, consensus
FAST'16 Efficient and Available In-memory KV-Store with Hybrid Erasure Coding and Replication Link Cocytus, KVStore
OSDI'16 EC-Cache: Load-Balanced, Low-Latency Cluster Caching with Online Erasure Coding Summary EC-Cache
OSDI'14 Pelican: A Building Block for Exascale Cold Data Storage Summary Pelican, cold DSS
FAST'12 NCCloud: A Network-Coding-Based Storage System in a Cloud-of-Clouds Summary NCCloud, network coding

Miscellaneous

Venue Title Link / Summary Brief
IPTPS'02 Erasure coding vs. replication:a quantitative comparison Link EC vs replication

Storage Systems and Cloud

Venue Title Link / Summary Brief
ATC'19 Dayu: Fast and Low-interference Data Recovery in Very-large Storage Systems Link Dayu, recovery
SYSTOR'19 Kurma: Secure Geo-Distributed Multi-Cloud Storage Gateways Summary Kurma
ATC'14 SCFS: A Shared Cloud-backed File System Summary SCFS, Depsky extension
SoCC'14 Hybris: Robust Hybrid Cloud Storage Summary Hybris
SOSP'13 SPANStore: Cost-Effective Geo-Replicated Storage Spanning Multiple Cloud Services Summary SPANStore
OSDI'12 Flat Datacenter Storage Link Flat Datacenter Storage
Eurosys'11 DEPSKY: A High-Availability and Integrity Layer for Cloud Storage Summary Depsky
SoCC'10 RACS: a case for cloud storage diversity Summary RACS

Blockchain

Venue Title Link / Summary Brief
Bitcoin white paper Bitcoin: A Peer-to-Peer Electronic Cash System Summary Bitcoin white paper
Ethereum yellow paper Ethereum: A secure decentralised generalised transaction ledger Link Ethereum yellow paper
Github Repo Self-maintained blockchain paper list Repo 1, Repo 2 -
Tutorial Blockchain tutorial from Liao Xuefeng Link -
FAST'24 COLE: A Column-based Learned Storage for Blockchain Systems Summary COLE
SIGMOD'24 ChainKV: A Semantics-Aware Key-Value Store for Ethereum System Link ChainKV
Frontiers of CS Dynamic-EC: an efficient dynamic erasure coding method for permissioned blockchain systems Summary Dynamic-EC
HPCA'24 Rapper: A Parameter-Aware Repair-in-Memory Accelerator for Blockchain Storage Platform Link Blockchain, EC
ACM Computing Survey'24 Scaling Blockchains with Error Correction Codes: A Survey on Coded Blockchains Link Blockchain, coding
TC'24 BFT-DSN: A Byzantine Fault-Tolerant Decentralized Storage Network Link BFT, EC
IOTJ'24 TORR: A Lightweight Blockchain for Decentralized Federated Learning Link Blockchain, EC, AI
TKDE'23 PartitionChain: A Scalable and Reliable Data Storage Strategy for Permissioned Blockchain Summary PartitionChain
TC'23 Efficient Integrity Auditing Mechanism With Secure Deduplication for Blockchain Storage Link Blockchain, security, deduplication
ICPADS'23 DW-LRC: A Dynamic Wide-stripe LRC Codes for Blockchain Data Under Malicious Node Scenarios Link Blockchain, EC, LRC
IOTJ'23 On Min–Max Storage for Resource-Restricted Clients in Coded Blockchain Systems Link Blockchain, coding
TDSC'22 Enabling Secure and Efficient Decentralized Storage Auditing With Blockchain Link Blockchain, security, coding
ISIT'22 Polar Coded Merkle Tree: Improved Detection of Data Availability Attacks in Blockchain Systems Link Blockchain, Merkle tree, Coding
IOTJ'22 Proof of Continuous Work for Reliable Data Storage Over Permissionless Blockchain Link Permissionless blockchain, EC
COMNET'22 Speeding up block propagation in Bitcoin network: Uncoded and coded designs Link Bitcoin, coding
TCOM'22 Overcoming Data Availability Attacks in Blockchain Systems: Short Code-Length LDPC Code Design for Coded Merkle Tree Link Blockchain, merkle tree, coding
SmartWorld'22 A Lightweight Locally Repairable Code-based Storage Architecture for Blockchains Link Blockchain, coding, LRC
WCNC'22 Secure and Private Fountain Code based Architecture for Blockchains Link Blokchain, coding
IEEE S&P (Oakland)'21 Red Belly: A Secure, Fair and Scalable Open Blockchain Link Red Belly
TIFS'21 PolyShard: Coded Sharding Achieves Linearly Scaling Efficiency and Security Simultaneously Link Polyshard, blockchain, sharding
TON'21 Coding for Scalable Blockchains via Dynamic Distributed Storage Link Blockchain, EC
TKDE'21 Distributed Error Correction Coding Scheme for Low Storage Blockchain Systems Link Erasure coding, blockchain
ISIT'21 Low Latency Cross-Shard Transactions in Coded Blockchain Link Blockchain, coding, sharding
ITW'21 Communication-Efficient LDPC Code Design for Data Availability Oracle in Side Blockchains Link Blockchain, coding
ICDE'20 BFT-Store: Storage Partition for Permissioned Blockchain via Erasure Coding Summary BFT-Store
SIGMOD'20 Demo A Byzantine Fault Tolerant Storage for Permissioned Blockchain Link Erasure coding, permissioned blockchain
JPDC'20 Blockchain-based verification framework for data integrity in edge-cloud storage Link Blockchain, verification, coding
ICDCS'20 Towards Privacy-assured and Lightweight On-chain Auditing of Decentralized Storage Link Blockchain, verification, auditing
Blockchain'20 Secure Regenerating Codes for Reducing Storage and Bootstrap Costs in Sharded Blockchains Link Blockchain, EC, regenerating codes
IOTJ'20 Distributed Error Correction Coding Scheme for Low Storage Blockchain Systems Link Erasure coding, blockchain
AFT'19 SoK: Sharding on Blockchain Link Sharding
CCS'18 RapidChain: Scaling Blockchain via Full Sharding Link RapidChain, blockchain, sharding
iTings'18 Erasure code-based low storage blockchain node Link (highly cited reference) Erasure coding, blockchain
TrustCom'18 A Blockchain-based Decentralized Data Storage and Access Framework for PingER Link Bitcoin, coding
ICPADS'18 Blockchain Based Data Integrity Verification in P2P Cloud Storage Link Blockchain, verification, coding
PODC'07 Verifying Distributed Erasure-Coded Data Link EC, verification
DSN'04 Efficient Byzantine-tolerant erasure-coded storage Link BFT, erasure coding

Large Language Models

LLM Serving

Venue Title Link / Summary Brief
OSDI'24 ServerlessLLM: Low-Latency Serverless Inference for Large Language Models Link LLM, serverless
OSDI'24 Fairness in Serving Large Language Models Link LLM, Fairness

Database

Indexing

Venue Title Link / Summary Brief
OSDI'21 FoundationDB Summary FoundationDB, Apple
SIGMOD'18 The Case for Learned Index Structures Link Learned Index

Security

Venue Title Link / Summary Brief
Systor'18 How to Best Share a Big Secret Link Secret sharing
Communications of the ACM'1979 How to Share a Secret Link Secret sharing

Edge

Venue Title Link / Summary Brief
HotEdge'20 Sharing and Caring of Data at the Edge Summary Edge storage survey (including a list of papers, must read)
JPDC'20 EdgeKV: Decentralized, scalable, and consistent storage for the edge Summary EdgeKV

SEC (Symposium on Edge Computing) Paper List

Venue Title Link / Summary Brief
SEC’17 EdgeCourier: An Edge-hosted Personal Service for Low-bandwidth Document Synchronization in Mobile Cloud Storage Services --- ---
SEC’17 CloudPath: A Multi-Tier Cloud Computing Framework --- ---
SEC’17 LAVEA: Latency-aware Video Analytics on Edge Computing Platform --- ---
SEC’17 Fast Transparent Virtual Machine Migration in Distributed Edge Clouds --- ---
SEC’17 A Vehicle-based Edge Computing Platform for Transit and Human Mobility Analytics --- ---
SEC’18 VideoEdge: Processing Camera Streams using Hierarchical Clusters --- ---
SEC’18 Extend Cloud to Edge with KubeEdge --- ---
SEC’19 Sandpaper: mitigating performance interference in CDN edge proxies --- ---
SEC’19 Real-time traffic estimation at vehicular edge nodes --- ---
SEC’19 Infrastructure fault detection and prediction in edge cloud environments --- ---
SEC’19 Why cloud applications are not ready for the edge (yet) --- ---

Deduplication

Venue Title Link / Summary Brief
ATC'15 Toward Reliable, Secure, and Cost-Efficient Cloud Storage via Convergent Dispersal Summary CDStore

Consensus

Venue Title Link / Summary Brief
ATC'14 In Search of an Understandable Consensus Algorithm Summary Raft
OSDI'1999 Practical Byzantine Fault Tolerance Summary PBFT

Stream Processing

Venue Title Link / Summary Brief
ICDCS'20 Toward Adaptive Disk Failure Prediction via Stream Mining Summary StreamDFP

Graph Processing

Venue Title Link / Summary Brief
OSDI'16 Gemini: A Computation-Centric Distributed Graph Processing System Summary Gemini
SIGMOD'19 Nanosecond Indexing of Graph Data With Hash Maps and VLists Summary Nanosecond

Scheduling

Venue Title Link / Summary Brief
SOSP'1973 Polynomial Complete Scheduling Problems Summary Scheduling proof
Communications of ACM'1974 Scheduling independent tasks to reduce mean finishing time Summary Scheduling algorithms
JACM'1976 Exact and Approximate Algorithms for Scheduling Nonidentical Processors Summary Scheduling algorithms
JACM'1977 Heuristic Algorithms for Scheduling Independent Tasks on Nonidentical Processors Summary Performance analysis on scheduling heuristics
MP'1990 Approximation Algorithms for Scheduling Unrelated Parallel Machines Summary Scheduling algorithms and proofs

Graph Theory

Venue Title Link / Summary Brief
JALG'06 Semi-matchings for bipartite graphs and load balancing Summary Semi-matching on unweighted bipartite
IPL'06 An approximation algorithm for the load-balanced semi-matching problem in weighted bipartite graphs Summary Semi-matching for jobs with identical processing times
IPL'09 A note on "An approximation algorithm for the load-balanced semi-matching problem in weighted bipartite graphs" Summary Corrections of bounds for IPL'06
IPSJ'07 Optimal Balanced Semi-Matchings for Weighted Bipartite Graphs Summary Optimal Semi-matching proof

Networking

Software Defined Network (SDN)

Venue Title Link / Summary Brief
Book Software-Defined-Networks: A Systems Approach Reading notes: Ch.1, Ch.2, Ch.3, Ch.4, Ch.5, Ch.6, Ch.7, Ch.8 SDN Book
White paper Cisco SD-WAN white paper Link Cisco SD-WAN
IEEE Communications Surveys & Tutorials'14 A Survey of Software-Defined Networking: Past, Present, and Future of Programmable Networks Link SDN Survey
ICCCN'21 Software-Defined Wide Area Network (SD-WAN): Architecture, Advances and Opportunities Link SD-WAN Survey
SIGCOMM'18 B4: Experience with a Globally-Deployed Software Defined WAN Link B4
NSDI'14 Network Virtualization in Multi-tenant Datacenters Link Network Virtualization
SIGCOMM'13 Achieving High Utilization with Software-Driven WAN Link (Not done) Software-Driven WAN
SIGCOMM'08 OpenFlow: Enabling Innovation in Campus Networks Link (Not done) OpenFlow

Network Measurement

Venue Title Link / Summary Brief
SIGCOMM'18 SketchLearn: Relieving User Burdens in Approximate Measurement with Automated Statistical Inference Summary SketchLearn

TOS (Transaction on Storage) Paper List

Erasure Coding (TOS)

Venue Title Link / Summary Brief
TOS'09 GRID codes: Strip-based erasure codes with high fault tolerance for storage systems Link ---
TOS'12 Generalized X-code: An efficient RAID-6 code for arbitrary size of disk array Link ---
TOS'13 Exploiting Redundancies and Deferred Writes to Conserve Energy in Erasure-Coded Storage Clusters Link ---
TOS'13 Pyramid Codes: Flexible Schemes to Trade Space for Access Efficiency in Reliable Data Storage Systems Link ---
TOS'14 STAIR Codes: A General Family of Erasure Codes for Tolerating Device and Sector Failures Link ---
TOS'14 Sector-Disk (SD) Erasure Codes for Mixed Failure Modes in RAID Systems Link ---
TOS'15 Low-Complexity Implementation of RAID Based on Reed-Solomon Codes Link ---
TOS'17 High-Performance General Functional Regenerating Codes with Near-Optimal Repair Bandwidth Link ---
TOS'17 Optimal Repair Layering for Erasure-Coded Data Centers: From Theory to Practice Link ---
TOS'17 Systematic Erasure Codes with Optimal Repair Bandwidth and Storage Link ---
TOS'20 On Fault Tolerance, Locality, and Optimality in Locally Repairable Codes Link ---
TOS'20 Fast Erasure Coding for Data Storage: A Comprehensive Study of the Acceleration Techniques Link ---
TOS'20 PBS: An Efficient Erasure-Coded Block Storage System Based on Speculative Partial Writes Link ---

RAID (TOS)

Venue Title Link / Summary Brief
TOS'05 Improving storage system availability with D-GRAID Link ---
TOS'05 Reliability and security of RAID storage systems and D2D archives using SATA disk drives Link ---
TOS'07 PARAID: A gear-shifting power-aware RAID Link ---
TOS'08 A new intra-disk redundancy scheme for high-reliability RAID storage systems in the presence of unrecoverable errors Link ---
TOS'09 Higher reliability redundant disk arrays: Organization, operation, and coding Link ---
TOS'10 Differential RAID: Rethinking RAID for SSD reliability Link ---
TOS'11 A Hybrid Approach to Failed Disk Recovery Using RAID-6 Codes: Algorithms and Performance Evaluation Link ---
TOS'11 Minimum density RAID-6 codes Link ---
TOS'11 Online availability upgrades for parity-based RAIDs through supplementary parity augmentations Link ---
TOS'11 Reducing Repair Traffic in P2P Backup Systems: Exact Regenerating Codes on Hierarchical Codes Link ---
TOS'11 Disk Scrubbing Versus Intradisk Redundancy for RAID Storage Systems Link ---
TOS'14 Beyond MTTDL: A Closed-Form RAID 6 Reliability Equation Link ---
TOS'15 RAIDShield: Characterizing, Monitoring, and Proactively Protecting Against Disk Failures Link ---
TOS'15 An Energy-Efficient and Reliable Storage Mechanism for Data-Intensive Academic Archive Systems Link ---
TOS'15 Rebuttal to “Beyond MTTDL: A Closed-Form RAID-6 Reliability Equation” Link ---
TOS'16 LoneStar RAID: Massive Array of Offline Disks for Archival Systems Link ---
TOS'16 H-Scale: A Fast Approach to Scale Disk Arrays via Hybrid Stripe Deployment Link ---
TOS'19 Determining Data Distribution for Large Disk Enclosures with 3-D Data Templates Link RAID+

Data Placement (TOS)

Venue Title Link / Summary Brief
TOS'14 Random Slicing: Efficient and Scalable Data Placement for Large-Scale Storage Systems Link ---

Flash-memory (TOS)

Venue Title Link / Summary Brief
TOS'18 An Analysis of Flash Page Reuse With WOM Codes Link ---

Backup (TOS)

Venue Title Link / Summary Brief
TOS'12 Efficient cooperative backup with decentralized trust management Link ---

Storage System (TOS)

Venue Title Link / Summary Brief
TOS'05 DISP: Practical, efficient, secure and fault-tolerant distributed data storage Link ---
TOS'09 POTSHARDS—a secure, recoverable, long-term archival storage system Link ---
TOS'11 PRESIDIO: A Framework for Efficient Archival Data Storage Link ---
TOS'13 DepSky: Dependable and Secure Storage in a Cloud-of-Clouds Summary ---
TOS'17 Hybris: Robust Hybrid Cloud Storage Summary ---
TOS'17 Redundancy Does Not Imply Fault Tolerance: Analysis of Distributed Storage Reactions to File-System Faults Link ---
TOS'19 Liquid Cloud Storage Link ---
TOS'20 The Case for Custom Storage Backends in Distributed Storage Systems Link ---

KV-Store (TOS)

Venue Title Link / Summary Brief
TOS'17 Efficient and Available In-Memory KV-Store with Hybrid Erasure Coding and Replication Link ---

Benchmark (TOS)

Venue Title Link / Summary Brief
TOS'07 Understanding disk failure rates: What does an MTTF of 1,000,000 hours mean to you? Link ---
TOS'08 A nine year study of file system and storage benchmarking Link ---

Techniques (TOS)

Venue Title Link / Summary Brief
TOS'12 Efficient software implementations of large finite fields GF(2n) for secure storage applications Link ---
TOS'16 Tools for Predicting the Reliability of Large-Scale Storage Systems Link ---

File System (TOS)

Venue Title Link / Summary Brief
TOS'14 A Study of Linux File System Evolution Link ---
TOS'20 Everyone Loves File: Oracle File Storage Service Link ---