NCCloud: A Network-Coding-Based Storage System in a Cloud-of-Clouds
This paper presents a proxy-based storage system called NCCloud, which is built on top of a network-coding-based storage scheme called FMSR. Compared with traditional RAID-6 (Reed Solomon Codes), NCCloud can save repair traffic by close to 50% when $n$ is large. While other regenerating code like EMSR code keeps the same storage size as in RAID-6 codes but saves repair traffic, it still requires storage nodes to perform encoding during repair. FMSR codes achieves the same repair traffic but relax the encoding requirement of storage nodes, so it ‘s portable to any cloud storage service which only need to support the standard read/write functionalities. However, FMSR codes are non-systematic and are acceptible for long-term archival applications, where the read frequency is typically low.
Fig 1 illustrates the repair operation single node failure for different codes. In this example, a file in size $M$ are stored in $n = 4$ nodes and allows $k = 2$ node failure at most. Corresponding statistics are listed in the following table.
Codes | Chunk Size | Total Storage Size | Repair Traffic | Other Requirements |
---|---|---|---|---|
RS | $M / 2$ | 2M | M | |
EMSR | $M / 4$ | 2M | 0.75M | Encoding in Storage Nodes |
FMSR | $M / 4$ | 2M | 0.75M |
Regenerate code chunks that are not necessarily identical to those originally stored in the failed node
Two phase checking scheme
devide file $F$ into $k(n-k)$ equal-size native chunks = $F_i$
encode native chunks to $n(n-k)$ equal-size code chunks = $P_i$
let EM = $[a_{i,j}]$ be $n(n-k) * k(n-k)$ encoding matrix. Row vector in size $k(n-k)$ of EM is called encoding coefficient vector (ECV)
$P_i = \sum_{j=1}^{k(n-k)} a_{i,j}F_j$, product of ECV_i and all the native chunks $F_1$ … $F_{k(n-k)}$.
Whole EM is stored in a metadata object then replicated to all storage nodes.
Many ways to construct EM, as long as it passes two-phase checking.
select any $k$ of $n$ storage nodes, download $k(n-k)$ code chunks.
ECVs form a $k(n-k) * k(n-k)$ matrix $EM_{repair}$.
If the MDS property is maintained, $EM_{repair}^{-1}$ exists.
native chunks can be recovered by using all downloaded code chunks by $F = EM_{repair}^{-1}P$.
Ensure the MDS property still holds even after iterative repairs.
Two phase-checking scheme: suppose $(r-1)^{th}$ repair successful, then in $(r)^{th}$ repair, we first check if the new set of chunks in all storage nodes satisfies the MDS property in $(r)^{th}$ repair, then check if the MDS property still holds in $(r+1)^{th}$ repair.
repair steps
Step 1: Download EM from one node.
Step 2: Select one ECV from each of the $n - 1$ surviving nodes, which means randomly select one chunk from all (n-k) chunks in each storage nodes.
Step 3: Generate repair matrix randomly.
Step 4: Compute the ECVs for the new code chunks and reproduce a new encoding matrix $EM^{'}$.
Step 5: check if both MDS and rMDS are satisfied for $EM^{'}$. If either one fails, return to Step 2. Note that overhead doesnot depend on the chunk size, since only deal with ECVs.
Step 6: Download actual chunk data and regenerate new chunk data. We proceed to download the $n - 1$ chunks that correspond to the selected ECVs in Step 2 from the n - 1 surviving nodes. Wrong selection of ECVs may violate the MDS property thus fail in recovery.
Some analysis of two-phase checking and system reliability, including an counter example, simulations.
Generalization of FMSR codes for concurrent node failures
Different metrics other than MTTDL
Degraded reads
Double fault-tolerance
Save repair traffic (up to 50% for large n)
Thin-cloud interface
Large overhead of encode/decode operations. Two-phase checking causes generating new code chunks.
Non-systematic, not a good choice for frequent read(modify) operation. Partial read requires complete file download.