155 lines
5.7 KiB
Markdown
155 lines
5.7 KiB
Markdown
# cm256cc
|
|
Fast GF(256) Cauchy MDS Block Erasure Codec in C++
|
|
|
|
This is the rewrite in (as much as possible) clean C++ of [cm256](https://github.com/f4exb/cm256). In some contexts like Qt programs and plugins the original cm256 library does not work.
|
|
|
|
cm256cc performance is on par or even better than cm256. This is particularly true for armv7 architecture (Raspberry Pi 2 and 3) and is the most significant with Raspberry Pi 2.
|
|
|
|
cm256cc is a simple library for erasure codes. From given data it generates
|
|
redundant data that can be used to recover the originals.
|
|
|
|
Currently only g++ is supported, other versions of MSVC than Visual Studio 2013 may work. Optimizations for both SSE3 (x86_64) and Neon (armv7) are available.
|
|
|
|
The original data should be split up into equally-sized chunks. If one of these chunks
|
|
is erased, the redundant data can fill in the gap through decoding.
|
|
|
|
The erasure code is parameterized by three values (`OriginalCount`, `RecoveryCount`, `BlockBytes`). These are:
|
|
|
|
+ The number of blocks of original data (`OriginalCount`), which must be less than 256.
|
|
+ The number of blocks of redundant data (`RecoveryCount`), which must be no more than `256 - OriginalCount`.
|
|
|
|
For example, if a file is split into 3 equal pieces and sent over a network, `OriginalCount` is 3.
|
|
And if 2 additional redundant packets are generated, `RecoveryCount` is 2.
|
|
In this case up to 256 - 3 = 253 additional redundant packets can be generated.
|
|
|
|
|
|
##### Building: Quick Setup
|
|
|
|
This is a classical cmake project. Make sure cmake and g++ is installed in your system. create a `build` directory and cd into it. If you install the library in a custom location say `opt/install/cm256cc` use the following command line for cmake:
|
|
|
|
- `cmake -Wno-dev -DCMAKE_INSTALL_PREFIX=/opt/install/cm256cc ..`
|
|
|
|
Result:
|
|
|
|
- Library will be installed as `/opt/install/cm256cc/lib/libcm256cc.so`
|
|
- Include files will be installed in `/opt/install/cm256cc/include/cm256cc`
|
|
- Binary test programs will be installed in `/opt/install/cm256cc/bin`
|
|
|
|
##### Building: Use the library
|
|
|
|
Include the cm256cc library in your project and cm256.h header in your program. Have a look at example programs `cm256_test.cpp`, `transmit.cpp`and `receive.cpp` in the `unit_test` folder for usage. Consult the `cm256.h header` for details on the encoding / decoding method.
|
|
|
|
|
|
## Usage
|
|
|
|
Documentation is provided in the header file [cm256.h](https://github.com/catid/cm256/raw/master/cm256.h).
|
|
|
|
When your application starts up it should call `isInitialized()` to verify that the library is constructed properly:
|
|
|
|
~~~
|
|
#include "cm256.h"
|
|
|
|
CM256 cm256;
|
|
|
|
if (!cm256.isInitialized()) {
|
|
// library not initialized
|
|
exit(1);
|
|
}
|
|
~~~
|
|
|
|
To generate redundancy, use the `cm256_encode` function. To solve for the original data use the `cm256_decode` function.
|
|
|
|
Example usage:
|
|
|
|
~~~
|
|
bool ExampleFileUsage()
|
|
{
|
|
CM256 cm256;
|
|
|
|
if (!cm256.isInitialized()) {
|
|
// library not initialized
|
|
exit(1);
|
|
}
|
|
|
|
CM256::cm256_encoder_params params;
|
|
|
|
// Number of bytes per file block
|
|
params.BlockBytes = 4321;
|
|
|
|
// Number of blocks
|
|
params.OriginalCount = 33;
|
|
|
|
// Number of additional recovery blocks generated by encoder
|
|
params.RecoveryCount = 12;
|
|
|
|
// Size of the original file
|
|
static const int OriginalFileBytes = params.OriginalCount * params.BlockBytes;
|
|
|
|
// Allocate and fill the original file data
|
|
uint8_t* originalFileData = new uint8_t[OriginalFileBytes];
|
|
memset(originalFileData, 1, OriginalFileBytes);
|
|
|
|
// Pointers to data
|
|
CM256::cm256_block blocks[256];
|
|
for (int i = 0; i < params.OriginalCount; ++i)
|
|
{
|
|
blocks[i].Block = originalFileData + i * params.BlockBytes;
|
|
}
|
|
|
|
// Recovery data
|
|
uint8_t* recoveryBlocks = new uint8_t[params.RecoveryCount * params.BlockBytes];
|
|
|
|
// Generate recovery data
|
|
if (cm256.cm256_encode(params, blocks, recoveryBlocks))
|
|
{
|
|
exit(1);
|
|
}
|
|
|
|
// Initialize the indices
|
|
for (int i = 0; i < params.OriginalCount; ++i)
|
|
{
|
|
blocks[i].Index = CM256::cm256_get_original_block_index(params, i);
|
|
}
|
|
|
|
//// Simulate loss of data, subsituting a recovery block in its place ////
|
|
blocks[0].Block = recoveryBlocks; // First recovery block
|
|
blocks[0].Index = cm256_get_recovery_block_index(params, 0); // First recovery block index
|
|
//// Simulate loss of data, subsituting a recovery block in its place ////
|
|
|
|
if (cm256.cm256_decode(params, blocks))
|
|
{
|
|
exit(1);
|
|
}
|
|
|
|
// blocks[0].Index will now be 0.
|
|
|
|
delete[] originalFileData;
|
|
delete[] recoveryBlocks;
|
|
|
|
return true;
|
|
}
|
|
~~~
|
|
|
|
The example above is just one way to use the `cm256_decode` function.
|
|
|
|
This API was designed to be flexible enough for UDP/IP-based file transfer where
|
|
the blocks arrive out of order.
|
|
|
|
|
|
#### Comparisons with Other Libraries
|
|
|
|
The approach taken in CM256 is similar to the Intel Storage Acceleration Library (ISA-L) available here:
|
|
|
|
https://01.org/intel%C2%AE-storage-acceleration-library-open-source-version/downloads
|
|
|
|
ISA-L more aggressively optimizes the matrix multiplication operation, which is the most expensive step of encoding.
|
|
|
|
CM256 takes better advantage of the m=1 case and the first recovery symbol, which is also possible with the Vandermonde matrices supported by ISA-L.
|
|
|
|
ISA-L uses a O(N^3) Gaussian elimination solver for decoding. The CM256 decoder solves the linear system using a fast O(N^2) LDU-decomposition algorithm from "Pivoting and Backward Stability of Fast Algorithms for Solving Cauchy Linear Equations" (T. Boros, T. Kailath, V. Olshevsky), which was hand-optimized for memory accesses.
|
|
|
|
|
|
#### Credits
|
|
|
|
This software was written entirely by Christopher A. Taylor <mrcatid@gmail.com> and converted to clean C++ code by myself Edouard M. Griffiths <f4exb06@gmail.com>.
|