From 019806b601ae9601510a9051c56551ae922e3d1f Mon Sep 17 00:00:00 2001 From: Mathis Rosenhauer <rosenhauer@dkrz.de> Date: Thu, 21 Jan 2021 09:48:25 +0100 Subject: [PATCH] doc: update to new CCSDS standard 121.0-B-3 --- CHANGELOG.md | 14 ++++++++++++ README.md | 61 ++++++++++++++++++++++++++-------------------------- src/encode.c | 4 +++- src/libaec.h | 11 ++++++---- 4 files changed, 55 insertions(+), 35 deletions(-) diff --git a/CHANGELOG.md b/CHANGELOG.md index e77c61c..e828ab3 100644 --- a/CHANGELOG.md +++ b/CHANGELOG.md @@ -1,6 +1,20 @@ # libaec Changelog All notable changes to libaec will be documented in this file. +## [1.0.5] + +### Changed +- Updated documentation to new 121.0-B-3 standard. The new standard + mainly clarifies and explicitly defines special cases which could be + ambiguous or misleading in previous revisions. + + These changes did *not* require any substantial changes to libaec. + Existing compressed data is still compatible with this version of + the library and compressed data produced by this version can be + uncompressed with previous versions. + +- Improvements to the build process and further documentation fixes. + ## [1.0.4] - 2019-02-11 ### Added diff --git a/README.md b/README.md index 12649b2..4ea9f70 100644 --- a/README.md +++ b/README.md @@ -8,19 +8,24 @@ simulations. While floating point representations are not directly supported, they can also be efficiently coded by grouping exponents and mantissa. -Libaec implements +## Scope + +Libaec implements extended [Golomb-Rice](http://en.wikipedia.org/wiki/Golomb_coding) coding as -defined in the Space Data System Standard documents [121.0-B-2][1] and -[120.0-G-2][2]. +defined in the CCSDS recommended standard [121.0-B-3][1]. The library +covers the adaptive entropy coder and the preprocessor discussed in +sections 1 to 5.2.6 of the [standard][1]. ## Downloads Source code and binary installer can be [downloaded here](https://gitlab.dkrz.de/k202009/libaec/tags) [or here](https://github.com/MathisRosenhauer/libaec). -## Patent +## Patent considerations + +As stated in section A3 of the current [standard][1] -In [patent.txt](doc/patent.txt) a statement on potentially -applying intellectual property rights is given. +> At time of publication, the specifications of this Recommended +> Standard are not known to be the subject of patent rights. ## Installation @@ -36,8 +41,8 @@ In this context efficiency refers to the size of the encoded data. Performance refers to the time it takes to encode data. Suppose you have an array of 32 bit signed integers you want to -compress. The pointer pointing to the data shall be called `*source`, -output goes into `*dest`. +compress. The pointer pointing to the data shall be called `source`, +output goes into `dest`. ```c #include <libaec.h> @@ -91,11 +96,12 @@ compression to adapt more rapidly to changing source statistics. Larger blocks create less overhead but can be less efficient if source statistics change across the block. -`rsi` sets the reference sample interval. A large RSI will improve -performance and efficiency. It will also increase memory requirements -since internal buffering is based on RSI size. A smaller RSI may be -desirable in situations where each RSI will be packetized and possible -error propagation has to be minimized. +`rsi` sets the reference sample interval in blocks. A large RSI will +improve performance and efficiency. It will also increase memory +requirements since internal buffering is based on RSI size. A smaller +RSI may be desirable in situations where errors could occur in the +transmission of encoded data and the resulting propagation of errors +in decoded data has to be minimized. ### Flags: @@ -108,9 +114,7 @@ error propagation has to be minimized. uncorrelated. * `AEC_DATA_MSB`: input data is stored most significant byte first - i.e. big endian. You have to specify `AEC_DATA_MSB` even if your host - architecture is big endian. Default is little endian on all - architectures. + i.e. big endian. Default is little endian on all architectures. * `AEC_DATA_3BYTE`: the 17 to 24 bit input data is stored in three bytes. This flag has no effect for other sample sizes. @@ -118,11 +122,6 @@ error propagation has to be minimized. * `AEC_RESTRICTED`: use a restricted set of code options. This option is only valid for `bits_per_sample` <= 4. -* `AEC_PAD_RSI`: assume that the encoded RSI is padded to the next byte - boundary while decoding. The preprocessor macro `ENABLE_RSI_PADDING` - needs to be defined while compiling for the encoder to honour this - flag. - ### Data size: The following rules apply for deducing storage size from sample size @@ -138,7 +137,7 @@ The following rules apply for deducing storage size from sample size If a sample requires less bits than the storage size provides, then you have to make sure that unused bits are not set. Libaec does not -check this for performance reasons and will produce undefined output +enforce this for performance reasons and will produce undefined output if unused bits are set. All input data must be a multiple of the storage size in bytes. Remaining bytes which do not form a complete sample will be ignored. @@ -211,15 +210,17 @@ The actual values of coding parameters are in fact only relevant for efficiency and performance. Data integrity only depends on consistency of the parameters. +The exact length of the original data is not preserved and must also be +transmitted out of band. The decoder can produce additional output +depending on whether the original data ended on a block boundary or on +zero blocks. The output data must therefore be truncated to the +correct length. This can also be achieved by providing an output +buffer of just the correct length. ## References -[Consultative Committee for Space Data Systems. Lossless Data -Compression. Recommendation for Space Data System Standards, CCSDS -121.0-B-2. Blue Book. Issue 2. Washington, D.C.: CCSDS, May 2012.][1] -[1]: https://public.ccsds.org/Pubs/121x0b2ec1.pdf +[Lossless Data Compression. Recommendation for Space Data System +Standards, CCSDS 121.0-B-3. Blue Book. Issue 3. Washington, D.C.: +CCSDS, August 2020.][1] -[Consultative Committee for Space Data Systems. Lossless Data -Compression. Recommendation for Space Data System Standards, CCSDS -120.0-G-3. Green Book. Issue 3. Washington, D.C.: CCSDS, April 2013.][2] -[2]: https://public.ccsds.org/Pubs/120x0g3.pdf +[1]: https://public.ccsds.org/Pubs/121x0b3.pdf diff --git a/src/encode.c b/src/encode.c index 1cf98d2..47facb3 100644 --- a/src/encode.c +++ b/src/encode.c @@ -653,7 +653,7 @@ static int m_get_rsi_resumable(struct aec_stream *strm) /** Get RSI while input buffer is short. - Let user provide more input. Once we got all input pad buffer + Let user provide more input. Once we got all input, pad buffer to full RSI. */ @@ -668,6 +668,8 @@ static int m_get_rsi_resumable(struct aec_stream *strm) state->blocks_avail = state->i / strm->block_size - 1; if (state->i % strm->block_size) state->blocks_avail++; + /* Pad raw buffer with last sample. Only encode + * blocks_avail will be encoded later. */ do state->data_raw[state->i] = state->data_raw[state->i - 1]; diff --git a/src/libaec.h b/src/libaec.h index 44505e4..afb166a 100644 --- a/src/libaec.h +++ b/src/libaec.h @@ -83,7 +83,7 @@ struct aec_stream { /* block size in samples */ unsigned int block_size; - /* Reference sample interval, the number of Coded Data Sets + /* Reference sample interval, the number of blocks * between consecutive reference samples (up to 4096). */ unsigned int rsi; @@ -113,7 +113,9 @@ struct aec_stream { /* Use restricted set of code options */ #define AEC_RESTRICTED 16 -/* Pad RSI to byte boundary. Only for decoding CCSDS sample data. */ +/* Pad RSI to byte boundary. Only used for decoding some CCSDS sample + * data. Do not use this to produce new data as it violates the + * standard. */ #define AEC_PAD_RSI 32 /* Do not enforce standard regarding legal block sizes. */ @@ -140,8 +142,9 @@ struct aec_stream { * set AEC_FLUSH to drain all output. * * It is not possible to continue encoding of the same stream after it - * has been flushed because the last byte may be padded with fill - * bits. */ + * has been flushed. For one, the last block may be padded zeros after + * preprocessing. Secondly, the last encoded byte may be padded with + * fill bits. */ #define AEC_FLUSH 1 /*********************************************/ -- GitLab