Stanislaw Ogorkis- education MSc

WARSAW UNIVERSITY OF TECHNOLOGY

The faculty of Electronics and Information Technology

2011-2012

Degree: Master of Science in Engineering
Field: Computer Science
Major: Computer Information System Engineering
Final grade: excellent
Date: 8 October 2012
Thesis: Video compression in Linux (details below)

Thesis: Video compression in Linux

Original thesis (polish): Kompresja sekwencji wideo w systemie Linux

Abstract:

The main goal of this thesis is to discover and document method of creation video codec in Linux operating system. It describes existing video libraries and tools in Linux system. Implementation uses FFmpeg library and it is integrated with various applications and libraries. Moreover theoretical introduction describes video compression basics, motion compensation, signal transforms, quantization and lossless encoding methods.

Thesis contains detailed instruction of adding new video codec to FFmpeg library and its integration with other applications. Developed codec is verified for compression effectiveness and speed.

Codec overview

Developed lossy codec is simplified variation of DPCM/DCT video coding scheme. It does not contain motion estimation and compensation algorithms. Temporal redundancy is reduced by usage of residual frames. Frames are divided into blocks and transformed by block DCT. Block coefficients are next quantizied and encoded by arithmetic encoder. Codec named msc, is implemented in FFmpeg library. Image 1 shows all steps preformed by encoder.

Image 1. MSC codec encoding scheme

Data formatting

First step of encoding process is data formatting. Video sequence is processed frame by frame in I420p pixel format. This step includes temporal redundancy reduction which is done by extraction of residual frames. Each 25th frame is coded as key frame and rest are coded as residual frames.

DCT transform

Temporal and residual frames are next divided into 8x8 blocks and Discrete Cosine Transform (DCT) is applied to them. Due to chroma subsampling applied in data formatting step, there is four time less coefficients in chroma components than in luma components. Blocks are processed in macroblocks, containing 4 luma blocks and 2 chroma blocks.

Quantization

Blocks containing DCT coefficients are next quantized with quantization matrices. During codec implementation several quantization matrices were tested. Best results were obtained with quantization matrix from JPEG standard.

RLE encoding with zig-zag scanning

Quantized coefficients in each block are coded with RLE method combined with zig-zag scanning order. Due to energy accumulation in most top left coefficients of block, this method allows omitting of trailing zeros in each block.

Arithmetic coding

In last step coefficient stream is encoded with arithmetic encoder adapted from article. Arithmetic encoder uses several statistical models depending on coefficient values in each block. For instance, if maximal value in block is 20, then it is better to use model with values from 0 to 31, than model with values from 0 to 63. Using several statistical models leads to significant compression efficiency gain.

Implementation and integration

Designed codec was implemented with FFmpeg library. One of main purposes of thesis was preparing instruction of adding video codec in Linux operating system and its integration with various libraries and applications. Created tutorial consists instruction of adding video codec to FFmpeg library and its itegration with MPlayer, ffmpegthumbnailer and GStreamer library.

Results

Developed codec was verified for compression effectiveness and speed. Compression effectiveness was measured with rate of bitrate to PSNR. Tests were conducted on set of standard test sequences from Derf's collection. Created codec msc was compared to MJPEG, MPEG4 codecs. Below there are results for some test sequences.

Image 2. Results for Akiyo test sequence

Image 3. Results for Bowing test sequence

Image 4. Results for Foreman test sequence

Summary

Developed msc codec achieved better compression efficiency than simple MJPEG codec. Reduction of temporal redundancy combined with arithmetic encoder with multiple statistical models, results in significant gain in static scenes like Akiyo. In sequences with lower temporal redundancy (i.e. Foreman), this efficiency gain is not so significant, but still msc codec is better.

Moreover, created FFmpeg video codec tutorial, can be used by persons interested in developing their own codecs. Integration instruction for Mplayer, ffmpegthumnailer and GStreamer shows how new codecs can be used in various applications of Linux operating system.

Resources

http://ogorkis.net/ffmpeg - FFmpeg video codec tutorial
https://github.com/sogorkis/FFmpeg - github FFmpeg fork with developed msc codec source code
http://media.xiph.org/video/derf/ - test sequences used in codec verification
http://marknelson.us/1991/02/01/arithmetic-coding-statistical-modeling-data-compression - article describing arithmetic encoding