Data Compression Cheatsheet: Algorithms & Techniques

Quick Reference Table

Algorithm	Type	Primary Use Case	Typical Ratio	Speed (C/D)	Key Characteristic	Common Ext.
RLE	Lossless	Simple graphics, faxes	Low-Med	Fast/Fast	Simple, good for repeated data	(BMP, TIFF)
Huffman	Lossless	Text, component	Med	Med/Med	Optimal per-symbol	(JPEG, Deflate)
LZ77/LZ78	Lossless	General purpose, text	Med-High	Med/Med	Dictionary-based, adaptive	(ZIP, GZIP)
LZW	Lossless	GIF images, `compress`	Med	Med/Med	Dictionary-based	.gif
bzip2	Lossless	General file compression	High	Slow/Slow	BWT, good ratio but slow	.bz2
LZMA/LZMA2	Lossless	Archives (7z, xz)	Very High	Slow/Med	Excellent ratio, high memory	.7z, .xz
Deflate	Lossless	General (ZIP, GZIP, PNG)	Med-High	Med/Fast	LZ77 + Huffman, good balance	.zip, .gz, .png
Arithmetic	Lossless	Component	Med-High	Slow/Med	Near-optimal, complex	(JPEG2000)
Brotli	Lossless	Web content (text, fonts)	High-V.High	Slow/Fast	Excellent for text	.br
Zstandard (Zstd)	Lossless	General, databases, real-time	High-V.High	V.Fast/V.Fast	Fast, flexible, modern	.zst
FLAC	Lossless	Audio archival	Med (audio)	Med/Fast	Lossless audio	.flac
ALAC	Lossless	Audio (Apple ecosystem)	Med (audio)	Med/Fast	Apple lossless audio	.m4a
PPM	Lossless	Text, high ratio	Very High	Slow/Slow	Context modeling, high ratio	-
JPEG	Lossy	Photographic images	High	Med/Fast	Widely supported for photos	.jpg, .jpeg
HEIC	Lossy (primarily)	Images (Apple default)	High	Med/Fast	Efficient, supports advanced features	.heic, .heif
WebP	Lossy/Lossless	Web images	High (lossy)	Med/Fast	Versatile, animation, transparency	.webp
AVIF	Lossy/Lossless	Web images (next-gen)	Very High	Slow/Med	Excellent ratio/quality, HDR	.avif
JPEG 2000	Lossy/Lossless	Medical/archival images	V.High	Slow/Med	Better quality than JPEG, scalable	.jp2, .j2k
ProRes	Lossy (visually)	Professional video editing	Low-Med	Fast/Fast (editing)	High quality, edit-friendly	.mov
H.264/AVC	Lossy	Video (Blu-ray, streaming)	V.High	Med/Fast	Excellent quality/ratio	.mp4, .mkv
H.265/HEVC	Lossy	Video (4K/UHD, streaming)	V.High	Slow/Fast	~2x efficiency of H.264	.mp4, .mkv
VP9	Lossy	Video (web streaming, YouTube)	V.High	Med/Fast	Royalty-free, H.265 competitor	.webm, .mp4
AV1	Lossy	Video (web streaming)	V.High	V.Slow/Med	Royalty-free, excellent compression	.mkv, .mp4
MP3	Lossy	Audio (music)	High	Fast/V.Fast	Ubiquitous for music	.mp3
AAC	Lossy	Audio (streaming, Apple)	V.High	Fast/Fast	Better than MP3 at same bitrate	.aac, .m4a
Opus	Lossy	Audio (VoIP, streaming, web)	V.High	Fast/Fast	Royalty-free, versatile, low latency	.opus
Vorbis	Lossy	Audio (open-source applications)	High	Fast/Fast	Royalty-free, good quality	.ogg, .oga

Note: Ratio & Speed are relative and can vary based on data, settings, and implementation.

I. Foundational Theory

Core Concepts

Fundamental ideas like "What is compression?", its necessity, entropy, and types of redundancy.

What is Data Compression?

The process of reducing the size of data (number of bits) to store or transmit it more efficiently.

Why Compress?

Storage Savings: Store more data in the same space.
Faster Data Transmission: Reduce time and bandwidth needed to transfer data.
Reduced Costs: Lower expenses for storage and bandwidth.

Information Theory Basics:

Entropy: A measure of the inherent randomness or uncertainty in data. Represents the theoretical lower bound for compression.
Redundancy: Information that is repeated or predictable. Types include:
- Spatial Redundancy
- Temporal Redundancy
- Statistical/Symbol Redundancy
- Perceptual Redundancy

Limits of Compression

Understanding incompressibility and the theoretical boundaries set by Rate-Distortion Theory for lossy compression.

Incompressibility

Truly random data (or data that appears random, like encrypted or already well-compressed data) cannot be significantly compressed further by lossless methods. Applying a lossless compressor might even slightly increase size due to overhead.

Rate-Distortion Theory (Lossy)

A mathematical framework defining the trade-off between compression rate (bits used) and distortion (loss of fidelity). It sets the minimum achievable rate for a given distortion level, guiding lossy codec design.

Classifications

Key distinctions: Lossless vs. Lossy compression, and the concept of Near-Lossless compression.

Lossless vs. Lossy

Feature	Lossless	Lossy
Reconstruction	Perfect	Imperfect (Approximation)
Info Loss	None	Yes (irreversible)
Typical Ratio	Moderate (2:1-4:1)	High (10:1-100:1+)
Use Cases	Text, code, archives	Multimedia (images, audio, video)

Near-Lossless Compression

A specialized category where decompressed data isn't identical, but differences are strictly bounded and often imperceptible (e.g., some scientific data, medical imaging).

Evaluation Metrics

How compression algorithms are measured: Ratio, Speed, Cost, Fidelity, Asymmetry, Robustness.

Compression Ratio/Savings: Original Size / Compressed Size or (1 - CS/OS) * 100%.
Compression/Decompression Speed: Rate of data processing (e.g., MB/s).
Computational Cost: CPU, memory usage.
Fidelity/Quality (Lossy): Objective (PSNR, SSIM) or subjective perception.
Asymmetry: Difference in computational cost between compression and decompression.
Robustness to Errors: How well a compressed stream recovers from bit errors.

Basic Principles/Techniques

Underlying methods used in many algorithms: Dictionary-based, Statistical, Transform, RLE, Predictive, Context Modeling, BWT, Delta Coding.

Dictionary-Based: Replaces sequences with references (e.g., LZ77, LZW).
Diagram: Sliding Window & Dictionary
Statistical Modeling: Shorter codes for frequent symbols (e.g., Huffman, Arithmetic).
Diagram: Huffman Tree Example
Transform Coding: Converts data to a more compressible domain (e.g., DCT in JPEG).
Diagram: DCT Block Transformation
Run-Length Encoding (RLE): Replaces sequences of identical symbols (e.g., AAAAA -> 5A).
Predictive Coding: Encodes difference from predicted value.
Context Modeling: Estimates symbol probability based on preceding symbols.
Burrows-Wheeler Transform (BWT): Reversible transform grouping similar characters (used in bzip2).
Delta Coding: Stores difference between consecutive data elements.

II. Lossless Algorithms

Run-Length Encoding (RLE) Classic

Simple technique replacing consecutive identical data values with a count and the value.

Core Idea:

Replaces runs of identical data with a count and the single data value (e.g., WWWWBB -> 4W2B).

Use Cases:

Simple graphics, icons, fax transmissions, bitmap images (BMP), TIFF.

Strengths:

Very simple, computationally inexpensive, fast.

Weaknesses:

Inefficient for data without long runs; can even increase file size.

File Extensions:

Used within BMP, TIFF, PDF.

Huffman Coding Classic

Assigns variable-length codes based on symbol frequencies; more frequent symbols get shorter codes.

Core Idea:

Builds a prefix code tree where more frequent symbols have shorter paths (codes).

Use Cases:

Component in Deflate (ZIP, GZIP), JPEG, MP3, PNG.

Strengths:

Optimal per-symbol coding, relatively simple.

Weaknesses:

Requires symbol frequencies beforehand (or two passes). Not adaptive by itself.

Diagram: Simple Huffman Tree Example

LZ77 & LZ78 Classic

Dictionary-based algorithms. LZ77 uses a sliding window; LZ78 builds an explicit dictionary.

Core Idea:

Replace repeated sequences with references to previously seen data.

Use Cases:

General-purpose text/data. Basis for Deflate (ZIP, GZIP).

Strengths:

Adaptive, good compression ratios.

Weaknesses:

Can be slower if not optimized.

LZW (Lempel-Ziv-Welch) Classic

Builds a dictionary from data; outputs dictionary codes. Used in GIF.

Core Idea:

Builds a string translation table from input data; outputs codes for encountered sequences.

Use Cases:

GIF images, compress utility, TIFF, PDF.

Strengths:

Simple, fast decompression.

Weaknesses:

Patents (now expired). Less efficient than modern LZ variants.

File Extensions:

.gif (internally)

bzip2 Archive

Uses Burrows-Wheeler Transform (BWT) followed by MTF and Huffman coding. Good ratio, but slow.

Core Idea:

Reorders data using BWT to group similar characters, then compresses the transformed data.

Use Cases:

General file compression, software distribution (common on Linux/Unix).

Strengths:

Generally better compression ratios than Deflate.

Weaknesses:

Significantly slower compression and decompression than Deflate, Zstd.

File Extensions:

.bz2

LZMA / LZMA2 Archive

LZ77 variant with large dictionary and range encoding. Very high ratios, but can be slow and memory-intensive.

Core Idea:

Combines an LZ77-like dictionary coder with sophisticated probability modeling (range coder).

Use Cases:

Default for 7-Zip (.7z), XZ Utils (.xz). Software distribution, large archives.

Strengths:

Very high compression ratios.

Weaknesses:

Slow compression, high memory usage. Decompression is faster but not top-tier.

File Extensions:

.7z, .xz

Deflate (LZ77 + Huffman) General

Combines LZ77 and Huffman coding. Widely used in ZIP, GZIP, PNG.

Core Idea:

Finds duplicate strings with LZ77, then compresses literals and LZ77 output with Huffman.

Use Cases:

.zip, .gz files, PNG images, HTTP compression.

Strengths:

Good balance of speed and ratio, widely adopted.

Weaknesses:

Outperformed in ratio and/or speed by modern algorithms like Zstd, Brotli.

File Extensions:

.zip, .gz, .png (internally)

Arithmetic Coding Advanced

Encodes entire message as a single fraction. Achieves near-optimal compression.

Core Idea:

Represents the input data as a single fractional number in the range [0,1). More efficient than Huffman for skewed probabilities.

Use Cases:

Component in JPEG 2000, H.264/AVC, some bzip2 variants.

Strengths:

Higher compression efficiency than Huffman, especially for skewed probabilities.

Weaknesses:

More computationally complex, historically patent-encumbered.

Brotli Modern Web

Modern algorithm (LZ77 + Huffman + Context Modeling + Static Dictionary). Excellent for web text.

Core Idea:

Uses LZ77, Huffman coding, 2nd order context modeling, and a large pre-defined static dictionary.

Use Cases:

Web content (HTTP compression), WOFF2 fonts. Excels on text.

Strengths:

Excellent compression ratios, fast decompression.

Weaknesses:

Compression can be slower than Gzip/Zstd, though offers quality levels.

File Extensions:

.br (files), br (HTTP content encoding)

Zstandard (Zstd) Modern Fast

Modern algorithm (LZ77 variant + ANS/FSE). Very fast with good ratios. Highly flexible.

Core Idea:

Combines an LZ77-variant with a fast entropy stage (Finite State Entropy - FSE, an Asymmetric Numeral System variant).

Use Cases:

General-purpose, databases (MySQL, RocksDB), file systems (ZFS, Btrfs), real-time, archives (.tar.zst).

Strengths:

Very fast compression/decompression, flexible levels, good ratios, dictionary support.

Weaknesses:

Newer, so adoption still growing vs. Deflate (though rapidly).

File Extensions:

.zst

Prediction by Partial Matching (PPM) High Ratio

Adaptive statistical technique using context modeling and arithmetic coding. High ratios, but slow.

Core Idea:

Uses preceding symbols (context) to predict the next symbol's probability for arithmetic coding.

Use Cases:

Text compression, general-purpose where ratio is paramount.

Strengths:

Very high compression ratios, adaptive.

Weaknesses:

Computationally expensive (CPU and memory), can be very slow.

FLAC Audio

Lossless audio compression using linear prediction and Golomb-Rice coding.

Core Idea:

Models audio signal with linear prediction, encodes residual error.

Use Cases:

Archival of music, high-fidelity audio playback.

Strengths:

Good audio compression (30-60% reduction), royalty-free, widely supported.

Weaknesses:

Specifically for audio.

File Extensions:

.flac, .fla

ALAC (Apple Lossless) Audio Apple

Apple's lossless audio codec, also using linear prediction.

Core Idea:

Similar to FLAC, uses linear prediction with different parameters and entropy coding.

Use Cases:

Used within Apple's ecosystem (Apple Music Lossless, iTunes libraries).

Strengths:

Similar compression to FLAC, well-integrated in Apple products.

Weaknesses:

Less universal support outside Apple ecosystem compared to FLAC.

File Extensions:

.m4a (when containing ALAC)

III. Lossy Algorithms

JPEG Image Classic

Widely used for photographic images. Uses DCT, quantization, and Huffman/Arithmetic coding.

Core Idea:

Transforms 8x8 pixel blocks using Discrete Cosine Transform (DCT), quantizes coefficients, then entropy codes.

Use Cases:

Still images (photographs). Very common on the web.

Strengths:

Widely supported, good for photos at reasonable quality.

Weaknesses:

Blocking artifacts at low quality, not ideal for sharp lines/text.

Parameters:

Quality setting (1-100), chroma subsampling.

File Extensions:

.jpg, .jpeg

Diagram: JPEG 8x8 DCT Block Processing

HEIC (High Efficiency Image Format) Image Apple Default

Apple's default image format. Uses HEVC/H.265 for image data, offers better compression than JPEG.

Core Idea:

Stores HEVC-encoded image data within an HEIF container. Offers better compression than JPEG for similar quality.

Use Cases:

Default image capture on modern iPhones/iPads. Growing support on other platforms.

Strengths:

~50% smaller file size than JPEG for similar quality. Supports transparency, animations, depth maps, Live Photos.

Weaknesses:

Not as universally supported as JPEG yet, though adoption is increasing. Can have licensing considerations (HEVC patents).

Parameters:

Typically managed by capture device settings.

File Extensions:

.heic, .heif

Apple ProRes Video Pro Video

Family of high-quality, lossy (visually lossless to near-lossless) video codecs for professional post-production.

Core Idea:

Intra-frame DCT-based codecs optimized for editing performance and high image fidelity.

Use Cases:

Video acquisition (iPhone Cinematic Mode), professional video editing (Final Cut Pro), intermediate/mastering format.

Strengths:

Excellent image quality, robust editing performance, supports alpha channels (ProRes 4444), multiple data rates/quality levels.

Weaknesses:

Large file sizes compared to distribution codecs (H.264/HEVC). Not intended for final delivery to end-users.

Variants:

ProRes Proxy, LT, 422, 422 HQ, 4444, 4444 XQ, ProRes RAW.

File Extensions:

.mov (QuickTime container)

JPEG 2000 Image Specialized

Uses wavelet transform. Better quality than JPEG at high compression, supports lossless.

Core Idea:

Applies wavelet transform to entire image or large tiles, offering progressive decoding.

Use Cases:

Medical imaging (DICOM), digital cinema, archival.

Strengths:

Better quality than JPEG at high compression, lossless option, ROI coding.

Weaknesses:

More complex, less native software support than JPEG, computationally intensive.

Parameters:

Compression ratio/bitrate, quality layers, lossless/lossy.

File Extensions:

.jp2, .j2k

WebP Image Web

Google's image format. Lossy mode uses VP8-based prediction; also supports lossless, animation, transparency.

Core Idea:

Lossy uses intra-frame prediction from VP8 video; lossless uses different techniques (spatial prediction, LZ77).

Use Cases:

Web images, aiming to replace JPEG, PNG, GIF.

Strengths:

Better compression than JPEG (lossy) and PNG (lossless). Supports animation & alpha.

Weaknesses:

Lossy quality can be debated vs. highly optimized JPEGs or newer AVIF.

Parameters:

Quality setting (lossy), effort setting (lossless).

File Extensions:

.webp

AVIF (AV1 Image Format) Image Next-Gen Web

Image format using AV1 video intra-frame coding. Excellent efficiency, supports HDR.

Core Idea:

Leverages AV1 video compression techniques for still images, stored in HEIF container.

Use Cases:

Web images, aiming for superior quality/ratio over JPEG/WebP.

Strengths:

Significantly better compression than JPEG/WebP. Supports HDR, wide color gamut, lossless, animation. Royalty-free.

Weaknesses:

Newer, software/browser support still growing. Can be computationally demanding.

Parameters:

Quality setting (quantizer), speed/effort.

File Extensions:

.avif

MPEG Family (Video & Audio)

The Moving Picture Experts Group (MPEG) has developed a suite of widely adopted standards for audio and video compression. Key video codecs include H.264/AVC and H.265/HEVC. Key audio codecs include MP3 and AAC. Details for prominent members are in individual cards below.

H.264/AVC (MPEG-4 Part 10) Video

Widely used video codec. Excellent balance of quality and compression. Used in Blu-ray, streaming.

Core Idea:

Advanced video coding with flexible macroblocks, improved prediction, in-loop deblocking filter.

Use Cases:

Blu-ray, streaming, video conferencing, most web video.

Strengths:

Excellent quality/ratio balance, wide hardware support.

Weaknesses:

More complex than MPEG-2, royalty-bearing.

Parameters:

Bitrate, profiles (Baseline, Main, High), levels, GOP settings.

File Extensions:

.mp4, .mkv, .mov

H.265/HEVC Video Apple Used

Successor to H.264, roughly 2x efficiency. Used for 4K/UHD content. Apple default video codec.

Core Idea:

Larger coding units (CTUs), improved prediction modes, Sample Adaptive Offset (SAO) filtering.

Use Cases:

4K/UHD Blu-ray, high-resolution streaming. Default video on modern iPhones/iPads.

Strengths:

Significantly better compression than H.264.

Weaknesses:

More complex, licensing was complicated (improving).

Parameters:

Bitrate, profiles (Main, Main10), tiers, levels.

File Extensions:

.mp4, .mkv, .mov

VP9 Video Web

Google's open and royalty-free video codec. Widely used by YouTube, competitor to H.265.

Core Idea:

Advanced video coding techniques, designed for web streaming and real-time communication.

Use Cases:

YouTube, WebRTC, other streaming services.

Strengths:

Good compression efficiency (comparable to early H.265), royalty-free, strong browser support.

Weaknesses:

Largely being superseded by AV1 for top-tier efficiency.

File Extensions:

Often in .webm, .mp4.

AV1 (AOMedia Video 1) Video Apple Support

Royalty-free, open-source video codec. Aims for better efficiency than HEVC. Growing in web streaming. Apple adds hardware decode support.

Core Idea:

Advanced techniques: larger superblocks, sophisticated prediction, CDEF/loop restoration filters.

Use Cases:

Web streaming (YouTube, Netflix, Twitch), real-time communications (WebRTC).

Strengths:

Excellent compression (better than HEVC), royalty-free. Apple hardware decoding from A17 Pro/M3 chips.

Weaknesses:

Very computationally intensive to encode (improving), decode can also be heavy without hardware support.

Parameters:

Bitrate, quality settings (CRF), speed presets.

File Extensions:

.mkv, .webm, .mp4 (with ISOBMFF)

MP3 (MPEG-1 Audio Layer III) Audio Classic

Ubiquitous audio codec. Uses psychoacoustic model, MDCT, quantization, Huffman.

Core Idea:

Discards parts of audio signal less perceptible to human hearing using psychoacoustic models.

Use Cases:

Digital audio, music files, podcasts.

Strengths:

Ubiquitous support, good quality at moderate bitrates.

Weaknesses:

Older, less efficient than AAC/Opus. Audible artifacts at low bitrates.

Parameters:

Bitrate (CBR/VBR, e.g., 128, 192, 320 kbps).

File Extensions:

.mp3

AAC (Advanced Audio Coding) Audio Apple Standard

Successor to MP3, better quality at same bitrate. Standard for Apple Music, iTunes.

Core Idea:

Improved psychoacoustic model, MDCT with better windowing, more efficient coding techniques (TNS, PNS).

Use Cases:

Apple Music/iTunes, YouTube, streaming, digital radio (DAB+).

Strengths:

Better quality than MP3 at same bitrate, especially lower bitrates.

Weaknesses:

Several variants (AAC-LC, HE-AAC) can cause confusion.

Parameters:

Bitrate, profiles (LC, HE, HEv2).

File Extensions:

.aac, .m4a, .mp4

Opus Audio Apple Support

Royalty-free, versatile (speech & music), low latency. Excellent for VoIP, streaming. Supported by Apple.

Core Idea:

Combines SILK (speech) and CELT (music) algorithms, dynamically switching or combining.

Use Cases:

VoIP, video conferencing (WebRTC default), game chat, streaming, audiobooks. Used by FaceTime audio.

Strengths:

Excellent quality across wide bitrate range, very low delay, royalty-free, adaptive.

Weaknesses:

Less ubiquitous for stored music vs. MP3/AAC (though growing).

Parameters:

Bitrate, application type (VoIP, Audio, Low-Delay).

File Extensions:

.opus (often in .ogg or .webm)

Vorbis (Ogg Vorbis) Audio Open Source

Open-source, patent-free audio format. Good quality, popular in open-source applications.

Core Idea:

Uses Modified Discrete Cosine Transform (MDCT), vector quantization, and codebook-based entropy encoding.

Use Cases:

Open-source software, indie games, some streaming (historically Spotify).

Strengths:

Good quality, royalty-free and open.

Weaknesses:

Less efficient than Opus/modern AAC at very low bitrates. Hardware support less widespread.

Parameters:

Quality level (q -1.0 to 10.0), average bitrate.

File Extensions:

.ogg, .oga

Psychovisual/Psychoacoustic Principles

How lossy codecs exploit human perception limits (auditory/frequency masking, luminance vs. chrominance sensitivity).

Psychoacoustics (Audio)

Lossy audio codecs (MP3, AAC, Opus) exploit auditory masking:

Frequency Masking: Louder sounds make quieter sounds at nearby frequencies inaudible.
Temporal Masking: A loud sound masks quieter sounds immediately before (pre-masking) or after (post-masking) it.

Codecs discard or heavily quantize information in masked regions.

Psychovisuals (Image/Video)

Lossy image/video codecs (JPEG, H.264) exploit Human Visual System (HVS) characteristics:

Luminance vs. Chrominance Sensitivity: Humans are more sensitive to brightness (luminance) than color (chrominance). Chroma subsampling reduces color info.
Frequency Sensitivity: Less sensitive to high-frequency details. Transform coding allows selective quantization.
Contrast Masking: Visual patterns can mask noise within those regions.

IV. Practical Considerations

Choosing the Right Algorithm

Factors to consider: data type, loss tolerance, ratio, speed, resources, licensing, ecosystem.

Key Questions:

What data type? (Text, image, audio, video, binary)
Is loss acceptable? (Lossless vs. Lossy)
Primary goal? (Ratio, speed, quality, low cost)
Resource constraints? (CPU, RAM)
Target platform/ecosystem support?
Licensing/royalty concerns?
Energy consumption / battery life needs?
Standards compliance / interoperability needs?

Flowchart: Decision Tree for Algorithm Selection

General Guidelines:

Text/Code: Zstd, Brotli, Gzip.
Archives: Zstd (high levels), 7-Zip (LZMA2), XZ.
Web Photos: JPEG, WebP, AVIF, HEIC (where supported).

Tools, Libraries & Software

Common archivers (gzip, 7-Zip), libraries (zlib, FFmpeg), and software implementing these algorithms.

Command-Line Archivers:

gzip, zip, 7-Zip (7z), tar (with compressors)
brotli, zstd, bzip2, xz

Libraries for Developers:

zlib: (C library for Deflate)
libjpeg-turbo: (JPEG C library)
FFmpeg: (Audio/video codecs library & tool)

Applications:

Image Editors (GIMP, Photoshop, Pixelmator), Video Editors (DaVinci, Premiere, Final Cut Pro), Audio Editors (Audacity, Logic Pro).

Application Domains

Unique compression needs in databases, network traffic, medical imaging, genomics, archives, scientific data.

Databases: Columnar compression, delta encoding.
Network Traffic: HTTP compression (Gzip, Brotli), real-time (Opus).
Medical Imaging (DICOM): Lossless (JPEG-LS, RLE) or visually lossless (JPEG 2000).
Genomics Data (FASTQ, CRAM): Specialized algorithms.

Emerging Trends

AI/Neural Network-based compression, perceptual video coding (VVC), hardware acceleration, semantic compression, privacy-aware compression.

AI/Neural Network-Based Compression: Promising for images/video/audio, but often computationally expensive.
Perceptual Video Coding (VVC): Latest MPEG standard, ~30-50% improvement over HEVC.
Specialized Hardware Acceleration: For newer codecs (AV1, VVC).
Focus on Semantic Compression: Compressing based on data *meaning*.
Compression for Privacy: Emerging techniques.

Pre/Post-processing

Steps taken before compression (e.g., normalization, BWT) or after decompression (e.g., deblocking filters) to improve results.

Pre-processing Examples:

Normalization, noise removal (for lossy), data transformation (like BWT), reordering data fields.

Post-processing Examples:

Deblocking filters (common in video codecs), deringing filters, error concealment.

V. Standards Bodies

Key Organizations

Several organizations play crucial roles in developing and standardizing compression algorithms, ensuring interoperability and advancing the field. Key players include MPEG, ITU-T, IETF, AOMedia, ISO/IEC, and W3C.

MPEG (Moving Picture Experts Group): Develops standards for audio and video (e.g., JPEG, MPEG-2, H.264, H.265, VVC, MP3, AAC). Part of ISO/IEC.
ITU-T (International Telecommunication Union - Telecommunication Standardization Sector): Develops video coding standards, often jointly with MPEG (e.g., H.26x series).
IETF (Internet Engineering Task Force): Develops standards for internet protocols, including codecs for real-time communication (e.g., Opus, AV1 via AOMedia).
AOMedia (Alliance for Open Media): Consortium developing royalty-free video codecs like AV1.
ISO (International Organization for Standardization) & IEC (International Electrotechnical Commission): General standards bodies, often publishing MPEG work.
W3C (World Wide Web Consortium): Standardizes web technologies, including formats like WebP, PNG, and font compression (WOFF/WOFF2).