Now we begin to introduce more detail
about the DCT-based modes of encoding operation. The following figure shows
the key processing steps of encoding. It illustrates the special case of
grayscale image compression.
Because of the high correlation
among three components of RGB, the first step is usually to change color
space from RGB space to YCbCr space. Generally speaking, eyes are more
sensitive to illumination and less sensitive to saturation. Therefore,
it can save storage by keeping more illumination data and less saturation
Cosine Transform (DCT)
After changing color space, source image
samples are grouped into 8x8 blocks, shifted from unsigned integers to
signed integers and input to the DCT. The following equation is the idealized
mathmatical definition of the 8x8 DCT :
Each 8x8 block of source image samples
is effectively a 64-point discrete signal which is a function of the two
spatial dimensions x and y. The DCT takes such a signal as its input and
decomposes of the 64 unique two-dimensional "spatial frequencies" which
comprise the input signal's "spectrum." The ouput of the DCT is the set
of 64 basis-signal amplitudes ( DCT coefficients ) whose values can be
regarded as the relative amount of the 2D spatial frequencies contained
in the 64-pint unput signal.
The DCT coefficients are divided into
"DC coefficient" and "AC coefficients". DC coefficient is the coefficient
with zero frequency in both dimensions, and AC coefficients are remaining
63 coefficients with non-zero frequences. The DCT step can concentrate
most of the signal in the lower spatial frequencies. In others words, most
of the spatial frequences have zero or near-zero amplitude and need not
Because the DCT equations contain transcendental
functions, no physical implementation can compute them with perfect accuracy.
Besides, JPEG does not specify unique DCT algorithm in its propsed standard.
This , however, makes compliance more difficult to confirm since two compliant
encoders ( or decoders ) generally will not produce identical outputs given
To achieve further compresesion, each
of the 64 DCT coefficients is uniformly quantized in conjunction with a
64-element Quantization Table, which is specified by the application. The
purpose of quantization is to discard information which is not visually
significant.. Because quantization is a many-to-one mapping, it is fundamentally
lossy. Moreover, it is the principal source of lossiness in DCT-based encoder.
Quatization is defined as division of
each DCT coefficient by its corresponding quantizer step size, followed
by rounding to the nearest integer, as following equation :
and then the output is normailized by
the quantizer step size.
Each step size of quantization ideally
should be chosen as the perceptual threshold to compress the image as much
as poissible without visible artifacts. It is also a function of the source
image characteristeics, display characteristics and viewing distance.
The final processing step of encoder
is entropy coding. Before entropy coding, there are a few processing steps
for the quantized coefficients. Note that the DC coefficient is treated
separately from the 63 AC coefficients.
The DC coefficients is a measure of
the average value of the 64 image samples. Because there is usually strong
correlation between the DC codfficients of adjacent 8x8 blocks, the quantized
DC coefficient is encoded as the difference from the DC term of the previous
block in the encoding order, called Differential Pulse Code Modulation
( DPCM ), and the function is as followed :
DiffDC(i) = DC(i) -
where i is the i-th block, DC(0) = 0
DPCM can usually achieve further compression
due to the smaller range of the coefficient values.
The remaining AC coefficients are ordered
into the "zig-zag" sequence, which helps to facilitate entropy coding by
placing low-frequency coefficients before high-frequency coefficients.
Now the outputs of DPCM and "zig-zag"
scanning can be encoded by entropy coding seperately. It encodes the quantized
DCT coeffieicts more compactly based on their statistical characteristics.
The JPEG proposal specifies two entropy coding methods : Huffman coding
and arithmetic coding.
Entropy coding can be considered as
2-step process. The first step converts the zig-zag sequnce of quantized
coefficients into an intermediate sequence of symbols. The second step
converts the symbols to a data stream in which the symbols no longer have
externally identifiable boundaries. The form and definition of the intermediate
symbols is dependent on both the DCT-based mode of operation and the entropy