Pictures are coded as luminance
and two colour difference components(Y, Cb and Cr as the same with MPEG).
In H.263, there are five standardized picture formats: sub-QCIF, QCIF,
CIF, 4CIF, 16CIF. For each of these picture formats, the luminance
sampling structure is dx pixels per line, dy lines per picture in an orthogonal
arrangement. Sampling of each of the two colour difference components is
at dx/2 pixels per line, dy/2 lines per picture, orthogonal. The value
of dx, dy, dx/2 and dy/2 are shown as follow:
Picture Format
Number of pixels
for luminance (dx)
Number of lines
for luminance (dy)
Number of pixels for chrominance (dx/2)
Number of lines for chrominance (dy/2)
sub-QCIF
128
96
64
48
QCIF
176
144
88
72
CIF
352
288
176
144
4CIF
704
576
352
288
16CIF
1408
1152
704
576
For each of the picture formats, colour difference samples
are sited such that their block boundaries coincide with luminance block
boundaries as shown in Figure 3. However all decorders shall be to operate
using sub-QCIF and QCIF. The encoders determine which of these two formats
are used, and are not obliged to be able to operate with both.
Video source coding algorithm
The source coder is shown in generalized form in Figure 4. The main
elements are prediction , block transformation and quantization.
GOBs, macroblocks and blocks
Each picture
is divided into groups of blocks(GOBs). A GOB comprises of
k x 16 lines, depending on the picture format (k=1 for sub-QCIF, QCIF and
CIF; k=2 for 4CIF; k=4 for 16CIF). The number of GOBs per picture is 6
for sub-QCIF, 9 for QCIF, and 18 for CIF, 4CIF and 16CIF. The GOB numbering
is done by use of vertical scan of the GOBs, starting with th upper GOB(number
0) and ending with the lower GOB. Data for each GOB consists of a GOB header
followed by data for macroblocks. Data for GOBs is transmitted per GOB
in increasing GOB number.
Each GOB
is divided into macroblocks. A macroblock relates to 16 pixels by 16 lines
of Y(four luminance blocks) and the spatially corresponding 8 pixels by
8 lines of Cb and Cr(two chrominance blocks).Each luminance or chrominance
block relates to 8 pixels by 8 lines of Y, Cb or Cr.
The macroblock
numbering is done by use of horizontal scan of the macroblock rows from
left to right, starting with the upper macroblock row and ending with the
lower macroblock row. Data for the macroblocks is transmitted per macroblock
in increasing macroblock number.
Prediction
The prediction is inter-picture
and may be augmented by motion compensation. The coding mode in which prediction
is applied is called INTER; the coding mode is called INTRA if no
prediction is applied. The INTRA coding mode can be signalled at
the picture level (INTRA for I-pictures or INTER for P-pictures) or at
the macroblock level in P-pictures. In the optional PB-frames, mode B-pictures
are always coded in INTER mode. The B-pictures are partly predicted bidirectionally
(see following figure).
Interpolation for subpixel
prediction is achieved that half pixel values are found using bilinear
interpolation as described in following figure.
Motion compensation
The decoder will accept
one vector per macroblock or if the Advanced Prediction mode is used, one
or four vectors per macroblock If the PB-frames mode is used, one additional
delta vector can be transmitted per macroblock for adaptation of the motion
vectors for prediction of the B-macroblock.
Both horizontal and vertical
components of the motion vectors have integer or half integer values. In
the default prediction mode, these values are restricted to the range [-16,
15.5] (this is also valid for the forward and backward motion vector components
for B-pictures). In the Unrestricted Motion Vector mode however, the maximum
range for vector components is [-31.5, 31.5], with the restriction that
only values that are within a range of [-16, 15.5] around the predictor
for each motion vector component can be reached if the predictor is in
the range [-15.5, 16]. If the predictor is outside [-15.5, 16], all values
within the range [-31.5, 31.5] with the same sign as the predictor plus
the zero value can be reached.
A positive value of the
horizontal or vertical component of the motion vector signifies that the
prediction is formed from pixels in the referenced picture which are spatially
to the right or below the pixels being predicted.
The number of quantizers
is 1 for the first coefficient of INTRA blocks and 31 for all other coefficients.
Within a macroblock the same quantizer is used for all coefficients except
the first one of INTRA blocks. The decision levels are not defined. The
first coefficient of INTRA blocks is nominally the transform dc value uniformly
quantized with a stepsize of 8. Each of the other 31 quantizers use equally
spaced reconstruction levels with a central dead-zone around zero and with
a stepsize of an even value in the range 2 to 62.
Coding control
Several parameters may be varied
to control the rate of generation of coded video data. These include processing
prior to the source coder, the quantizer, block significance criterion
and temporal subsampling. The proportions of such measures in the overall
control strategy are not subject to recommendation.
A decoder can signal its
preference for a certain tradeoff between spatial and temporal resolution
of the video signal. The encoder shall signal its default tradeoff at the
beginning of the call and shall indicate whether it is capable to respond
to decoder requests to change this tradeoff. The transmission method for
these signals is by external means (for example, Recommendation H.245).
Forced updating
This function is achieved by
forcing the use of the INTRA mode of the coding algorithm. The update pattern
is not defined. To control the accumulation of inverse transform mismatch
error, each macroblock shall be coded in INTRA mode at least once every
132 times when coefficients are transmitted for this macroblock in P-pictures.
Byte alignment of start codes
Byte alignment of start codes
is achieved by inserting a stuffing codeword consisting of less than 8
zero-bits before the start code such that the first bit of the start code
is the first (most significant) bit of a byte. A start code is therefore
byte aligned if the position of its most significant bit is a multiple
of 8-bits from the first bit in the H.263 bit stream. All picture start
codes shall and GOB and EOS start codes may be byte aligned.