Source Coder 

Source format

        Pictures are coded as luminance and two colour difference components(Y, Cb and Cr as the same with MPEG). In H.263, there are five standardized picture formats: sub-QCIF, QCIF, CIF, 4CIF, 16CIF. For each of these picture formats, the luminance sampling structure is dx pixels per line, dy lines per picture in an orthogonal arrangement. Sampling of each of the two colour difference components is at dx/2 pixels per line, dy/2 lines per picture, orthogonal. The value of dx, dy, dx/2 and dy/2 are shown as follow:
 
Picture Format
Number of pixels
for luminance (dx)
Number of  lines 
for luminance (dy)
Number of pixels for chrominance (dx/2)
Number of  lines for chrominance (dy/2)
sub-QCIF
128
96
64
48
QCIF
176
144
88
72
CIF
352
288
176
144
4CIF
704
576
352
288
16CIF
1408
1152
704
576
    For each of the picture formats, colour difference samples are sited such that their block boundaries coincide with luminance block boundaries as shown in Figure 3. However all decorders shall be to operate using sub-QCIF and QCIF. The encoders determine which of these two formats are used, and are not obliged to be able to operate with both.

Video source coding algorithm

The source coder is shown in generalized form in Figure 4. The main elements are prediction , block transformation and quantization.

GOBs, macroblocks and blocks
       Each picture is divided into groups of blocks(GOBs). A GOB comprises of  k x 16 lines, depending on the picture format (k=1 for sub-QCIF, QCIF and CIF; k=2 for 4CIF; k=4 for 16CIF). The number of GOBs per picture is 6 for sub-QCIF, 9 for QCIF, and 18 for CIF, 4CIF and 16CIF. The GOB numbering is done by use of vertical scan of the GOBs, starting with th upper GOB(number 0) and ending with the lower GOB. Data for each GOB consists of a GOB header followed by data for macroblocks. Data for GOBs is transmitted per GOB in increasing GOB number.
          Each  GOB is divided into macroblocks. A macroblock relates to 16 pixels by 16 lines of Y(four luminance blocks) and the spatially corresponding 8 pixels by 8 lines of Cb and Cr(two chrominance blocks).Each luminance or chrominance block relates to 8 pixels by 8 lines of Y, Cb or Cr.
          The macroblock numbering is done by use of horizontal scan of the macroblock rows from left to right, starting with the upper macroblock row and ending with the lower macroblock row. Data for the macroblocks is transmitted per macroblock in increasing macroblock number.
 
Prediction
      The prediction is inter-picture and may be augmented by motion compensation. The coding mode in which prediction is applied is called INTER; the coding mode is called INTRA if no prediction is applied. The INTRA coding mode can be signalled at the picture level (INTRA for I-pictures or INTER for P-pictures) or at the macroblock level in P-pictures. In the optional PB-frames, mode B-pictures are always coded in INTER mode. The B-pictures are partly predicted bidirectionally (see following figure).

 

        Interpolation for subpixel prediction is achieved that half pixel values are found using bilinear interpolation as described in following figure.

 

 
Motion compensation
      The decoder will accept one vector per macroblock or if the Advanced Prediction mode is used, one or four vectors per macroblock If the PB-frames mode is used, one additional delta vector can be transmitted per macroblock for adaptation of the motion vectors for prediction of the B-macroblock.
        Both horizontal and vertical components of the motion vectors have integer or half integer values. In the default prediction mode, these values are restricted to the range [-16, 15.5] (this is also valid for the forward and backward motion vector components for B-pictures). In the Unrestricted Motion Vector mode however, the maximum range for vector components is [-31.5, 31.5], with the restriction that only values that are within a range of [-16, 15.5] around the predictor for each motion vector component can be reached if the predictor is in the range [-15.5, 16]. If the predictor is outside [-15.5, 16], all values within the range [-31.5, 31.5] with the same sign as the predictor plus the zero value can be reached.
        A positive value of the horizontal or vertical component of the motion vector signifies that the prediction is formed from pixels in the referenced picture which are spatially to the right or below the pixels being predicted.

    Differential motion vectors

 
Quantization
      The number of quantizers is 1 for the first coefficient of INTRA blocks and 31 for all other coefficients. Within a macroblock the same quantizer is used for all coefficients except the first one of INTRA blocks. The decision levels are not defined. The first coefficient of INTRA blocks is nominally the transform dc value uniformly quantized with a stepsize of 8. Each of the other 31 quantizers use equally spaced reconstruction levels with a central dead-zone around zero and with a stepsize of an even value in the range 2 to 62.

Coding control

        Several parameters may be varied to control the rate of generation of coded video data. These include processing prior to the source coder, the quantizer, block significance criterion and temporal subsampling. The proportions of such measures in the overall control strategy are not subject to recommendation.
        A decoder can signal its preference for a certain tradeoff between spatial and temporal resolution of the video signal. The encoder shall signal its default tradeoff at the beginning of the call and shall indicate whether it is capable to respond to decoder requests to change this tradeoff. The transmission method for these signals is by external means (for example, Recommendation H.245).

Forced updating

        This function is achieved by forcing the use of the INTRA mode of the coding algorithm. The update pattern is not defined. To control the accumulation of inverse transform mismatch error, each macroblock shall be coded in INTRA mode at least once every 132 times when coefficients are transmitted for this macroblock in P-pictures.

Byte alignment of start codes

        Byte alignment of start codes is achieved by inserting a stuffing codeword consisting of less than 8 zero-bits before the start code such that the first bit of the start code is the first (most significant) bit of a byte. A start code is therefore byte aligned if the position of its most significant bit is a multiple of 8-bits from the first bit in the H.263 bit stream. All picture start codes shall and GOB and EOS start codes may be byte aligned.