[ Previous | Next | Contents | Glossary | Home | Search ]
Ultimedia Services Version 2 for AIX: Programmer's Guide and Reference

UMSPCM16toG723 Object

This audio encoder object converts monophonic 16-bit PCM data, sampled at 8000 Hz into G.273 .1 compressed audio. The G.723 .1 coder is an ITU Draft Recommendation that is part of the H. 323/ 324 family of standards. The G.723 .1 standard supports two compression rates, 5.3 Kbit per second and 6.3 Kbit per second. It is possible to switch between the rates at any 30 ms frame boundary.

The UMSPCM16toG723 object is a subclass of the UMSFilter object. The UMSPCM16toG723 object contains all the methods of the UMSFilter object, plus the methods listed below.

To learn more about the UMSPCM16toG723 object, see:

For introductory information, see Audio Codec Objects.

Algorithm Overview

The G.723 coder is based on the principles of linear prediction analysis-by-synthesis coding and attempts to minimize a perceptually weighted error signal. The encoder operates on frames of 240 samples (30 ms). Each frame is first high pass filtered to remove the DC component and divided into four sub-frames of 60 samples each. For every sub-frame, a 10th order linear prediction coder (LPC) filter is computed using the unprocessed input signal. The unquantized LPC coefficients are used to construct the short term perceptual weighting filter, which is used to filter the entire frame and to obtain the perceptually weighted speech signal.

The G.723 coder was specifically optimized for speech, and gives a high quality representation at either of the supported bit rates. Music and other audio signals are not reproduced with the same quality as speech. It is not recommended that this standard be used to compress audio content that is primarily music.

Default Settings

Filter Mode: OFF

BitRate: 6.3K

channelEncode: MONO

Voice Activity Detector: OFF

Noise Type: Regular

Enumeration Lists

enum BitRate { 
   Rate53K, //Compressed bit-rate of 5.3 kilobits per second
   Rate63K //Compressed bit-rate of 6.3 kilobits per second
   };
enum FilterMode { 
   Filter_On, //High-pass filter ON
   Filter_Off //High-pass filter OFF
   };
enum channelEncode {
   MONO, //Monophonic input signal 
   STEREO //Stereo input signal (not supported)
   }; 
enum VadMode {
   Vad_On, //Voice activity detector ON
   Vad_Off //Voice activity detector OFF
   };
enum NoiseType {
   Regular, //Normal silence frame
   Truncated //Shortened silence frame
   };

Method Descriptions

ReturnCode set_bit_rate(in BitRate bit_rate);

Description

A method for setting the compressed bit rate.

Arguments
i n BitRate bit_rate A enum type variable that holds the bit rate setting.
Return Values

Success

InvalidArgument

ReturnCode get_bit_rate(inout BitRate bit_rate);

Description

A method to query the state of the compressed bit rate.

Arguments
inout BitRate bit_rate A pointer to an enum type variable that holds the bit rate setting.
Return Values

Success

InvalidArgument

ReturnCode set_filter_mode(in FilterMode mode);

Description

There are two different optional filters in the G723 coder. For the encoder, if the high-pass filter is enabled, then the DC is removed from the signal. For the decoder, the post-filter, if enabled, improves the perceived speech quality.

Some of the reasons for not using the filter are: 1) it adds a little of computation complexity to the algorithm, and thus the encoder or decoder run slightly slower, 2) If DTMF or other non-speech signals are contained in the data, they may be altered by the filtering.

Both decisions are "single sided", i.e. the encoding side decides to filter not knowing anything about the decoder and the decoder decides weather to use post-filtering independent of how the signal was encoded. There is no information about the filter mode setting included in the bit-stream.

Arguments
in FilterMode mode An enum variable that turns the filter ON or OFF.
Return Values

Success

InvalidArgument

ReturnCode get_filter_mode(inout FilterMode mode);

Description

A method to query the state of the encoder filter.

Arguments
inout FilterMode mode A pointer to an enum variable that contains the state of the filter, ON or OFF.
Return Values

Success

InvalidArgument

ReturnCode set_vad_mode(in VadMode mode, in NoiseType type);

Description

A method to set the voice activity mode to either ON or OFF, and to set the type of silence frame to either Regular or Truncated. If the voice activity detector is set to ON, set the NoiseType parameter to Regular to simplify the protocol between encoder and decoder. Setting NoiseType to Truncated results in a slightly smaller, nonstandard bit stream that requires a more complex protocol.

If the voice activity detector is ON, a NoiseType of Regular gives a four-byte compressed frame that attempts to mimic the background noise level during nonspeech frames. A NoiseType of Truncated gives a one-byte frame the contains only two bits of identifier information. If the Truncated mode is selected in the encoder, the first frame of detected silence is a four-byte VAD frame, followed by a one-byte truncated frame for the second and subsequent frames of silence. You must replace each truncated frame with the previous four-byte VAD frame before it can be sent to the decoder.

Arguments
in VadMode mode A variable that turns the voice activity detector, or silence detector, ON and OFF.
in NoiseType type A variable that sets the type of silence frame returned to either Regular (a four-byte value) or Truncated (a two-bit identifier).
Return Values

Success

InvalidArgument

ReturnCode get_vad_mode(inout VadMode mode, inout NoiseType type);

Description

A method to query the voice activity mode and the NoiseType frame.

Arguments
inout VadMode mode A pointer to a variable that holds the voice activity detector state.
inout NoiseType type A pointer to a variable that holds the noise frame type.
Return Values

Success

InvalidArgument

ReturnCode get_version(out string version_number);

Description

A method to query the version of the encoder.The version of the encoder and decoder must match.

Arguments
out string version_number A pointer to a character string containing the version identifier.
Return Values

Success

InvalidArgument

ReturnCode get_minimum_input_size(inout long minimum_size);

Description

Query the number of input bytes that must be available before a frame can be compressed. This method assumes an input data format of 16-bit PCM, that is 2 bytes per input sample.

Arguments
inout long minimum_size A pointer to a long variable that holds the number of input bytes needed per compressed frame.
Return Values

Success

InvalidArgument

For introductory information, see Audio Codec Objects.


[ Previous | Next | Contents | Glossary | Home | Search ]