This audio encoder object converts monophonic 16-bit PCM data, sampled at 8000 Hz into G.273 .1 compressed audio. The G.723 .1 coder is an ITU Draft Recommendation that is part of the H. 323/ 324 family of standards. The G.723 .1 standard supports two compression rates, 5.3 Kbit per second and 6.3 Kbit per second. It is possible to switch between the rates at any 30 ms frame boundary.
The UMSPCM16toG723 object is a subclass of the UMSFilter object. The UMSPCM16toG723 object contains all the methods of the UMSFilter object, plus the methods listed below.
To learn more about the UMSPCM16toG723 object, see:
For introductory information, see Audio Codec Objects.
The G.723 coder is based on the principles of linear prediction analysis-by-synthesis coding and attempts to minimize a perceptually weighted error signal. The encoder operates on frames of 240 samples (30 ms). Each frame is first high pass filtered to remove the DC component and divided into four sub-frames of 60 samples each. For every sub-frame, a 10th order linear prediction coder (LPC) filter is computed using the unprocessed input signal. The unquantized LPC coefficients are used to construct the short term perceptual weighting filter, which is used to filter the entire frame and to obtain the perceptually weighted speech signal.
The G.723 coder was specifically optimized for speech, and gives a high quality representation at either of the supported bit rates. Music and other audio signals are not reproduced with the same quality as speech. It is not recommended that this standard be used to compress audio content that is primarily music.
enum BitRate { Rate53K, //Compressed bit-rate of 5.3 kilobits per second Rate63K //Compressed bit-rate of 6.3 kilobits per second };
enum FilterMode { Filter_On, //High-pass filter ON Filter_Off //High-pass filter OFF };
enum channelEncode { MONO, //Monophonic input signal STEREO //Stereo input signal (not supported) };
enum VadMode { Vad_On, //Voice activity detector ON Vad_Off //Voice activity detector OFF };
enum NoiseType { Regular, //Normal silence frame Truncated //Shortened silence frame };
A method for setting the compressed bit rate.
i n BitRate bit_rate | A enum type variable that holds the bit rate setting. |
A method to query the state of the compressed bit rate.
inout BitRate bit_rate | A pointer to an enum type variable that holds the bit rate setting. |
There are two different optional filters in the G723 coder. For the encoder, if the high-pass filter is enabled, then the DC is removed from the signal. For the decoder, the post-filter, if enabled, improves the perceived speech quality.
Some of the reasons for not using the filter are: 1) it adds a little of computation complexity to the algorithm, and thus the encoder or decoder run slightly slower, 2) If DTMF or other non-speech signals are contained in the data, they may be altered by the filtering.
Both decisions are "single sided", i.e. the encoding side decides to filter not knowing anything about the decoder and the decoder decides weather to use post-filtering independent of how the signal was encoded. There is no information about the filter mode setting included in the bit-stream.
in FilterMode mode | An enum variable that turns the filter ON or OFF. |
A method to query the state of the encoder filter.
inout FilterMode mode | A pointer to an enum variable that contains the state of the filter, ON or OFF. |
A method to set the voice activity mode to either ON or OFF, and to set the type of silence frame to either Regular or Truncated. If the voice activity detector is set to ON, set the NoiseType parameter to Regular to simplify the protocol between encoder and decoder. Setting NoiseType to Truncated results in a slightly smaller, nonstandard bit stream that requires a more complex protocol.
If the voice activity detector is ON, a NoiseType of Regular gives a four-byte compressed frame that attempts to mimic the background noise level during nonspeech frames. A NoiseType of Truncated gives a one-byte frame the contains only two bits of identifier information. If the Truncated mode is selected in the encoder, the first frame of detected silence is a four-byte VAD frame, followed by a one-byte truncated frame for the second and subsequent frames of silence. You must replace each truncated frame with the previous four-byte VAD frame before it can be sent to the decoder.
A method to query the voice activity mode and the NoiseType frame.
inout VadMode mode | A pointer to a variable that holds the voice activity detector state. |
inout NoiseType type | A pointer to a variable that holds the noise frame type. |
A method to query the version of the encoder.The version of the encoder and decoder must match.
out string version_number | A pointer to a character string containing the version identifier. |
Query the number of input bytes that must be available before a frame can be compressed. This method assumes an input data format of 16-bit PCM, that is 2 bytes per input sample.
inout long minimum_size | A pointer to a long variable that holds the number of input bytes needed per compressed frame. |
For introductory information, see Audio Codec Objects.