AVI File Access

Ultimedia Services Version 2 for AIX: Programmer's Guide and Reference

AVI File Access

The audio/video interleaved (AVI) file format is a RIFF file specification. The AVI format permits audio and video data to be interleaved in a file; this format permits data for the separate audio and video streams to be accessed in alternate chunks for playback or recording while maintaining sequential access on the file device.

The Ultimedia Services UMSAVIReadWrite object provides methods to locate, read, and write the data and headers of an AVI file. The object uses the UMSRiffReadWrite object to access the file but provides a method interface to specifically facilitate AVI file access. Samples programs using AVI files are located in the UMS install directory.

To learn more about the UMSAVIReadWrite object, see:

For additional information on the structure of the AVI file, see:

"AVI Files," Microsoft Multimedia Technical Note, November 10, 1992.
"AVI JPEG File Format Proposal," C-Cube Microsystems, January 20, 1993.

For introductory information, see Programming with Formatted File Access Objects.

Note: At the time Ultimedia Services was under development, the definition for Joint Photographic Experts Group (JPEG) and Motion JPEG (MJPEG) in AVI was in draft form. This draft did not clearly define a standard location for the Huffman tables. The implementation is based on information available at the time and can be modified in future releases to comply with a final standard. This implementation includes the Huffman tables (the X'FF' DHT segment) in each frame of an AVI sequence rather than using an abbreviated JPEG format. Since this is suspected to be different from the MJPEG standard, the biCompression field of the BITMAPINFOHEADER is coded "mJPG" rather than "MJPG".

File Format

The AVI file is a RIFF file that consists of an AVI RIFF chunk with a registered FOURCC of AVI. The AVI RIFF chunk includes two mandatory list chunks with a FOURCC of hdrl and movi. The AVI RIFF chunk can also include an additional index chunk with a FOURCC idx1.

The hdrl chunk is a header chunk that defines the format of the included data. The movi chunk contains the data, and the idx1 is an optional index to the data. Specifics concerning the hdrl, strl, movi, and idx1 list chunks follow.

The expanded form of the AVI file is illustrated in the RIFF syntax that follows:

RIFF ( 'AVI' 
        LIST ( 'hdrl' 
                'avih' ( <Main AVI Header> ) 
LIST ('strl' 
'strh' ( <Stream 1 Header> ) 
'strf' ( <Stream 1 Format> ) 
['strd'( <Stream 1 optional codec data> ) ]
                )
                LIST ( 'strl'
                       'strh' ( <Stream 2 Header> ) 
                       'strf' ( <Stream 2 Format> ) 
                       [ 'strd' ( <Stream 2 optional codec data> )]
                )
                ...
        )
        LIST ( 'movi' 
                { 
                        '##dc' ( <compressed DIB> )
                        | 
                        LIST ( 'rec '
                                '##dc' ( <compressed DIB> ) 
                                '##db' ( <uncompressed DIB> ) 
                                '##wb' ( <audio data> ) 
                                '##pc' ( <palette change> ) 
                                ...
                         )
                         } 
                ... 
        )
        [ 'idx1' ( <AVI Index> ) ]
)

hdrl List Chunk

The hdrl list chunk contains the avih chunk (the main AVI header) and a strl list chunk for each data stream contained in the file. The avih chunk provides general information about the file, including:

Number of streams in the file (dwStreams)
Width of the video image (dwWidth)
Height of the video image (dwHeight)
Frame duration (dwMicroSecPerFrame)
Number of frames in the file (dwTotalFrames)
Whether the file has an index (AVIF_HASINDEX bit of dwFlags).

strl List Chunk

The stream list (strl) chunk contains information about the streams in the file. A strl list chunk contains a stream header (strh) chunk, a stream format (strf) chunk, and an optional stream data (strd) chunk.

The strh includes general information about the stream data, such as whether it is audio or video (fccType set to auds or vids) and an indication of the maximum stream data chunk size (dwSuggestedBufferSize).

The strf contains specific information about the stream data. This chunk is not a fixed size, and the data in it maps to different structure definitions depending on the data type and data compression.

movi List Chunk

The movi list chunk contains the stream data in subchunks. The stream data subchunks can exist directly in the movi chunk or can be grouped into rec chunks. Grouping the subchunks into rec chunks is a means of defining the optimal read size for input/output (I/O) performance.

The stream data should be read in rec chunks. The stream data subchunks have 4-character codes that identify the data as a compressed DIB (##dc), uncompressed DIB (##db), or audio data (##wb).

idx1 Chunk

The idx1 (optional) chunk contains successions of AVIINDEXENTRY structures, one for each rec and stream data subchunk in the movi chunk. The AVINDEXENTRY structures identify each data chunk and its location in the file.

UMSAVIReadWrite Object Method Calls

UMSAVIReadWrite object method calls include:

open	Opens an AVI file.
close	Closes an AVI file.
flush	Flushes all data to file.
seek_to_frame	Seeks to frame number.
put_avi_header	Writes main AVI header.
get_avi_header	Returns main AVI header.
put_stream_headers	Writes AVIStreamHeader structures.
get_stream_headers	Returns AVIStreamHeader structures.
put_strf_data	Writes strf data chunks.
get_strf_data	Returns the strf data chunks.
get_strf_sizes	Returns the sizes of each strf data chunk.
put_strd_data	Writes strd data chunks.
get_strd_data	Returns the strd data chunks.
get_strd_sizes	Returns the sizes of each strd data chunk.
put_chunk	Writes a chunk.
get_chunk	Returns the next audio or video chunk.
get_chunk_size	Returns the size of next audio or video chunk.
get_audio_chunk	Returns the next audio chunk.
get_audio_chunk_size	Returns the size of next audio chunk.
get_video_chunk	Returns the next video chunk.
get_video_chunk_size	Returns the size of the next video chunk.
status_ok	Checks the object status.
get_track_count	Returns the number of tracks.
get_current_frame_number	Returns the frame number of next frame to read.
select_track	Selects tracks for get_chunk methods.
un_select_track	Unselects tracks.
get_chunk_fourcc	Returns the RIFF FOURCC value of next chunk.
get_chunk_track	Returns the track number of next chunk.
get_chunk_flags	Returns the `dwFlags` field in AVIINDEXENTRY structure for next chunk.
get_index_size	Returns the number of AVIINDEXENTRY structures.
get_index	Returns file index.

The AVI file format is defined in LSB first-byte order. Where structures are well-defined, byte swapping is automatically provided by the object. When working with an AVI file, it is most convenient to keep the header structures in the applications data space. Most of the data in the structure should be supplied when writing; however, the dwTotalFrames field is maintained by the object to be consistent with the file contents.

The AVIStreamHeader structures (or strh chunks) are read or written as a group and are accessed as an array of AVIStreamHeader structures. The object performs byte swapping for the stream header structures. The appropriate size of the array and value of count can be determined by the get_track_count method, which returns the number of tracks (or data streams) in the file.

The strf and strd chunks can assume various sizes and definitions. Consequently, these objects are accessed with an array of pointers to the data structures, and the data is not byte swapped by the object. When reading these objects, the application must preallocate the required space for them. The get_strf_sizes and get_strd_sizes methods can be used to determine the appropriate sizes of each structure. Using this size information and the number of tracks returned from the get_track_count method, the application allocates space for each strf or strd chunk and builds the array of pointers necessary for the read method. The data can then be read using the get_strf_data and get_strd_data methods. Similarly, the data can be written using the put_strf_data and put_strd_data methods.

Accessing the Stream Data

The stream data exists as subchunks within the movi list chunks. These subchunks can or cannot be bundled into rec chunks. The following sections describe how to seek within these data chunks and read or write data chunks.

Seeking: Frames and Chunks

The following are the two stream-data boundary abstractions presented by the object:

chunk	A stream-data subchunk within the movi list chunk.
frame	A grouping of logically sequential subchunks from different data streams (or tracks). For example, if the movi list chunk consisted of two streams of alternating audio and video chunks, each audio and video pair is considered a frame.

Note: The definition of a frame becomes more complex when a stream has initial data. This complexity is not covered for this discussion but is described in the object specification.

Random seeking within the movi list chunk is done on frame boundaries with the seek_to_frame method. The initial data position is frame 0, which is the first frame. The current frame position can be determined by the get_current_frame_number method. The file position can be advanced sequentially by any of the methods that read or write the stream data chunks. The file position is then logically at the end of the data chunk just read or written.

Selecting and Unselecting Tracks

An application is not interested in all of the data streams that the file contains. The select_track and un_select_track methods are used to specify which tracks the application wants to have selected. The default is to have all tracks selected. If a track is not selected, its data chunks are essentially invisible; the data chunks for that stream are skipped when sequentially reading data chunks with the get_chunk, get_audio_chunk, or get_video_chunk methods.

Retrieving the Data Chunks

An application can sequentially retrieve all the data chunks within the movi list chunk by successively using the get_chunk method.

The size of the next chunk, as well as other characteristics of the next chunk, can be obtained with the following methods: get_chunk_size, get_chunk_fourcc, get_chunk_track, and get_chunk_flags, which return data about the next chunk that would be accessed with the get_chunk method. Typically, an application determines the size and characteristics of a chunk before retrieving the chunk data. If a track is not selected, the data chunks for it are skipped (essentially made invisible) to the get_chunk, get_chunk_size, get_chunk_fourcc, get_chunk_track, and get_chunk_flags methods.

Similarly, the get_audio_chunk_size and get_video_chunk_size methods return the size of the next audio chunk or video chunk, respectively. The get_audio_chunk and get_video_chunk methods retrieve the data in the next audio or video chunk, respectively.

The object always uses the file index if one exists, presenting the data in the order defined by the index.

Writing Data Chunks

The object provides the put_chunk method to write stream data chunks and their associated index data. This inserts the chunk at the current position within the movi list chunk. An index structure for the chunk is added if the AVIF_HASINDEX bit of dwFlags field is set in the MainAVIHeader.

For introductory information, see Programming with Formatted File Access Objects.