Outline of the Standard MIDI File Structure

Go to: [ header chunk | track chunk | track event | meta event | system exclusive event | variable length values ]

A standard MIDI file is composed of "chunks". It starts with a header chunk and is followed by one or more track chunks. The header chunk contains data that pertains to the overall file. Each track chunk defines a logical track.

 
   SMF = <header_chunk> + <track_chunk> [+ <track_chunk> ...]

A chunk always has three components, similar to Microsoft RIFF files (the only difference is that SMF files are big-endian, while RIFF files are usually little-endian). The three parts to each chunk are:

  1. The track ID string which is four charcters long. For example, header chunk IDs are "MThd", and Track chunk IDs are "MTrk".
  2. next is a four-byte unsigned value that specifies the number of bytes in the data section of the track (part 3).
  3. finally comes the data section of the chunk. The size of the data is specified in the length field which follows the chunk ID (part 2).

Header Chunk

    The header chunk consists of a literal string denoting the header, a length indicator, the format of the MIDI file, the number of tracks in the file, and a timing value specifying delta time units. Numbers larger than one byte are placed most significant byte first.
     
       header_chunk = "MThd" + <header_length> + <format> + <n> + <division>
     
    "MThd" 4 bytes
    the literal string MThd, or in hexadecimal notation: 0x4d546864. These four characters at the start of the MIDI file indicate that this is a MIDI file.
    <header_length> 4 bytes
    length of the header chunk (always 6 bytes long--the size of the next three fields which are considered the header chunk).
    <format> 2 bytes
    0 = single track file format
    1 = multiple track file format
    2 = multiple song file format (i.e., a series of type 0 files)
    <n> 2 bytes
    number of track chunks that follow the header chunk
    <division> 2 bytes
    unit of time for delta timing. If the value is positive, then it represents the units per beat. For example, +96 would mean 96 ticks per beat. If the value is negative, delta times are in SMPTE compatible units.

Track Chunk

    A track chunk consists of a literal identifier string, a length indicator specifying the size of the track, and actual event data making up the track.
     
       track_chunk = "MTrk" + <length> + <track_event> [+ <track_event> ...]
     
    "MTrk" 4 bytes
    the literal string MTrk. This marks the beginning of a track.
    <length> 4 bytes
    the number of bytes in the track chunk following this number.
    <track_event>
    a sequenced track event.

    Track Event

    A track event consists of a delta time since the last event, and one of three types of events.
     
       track_event = <v_time> + <midi_event> | <meta_event> | <sysex_event>
     
    <v_time>
    a variable length value specifying the elapsed time (delta time) from the previous event to this event.
    <midi_event>
    any MIDI channel message such as note-on or note-off. Running status is used in the same manner as it is used between MIDI devices.
    <meta_event>
    an SMF meta event.
    <sysex_event>
    an SMF system exclusive event.

    Meta Event

    Meta events are non-MIDI data of various sorts consisting of a fixed prefix, type indicator, a length field, and actual event data..
     
       meta_event = 0xFF + <meta_type> + <v_length> + <event_data_bytes>
     
    <meta_type> 1 byte
    meta event types:
    Type Event Type Event
    0x00 Sequence number 0x20 MIDI channel prefix assignment
    0x01 Text event 0x2F End of track
    0x02 Copyright notice 0x51 Tempo setting
    0x03 Sequence or track name 0x54 SMPTE offset
    0x04 Instrument name 0x58 Time signature
    0x05 Lyric text 0x59 Key signature
    0x06 Marker text 0x7F Sequencer specific event
    0x07 Cue point
    <v_length>
    length of meta event data expressed as a variable length value.
    <event_data_bytes>
    the actual event data.

    System Exclusive Event

    A system exclusive event can take one of two forms:

    sysex_event = 0xF0 + <data_bytes> 0xF7 or sysex_event = 0xF7 + <data_bytes> 0xF7

    In the first case, the resultant MIDI data stream would include the 0xF0. In the second case the 0xF0 is omitted.


Variable Length Values

Several different values in SMF events are expressed as variable length quantities (e.g. delta time values). A variable length value uses a minimum number of bytes to hold the value, and in most circumstances this leads to some degree of data compresssion.

A variable length value uses the low order 7 bits of a byte to represent the value or part of the value. The high order bit is an "escape" or "continuation" bit. All but the last byte of a variable length value have the high order bit set. The last byte has the high order bit cleared. The bytes always appear most significant byte first.

Here are some examples:

   Variable length              Real value
   0x7F                         127 (0x7F)
   0x81 0x7F                    255 (0xFF)
   0x82 0x80 0x00               32768 (0x8000)


craig@ccrma.stanford.edu