Notes File Format for the
Expression Algorithm program (version 1.00)

The input Notes file for the Expression Algorithm software is a text file containing a list of all the notes to search for in the audio file. Each line represents a separate note in the score. In general, secondary notes generated from ornaments (such as trills or mordents) are not considered, since they requires performance-specific information.

Below is an example Notes file. Comments in Notes files start with a percent sign (%). For those familiar with the calculation programs GNU octave or matlab, the Notes file is just an array of data which can be loaded into either of these programs. The meaning of the numbers in the columns are given below the example data.

Each lines contains a list of numbers separated by a single tab character (multiple tabs or spaces may work as well). The first four numbers on a line are required:

  1. Estimated starting time of the note in milliseconds. The first note in the example file above starts at 723 milliseconds from the starting of the audio file.
  2. Estimated duration of the note in milliseconds. The first note in the example file above has a duration of 563 milliseconds.
  3. The MIDI note number of the pitch. For example, the value 76 is equivalent to E5, or the pitch E which is a major tenth above middle C. Middle C is number 60. The note a semitone above middle C (C-sharp/D-flat) is 61, D is 62, and so on.
  4. The Expression Algorithm program runs in three stages: (a) localize the timing of beats (b) localize the timing of off-beats, and (c) localize the timing of individual notes (such as when beats or off-beats contain chords). The fourth column is used to distinguish between rhythmic events in stages (a) and (b). Notes which have a value in this column less than 0 are on an off-beat. Notes with a value of zero or higher in this column occur on beats. Notes with a value of one in this column occur on the first beat of a measure (however, this information is not utilized in version 1.00 of the software, and the values of 0 or +1 are treated equivalently).

The next three columns of data are not utilized directly by version 1.00 of the software but are recommended to be provided. Primarily, these three numbers are used for readability of the data for proof-reading and manual coordination of the data file with a written scores. Here are the meanings of these three columns:

  1. The measure number in which the note occurs. The measure number is only for reference to the score, so you can label repeated sections with the same measure number for each repetition, or you can label the measures incrementally from the start of the performed scores, regardless of how the repeats are performed.
  2. The absolute beat position of the note in the score. This value describes how many beats (or optionally quarter-notes for complicated meters) between the start of the performance and the current note. The first beat position in the score starts at 0 absolute beats, even if the note is on a pickup-beat.
  3. The seventh column is a track number. "1" represents a note played by the left-hand and/or a note printed in the bottom clef of the grand staff. A "2" represents a note played by the right hand (or printed on the top staff of the grand staff).

Columns 8 through 10 are no longer used in the Expression Algorithm program, and it is not recommended to write anything in these columns which may change meaning in future versions of the software. Here are the original functions of data in these columns:

  1. The minimum estimated time at which the note is expected to occur in the audio file of the performance (in milliseconds). This data is now estimated by measuring the time to the previous beat in the data file.
  2. The maximum estimated time at which the note is expected to occur in the audio file of the performance (in milliseconds). This data is now estimated by measuring the time to the next beat in the data file.
  3. The standard deviation of the expected location of the note (which is found in column one) expressed in milliseconds.

Further examples of Notes files used in the Mazurka project can be accessed from this page. For each entry on that page which has a link to processed data for a performance, there is a Notes file ending in the extension .notes. For example, for performance number pid52932-06 the Notes file is:

      http://mazurka.org.uk/info/revcond/pid52932-06/pid52932-06.notes
Which in this case represents the music for Mazurka in A minor, Op. 17, No. 4 with estimated notes timings for a recording performed by Charles Rosen (released on the CD label Globe 5028 in 1990).

Notes-file preparation

You can generate a Notes file any way you want, from any type of digital musical score and timing data you have access to. You could even type all of the data into a text editor or a Microsoft Excel spreadsheet (then saving as a tab-delimited text file). But doing things this way would be extremely time-consuming. The next sections describes how Notes files are prepared for use in the Expression Algorithm software for the Mazurka Project.

There are two main intermediate files to produce: (1) a file listing the estimated timings of beats in the audio file which contains a performance, and (2) a file containing the musical score. These two files must then be merged into a single file, which in turn is converted into a Notes file.

Tapping data

Use Sonic Visualiser which can run on Windows, Apple, and Linux computers. After importing the audio file you want to work with, create a time-instant Layer (or it can be created automatically). Next start playing the audio file in Sonic Visualiser. As the audio file plays, tap to the beats with the number-pad enter key, or if using a laptop which does not have a number-pad, with the semi-colon key (;). Each time you press the key, a time marker is inserted into the editor into a time-instant annotation layer.

It is recommended that you set up the tap labeling method found in the Edit menu of Sonic Visualiser to be a two-level cyclical counter. The primary counter counts the measures, and the secondary counter cycles through the beats (the beat cycle size is also set in the Edit menu). If you miss or add an extra beat while tapping to the performance, you can add/delete markers later. The markers can be relabeled sequentially from the Edit menu if changes are made.

After you have corrected any problems, you should save the tapping data by using the "export annotation" option in the File menu of Sonic Visualiser. Save the file with any name, but ending with .txt, and also set the save type as a "text file". The saved file contains two columns of data. The first number is the time in seconds for a tap, and the second column contains a text label for the tap (ideally in the form of "measure.beat"). Here is an example of what the tapping data should look like for pid52932-06. In this case, comment lines starting with (#) were added with a text editor after the file was saved, and the dot character between the measure number and the beat was changed in a text editor to be a colon (:).

One further step is necessary in order to prepare the tap times to be aligned with the score. A short PERL script is used to take the SV text annotation data of the taps, and converts them into this format. The PERL script can be run on Apple or Linux (and even Windows) computers, but you will probably have to understand how to use the Unix command-line in order to do that (or find someone who can explain it to you -- for example, try the computer-science department at your school).

The PERL script, called sv2revcond, takes up to for input arguments:

  1. the name of the Sonic Visualiser time-instant annotation-layer text file.
  2. optionally, the starting beat which is needed when the first beat is a pickup beat and not the downbeat of a measure.
  3. the number of beats in one measure (default is 3 for the Mazurka Project).
  4. the Humdrum rhythmic duration value of a beat. The default is "4" which represents a quarter note. (view the script file for other rhythms).
The resulting file is in the Humdrum file format which is used by the Humdrum Toolkit for Music Research. The first column of data in the resulting file indicates the rhythmic value of each beat. In the case of the Mazurka project, all beats are equivalent to a quarter-note, so the first column always contains a "4". If your music were in 6/8 with two beats to a measure, then you would have to modify the PERL script to place "4." in the first column (representing a dotted quarter-note). Irregular beat durations can also be handled, but in that case you would probably have to set the rhythm of each beat manually in a text editor in such a case. The second column contains the beat number of the beat in the measure. This values is not strictly necessary, but is used to proof-read the tapping data for errors, such as a missed or extra beat. The third column contains the time in milliseconds from the start of the audio file when the tap occurs. The fourth column contains the time in milliseconds to the previous beat in the list.

Score data

Next, a score must be prepared. For the Mazurka Project, the scores for Chopin mazurkas are available on this page by clicking on any of the "H" icons. The scores are in the Humdrum file format. Note that scores in any format usually have to be prepared for a particular performance, depending on the way the performer follows the repetition patterns in the music (does the performer follow the repeats as marked in the score, or does the performer take only the second endings). For Humdrum files, this is accomplished using the thru/thrux command. Once you have realized the repeat structure taken by the performer, you are ready to combine the score with the tapping data (see the next section below).

It is possible to import a score created with Sibelius or Finale into the Humdrum format via MusicXML, but this takes somewhat advanced knowledge of the Humdrum Toolkit (which in turn takes somewhat advanced knowledge of the Unix command-line). In particular, it would involve using the xml2hum program, and possibly the assemble command as well for more rhythmically complicated scores. Most scores available on the kernscores website are generated via MusicXML output from the SharpEye music recognition software from scanned printed scores.

Combining the tapping data and the score

The score (example) and the Humdrumified tapping data from SV (example), are joined using another fairly short PERL script. This script automates the use of several Humdrum commands: minrhy, scordur, timebase, assemble, and rid to combine the two files into a single score containing an extra column of timing information.

Here is an example of the combined file. Note that this file contains the tap times located on the same lines as the beats in the score. Another Humdrum program called gettime is used (with the -i option) to linearly interpolate (estimate) the timings for notes which occur off of the beats. Here is the result of that process. Note that the timing data in the first column of the new file contains time values for all lines of the scores (both beats an off-beats).

Finally, a program called time2matlab is used to turn the combined score/timing file into a Notes file which is used as one of the inputs to the Expression Algorithm program.

Summary

Below is a flowchart of data processing used to generate Notes files for use in the Mazurka Project. Boxes represent data files, and ovals represent computer programs which process these files. The blackened items are the primary production line, while the grayed items are one of many possible methods to generate files (D) and (E).

  1. Sonic Visualiser: An audio annotation editor from The Centre for Digital Music, Queen Mary, University of London. In this program you can listen to the music and tap to the beats on a computer or MIDI keyboard, and the times of the tapes will be displayed and editable in the program.
  2. The tapping data is then exported as a text-based annotation layer file. (File->Export Annotation: save type as Text). [Example]
  3. The Sonic Visualiser annotation layer now needs to be given explicit rhythmic information which is done with a simple PERL script sv2revcond.
  4. The resulting file contains two important columns: (1) the rhythmic value of each beat (always "4" which means a quarter-note in 3/4 time mazurkas). Note that you don't need to tap to a constant rhythm. But in that case, you would probably have to generate the rhythmic information manually. (2) The second required column is the **abstime column in the example file, which contains the absolute time in milliseconds of the tap since the start of the audio recording. The **beat and **deltatime columns are not necessary for the assembly process (K). [Example]
  5. You must prepare a Humdrum score containing **kern data for the music. There are several ways to create the score. On the KernScores website, there are a wide selection of musical scores in the **kern data format. Most of the scores on that website were created using the processing chain in the diagram of F->I->J. If you have a MIDI file of the score, then G->I->J or H->I->J would probably be the best route.
  6. SharpEye is a Music Recognition Program which will take scanned musical scores as input and output music in a variety of symbolic formats, such as MIDI and MusicXML.
  7. Sibelius is a music notation editor which can be used to graphically edit music. You may need to buy the Dolet plugin for Sibelius in order to save MusicXML data output from Sibelius.
  8. Finale is another music notation editor. Since 2006, it has had a built-in MusicXML exporter, so buying an additional plugin for MusicXML export is not necessary.
  9. MusicXML is a data representation for symbolic music in the XML-structured format, created by Michael Good at Recordare and based on an earlier data format called MuseData which was created by Walter Hewlett at CCARH. [Example]
  10. xml2hum is a free program for converting MusicXML data into Humdrum data.
  11. Once the tapping file and score have been prepared, they are assembled using this PERL script. This script uses several Humdrum-processing commands which are typically used to generate scores:
    • minrhy -- used to calculate the minimum rhythmic unit in a file which is then used in the timebase command.
    • scordur -- used to add remove extra beats in the tapping file so that the duration of both input files match.
    • timebase -- prepatory program for input to the assemble command.
    • assemble -- actually joins the two data files into a single file.
    • extract -- used to extract just the abstime data from the input tapping data (other columns of information are thrown away).
    • rid -- used to remove temporary empty data lines created with the timebase command.
  12. The output from step K contains the beat times aligned with the respective beats in the score. Events in the score which do not occur on a beat have an undefined time. The gettime program, using the -i option is used to estimate the times of events occurring on the off-beats. The output from gettime is a similar file, but with timing information at the start of every line, not just the lines which contain beats. [Example]
  13. Once every line in the score contains an absolute time value, the data can be converted into Notes data using a program called time2matlab.
  14. Output from time2matlab is in the final Notes data format for use with the Performance Expression Algorithm program. The last three columns of data in the example are not generated by the time2matlab program, but were appended in a separate process (but these last three columns are now obsolete and are no longer needed).

    Typically, converting the data files D & E to generate file N can be done with a single command-line pipeline:

        makebtime Dfile Efile | gettime -i | time2matlab > Nfile
    

    To convert data files B & I to generate file N can also be done with a single command:

        xml2hum Ifile > Efile && sv2revcond Bfile 1 3 4 > Dfile && \
           makebtime Dfile Efile | gettime -i | time2matlab > Nfile
    
    However, it is better to convert B into D, and I into E separately in order to check for errors before creating the final Notes file.