Segmentation of field recordings

Goal

We are exploring automatic segmentation and labeling of ethnomusicological field recordings. Field recordings are integral documents of folk music performances and typically contain interviews with performers intertwined with actual performances. As these are live recordings of amateur folk musicians, they may contain interruptions, false starts, environmental noises or other interfering factors.

Our goal is to design robust automatic algorithms that approximate manual segmentation of field recordings and classify segments into a set of predefined classes.
The tools developed are integrated into the Ethnomuse digital archive of Slovenian folk music and dances, as well as our segmentation tool SeFiRe. The figure shows visualization of a segmented field recording within the SeFiRe tool. Various signal classes (speech, solo singing, choir singing, instrumental) are shown in different colors, segment boundaries are shown as vertical lines. Users can manipulate and manually adjust the found boundaries, as well as listen to the recording and annotate its contents.

The algorithm can also be used for more general speech/music segmentation and is very robust, as also demonstrated by its result at MIREX 2015 Music/Speech Classification and Detection results.

Download

Our current algorithms, deep learning models and MIREX submissions are available on GitHub.
In addition, the SeFiRe field recording dataset, containing cca. 7000 labelled field recording excerpts is also available.

References
  • [PDF] M. Marolt, C. Bohak, A. Kavčič, and M. Pesek, "Automatic segmentation of ethnomusicological field recordings," Applied sciences, vol. 9, iss. 3, pp. 1-12, 2019.
    [Bibtex]
    @article{1538109123,
    author={Matija Marolt and Ciril Bohak and Alenka Kavčič and Matevž Pesek},
    year={2019},
    pages={1-12},
    volume={9},
    title={Automatic segmentation of ethnomusicological field recordings},
    journal={Applied sciences},
    number={3},
    }
  • [PDF] M. Marolt, "Going deep with segmentation of field recordings," in Folk music analysis : 8th International Workshop, 26-29 June 2018, Thessaloniki, Greece, 2018, pp. 1-5.
    [Bibtex]
    @conference{1537828291,
    author={Matija Marolt},
    year={2018},
    pages={1-5},
    title={Going deep with segmentation of field recordings},
    booktitle={Folk music analysis : 8th International Workshop, 26-29 June 2018, Thessaloniki, Greece},
    }
  • [PDF] M. Marolt, "Probabilistic segmentation and labeling of ethnomusicological field recordings," in ISMIR 2009 : proceedings of the 10th International Society for Music Information Retrieval Conference, October 26-30, 2009, Kobe, Japan, 2009, pp. 75-80.
    [Bibtex]
    @conference{7368532,
    author={Matija Marolt},
    year={2009},
    pages={75-80},
    title={Probabilistic segmentation and labeling of ethnomusicological field recordings},
    booktitle={ISMIR 2009 : proceedings of the 10th International Society for Music Information Retrieval Conference, October 26-30, 2009, Kobe, Japan},
    }

Visualization of a segmented field recording