Facebook AI just released textless NLP for expressive speech generation. The applications and intentions behind this research are obviously ethically questionable, but the whole approach of analyzing audio signals directly rather than using some intermediate representation or labeling is interesting to me from a computer music perspective.

Music machine learning projects like Google Magenta rely on things like MIDI, which, like text vs speech, leave much to be desired.

I could imagine some very interesting outcomes using similar approaches to generate expressive melodies on monophonic instruments, via skipping the MIDI/MusicXML/ABC representation entirely and training on recordings of real performances.



It's been on HN for a few hours with minimal reception. Which is a bit disappointing.

Again, putting aside the fact that Facebook is Facebook, there are some very compelling things being done here.

