Machine learning for music: Google’s Tone Transfer
Google made a low-key launch this week, when it made Tone Transfer public. Built by two teams at Mountain View – Magenta and AI UX – the tool takes a tonal input (a voice or a line of melody) and can then re-render it with instrument modelling.
The year-long collaboration between AI researchers, UX engineers and designers is built on Magenta’s Differential Digital Signal Processing engine (DDSP). It was created as an exercise in learning about how people perceive music, machine learning and their own practice. Running on an early version of DDSP, it’s an in-browser deployment (on tensorflow.js) which extracts pitch data using another Google Research project, SPICE .
At the moment, the tool has been opened up for experiment by musicians and non-musicians who want to explore music creation. But Magenta made DDSP open source earlier this year and, while it hasn’t been expressly stated, there may be implications for business data collection.
As new routes to mining data are explored, and voice collection via VOIP services becomes more commonplace, there is scope to explore tonal approaches to voice data. Applying machine learning to voice data could help to parse language that otherwise could be misinterpreted by data analysts. Sarcasm or humour are both common idioms that would create a false positive if voice data is mined in like-for-like fashion with text.
Magenta says: “We are excited with upcoming releases enabling you to easily train your own DDSP models and deploy them everywhere: a phone, an audio plugin or a website using the larger tensorflow lite and tensorflow.js ecosystem.”
Data scientists, always on the lookout for a new frontier, should prick their ears.