Rochester (NY) – Scientists at the University of Rochester have taken a unique approach to compress digital music files: Rather than cutting an existing a digital music file in size through compression optimization, they decided to build a completely new file based on the simulation of humans playing music. The result is impressive – a 20-second clarinet solo can be stored in less than a Kilobyte of space. What’s not included in the file is the emotional factor of music.
Imagine a technology that would compress digital music 1000 times more efficient than MP3 and average songs that would be measures in Kilobytes rather than Megabytes. Suddenly, those gigantic music collections that can be found on home computers today could be compressed from Gigabytes to just a few Megabytes. Mark Bocko, professor of electrical and computer engineering at the University of Rochester believes that the first step towards such a technology has been made.
Announced at the International Conference on Acoustics Speech and Signal Processing, this new approach attempts to recreate music into a virtual performance. Just like we see physics simulations that calculate thousands of particles in flowing water or exploding objects, this audio technology tries to simulate humans playing music by monitoring human actions for example. Just like it is the case with physics simulation, lots of computation will actually be done in the background by the CPU to come up with audio output. “In replaying the music, a computer literally reproduces the original performance based on everything it knows about clarinets and clarinet playing,” the group said.
So, how does it sound? You can compare the actual human performance of a 20-second clarinet solo with the simulated output for yourself. But you don’t have to be a music expert to notice the difference as the simulated version sounds robotic and the researchers have a long way to go until they can put musicians out of business.
“This is essentially a human-scale system of reproducing music,” said Bocko. “Humans can manipulate their tongue, breath, and fingers only so fast, so in theory we shouldn’t really have to measure the music many thousands of times a second like we do on a CD. As a result, I think we may have found the absolute least amount of data needed to reproduce a piece of music.”
The project group said that the current “results are a very close, though not yet a perfect, representation of the original sound.”
Bocko believes that the quality will continue to improve as the acoustic measurements and the resulting synthesis algorithms become more accurate. “Maybe the future of music recording lies in reproducing performers and not recording them,” Bocko said.
Based on the results made available so far, this may be a bit of an overstatement, as at least the simulated music published yesterday lacks any kind of emotion. And it is especially emotion that can either make or break a performance, otherwise it sounds stale or, to borrow one of Simon Cowell’s phrases, “forgettable.”
But we have no doubt that there will be some tools to simulate human emotions sometime in the future as well.