Max Plansky is a 16 year old boy diagnosed at 6 months of age with a neurological disorder called Cerebral palsy. Max’s vocal folds vibrate and make a sound however he is not able to control the muscles in his face in order to produce speech.
He is however able to speak though selection of words and sentences on an electronic device which he chooses through the device which produces a computerized voice known as “Perfect Paul.” Perfect Paul is a common voice on devices, which gives Max a voice that sounds like an adult male robot. Recent computer and technological advances however have made it possible to analyse characteristics of the voice to create a less artificial sounding voice.
These advancements have mainly been for commercial purposes geared towards improving the synthetic voices that are used during automated customer service calls. However researchers have started using this technology to improve electronic speech devices which assist people such as Max in order to help them sound more natural.
Speech is not just a means of communication—it is a window into the soul, says Matthew Aylett, chief scientific officer of an Edinburgh-based company, CereProc, that sells technology for personalizing synthetic voices. To capture inflection and tone in a synthetic voice is an enormous challenge, Dr. Aylett says. People speak faster, louder or at a higher pitch when they are upset, or slower, softer and deeper when they are sad. The stronger the emotion, the harder it is to simulate, he says. Telling a joke is also tough. Another thing that is hard to capture, says Dr. Aylett, is the myriad ways a voice can be used to convey different meanings of the same word. When the waiter asks if you have decided what you want,” he says, “it means something different if you say ‘Yes!’ or ‘Yessss.’
Creating a personalised voice for Max however is more than just a technical issue. It also requires matching the characteristics of a voice to his personality and identity. Should it match his age now or what he will sound like later in life?
These are questions that Rupal Patel, a speech-technology professor from Northeastern University is focused on addressing. Dr. Patel’s company has been focusing on finding ways to extract an individual’s unique sounds and combining them with the closest match from donor voices they have collected and stored in the voice bank she is building to help capture the personality of the recipient as well as the clarity of the donor.
As with many startups, costs are initially high. Max’s voice will be one of the first created by the company. The family agreed to pay $10,000—which they raised with help from a generous uncle and aunt and other donors—in return for three voices Max could choose from and ensured delivery by the end of 2015. With more experience and infrastructure in place, VocaliD is now charging customers $1,249 plus an annual fee of $240 to tune and modify the voice over time.
Max recognizes that it will still be difficult for him to communicate; however, he will sound a lot more natural when speaking. He will control his new voice using the same system, through a switch mounted on his wheelchair headrest. Max now just has the choice between three voices created for him.
The first voice is clear but still a little robotic. The second is higher and boyish. Dr. Patel says it reminds her of Max as he is now, with his smooth face and wide smile. The third is deeper, as if it belongs to someone a little older, the person Max might be someday.
Following a passion and interest of his, Max has decided he will publicly use his new voice for the first time at his school after the basketball team finishes its practice.
The locker room fills with players, Max’s family and friends, Dr. Patel and her colleagues, among others. The players flank Max, towering over him. Then the buzz in the room settles for a moment. It is Max’s turn to speak: “Getting a voice from VocaliD is going to be a great opportunity for me,” he says. People start to clap and cheer.
Max has chosen a voice that speaks to the future, the voice of his older self. Dr. Patel described the technology by saying, “A new voice is born.” Perhaps this is why after trying out his new voice in the locker room; Max reacts as many do after witnessing a birth. Max stopped talking and began to cry.