Voice recognition is used in many modern devices. Mobile phones, TVs and computers can already be controlled with voice commands to fetch information, change channels or tell the time. But the human voice could be used for so much more, and the SHIFT 2018 Emotionally Intelligent Machines track showcased one innovative application: what if we could analyze a person’s voice to find out about their health?
Dr. Yoram Levanon is the Chief Science Officer of Beyond Verbal, a world leader in Emotion Analytics and vocal biomarkers. Beyond Verbal have developed technology for healthcare screening and remote monitoring of health and emotions based on speech recognition and AI solutions.
Before starting Beyond Verbal, Dr. Levanon worked on software that analyzes conversations in real time and provides information to each participant about the other’s mood and emotions. Beyond Verbal was founded in 2012 to develop emotion detection, with over 20 years of research and experience in the field. Their focus later shifted towards healthcare, and with their patented vocal biomarker technology and a large database of anonymized recordings, Beyond Verbal are now developing solutions for predicting and monitoring chronic conditions.
Speaking sounds easy, but it requires multiple parts of the body to work together. Our brain, muscles and respiratory system must all function just right to produce speech that others can easily recognize and understand. With all these different parts in the mix, small variations can and do occur, and things like hypernasality, slurring, flattened monotone, changes in pitch or distribution of pauses can all point to several underlying issues.
Analyzing these subtle acoustic deviations is just one part of what building a database of vocal biomarkers entails. The other half has to do with context: choice of words, grammar, cultural differences, speaker’s history and emotional state must all be included in the analysis. While Beyond Verbal’s database is anonymized, and only the vocal patterns are captured for later use, the database is compiled of real speech on different topics, and the reasons for things like excessive pauses in speech might in some cases be as simple as not wanting to talk about a certain topic.
As follows, what was a complex task to begin with suddenly requires analyzing and evaluating even more information to ensure correct results.
Trained with this massive database of voice recordings and correlated medical records, Beyond Verbal’s system of AI, machine learning and deep learning techniques can now spot and understand these nuanced variations. Only requiring a short voice sample, the system can analyze anyone’s speech and provide information about possible chronic diseases. Through identifying novel vocal biomarkers and building new prediction algorithms, even more diseases could be detected in the future.
With a robust R&D platform enabling deeper big data research, the platform is rapidly growing and scalable. Collaboration with other research and technology providers and commercializing the results benefit everyone using the platform and allow more data to flow in. Vocal biomarkers for PTSD, traumatic brain injury, depression, autism, heart diseases and Parkinson’s disease have already been identified in the database. With a growing database, markers denoting health conditions like cancer, hypertension and diabetes could be found.
As noted, one issue that the system already seeks to take into account is people themselves. We may lie, might not want to talk about our issues, especially mental, and might consciously try to affect the results. Because of this, mental health patients can easily be underdiagnosed. To balance this out, Beyond Verbal’s solution is non-intrusive, continuous and passive. Patients can record themselves when they want to and thus provide more accurate samples, making diagnosis more accurate and easier for the patients. Recordings need to be of certain quality, but the devices we use in our daily lives have already been shown to produce good results.
The anonymized recordings in the database each correlate to a healthcare history. Although the database entries do not contain the actual context of words, only vocal patterns, security is always a concern when handling sensitive personal information. On the other hand, a system like this can easily be seen as the natural evolution of digitized healthcare databases already in use, and as something that calls for equal safety measures.
Similarly to how these already existing databases are used, Beyond Verbal’s system, integrated into our healthcare systems, could provide easy, cheap and flexible predictive screening for different conditions. It could allow doctors to be more aware of their patients’ health and able to make more precise diagnoses, by effectively seeing right through what we say and directly into how we’re saying it.