We raised $5 million and secured two U.S. patents. Here is what we built and why.
Blog

Chloe Duckworth

Every conversation carries two streams of information. The first is linguistic. The second is emotional—the tone that signals frustration before the caller names it, the pacing that precedes a hang-up, the flatness that indicates the conversation is no longer connecting. For as long as voice technology has existed, the industry has processed the first stream and discarded the second.
Shannon and I both studied the underlying problem at USC (she in signal processing and machine learning, I in computational neuroscience) and when we left to build Valence AI, we started where every prior attempt had failed: the data. Scraped public audio produces models that perform in controlled conditions and break at the margins. We built our training sets through self-annotated evoked emotion surveys crowdsourced across a representative sample of a population's accent, age, race, gender, and neurotype, and geographical location within a given language.
The result is a signal processing pipeline, protected by two issued U.S. patents, that classifies emotional state from raw audio in under one second, independent of words used. Conversational analytics mean more when emotions can be tracked over time. Contact center operators in production are seeing 30 percent reductions in handle time. That number reflects something more specific than efficiency: it is what happens when a system can finally hear not just what customers say, but how they feel when they say it.
Today we are announcing $5 million in seed funding led by Differential Ventures. The release is here.
What I want to say beyond the capital is this: the gap between machine intelligence and human experience is not a design problem and it is not a feature gap. It is an infrastructure problem. The emotional signal has always existed in the audio layer of every call. The industry built stacks that discard it. Brand loyalty, customer satisfaction, and the operational metrics that follow from both—cost per call, resolution rates, churn—are all downstream of an interaction quality that existing voice AI is structurally incapable of measuring.
We built the infrastructure to change that.




