In this edition of the Contact Center Top Voice interview, I sit down with Davit Baghdasaryan, the visionary Co-Founder and CEO of Krisp, whose Voice AI innovations are setting a new standard for clarity in human communication.
Davit reveals how Krisp is tackling one of the contact center industry’s most overlooked challenges—the struggle to truly hear and be heard. From pioneering Accent Conversion AI that bridges global voices to expanding across Africa’s diverse linguistic landscape, he shares a radical vision where technology doesn’t replace conversation—it restores its quality. This dialogue goes beyond AI and automation; it’s about eliminating bias, unlocking talent, and redefining what “human connection at scale” really means in the age of intelligent CX.
Davit: The contact
center industry is where brand’s promises are tested and delivered in
real time. It’s where communication either works or breaks. What’s always
surprised me is how little innovation actually reached that front line. For
decades, the same problems persisted: agents struggling to be heard, customers
repeating themselves, frustration building on both sides.
I started Krisp, alongside my co-founder Arto Minasyan,
because we saw a fundamental flaw in how voice technology was treated. The
industry was chasing automation or analytics, and most organizations were too
afraid to touch this technology because voice is such a critical
infrastructure. We saw this as a gap in the market on the foundational
issue of clarity.
We began by eliminating background noise to help people simply be heard. Once we solved that, we realized how much more could be improved in real time. Now we’re using AI to enhance not just what people hear, but how they understand each other. That’s what excites me. Voice AI isn’t about replacing human conversation; it’s about restoring its quality at scale and powering a new era of human connection.
Davit: Contact centers live and die on voice quality. If people can’t clearly understand each other, the entire experience collapses, no matter how well the script or training is designed. That problem has existed for decades, and most of the industry has tried to work around it instead of solving it.
We launched Krisp’s Accent Conversion technology to fix the live conversation itself. Not the transcript or the recording after the call; the real moment when two people are trying to understand each other. In a global contact center, accent gaps can create real friction. Agents might speak clearly, but the listener still struggles.
Our AI bridges that gap in real time. It adjusts speech patterns just enough to make them easier to understand, without changing who the speaker is or how they sound. The result is a smoother exchange where both sides stay focused on the conversation, not the comprehension. Agents sound more confident, customers stay engaged, and communication feels more natural for everyone involved.
This isn’t about replacing agents or automating empathy. It’s about giving people the tools to be clearly heard and easily understood. This results in shorter handling times, higher satisfaction scores, and lower agent turnover. But the real outcome is simpler: every call feels effortless. That’s what Krisp was built to deliver.
CCTI: What are Krisp’s core offerings for cloud-based contact centers? What does your product roadmap look like for 2025-2026?
Davit: Krisp is a real-time Voice AI platform built for modern contact centers that depend on clear communication. It enhances live conversations as they happen, improving how agents and customers hear and understand each other.
Our core offerings include AI Noise Cancellation and Voice Isolation to remove background distractions and make speech clearer, Accent Conversion to improve understanding across different accents, and Voice Translation to bridge language gaps in real time. We also provide Agent Assist, which gives agents live context, guidance, and summaries so they can stay focused on the customer and the live interaction. Coming soon, Speech Analytics will allow contact centers to track performance, score 100% of the calls, and monitor compliance securely, without recording or storing any sensitive data.
Everything runs securely in real time, on-device or within enterprise infrastructure. No recordings or data leave the session. That privacy-first design is a non-negotiable for us. It’s what lets global CX teams scale Krisp without worrying about compliance or data exposure. When agents sound clear and customers can follow every word, the experience just works.
Davit: Africa is one of the most linguistically rich and diverse regions in the world. For us, that makes it one of the most exciting. Accent Conversion isn’t about standardizing how people speak; it’s about making sure they can be understood without losing their identity.
Scaling into African English accents requires more than adding data. It means training AI models that capture the unique rhythm, tone, and cadence of each dialect. We’ve trained our models using regional datasets and on-device learning to fine-tune performance for specific linguistic patterns, without sending any voice data outside the enterprise.
Our focus is on accuracy and authenticity. The AI should enhance clarity, not flatten or lose character. A Nigerian agent should still sound Nigerian, just easier to understand for a customer in London or New York.
Our long-term vision is to make real-time voice clarity inclusive of every accent, not just the ones historically prioritized by technology. Africa is a crucial step toward that goal. It’s where we’ll prove that clarity and cultural identity can coexist at scale.
Recommended: Contact Center Tech Top Voice: Interview with David Funck, CTO at Avaya
Davit: Accent bias has quietly shaped hiring and performance in contact centers for years and is one of the most invisible yet damaging forms of friction in global CX. Not to mention it limits opportunity for agents and affects how customers perceive service quality. Our goal with Accent Conversion is to remove that bias at the root, by making understanding effortless on both sides of the conversation.
We’ll measure success the same way our customers measure performance through outcomes. Specifically:
The biggest operational challenge is data diversity. Accents evolve regionally, even within the same country. We’re addressing that by continuously updating our models with representative datasets and feedback from local teams, not just lab conditions.
Reducing bias isn’t a one-time launch, it’s a continuous process of listening, measuring, and improving. The goal isn’t just better technology; it's a broader opportunity for skilled people whose voices deserve to be heard clearly.
Davit: With real-time voice AI, performance, latency, and privacy aren’t competing priorities; they’re design imperatives. You can’t compromise one without breaking the customer experience.
Our approach has always been to solve this at the architectural level. Krisp runs entirely in real time, either on-device or within the enterprise’s secure cloud infrastructure. That means audio never leaves the session, and there’s no dependency on external cloud processing. It keeps latency minimal—unperceivable to the ear—while meeting the highest data privacy standards.
We’ve also optimized our models to run efficiently on low-bandwidth networks, which is critical for different regions where infrastructure can vary. The result is consistent clarity and reliability, regardless of geography or hardware constraints.
Enterprises shouldn’t have to choose between privacy, performance, and experience. The right architecture delivers all three. That’s what we’ve built Krisp to do.
Davit: Africa is becoming a major CX hub because it combines linguistic diversity, cultural alignment with Western markets, and a young, motivated, tech-ready workforce. Our strategy is to enable that growth by localizing deeply rather than expanding broadly.
We’re partnering with regional BPOs, enterprises, and CX technology platforms to integrate Krisp directly into their ecosystems. That approach lets us scale fast while ensuring the technology fits local infrastructure, bandwidth realities, and linguistic needs. Our Accent Conversion expansion is a prime example, built with African English data and tested with local agents, not imported models.
Customer acquisition in Africa is also relationship-driven. We’re working closely with enterprises and BPOs that view voice quality as a competitive advantage, not just an operational cost. Krisp helps them win higher-value contracts by delivering clearer, more productive voice interactions.
Davit: Over the next 18 to 24 months, our focus is on evolving Krisp from a clarity engine into a full real-time voice intelligence layer. The goal is simple: make every live conversation not only clearer, but smarter. We already provide multilingual translation and agent assist tools, and will continue to invest in our core innovations to raise the bar across all of our offerings, including:
What makes Krisp different is our architecture. Everything runs securely in real time, either on-device or within the enterprise environment. That gives BPOs the ability to deliver AI-powered CX at a fraction of the cost of traditional analytics or automation platforms.
In Africa, our roadmap is directly shaped by the region’s strengths: multilingual empathetic talent, rapid digital adoption, and growing demand for secure, scalable AI infrastructure. Our mission is to equip local CX providers with the same advanced tools as their global counterparts, helping them move up the value chain from service providers to strategic partners.
Davit: Scale only matters if it’s reliable, private, and compliant. Our systems already handle over 80 billion voice minutes a month, and the same architecture supports our Africa expansion.
Consistent with Krisp’s privacy-by-design model, our Africa deployment adheres to regional data sovereignty frameworks like POPIA while maintaining low-latency performance, even in variable bandwidth environments.
The main challenge isn’t regulation but infrastructure. Bandwidth varies widely, so we’ve optimized our models for low-network environments while keeping latency under 100 milliseconds.
Our approach stays the same everywhere: privacy by design, performance by architecture, and adaptability by region.
Davit: Today, CX data lives in silos: voice, chat, CRM, analytics. MCP changes that by creating a living, real-time profile that agents and AI can act on instantly. When combined with technologies like Krisp’s real-time voice intelligence, it allows every conversation to start with context already in place.
That means no repetition, faster resolution, and interactions that feel genuinely human. The contact center agent becomes an informed problem-solver, not a data retriever.
In the long run, MCP will redefine how organizations compete. The brands that win won’t just know their customers, they’ll understand them in real time and act on that understanding with precision.
Thank you, Davit! We look forward to having you again at our Top Voice programs.
To participate in our interviews, please write to our Contact Center Tech Media Room at info@intentamplify.com
About Krisp
Here is Krisp's boilerplate: "Founded in 2017, Krisp pioneered the world’s first real-time Voice Productivity software. Krisp’s Voice AI technology enhances digital voice communication through audio cleansing, noise cancellation, accent conversion, live speech-to-speech translation, and agent assist. Offering full privacy, Krisp works on-device, across all audio hardware configurations and applications that support digital voice communication. Today, Krisp is deployed on over 200 million devices, has transcribed over 40 million calls and processes over 80 billion minutes of voice conversations every month, helping businesses harness the power of voice to unlock higher productivity and deliver better business outcomes.
Sudipto Ghosh is the Director of Global Marketing at Intent Amplify, a leading AI-powered intent data company.