What is it about?
AI can now create realistic speech that sounds exactly like a real person using just a few seconds of audio. While this technology is exciting, most current tools require you to send your private voice recordings to cloud-based servers, which creates a risk that your personal biometric data could be misused or stolen. We created Voicy to solve this problem. It is a tool that runs entirely on your own computer, meaning your voice data never leaves your device. Voicy allows you to either clone a voice instantly from a short recording or train a custom model for higher quality using your own data. We also built an automated system that handles all the technical cleanup like adjusting volume and transcribing audio so you don't have to do it manually. By using Voicy, researchers, creators, and professionals can achieve professional-grade results while keeping their voice data safe and under their own control.
Featured Image
Photo by Ali Shah Lakhani on Unsplash
Why is it important?
As AI-generated voices become more common, the risk of "deepfakes" and unauthorized use of people's identities is growing. Current "black-box" services often strip users of control over their own biometric identifiers. Voicy is important because it proves that you do not have to choose between high-quality AI results and data privacy. It provides a transparent, secure, and easy-to-use platform that empowers anyone from independent researchers to small teams to develop their own speech models without needing expensive equipment or access to risky cloud services. By choosing to run everything offline, Voicy helps set a new, responsible standard for the future of generative audio.
Perspectives
In my view, the current 'privacy paradox' in AI, where we are forced to trade our security for the sake of convenience, is unsustainable. My goal in developing Voicy was to hand the power back to the user. I believe that anyone working with sensitive voice data, whether for medical needs, legal work, or personal projects, should be able to utilize the latest AI breakthroughs without fear of their data being exploited. Voicy is my contribution to a more ethical, transparent AI community where software is built to be secure by design, not just by necessity.
Kushal Pokhrel
Australian Institute of Higher Education Pty Ltd
Read the Original
This page is a summary of: VOICY: A Privacy-Centric Modular Architecture for Zero-Shot Voice Cloning and Fine-Tuned Speech Synthesis, May 2026, ACM (Association for Computing Machinery),
DOI: 10.1145/3774905.3795605.
You can read the full text:
Resources
Contributors
The following have contributed to this page







