Shravani Limited Releases VideoAvatar v2: The 420k Update
A significant leap forward in regional voice synthesis technology, featuring 420,000 training steps, F5-TTS architecture, and unprecedented micro-prosody control.
VIDEOAVATAR UK VOICE ENGINE V2
420,000 TRAINING STEPS | F5-TTS ARCHITECTURE
ST HELENS, UK – Shravani Limited today announces the immediate availability of VideoAvatar UK Voice Engine v2. Following the success of the initial release, this major update represents a "Perfection Phase" for the engine, pushing beyond standard text-to-speech limits to capture the subtle nuances of UK dialects—from the glottal stops of a London accent to the melodic lilt of a Welsh speaker.
Key Innovations to the Core
Most open-source models plateau early in their training. We pushed v2 to a massive 420,000 updates, allowing the model to learn not just the sounds of speech, but the microscopic pitch shifts, breath patterns, and emotional inflections that define truly human interaction.
"With v2, we aren't just synthesizing speech; we're synthesizing the 'soul' of a region. It's not about sounding perfect; it's about sounding local. The jump to 420k steps has unlocked a level of warmth and authenticity we hadn't seen before."— Harish S. Agawane, Founder
Technical Breakdown
The v2 engine is built on the advanced F5-TTS (Flow Matching) architecture, a significant upgrade over previous diffusion-based models.
420k Updates
Extensive training for superior emotional range and stability.
Instant Cloning
Zero-Shot cloning with just 3-10 seconds of reference audio.
F5-TTS Arch
Flow Matching for faster inference and better prosody.
Open Source
Released under CC-BY-NC-SA 4.0 for community innovation.
Validated Performance
The model has been rigorously validated against the CSTR VCTK Corpus (109 native speakers) and the Mozilla Common Voice dataset to ensure it generalizes well across diverse recording conditions and accent variations.

Source code available on GitHub

Model weights released on Hugging Face
Experience VideoAvatar v2
The weights are available now for researchers and developers. Join us in building the next generation of voice AI.