
Smallest.ai, a startup focused on developing enterprise-grade speech-generation and recognition tech, has successfully raised $8 million in seed funding. The latest round of capital infusion was led by the Silicon Valley venture firm Sierra Ventures, with notable participation from India-based investors 3one4 Capital and Better Capital.
Founded last year by Sudarshan Kamath and Akshat Mandloi, both formerly of Robert Bosch GmbH, the Bengaluru-based company specializes in creating highly compact, high-speed AI models meticulously optimized for real-time voice interaction. Smallest.ai’s core proposition is the mitigation of the pervasive lag and inconsistency that currently plague next-generation text-to-speech (TTS) and voice recognition systems, particularly within high-volume, large-scale customer service environments.
Smallest.ai’s research efforts focus on “Lightning,” its flagship TTS model, which the company claims can generate ten seconds of human-quality speech in a mere 100 milliseconds—approximately one-tenth of a second. This claimed latency performance would position Lightning at roughly 50 times the speed of several competing TTS models currently deployed in enterprise voice applications, a significant benchmark for real-time customer interaction. Lightning is engineered to generate multiple tokens concurrently, a design that sharply minimizes processing latency while simultaneously ensuring high audio fidelity. Furthermore, the model reportedly requires less than one gigabyte of VRAM, a crucial technical advantage that allows it to execute efficiently on smaller, more accessible GPUs. This factor is becoming increasingly important amid global constraints and rising costs for advanced AI hardware.
Smallest.ai is currently targeting the global contact center industry, a sector where enterprises collectively allocate over $400 billion annually. Despite significant capital investment in automation, human agents remain central to operations, largely because the quality of existing AI-driven voice systems has failed to meet escalating customer expectations. A recent Deloitte survey indicated that over half of participating businesses anticipate rising call volumes over the next five years, which intensifies the demand for reliable, scalable voice automation solutions. Long hold times and inconsistent service quality remain entrenched industry challenges.
The company recently revealed “Electron v2,” a 4-billion-parameter voice model capable of initiating speech generation within 53 milliseconds—a measurement known as time-to-first-token (TTFT). Electron v2 reportedly achieves the expressive quality of models six times its size, underscoring a key milestone in model compression and performance optimization. Currently, its platform allows enterprises to train and deploy custom voice agents by uploading minimal audio samples to replicate a specific speaker’s tone and rhythm. Businesses have the option of conducting quick cloning using a 15-second audio clip or achieving a higher-fidelity replication that requires 15 to 45 minutes of recorded audio.
Beyond mere voice replication, the system supports industry-specific configurations, offering automated agents pre-trained to handle specialized data across financial services, healthcare, or retail transactions, which are designed to meet stringent regulatory and cybersecurity requirements. For clients facing the strictest data compliance mandates, the company retains the capacity for full on-premise deployment. “Voice is emerging as the next frontier in enterprise AI, and Smallest has built one of the most compelling platforms we’ve seen,” Ashish Kakran, a partner at Sierra Ventures, commented on the matter.