Building voice-driven AI applications using LLMs

The article discusses the potential of voice-driven AI applications and the use of large language models (LLMs) in these applications. It highlights the importance of speech-to-text, text-to-speech, and the LLM itself as the three basic components for building an LLM application. The article also mentions the benefits of running application logic in the cloud, the challenges of phrase detection and endpointing, and the considerations for audio buffer management. It emphasizes the need for reliable and low-latency data flow in voice-driven LLM apps.
Original article: How to talk to an LLM (with your voice)
Featured writing
When your brilliant idea meets organizational reality: a survival guide
Transform your brilliant tech ideas into reality by navigating organizational challenges and overcoming hidden resistance with this essential survival guide.
Server-Side Dashboard Architecture: Why Moving Data Fetching Off the Browser Changes Everything
How choosing server-side rendering solved security, CORS, and credential management problems I didn't know I had.
AI as Coach: Transforming Professional and Continuing Education
Transform professional and continuing education with AI-driven coaching, offering personalized support, accountability, and skill mastery at scale.
Books
The Work of Being (in progress)
A book on AI, judgment, and staying human at work.
The Practice of Work (in progress)
Practical essays on how work actually gets done.
Recent writing
The bully pulpit: why AI slop only matters to people who write about AI slop
This article exposes how the 'AI moral crisis' narrative is amplified by the very people who control media—and why the 90% of workers actually using AI don't share the panic.
Why your job matters more than mine: the selective morality of job loss
This article reveals the uncomfortable pattern behind which jobs get moral protection and which get called 'market forces'—and what that means for everyone outside the creative class.
AI in writing: the end of a professional monopoly
This article reframes the AI writing debate: the panic isn't about creativity—it's about a professional class losing control of the systems they've gatekept for a century.
Notes and related thinking
Jasper is a useful tool for developing employee training.
Transform employee training with Jasper by aligning programs to business goals, engaging diverse learning styles, and using innovative methods for success.
The IMF Warns About AI's Impact on Inequality
IMF warns AI could deepen global inequality, urging policymakers to implement safety nets and retraining programs to protect vulnerable workers.
It's going to take a century for artifical intelligence to be able to perform most human jobs. But there are going to be some key developments during the next decade.
Explore how AI will transform jobs in the next decade, from enhancing security to automating coding, reshaping the future of work.