The role:
As a Conversational Video Interface Engineer, you’ll be at the forefront of building and optimizing our CVI platform—real-time, multimodal AI systems that bring digital avatars to life. You’ll collaborate across engineering, research, and design teams to integrate vision, speech, and emotional intelligence into seamless, human-like conversations.
What You’ll Do
Develop & Optimize CVI Components: Build core systems for WebRTC/video conferencing, speech recognition (ASR), text-to-speech (TTS), vision processing, and replica video output.
Integrate Multimodal AI Models: Work with in-house models like Phoenix-3 (avatar rendering), Sparrow-0 (conversational pacing), and Raven-0 (visual perception) to create responsive digital twins.
Ensure Real-Time Performance: Optimize the CVI pipeline to maintain sub-600ms utterance-to-utterance latency.
Collaborate Cross-Functionally: Partner with AI researchers, product managers, and UX designers to align development with user needs.
Enhance API Infrastructure: Build and maintain scalable APIs for external developers to integrate CVI into their platforms.
What We’re Looking For
Education: Bachelor’s or Master’s in Computer Science, Electrical Engineering, or a related field.
Experience: 3+ years in software engineering focused on real-time systems, multimedia processing, or AI integration.
Programming Skills: Proficiency in Python, C++, or JavaScript, with experience using frameworks like TensorFlow or PyTorch.
Multimedia Knowledge: Familiarity with WebRTC, streaming protocols, and audio/video processing.
AI Integration: Experience deploying ML models in production—especially in vision, speech, or NLP domains.
Analytical Thinking: Strong debugging and problem-solving skills in complex systems.
Bonus Points
Experience building or working with digital avatars or human representations.
Knowledge of multilingual systems and cultural nuance in AI communication.
Cloud infrastructure experience (AWS, GCP, Azure).
Benefits (not limited to):
Join us in shaping the future of human-AI interaction. Apply today and help us bring lifelike, emotionally intelligent AI to life.