Voice-driven interaction
in XR spaces

VOXReality is an initiative that aims to facilitate the convergence of Natural Language Processing (NLP) and Computer Vision (CV) technologies in the Extended Reality (XR) field. We will develop innovative Artificial Intelligence (AI) models that will combine language as a core interaction medium supported by visual understanding to deliver next-generation applications that provide comprehension of users goals, surrounding environment and context.

symboltop 3A


Improve human-to-machine and
human-to-human XR experiences

Extend and improve the visual
grounding of language models

Widening multilingual translation
and adapting it to different contexts

Provide accessible pretrained XR
models optimized for deployment

Automating the generation of virtual
agents using multimodal information

Demonstrate clear integration paths
for the VOXReality pretrained models



Digital Agent

Personal Assistants that are an emerging type of digital technology that seeks to support humans in their daily tasks, with their core functionalities related to human-to-machine interaction

Discover more

Virtual Conferencing

Virtual Conferences that are completely hosted and run online, typically using a virtual conferencing platform that sets up a shared virtual environment, allowing their attendees to view or participate from anywhere in the world

Discover more


Theaters where VOXReality will combine language translation, audio-visual user associations and AR VFX triggered by predetermined speech

Discover more

Latest News


[tribe_events tribe-bar="false"]

Meet the VOXReality team

Subscribe to our newsletter and receive the latest news, trends and insights in the XR ecosystem

Subscription Form

You can unsubscribe at any time by clicking the link in the footer of our emails.
By subscribing, you acknowledge that your information will be transferred to Mailchimp for processing. Learn about Mailchimp's privacy practices.