Extended Reality (XR) is rapidly evolving from visually immersive environments toward intelligent, responsive systems capable of adapting to human intent. A key milestone in this evolution has been achieved through the VoxReality project, which demonstrated how Artificial Intelligence (AI) can be effectively integrated into XR to enable more natural, voice-driven, and context-aware interactions. Building on these outcomes, future XR applications will offer more personalized and immersive experiences through understanding the user needs and preferences.
Looking ahead, the integration of AI into XR applications is expected to increasingly rely on advanced human-AI collaborative frameworks, enabling the emergence of XR hybrid intelligent systems in which human expertise and AI capabilities are tightly coupled within immersive environments. Immersive spaces allow AI systems to observe user actions, gestures, speech, and attention in context, while users can directly perceive and interact with AI behavior. XR hybrid intelligent systems will facilitate not only intelligent interactions and immersive experiences, but also personalised content generation that takes into account users’ knowledge, skills, and performance. For example, a virtual training assistant could dynamically generate immersive scenarios that specifically target and bridge individual knowledge gaps. Beyond content personalisation, XR hybrid intelligent systems will leverage user feedback to continuously adapt and improve system performance in alignment with evolving user needs. Key feature in these systems is to empower users in guiding, correcting, and shaping AI behavior over time.
In the context of XR hybrid intelligence, Explainable AI (XAI) and human-in-the-loop technologies become critical enablers. As XR systems grow more autonomous and intelligent, users should be able to understand why an AI agent behaves in a certain way. XAI techniques can make AI reasoning transparent within XR environments, for instance by visualising decision pathways, highlighting relevant contextual cues, or providing natural language explanations. Moreover, the integration of XAI into XR environments introduces new opportunities for sense-making and reflection. By embedding explanations directly into immersive experiences, users can not only observe system outcomes but also explore underlying reasoning processes in a spatial and interactive manner. This can transform XR from a passive visualization medium into an active cognitive workspace where users learn with AI, rather than merely from it. In parallel, human-in-the-loop approaches encourage user feedback, enabling continuous system adaptation, performance improvement, and trust calibration. Such transparency not only builds trust but also supports effective human-AI collaboration.
In conclusion, the VoxReality project represents an important step toward intelligent XR environments grounded in natural interaction. The future of XR will build upon these foundations by embracing hybrid intelligence, transforming XR into a space where humans and AI systems work together seamlessly, transparently, and creatively. This convergence will redefine how we interact with digital worlds: not as users of technology, but as partners within intelligent immersive environments.

Konstantia Zarkogianni
Associate Professor of Human-Centered AI
Department of Advanced Computing Sceinces
Maastricht University

