Partner Interview #6 with Synelixis Solutions S.A.

In our sixth installment of the Partner Interview series, we sit down with Stavroula Bourou, a Machine Learning Engineer at Synelixis Solutions S.A., to explore the company’s vital role in the VOXReality project. Synelixis, a leader in advanced technology solutions, has been instrumental in developing innovative virtual agents and immersive XR applications that are transforming how we experience virtual conferences. In this interview, Stavroula shares insights into their groundbreaking work and how they are driving the future of communication in the XR landscape.

Can you provide an overview of your organization's involvement in the VOXReality project and your specific role within the consortium?

Synelixis Solutions S.A. has been an integral part of the VOXReality project from its inception, serving as one of the original members of the proposal team. Our organization brings a wealth of experience to the table, participating in numerous EU-funded research projects and providing cutting-edge technology solutions.

In the VOXReality project, our roles span several domains, significantly enhancing the project’s success. One of our pivotal contributions is the development of a virtual agent designed for use in virtual conferences. This agent is designed to be user-friendly and non-intrusive, respecting user requests and preferences while assisting users by providing navigational help and timely information about the conference schedule, among other tasks. Its design ensures that interactions are helpful without being disruptive, allowing users to engage with the conference content effectively and comfortably.

Additionally, we have developed one of the three VOXReality XR Applications—the VR Conference application. This application recreates a professional conference environment in virtual reality, complete with real-time translation capabilities and a virtual assistant. It enables users to interact seamlessly in their native languages, thanks to VOXReality’s translation services, thus breaking down language barriers. Furthermore, the virtual agent provides users with essential information about the conference environment and events, enhancing their overall experience.

Furthermore, we have outlined deployment guidelines for the VOXReality models for four different methods: source code, Docker, Kubernetes, and ONNX in Unity. These guidelines are designed to facilitate the integration of VOXReality models into various applications, making the technology accessible to a broader audience.

How do you envision the convergence of NLP and CV technologies influencing the Extended Reality (XR) field within the context of the VOXReality initiative?

In the context of the VOXReality initiative, the convergence of Natural Language Processing (NLP) and Computer Vision (CV) technologies is poised to revolutionize the Extended Reality (XR) field. By integrating NLP, we enhance communication within XR environments, making it more intuitive and effective. This allows users to interact with the system using natural language, significantly improving accessibility and engagement. Additionally, this technology enables users who speak different languages to communicate with one another or to attend presentations and theatrical plays in foreign languages, thus overcoming language barriers and reaching a broader audience. Similarly, incorporating CV enables the system to understand and interpret visual information from the environment, which enhances the realism and responsiveness of both virtual agents and XR applications.

Together, these technologies enable a more immersive and interactive experience in XR. For example, in the VOXReality project, NLP and CV are being utilized to create environments where users can naturally interact with both the system and other users through voice commands. This integration not only makes XR environments more user-friendly but also significantly broadens their potential applications, ranging from virtual meetings and training sessions to more complex collaborative and educational tasks. The synergy of NLP and CV within the VOXReality initiative is set to redefine user interaction paradigms in XR, making them as real and responsive as interacting in the physical world.

What specific challenges do you anticipate in developing AI models that seamlessly integrate language as a core interaction medium and visual understanding for next-generation XR applications?

One of the primary challenges in developing AI models that integrate language and visual understanding for next-generation XR applications is creating a genuinely natural interaction experience. Achieving this requires not just the integration of NLP and CV technologies but their sophisticated synchronization to operate in real-time without any perceptible delay. This synchronization is crucial because even minor lags can disrupt the user experience, breaking the immersion that is central to XR environments. Additionally, these models must be adept at comprehensively understanding and processing user inputs accurately across a variety of dialects. The complexity of processing multilingual and dialectical variations in real-time adds significant complexity to AI model development.

Moreover, another significant challenge is the high computational demands required to process these complex AI tasks in real-time. These AI models often need to perform intensive data processing rapidly to deliver seamless and responsive interactions. Optimizing these models to function efficiently across different types of hardware, from high-end VR headsets to more accessible mobile devices, is crucial. Efficient operation without compromising performance is essential not only for ensuring a fluid user experience but also for the broader adoption of these advanced XR applications. The ability to run these complex models on a wide range of hardware platforms ensures that more users can enjoy the benefits of enriched XR environments, making the technology more inclusive and widespread.

All these challenges are being addressed within the scope of the VOXReality project. Stay tuned to learn more about our advancements and breakthroughs in this exciting field.

How do you plan to ensure the adaptability and learning capabilities of the virtual agents in varied XR scenarios?

To ensure the adaptability and learning capabilities of our virtual agents in varied XR scenarios within the VOXReality project, we are implementing several key strategies. Firstly, we utilize advanced machine learning techniques to equip the virtual agents with the ability to learn from user interactions and adapt their responses over time. These techniques, including deep learning and large language models (LLMs), enable the virtual agents to analyze and interpret vast amounts of data rapidly, thereby improving their ability to make informed decisions and respond to user inputs in a contextually appropriate manner, making them more intuitive and responsive.

Moreover, we are actively creating and curating a comprehensive dataset that reflects the real-world diversity of XR environments. This dataset includes a wide array of interactions, environmental conditions, and user behaviors. By training our virtual agents with this rich dataset, we enhance their ability to understand and react appropriately to both common and rare events, further boosting their effectiveness across various XR applications.

Through these methods, we aim to develop virtual agents that are not only capable of adapting to new and evolving XR scenarios but are also equipped to continuously improve their performance through ongoing learning and interaction with users.

In the long term, how do you foresee digital agents evolving and becoming integral parts of our daily lives, considering advancements in spatial and semantic understanding through NLP, CV, and AI?

In the long term, we foresee digital agents evolving significantly, becoming integral to our daily lives as advancements in NLP, CV, and AI continue to enhance their spatial and semantic understanding. As these technologies develop, digital agents will become increasingly capable of understanding and interacting with the world in ways that are as complex as human interactions.

With improved NLP capabilities, digital agents will be able to comprehend and respond to natural language with greater accuracy and contextual awareness, making interactions feel more conversational and intuitive. This advancement also includes sophisticated translation capabilities, enabling agents to bridge language barriers seamlessly. As a result, they can serve global user bases by facilitating multilingual communication, which enhances accessibility and inclusivity. This will allow them to serve in more personalized roles, such as personal assistants that can manage schedules, respond to queries, and even provide companionship with a level of empathy and understanding that closely mirrors human interaction.

Advancements in CV will enable these agents to perceive the physical world with enhanced clarity and detail. They’ll be able to recognize objects, interpret scenes, and navigate spaces autonomously. This will be particularly transformative in sectors like healthcare, where agents could assist in monitoring and providing care, and in retail, where they could offer highly personalized shopping experiences.

Furthermore, as AI technologies continue to mature, we will see digital agents performing complex decision-making tasks, learning from their environments, and operating autonomously within predefined ethical guidelines. They will become co-workers, caregivers, educators, and even creative partners, deeply embedded in all aspects of human activity.

Ultimately, the integration of these agents into daily life will depend on their ability to operate seamlessly and discreetly, enhancing our productivity and well-being without compromising our privacy or autonomy. As we advance these technologies, we must also consider the ethical implications and ensure that digital agents are developed in a way that is beneficial, safe, and respectful of human values.

Picture of Stavroula Bourou

Stavroula Bourou

Machine Learning Engineer at Synelixis Solutions SA

Twitter
LinkedIn
Shopping Basket