nordwood-themes-q8U1YgBaRQk-unsplash (1)

Economic Insights into the Growth of Immersive Technologies Markets

The evolution from the early days of computers to our current era marks a noticeable shift towards more straightforward and engaging interactions with technology. This transformation has been particularly propelled by the rise of augmented and virtual reality (AR/VR) technologies, along with the profound impact of the COVID-19 pandemic, hastening the acceptance of digital realms and the concept of the metaverse.

As of 2021, the global immersive technology market reached a valuation of USD 21.66 billion, and projections indicate a substantial increase to approximately USD 134.18 billion by 2030. This signifies a robust compound annual growth rate (CAGR) of 22.46% from 2022 to 2030. Deloitte Global forecasts a remarkable 50% increase in the virtual reality market, predicting a revenue of US$ 7 billion globally in 2023, up from 2022’s US$ 4.7 billion. 

These figures present a promising outlook for a continually expanding market, resilient even in the face of the recent COVID-19 pandemic. Notably, as a survey conducted in 2020 by McKinsey & Co. highlights, the global pandemic has accelerated the development and has encouraged the adoption of VR and AR technology, which facilitated the completion of activities previously only done in person. Companies have started to use these technologies more intensively because they allow daily working tasks to be performed remotely, while “offering endless possibilities for better learning, productivity and creativity outcomes in every way” .

The role of Europe

Looking at the European XR industry, it is anticipated to reach between €35 billion and €65 billion by 2025, with a gross added value ranging from €20 billion to €40 billion. This growth is expected to directly create employment for up to 860,000 people. The momentum in the VR/AR sector can be attributed to two key factors: the availability of advanced XR technologies, including more comfortable and affordable headsets, and the rising demand for enterprise XR solutions as businesses recognise their potential benefits.

The European Union has been a staunch supporter of digitalisation and XR technology development, evident through funding innovative digital research projects under the Horizon 2020 and Horizon Europe programs. Additionally, the European XR ecosystem thrives with events, initiatives, and associations like EuroXR, uniting national associations, individual members, and companies interested in XR. The EU’s overarching goal is to enhance digital literacy, transforming Europe into a thriving and highly competitive hub for XR activities.

However, recent geopolitical events, particularly the war in Ukraine, have left a significant impact on the immersive technology market. Sanctions and the withdrawal of companies from the Russian market, including <Microsoft’s suspension of HoloLens sales, have disrupted the AR landscape. The effects have rippled into the European market, with Central and Eastern European countries experiencing the most substantial consequences, given their slower and more price-sensitive AR market.

Conclusion

The trajectory of immersive technology, fuelled by advancements in XR technologies and the transformative influence of the COVID-19 pandemic, showcases a robust and continually expanding global market. Projections underscore the sector’s resilience and potential for substantial growth, with the European Union playing a pivotal role in fostering digital innovation and XR development. 

Despite geopolitical challenges impacting the industry, the commitment to digital literacy and strategic support for XR activities position Europe as a competitive hub. As we navigate this dynamic landscape, it becomes evident that immersive technologies are not merely trends but integral components shaping the future of how we interact with and perceive the digital realm.

References

https://www.precedenceresearch.com/immersive-technology-market
2 Lee P., Arkenberg C., Stanton B., Cook A., Will VR go from niche to mainstream? It all depends on compelling VR content, in Deloitte’s Technology,
Media and Telecommunications Predictions, 2023, p.71
https://www.mckinsey.com/capabilities/strategy-and-corporate-finance/our-insights/how-covid-19-has-pushed-companies-over-the-technology- tipping-point-and-transformed-business-forever#/
4 Globally Cool B.V., The European market potential for VR and AR services, 2021
5 World Economic Forum, Immersive Media Technologies: The Acceleration of Augmented and Virtual Reality in the Wake of COVID-19, 2022, p. 8
6 Ecorys, XR and its potential for Europe, 2021; Vigkos A., Bevacqua D., Turturro L., Kuehl S., The Virtual and Augmented Reality Industrial Coalition,
Ecorys, 2022
https://blog-idceurope.com/how-the-russia-ukraine-war-is-impacting-the-human-augmentation-market-in-europe/

Picture of Alberto Casanova

Alberto Casanova

Alberto Casanova is an EU Project Manager in the R&D department of Maggioli Group, one of Italy's foremost companies providing software and digital services for Public Administrations. With more than five years of experience in this role, Alberto is engaged in proposal preparation and project management; he specializes in Business and Exploitation activities, with a specific focus on European Projects. He has successfully led numerous projects in diverse fields, including e-Health, Security, Industry 4.0, Cloud Technologies, and Immersive Technologies. Alberto is currently involved in the project coordination of the VOXReality project, where he takes the lead in overseeing exploitation activities.

Twitter
LinkedIn
remy-gieling-Zf0mPf4lG-U-unsplash (1)

Redefining Connection: How VR and Artificial Intelligence can Transform BusinessNetworking

In the evolving landscape of technology, networking in professional and business events is not just about exchanging business cards or attending physical conferences. Thanks to pioneering developments in Virtual Reality (VR) and Artificial Intelligence (AI), networking is undergoing a transformative shift. These technologies are not just changing the way we connect but are redefining the entire landscape of professional networking. Let’s explore how VR and AI are becoming necessary tools for modern networking, creating immersive user experiences in virtual spaces.

Photo by Remy Gieling on Unsplash

The Immersive World of VR in Networking

Imagine attending a global conference event from the comfort of your home. Virtual Reality makes this possible, by creating immersive environments where you can interact with others in a 3D space. This interaction can replicate the physical presence as well as the feel of an in-person event. 

It’s more than a video call; it’s a virtual presence. You can “shake
hands” with participants from another continent, engage in one-on-one conversations, attend virtual trade shows or a keynote speech as if you’re there, enhancing the sense of connection and engagement. VR in networking bridges the gap between physical distance and personal connection, offering an engaging and interactive experience that traditional video conferencing cannot match.

In the immersive VR environment, diverse virtual spaces can cover every aspect of professional interaction. To begin with, the Lobby can be characterised as a welcoming gateway, helping attendees navigate the event. The Trade Show Area serves as an innovative exhibition hall for businesses to showcase their products and engage in discussions with other participants about their offered solutions. 

The Conference Area replicates the experience of attending seminars and lectures, enhancing knowledge sharing. For formal meetings and collaborations, the Business Area provides the perfect setting, while the Social Area offers a relaxed setting for casual conversations. Together, these spaces create a comprehensive and immersive VR networking experience, that mimics the real-life spaces of in-person networking events.

Image by Thomas Meier from Pixabay

Breaking Language Barriers with AI Translation

The human-to-human interaction of global networking events faces a significant challenge: the language barrier. AI-driven translation services are a game-changer in this aspect, especially in virtual environments. Real-time translation tools, integrated within the VR platform, enable seamless communication between participants speaking different languages, allowing for a truly global networking experience. This not only enhances understanding but also opens doors to cross-cultural collaborations that were previously hindered by language constraints.

The Smart AI-based Assistant

The human-to-machine interaction in VR spaces can be also enhanced by the AI-based assistant, a sophisticated tool designed to aid attendees as they navigate in the virtual environment. This intelligent agent provides real-time navigation instructions, ensuring that participants can move effortlessly between different areas of the event. Moreover, it offers detailed program information, helping attendees maximise their time by suggesting sessions
that align with their interests and professional goals.
In addition, the AI assistant can provide comprehensive trade show information, from booth locations to exhibitor details, allowing users to strategically plan their visits.

Embracing the Future of Business Networking with VOXReality

The future of networking and engagement in virtual events lies in the harmonious integration and synergy of VR and AI. VOXReality project manages to enhance the virtual networking by creating immersive VR platform that mimic the real-life spaces of events.

Moreover, in this VR platform, the advanced AI-based VOXReality models have been integrated to provide real-time language translation and assistance, ensuring a smooth and inclusive experience for all participants. This intelligent system can facilitate cross-cultural engagement by overcoming language barriers and guide users through the virtual event space with ease. The result is a dynamic and accessible environment where every interaction is optimised, pushing business networking into a new era of global connectivity.

To conclude, as VR and AI technologies continue to evolve, they promise to make virtual networking more accessible, effective, and inclusive, transcending geographical and linguistic limitations. Together, in VOXReality project, we embrace these innovations and experience the future of professional networking.

Picture of Stavroula Bourou

Stavroula Bourou

Hi! My name is Stavroula Bourou and I am a Machine Learning Engineer. I received my Master of Engineering from National Technical University of Athens (NTUA) in 2015. Additionally, I hold a MSc Degree on Geoinformation Science with specialisation in Computer Vision from Technical University Berlin in 2019.​ Currently, I am involved in projects funded by the European Commission at Synelixis Solutions S.A. At VOXReality, my contributions include the development of a context-aware, AI-based dialogue system as well as the VR Conference use case

Twitter
LinkedIn
WhatsApp Image 2023-11-29 at 18.47.06

VOXReality Unleashes XR Revolution: NLP Mastery and Tech Wizardry Take Center Stage at Immersive Tech Week 2023!

Hold onto your headsets, folks!  

From November 28th until December 1st, the VOXReality rockstars unleashed a tsunami of innovation at the mind-blowing Immersive Tech Week 2023 in Rotterdam. Imagine a tech utopia where the main stakeholders of VR, AR, AI, haptics, and Web4 gather to spread the magic on the latest tech wonders. Yeah, that’s the vibe!

Our team at VOXReality weren’t just attendees. First, In a strategic session during Immersive Tech Week 2023, VOXReality participated in the F6S Innovation led presentation titled “Connecting Founders to Horizon Europe Funding Opportunities,” spotlighting the integral role played by European funds in advancing XR innovation and fostering collaborative ventures. 

These funding opportunities assume a critical role in bolstering startup endeavours, providing the requisite resources for the development of cutting-edge XR solutions. The overarching goal was to enhance learning experiences, cultivate novel opportunities, and establish immersive technology as a conduit for societal connectivity. VOXReality shared the with other XR projects: SERMASXR2LearnXR4ED and CORTEX2

But that’s not all! Picture this: a special discussion panel titled “Maximizing Efficiency in NLP Model Training and XR Environments: Real-Time Language Processing for Seamless Interactions.” A warp-speed journey into the heart of Natural Language Processing (NLP) within the universe of extended reality (XR).

In the realm where Natural Language Processing (NLP) intersects with immersive technology, a narrative unfolds. NLP, driving change in digital interactions, has reshaped our virtual landscape. However, the resource-intensive nature of NLP models raises concerns about environmental impact, energy consumption, and scalability.

Extended Reality (XR) environments are now integrated with NLP tools, influencing virtual meetings, training simulations, and immersive experiences. The demand is clear: efficiency, optimisation, and seamless fusion of language models to minimise latency and enhance the user experience.

The roundtable assembled a group of experts to explore strategies for optimising training pipelines, addressing environmental concerns, and developing sustainable approaches. The focus extended to improving efficiency, reducing latency, and maximising the user experience.

 

Speakers included (As pictured from left to right):

This assembly of experts delved into challenges and solutions, navigating the delicate balance between cutting-edge advancements and environmental considerations. The discussion explored resource consumption challenges and sustainable approaches, with a focus on efficiency, reduced latency, and an enhanced user experience. The session provided insights into the future of NLP in XR, guided by the expertise of the speakers. 

 

NLP in XR environments highlights

The VOXReality team of experts explored the intersection of Natural Language Processing (NLP) and Extended Reality (XR). Nour Fendri introduced XR, detailing its applications in both consumer and industrial contexts, emphasising AR’s role in manufacturing and VR’s efficacy in training.

Stavroula delved into NLP, defining it and highlighting impactful applications such as speech recognition, language translation, conversational agents, and integration with visual processing like image captioning and Visual Question Answering (VQA). Olga underscored the importance of natural language in XR for intuitive interactions, enhancing user immersion and presence.

Afterwards, it was discussed industry pain points, including user acceptance in manufacturing, focusing on the need for better XR interaction methods. complemented by Olga, whom highlighted challenges in interpersonal communication in multiuser XR environments, especially in AR and VR conferences, stressing the necessity for real-time translation.

Stavroula and Jiahuan explored challenges in integrating NLP into XR, focusing on resource-intensive models impacting system performance and causing latency. Petros explained latency thresholds and the need for real-time processing, crucial for user experience in XR environments. Jiahuan detailed the challenges of large language models, addressing their computational intensity.

Petros and Yusuf discussed the evolution of AI models, citing their exponential growth in scale over the past five years driven by increased data availability and improved hardware. Yusuf emphasised environmental concerns, revealing the significant CO2 emissions associated with large models like GPT-3 and GPT-4.

Yusuf proposed solutions to minimise environmental impact, including renewable energy for data centres, data efficiency, and efficient model design. Petros outlined optimisation techniques like pruning, quantisation, knowledge distillation, and transfer learning.

Yusuf, Petros, and Olga provided insights into VOXReality’s project, discussing optimisation efforts and the use of pre-trained models in their XR integration. The team discussed VOXReality’s real-world use cases, including AR theatre and VR conferences, showcasing the enhanced communication and human-to-machine interaction facilitated by NLP.

In concluding remarks, the panel discussed the present availability of open-source tools, pre-trained models, and cloud services. In conclusion, the session emphasised  the importance of XR for industry innovation, promoting a natural way to interact with the digital world for increased efficiency and user adoption. The roundtable offered a comprehensive exploration of the challenges, solutions, and future prospects at the nexus of NLP and XR.

As we reflect on Immersive Tech Week 2023, we are truly grateful for the opportunity to connect with industry luminaries and pioneers. Interacting with thought leaders has provided invaluable insights into the dynamic landscape of immersive technologies. The exchange of ideas and networking opportunities at this event has opened up limitless possibilities, enhancing the overall experience for all participants.

Immersive Tech Week 2023 served as an outstanding platform for VOXReality to demonstrate our unwavering commitment to advancing XR and NLP. In the ever-evolving technological landscape, events like these play a vital role in fostering collaboration, learning, and idea exchange. Our team’s active contribution to the NLP discussion within XR environments reflects our dedication to pushing the boundaries of immersive technology.

The resounding success of Immersive Tech Week 2023 has left an indelible mark on both attendees and the VOXReality Team. This event not only showcased the current state of immersive technologies but also teased the exciting possibilities awaiting the industry. A heartfelt thank you to the entire VRDays team for orchestrating this monumental event and providing a space for innovation and collaboration.

As we bid farewell to the last event of the year, our excitement is already building for the adventures that Immersive Tech Week 2024 holds. Join us on this journey as we continue to push the boundaries of immersive technology and explore new frontiers in XR and NLP.

See you next year!

ilgmyzin-agFmImWyPso-unsplash

From Fantasy to Reality: The Enchantment of GPT-2

“GPT-2 transformative”, developed by OpenAI, stands as a groundbreaking achievement in the realm of artificial intelligence and natural language processing. GPT-2, short for “Generative Pre- trained Transformer 2” excels at predicting the next word in a sequence of text, showcasing its remarkable language modelling capabilities. GPT-2 was introduced by OpenAI in a research paper titled “Language Models are Unsupervised Multitask Learners” which was published on February 14, 2019. The paper presented the architecture and capabilities of GPT-2, marking its official debut in the field of natural language processing.

https://medium.com/geekculture/auto-code-generation-using-gpt-2-4e81cb05430

What sets GPT-2 apart is its ability to generate coherent and contextually relevant text passages based on a given prompt. Trained on vast amounts of internet text, GPT-2 learns to predict and generate text by capturing intricate patterns and structures within language. This pre-training equips GPT-2 with an extensive understanding of grammar, vocabulary, and context, enabling it to generate human-like text, answer questions, complete sentences, and even engage in creative writing tasks. GPT-2’s capacity for generating high-quality, contextually appropriate text has found applications in various fields, including content creation, conversational agents, and language translation, making it a versatile tool in the domain of natural language processing.

Content creators leverage GPT-2 to automate writing tasks, generate marketing copy, or brainstorm ideas. In conversational AI, it serves as the backbone for chatbots and virtual assistants, enabling them to engage in more natural and context-aware conversations with users. Moreover, GPT-2 has proven invaluable in translation tasks, where it can convert text from one language to another while preserving the original context and meaning.

The impact of GPT-2 extends beyond its ability to generate text. Its underlying architecture, the transformer model, has inspired subsequent developments in natural language processing and machine learning. Researchers and developers continue to explore its potential, pushing the boundaries of what AI-powered language models can achieve, making GPT-2 a cornerstone in the evolution of artificial intelligence and human-computer interaction.

The combination of language models like GPT-2 with vision transformers represents a powerful approach in the realm of multimodal AI, where both textual and visual information are processed together. By integrating GPT-2 with vision transformers, complex tasks involving both text and images can be tackled, leading to advancements in areas such as image captioning, visual question answering, and more.

https://www.kaggle.com/code/manann/generating-quotes-using-gpt-2-language- model/notebook

Empowering AI: The Fusion of GPT-2 and Vision Transformers Unleashes Multimodal Brilliance

The combination of language models like GPT-2 with vision transformers represents a powerful approach in the realm of multimodal AI, where both textual and visual information are processed together. By integrating GPT-2 with vision transformers, complex tasks involving both text and images can be tackled, leading to advancements in areas such as image captioning, visual question answering, and more. Here’s how GPT-2 can be combined with vision transformers:

  1. Multimodal Inputs: Vision transformers process images into a format understandable by transformers. These processed visual embeddings can be integrated into GPT-2 as additional input alongside text. This creates a multimodal input where GPT-2 receives both textual and visual information.
  2. Text-Image Context Understanding: GPT-2 excels at understanding textual context. By incorporating visual information, it gains the ability to comprehend the context of images, allowing it to generate more informed and contextually relevant textual responses. For example, when describing an image, the model can generate detailed and coherent textual descriptions.
  3. Applications in Image Captioning: In image captioning tasks, where an AI system generates textual descriptions for images, GPT-2 can leverage the visual embeddings provided by vision transformers to create rich and descriptive captions. This ensures that the generated captions not only describe the visual
    content accurately but also exhibit a natural language flow.
  4. Visual Question Answering (VQA): In VQA tasks, where the AI system answers questions related to images, combining GPT-2 with vision transformers allows for a more nuanced understanding of both the question and the image. This enables the model to provide contextually appropriate answers, taking into
    account the visual elements present in the image.
  5. Enhanced Creativity and Understanding: By understanding both text and images, the combined model can exhibit a higher level of creativity and nuanced understanding. It can generate creative stories inspired by images or
    answer questions about images with more depth and insight.
  6. Training Paradigms: During training, the multimodal model can be trained on tasks that involve both textual and visual inputs. This joint training enhances the model’s ability to learn the intricate relationships between textual and visual data, improving its performance on multimodal tasks.

Previous versions and development - This is where it all begins

GPT-2, the second version of the Generative Pre-trained Transformer developed by OpenAI, introduced several key differences and improvements compared to its predecessor, GPT-1:

  1. Scale and Size: GPT-2 is much larger than GPT-1, both in terms of the number of parameters and the model&#39;s overall size. GPT-2 has 1.5 billion or 1.5k million parameters, making it significantly larger than GPT-1, which had 117 million parameters. This increase in scale allows GPT-2 to capture more complex patterns in the data it is trained on.
  2. Performance: Due to its increased size, GPT-2 demonstrated superior performance in various natural language processing tasks. It exhibited a better understanding of context, allowing it to generate more coherent and contextually relevant text. The larger model size contributed to improved fluency and the ability to handle a wider range of topics and prompts
    effectively.
  3. Few-Shot and Zero-Shot Learning: GPT-2 showcased the ability to perform few-shot and even zero-shot learning. Few-shot learning means the model can generalise and generate text given a few examples or prompts. Zero-shot learning means it can generate text for tasks it has never seen before, just based on a description of the task.
  4. Controllability: GPT-2 allowed for more fine-grained control over the generated text. OpenAI demonstrated this control by conditioning the model on specific instructions, styles, or topics, resulting in text that adheres to those constraints.
  5. Ethical and Safety Concerns: The release of GPT-2 raised significant ethical concerns regarding the potential misuse of the technology for generating deceptive or malicious content. Due to these concerns, OpenAI initially refrained from releasing the full model but later decided to make it publicly available.
  6. Research Focus: GPT’s release sparked discussions in the research community about responsible AI development, the potential societal impact of highly advanced language models, and the ethical considerations in AI research. This led to increased awareness and ongoing research into the ethical use of such technologies
https://www.revechat.com/blog/chatbot-quotes/

Epilogue: Embracing the Language Revolution

As we conclude this exploration of GPT-2&#39;s transformative impact on our world, it becomes evident that we stand on the precipice of a linguistic revolution. The emergence of GPT-2 not only expanded the horizons of artificial intelligence but also ushered in a new era of human- machine interaction. Its remarkable ability to generate coherent, contextually rich text has opened doors to unprecedented possibilities, from revolutionising content creation and translation services to empowering educators and journalists. 

With great power, however, comes great responsibility. As we continue to integrate advanced language models like GPT-2 into our daily lives, it is crucial to navigate the ethical waters with vigilance. Striking a balance between innovation and ethical application will be the cornerstone of our journey forward. Let us embrace this linguistic revolution with wisdom and empathy, ensuring that the transformative potential of GPT-2 and its successors is harnessed for the betterment of humanity, heralding an era where the boundaries between human creativity and artificial intelligence blur, fostering a future where the art of communication knows no bounds.

Picture of  Giorgos Papadopoulos

Giorgos Papadopoulos

Associate Researcher at Centre for Research & Technology Hellas (CERTH)

Twitter
LinkedIn
VOXReality template LinkedIn v3 (3).pdf (1920 x 1080 px) (LinkedIn Post) (10)

VOXReality Newsletter #2: The first year of VOXReality 

The latest VOXReality newsletter details the recent strategic developments and research milestones as the project advances its mission to integrate AI with Extended Reality.

Highlights:

  • General Assembly in Amsterdam: Hosted by the Distributed & Interactive Systems (DIS) Group at CWI, the consortium met to synchronize R&D efforts, review internal technology demos, and plan the upcoming pilot phases for all work packages.

  • Voice-Driven Virtual Events: The project is developing new AI models to enable language-driven interactions within XR. This research aims to make virtual and hybrid events more natural, accessible, and engaging for global audiences.

  • User-Centered Design: Development is focused on a comprehensive assessment of user requirements to ensure that VR Conferences, Augmented Theatres, and Training Assistants address specific industry needs and user preferences.

  • Platform & Hardware Exploration: Technical updates include the use of Mozilla Hubs for immersive social experiences and a guide to selecting AR devices and modes to suit evolving professional environments.

Immersive Tech Week 2023

The VOXReality team will be participating in Immersive Tech Week 2023 in Rotterdam, hosting sessions on:

  • Optimizing NLP model training for XR to reduce latency and environmental impact.

  • Navigating Horizon Europe funding opportunities for XR startups and founders.

voxreality-ga2

A Recap of the VOXReality General Assembly and Plenary Meeting  

The DIS (Distributed & Interactive Systems) Group had the immense pleasure in hosting the three-day long General Assembly and Plenary Meeting for VOXReality Project from 9-11th October 2023 at Centrum Wiskunde & Informatica (CWI), Science Park, Amsterdam. As teamwork and collaboration are the driving forces behind groundbreaking projects, we were thrilled to collaborate with our use-case and technology partners to come together and share insights to plan the upcoming R&D, pilots, foreseeable challenges and internal demos for our XR technologies.

The first day kicked off with a hands-on workshop tailored for Immersive Tech Week, 2023. It was an exciting opportunity to plan our round table discussion session for ITW, diving into the intricate world of XR, providing valuable insights and expertise.  

The second day began with a warm welcome and a comprehensive overview of the day’s agenda mainly focusing on each work package, related risks and their mitigation strategies. All consortium partners provided updates for each use case, highlighting the dynamic range of applications within VOXReality. The team also engaged in a discussion on Ethics and Rights considerations within VOXPress Analysis, navigating the ethical landscape of XR is vital to ensure responsible and sustainable use.  As the sun set over Amsterdam for day 2, participants enjoyed a Boat dinner and canal city tour, allowing for more informal discussions and networking in a picturesque setting. 

The final day of the event was dedicated to the project’s management and planning for the road ahead. Throughout the three days, one-on-one discussions were held, enabling participants to delve deeper into specific topics and challenges. It was a fruitful way to address individual concerns and establish further collaboration. As the event drew to a close, there was a sense of accomplishment and optimism in the air. DIS group is indeed pleased to host this productive plenary meeting. The VOXReality GA event demonstrated the power of collaboration, innovation, and dedication within the consortium.  

Stay tuned for more updates on VOXReality!  

Picture of Moonisa Ahsan

Moonisa Ahsan

Moonisa Ahsan is a post-doc researcher in the DIS (Distributed & Interactive Systems) Group of CWI (NWO-I). In VOXReality, she is contributing in understanding next-generation applications within Extended Reality (XR), and to better understand user needs and leveraging that knowledge to develop innovative solutions that enhance the user experience in all three use-cases. She is a Marie-Curie Alumna and her scientific and research interests are Computer Graphics (CG), Interface Design, Cultural Heritage (CH), Human-Computer-Interaction (HCI), and User Experience (UX).

Twitter
LinkedIn
BASF_Augmented Reality Training_2

Revolutionising Industrial Training with Augmented Reality: A Glimpse into the VOXReality Project

Augmented Reality (AR), with its powerful immersive capabilities, is painting a new future for industrial environments. By merging the physical and digital worlds, AR provides a groundbreaking platform for prototype design, industrial site planning, operational training, and safety promotion. It allows content to be visualized in a way that is not possible in traditional environments, and provides the ability to interact virtually with complex machinery in a risk-free environment, making it a significant asset in industrial training.

The transformative VOXReality project seamlessly integrates AR with innovative technologies such as XR Streaming and advanced Artificial Intelligence (AI) language models. This integration is designed to enhance user training experiences by providing interactive, high-performance interactions with virtual personal assistants that provide guidance and support during AR-based training.

Industrial AR Training with XR Streaming and 3D CAD Visualization

Augmented Reality training can take many forms. Hololight’s XR Engineering application, Hololight Space, addresses the common need for practical knowledge and experience in machine assembly in industrial environments. The solution goes beyond traditional visualisation, allowing users to engage with specific sub-components of machines that are critical to assembly instructions. 

Powered by Hololight Stream, the company’s proprietary XR Streaming technology, Hololight Space provides remote rendering and application streaming capabilities, ensuring powerful and high-quality AR experiences while overcoming the processing limitations of XR mobile devices. This feature integrates VOXReality’s AI models, enriching AR training experiences using Microsoft’s HoloLens 2 AR headset as the medium for visualising virtual content and communication.

Human-Centric Training: AI-Driven Virtual Personal Assistants

Users engaged in AR assembly training often benefit from the presence of an instructor to oversee the steps of the training. However, it is not always possible or operationally practical to have an instructor present. Nevertheless, users should be able to have a source of guidance during their training experience when needed. VOXReality aims to create a truly novel addition to AR assembly training by integrating AI language models developed by other VOXReality consortium partners into the Hololight Space application.

In conjunction with these AI models, users participating in the training will be provided with a unique support system in the form of a virtual personal assistant. This assistant monitors the trainee’s progress and has the ability to step in and offer support and guidance when the trainee needs it. The virtual personal assistant to be developed in this project will also be able to interact with the student throughout the training process. Interacting with this virtual personal assistant creates a personalised support system during training that enhances learning during the AR experience. By using this virtual personal assistant, users will be able to receive live feedback and immediate support.

Strengthening Industrial Performance and Safety

The combination of AI and AR developed by the VOXReality project forms a unique industrial training solution that provides immediate virtual support and allows trainees to learn at their own pace. The use of Augmented Reality for training minimises the need for physical machines and additional personnel, fostering a learning environment that is more efficient, effective and safe.

These AI-enabled dialog systems and AR technologies form a synergy that not only reduces the time required for complete training, but also helps provide the necessary hands-on skills that trainees need to gain a concrete understanding of the tasks to be performed, and enhances the quality of learning by ensuring immediate feedback and support. 

The use of the dialog system and AI models is critical to this step, as the virtual personal assistant provides immediate, personalised support and engagement that can reduce the time required for complete training. Having an assistant to interact with also provides peace of mind, as immediate feedback supports the learning process. This revolutionary approach promises a well-trained workforce that contributes to improved performance and safety in industrial environments.

VOXReality – Advanced AR Training Experiences

The VOXReality project is a pioneering effort to expand the potential of AR in industrial training by incorporating innovative XR Streaming and advanced AI models. The project promises a future where AR is not just a visualisation tool, but a rich, interactive and immersive learning experience. The integration of AR and AI not only changes the way we learn and interact in industrial environments, but also lays the foundation for a safer and more efficient industrial future.

Picture of Carina Pamminger

Carina Pamminger

Carina Pamminger is the Head of Research at Holo-Light, an innovative company and global leader in Extended Reality (XR) technologies. Carina brings over ten years of research experience across several disciplines, ranging from the games and transportation industry to augmented and virtual reality sectors. As Head of Research at Holo-Light, Carina actively engages in research projects with various academic, industrial, and non-profit partners such as BMW, Engineering Ingegneria Informatica, and more. Her main interest areas are in investigating novel ways of leveraging innovative XR technologies to further enable and enhance the Industrial Metaverse.

Twitter
LinkedIn
engin-akyurt-Z2xz7U1tSfo-unsplash (1)

Embracing Humanity in Virtual Realms: A Journey Towards Inclusivity and Accessibility

In the bustling realm of technology, where advancements seem to leapfrog one another, the emergence of extended reality (XR) technologies has marked a significant milestone. As we delve into the vast possibilities that XR brings, it’s crucial to remember the heart of this revolution: the people. In a world rapidly embracing immersive technologies, the human-centered approach stands as the guiding light, ensuring that progress aligns with inclusivity and accessibility, fostering a society where no one is left behind.

Immersive Technologies: A Human-Centric Odyssey

In the corridors of innovation, the team at VOXReality recognises the paramount importance of putting humanity at the forefront of our extended reality projects. With a profound understanding that technology should enhance lives rather than alienate, we have embarked on a human-centric odyssey, ensuring that our immersive experiences cater to the diverse needs of people.

Our team’s commitment goes beyond the realms of technology; it’s a commitment to building bridges, connecting hearts, and making the extraordinary accessible to everyone. In this human-centric odyssey, VOXReality doesn’t just create virtual worlds; we aim to craft inclusive spaces where differences are celebrated, where barriers are shattered, and where the shared human experience becomes the cornerstone of innovation. 

It’s a conscious effort to empower individuals, irrespective of their abilities or backgrounds, to not just participate in the digital revolution but to lead it, ensuring that the promise of a better, more connected future is within reach for all.

Inclusivity and Accessibility

One of the fundamental pillars of VOXReality’s human-centred approach is inclusivity. XR technologies are breaking barriers, enabling individuals regardless of physical abilities, to explore new worlds and partake in experiences previously deemed impossible. From virtual assistants for events to immersive access to theatre plays, inclusivity is not just a concept but a tangible reality within VOXReality’s vision.

In the tapestry of human experiences, accessibility weaves the threads that connect us all. VOXReality takes pride in our meticulous design process, ensuring that XR applications are not only user-friendly but also accessible. Through several iterations from the consortium partners, we aim to empowering potential users to navigate and engage effortlessly.

The Future Beckons: A Harmonious Coexistence

As we stand on the adge of a future where the lines between reality and virtuality blur, VOXReality exemplifies the harmonious coexistence of humanity and technology. Our commitment to a human-centred approach ensures that the digital realms we create are not just immersive but also inherently humane. In this symbiotic relationship, technology amplifies human potential, fostering empathy, understanding, and shared moments of joy.

In conclusion, as we navigate the uncharted territories of extended reality, let us remember that the true essence of progress lies in the way it uplifts the human spirit. VOXReality’s unwavering dedication to a human-centred approach serves as an example illuminating the path towards an inclusive, accessible, and harmonious digital future. 

Together, as we embrace the boundless possibilities of XR technologies, let us continue this journey, ensuring that no one is left behind, and every soul finds solace and belonging in the immersive worlds we create.

Picture of Natalia Cardona

Natalia Cardona

Hi! My name is Natalia Cardona and i'm a Corporate communications specialist, Master in Journalism and Digital Content Innovation by the Autonomous University of Barcelona. Currently working in dissemination, communication, and marketing of technology, innovation, and science for projects funded by the European Commission at F6S.

Twitter
LinkedIn
uriel-sc-11KDtiUWRq4-unsplash

Enhancing Open-Domain Dialogue Answer Selection through Intent-Calibrated Self-Training

Can predicted intents calibrate correct answers in open-domain dialogues?

The capability of predicted intents to refine answer selection in open-domain dialogues is a topic of significant interest.

The mission of VOXReality is to explore the development of advanced context-aware task-oriented dialogue systems. In this context, Centrum Wiskunde & Informatica (CWI) has extensively explored and provided insights into whether predicted intent labels have the potential to calibrate answer labels in open-domain dialogues.

Spearheaded by the Distributions and Interactive Systems (DIS) group, this initiative has culminated in the publication of a paper titled “Intent-Calibrated Self-Training for Answer Selection in Open-domain Dialogues” on Transactions of the Association for Computational Linguistics (TACL).

This publication serves as an evidence of the significant progress made in understanding the intricate interplay between predicted intent labels and calibrated answer selection. Full paper is available here. 

Challenge

Answer selection models have achieved notable success through training on extensive labelled datasets. However, the process of amassing large-scale labelled data is not only labour-intensive but also time-consuming. This challenge is further exacerbated for Open-Domain Systems (ODSs) as they grapple with deciphering users’ information needs due to the unstructured nature of open-ended goals (Huang et al., 2020).

Motivation

The concept of user intents, encompassing a taxonomy of utterances, has been introduced to provide guidance to the information-seeking process (Qu et al., 2018, 2019a; Yang et al., 2020). When a potential answer (PA) does not satisfy the intent of the original question (OQ), the subsequent intent of the user is likely to be an information request (IR). For instance, if a user queries, “Can you direct me to a website for more information?” their intent is classified as IR. Overlooking the intent label IR may result in providing an answer that fails to fulfil the user’s request.

Method

We introduce a novel approach known as Intent-Calibrated Self-Training (ICAST) to enhance answer label calibration within a self-training framework. Specifically, our proposal involves leveraging predicted intent labels to calibrate answer labels. The ICAST method encompasses the following steps:

    1. Teacher Model Training: A teacher model is trained on labelled data (D^l) to predict pseudo intent labels for unlabeled data (D^u).
    2. Intent-Calibrated Pseudo Labelling: High-quality intent labels are identified using intent confidence gain, subsequently influencing the selection of samples. The answer labels are calibrated by integrating selected intent labels as supplementary inputs for answer selection.
    3. Student Model Training: The student model is trained using both labelled and pseudo-labeled data.

Figure (b) visually represents the Intent-Calibrated Self-Training (ICAST) process. The flow involves training the teacher model, intent-calibrated pseudo labelling, and student model training. In contrast to the basic teacher-student training depicted in Figure (a), ICAST enhances the quality of pseudo-labeled data, resulting in significant performance gains.

Conclusion

In this article, we introduce Intent-Calibrated Self-Training (ICAST), a framework rooted in teacher-student self-training and intent-calibrated answer selection. The approach entails training a teacher model on labelled data to predict intent labels for unlabeled data, selecting high-quality intents via intent confidence gain to enhance pseudo answer label prediction, and retraining a student model using labelled and pseudo-labeled data.

Extensive experimentation on two benchmark datasets demonstrates the superiority of ICAST over baselines even with minimal labelled data (1%, 5%, and 10%). Our future work aims to explore additional predictable dialogue contexts, such as user profiles, beyond intents.

Thanks to this research, VOXReality will stand poised to harness these insights to advance the frontiers of context-aware task-oriented dialogue systems. These insights will serve as the driving force propelling us to push the boundaries, ushering in a new era of inquiry, innovation, and seamless application.

Resources

References

  • Deng, W., Pei, J., Ren, Z., Chen, Z., & Ren, P. (2023). Intent-calibrated Self-training for Answer Selection in Open-domain Dialogues. arXiv preprint arXiv:2307.06703.
  • Minlie Huang, Xiaoyan Zhu, and Jianfeng Gao. 2020. Challenges in building intelligent opendomain dialog systems. ACM Transactions on Information Systems.
  • Chen Qu, Liu Yang, W. Bruce Croft, Johanne R Trippas, Yongfeng Zhang, and Minghui Qiu.2018. Analyzing and characterizing user intent in information-seeking conversations. In Proceedings of International ACM SIGIR conference on research and development in information retrieval.
  • Chen Qu, Liu Yang, W. Bruce Croft, Yongfeng Zhang, Johanne R. Trippas, and Minghui Qiu. 2019a. User intent prediction in informationseeking conversations. In Human Information Interaction and Retrieval.
  • Liu Yang, Minghui Qiu, Chen Qu, Cen Chen, Jiafeng Guo, Yongfeng Zhang, W. Bruce Croft, and Haiqing Chen. 2020. IART: Intent-aware response ranking with transformers in informationseeking conversation systems. In The Web Conference.
Picture of Jiahuan Pei

Jiahuan Pei

As a researcher at the CWI (NWO-I), I focus on generative dialogue systems in extended reality (XR) specifically for the VOXReality project. This project combines the fields of artificial intelligence, natural language processing, and immersive technologies to create interactive and engaging conversational experiences in virtual and augmented reality environments. We explore innovative ways to enhance human-computer interactions by enabling natural and realistic conversations with virtual entities. By leveraging the power of generative dialogue systems, we aim to develop intelligent agents capable of understanding and responding to user input in a dynamic and contextually appropriate manner.

Twitter
LinkedIn