Blog post – Page 2

Copy of OC EIC Solvers Webinar #2 (1920 x 1080 px) (1920 x 1200 px) (LinkedIn Post) (1)

Developing NLP models in the age of AI race

27 March 2025

by Ana R Blog post

The AI rance intensifies

During the last 10-15 years, Natural Language Processing (NLP) has undergone a profound transformation, driven by advancements in deep learning, use of massive datasets and increased computational power. These innovations led to early breakthroughs such as word embeddings (Word2Vec [1], GloVe [2]) and paved the way for advanced architectures like sequence-to-sequence models and attention mechanisms, all based on neural architectures. It was in 2018, that the introduction of transformers and especially BERT [3] (as an open-source model) that enabled the contextualized understanding of language. Performance in NLP tasks like machine translation, sentiment analysis or speech recognition has been significantly boosted, making AI-driven language technologies more accurate and scalable than ever before.

The “AI race” has intensified with the rise of large language models (LLMs) like OpenAI’s ChatGPT [4] and DeepSeek-R1 [5], which use huge architectures with billions of parameters and massive multilingual datasets to push the boundaries of NLP. These models dominate fields like conversational AI and can perform a wide range of tasks by achieving human-like fluency and context awareness. Companies and research institutions worldwide are competing to build more powerful, efficient, and aligned AI systems, leading to a rapid cycle of innovation. However, this race also raises challenges related to interpretability, ethical AI deployment and the accessibility of high-performing models beyond large tech firms.

But what did DeepSeek achieve? In early 2025, DeepSeek released its R1 model, which has been noted to outperform many state-of-the-art LLMs at a lower cost, therefore it caused a disruption in the AI sector. DeepSeek had made its R1 model available on platforms like Azure, allowing users to take advantage of their technology. DeepSeek introduced many technical innovations that allowed their model to thrive (such as architecture innovations: hybrid transformer design, the use of mixture-of-experts models, auxiliary-loss-free load balancing) however their main contribution was the reduction of reliance on traditional labeled datasets. This innovation stems from the integration of pure reinforcement learning techniques (RL), enabling the model to learn complex reasoning tasks without the need for extensive labeled data. This approach not only reduces the dependency on large labeled datasets but also streamlines the training process, lowering the resource requirements and costs associated with developing advanced AI models.

The Enduring Relevance of models used in VOXreality

At VOXReality we take a fundamentally different approach and believe in the significant value brought by “traditional” AI models (esp. for ASR and MT) particularly in specialized domain applications. We prioritize real open-source AI by ensuring transparency, reproducibility and accessibility [6]. Unlike proprietary or restricted “open weight” models, our work is built upon truly open architectures that allow full modification and deployment without any limitations. This is the reason that our open call winners [7] are allowed to build on top of the VOXreality ecosystem. Moreover, our approaches often require less computational power and data, making them suitable for scenarios with limited resources or where deploying large-scale AI models is impractical. Our models can be tailored to specific industries or fields, incorporating domain specific expertise without extensive or expensive retraining. The implementation of models on a local scale (if chosen), can also offer enhanced control over data and compliance with privacy regulations, which can be a significant consideration in sensitive domains.

VoxReality’s Strategic Integration

At VoxReality, we strategically integrate traditional ASR and MT approaches to complement advanced AI models, ensuring a comprehensive and adaptable solution that leverages the strengths of state-of-the art AI models. This focus on real open-source innovation and data-driven performance differentiates VOXreality from the rapidly evolving landance of AI mega-models.

Jerry Spanakis

Assistant Professor in Data Mining & Machine Learning at Maastricht University

References

[1] Mikolov, T., Chen, K., Corrado, G., & Dean, J. (2013). Efficient Estimation of Word Representations in Vector Space.

[2] Jeffrey Pennington, Richard Socher, and Christopher Manning. 2014. GloVe: Global Vectors for Word Representation. In Proceedings of the 2014 Conference on Empirical Methods in Natural Language Processing (EMNLP), pages 1532–1543, Doha, Qatar. Association for Computational Linguistics.

[3] Jacob Devlin, Ming-Wei Chang, Kenton Lee, and Kristina Toutanova. 2019. BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding. In Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long and Short Papers), pages 4171–4186, Minneapolis, Minnesota. Association for Computational Linguistics

[4] https://openai.com/chatgpt/overview/

[5] https://github.com/deepseek-ai/DeepSeek-R1

[6] https://huggingface.co/voxreality

[7] https://voxreality.eu/open-call-winners/

VOXReality’s Open Learning Revolution

21 March 2025

by Ana R Blog post

Industrial training has traditionally followed rigid, stepwise instruction, ensuring compliance and accuracy but often at the cost of creativity and adaptability. However, with the rapid advancements in Extended Reality (XR) and Artificial Intelligence (AI), training methodologies are shifting toward more dynamic and flexible models.

At the heart of this transformation is VOXReality, an XR-powered training system that departs from traditional step-by-step assembly guides. Instead, it embraces a freemode open-learning approach, allowing workers to logically define and customize their own assembly sequences with complete freedom. This method enhances problem-solving skills, engagement, and real-world adaptability.

Unlike conventional training, which dictates a specific order of operations, VOXReality’s open-ended model empowers users to experiment, explore, and determine their optimal workflow. This approach offers several key benefits: workers are more engaged when they can approach tasks in a way that feels natural to them; trainees develop a deeper understanding of assembly processes through problem-solving rather than rote memorization; the system adapts to different skill levels, allowing experienced workers to optimize workflows while providing guidance to beginners; and, since real-world assembly is rarely linear, this method better prepares workers for unexpected challenges on the factory floor.

VOXReality integrates an AI-driven dialogue agent to ensure trainees never feel lost in this open-ended system. This virtual assistant provides real-time feedback, allowing users to receive instant insights into their choices and refine their approach. It also enhances engagement and interactive learning by enabling workers to ask questions and receive contextual guidance rather than following static instructions. Additionally, the AI helps prevent errors by highlighting potential missteps, ensuring that creativity does not come at the cost of safety or quality.

Development Progress:

Below, we outline the development status with some corresponding screenshots that showcase the system’s core functionalities and user interactions.

The interface features two text panels displaying the conversation between the user and the dialogue agent. When the user speaks, an automated speech recognition tool (created by our partners from Maastricht University) converts their speech into text and sends it to the dialogue agent (created by our partners in Synelixis), which is shown in the top panel (input panel). The dialogue agent then processes the input, provides contextual responses, and uses a text-to-speech tool to read them aloud. These responses are displayed in the lower panel (output panel). Additionally, the system can trigger audio and video cues based on user requests. The entire scene is color-coded to enhance user feedback and improve interaction clarity.

The screenshots below capture the dialogue between a naive user and the dialogue agent. The user enters the scene and asks for help. The Dialogue Agent is guiding the user for the next steps.

The screenshot below captures the user’s curious question regarding the model to be assembled. The Dialogue Agent provides contextual answers to the user.

The user grabs an object and asks the Dialogue Agent to give a hint about the step he wants perform. The Dialogue Agent triggers the function in the application to give a useful hint.

The implementation of freemode XR training is just the beginning. As AI and XR technologies continue to evolve, the potential for fully immersive, adaptive, and intelligent industrial training systems grows exponentially. The success of this approach will be measured by increased worker efficiency, reduced onboarding time, and higher retention of complex technical skills.

VOXReality’s commitment to redefining industrial learning aligns with the broader movement toward smart manufacturing and Industry 5.0. By blending technology with human intuition and adaptability, we are not just training workers—we are empowering the future of industry. We are looking forward to test the solution with unbiased users and receive feedback for improvements.

Leesa Joyce

Head of Research @ Hololight

&

Gabriele Princiotta

Unity XR Developer @ Hololight

VOXReality template LinkedIn v3 (3).pdf (1920 x 1080 px) (Instagram Post) (1)

Celebrating Women in Extended Reality: Insights and Inspiration from the Women in XR Webinar

19 March 2025

by Ana R Blog post

VOXReality, in collaboration with SERMAS, XR4ED, TRANSMIXR, HECOF, MASTER, and CORTEX2 projects, had the privilege of hosting the “Women in XR Webinar – Celebrating Women in Extended Reality.” This online event brought together leading female experts from EU-funded XR projects for an inspiring discussion on the role of women in the rapidly evolving field of Extended Reality. We were honored to have a panel featuring Regina Van Tongeren, Grace Dinan, Leesa Joyce, Moonisa Ahsan, Megha Quamara, Georgia Papaioannou, Maria Madarieta, and Marievi Xezonaki. From seasoned trailblazers with 20 years of experience to emerging voices, these panelists shared their journeys, challenges, and invaluable insights. This webinar aimed to highlight the importance of gender diversity in XR and provide practical advice for aspiring women in tech.

Navigating the Digital Divide: Realities of Women in XR

The panelists openly discussed the challenges women face in the XR industry. While the field offers immense creative potential, it is inherently challenging. Participants highlighted several key issues. Women often find fewer opportunities compared to their male counterparts. The persistent pay gap remains a significant barrier. Women’s contributions can be overlooked, hindering career advancement. Some women still experience difficulties in accessing advocacy and support from the broader XR community. These challenges underscore the need for systemic changes to ensure equal opportunities and recognition for women in XR.

Unlocking Immersive Potential: The Boundless Opportunities for Women in XR

Despite the challenges, the webinar emphasized the vast opportunities available for women in XR. The panelists pointed to the expanding applications of XR across various sectors. XR has the potential to modernize medical training, patient care, and therapy in healthcare. Immersive learning experiences enhance engagement and knowledge retention in education. Innovative applications for virtual try-ons, digital fashion, and immersive design processes are emerging in fashion and design. The XR field is not limited to technical roles; it requires a wide range of skills, including legal expertise, artistic talent, and scientific knowledge. These emerging opportunities present a unique chance for women to lead and shape the future of XR.

Empowering the Future: Actionable Insights and Key Takeaways for Women in XR

The panelists shared a wealth of practical advice for women looking to thrive in the XR industry. They emphasized the importance of building a strong network and finding a supportive community within XR. Organizations like Women in Immersive Tech Europe [1] provide valuable resources, mentorship, and networking opportunities. Seeking out inspiring role models, such as Parul Wadhwa [2], or event figures like Marie Curie, and learning from their experiences was also strongly encouraged.

Furthermore, the panelists stressed the importance of being assertive and comfortable making suggestions. Staying updated with the latest developments in the rapidly evolving XR field is crucial, as is a commitment to continuous learning. They advised against trying to conform to a pre-existing mold, urging women to bring their unique perspectives to the table and contribute to creating inclusive XR experiences. Building a strong online brand, including a professional portfolio, active social media channels, and a personal website, was highlighted as essential for visibility.

For XR teams, the message was clear: diversity must be a core value, integrated into the DNA of the team and its products, not an afterthought. Diversifying hiring teams to include a wide range of skill sets is essential. For those considering starting their own businesses or working freelance, platforms like Immersive Insiders [3] and TalentLabXR [4] were recommended, along with exploring relevant courses from institutions such as the University of London and the University of Michigan.

The webinar left us with a powerful call to action, inspiring us to work together towards a more inclusive and equitable XR future. We encourage you to follow all the panelists, especially the member from our team, Leesa Joyce and Moonisa Ahsan, and be inspired by their ongoing leadership!

Missed the Live Session? Catch the Recording! If you were unable to join us live, don’t worry! The full event recording is available on the F6S Innovation YouTube channel.

Links

[1] http://wiiteurope.org/

[2] http://www.parulwadhwa.com/

[3] https://immersive-insiders.com/

[4] https://www.b3media.net/talentlab-xr

Ana Rita Alves

Communication Manager at F6S, where she specializes in managing communication and dissemination strategies for EU-funded projects. She holds an Integrated Master’s Degree in Community and Organizational Psychology from the University of Minho, which has provided her with strong skills in communication, project management, and stakeholder engagement. Her professional background includes experience in proposal writing, event management, and digital content creation.

Photo by James Bellorini, https://www.citymatters.london/london-short-film-festival-smart-caption-glasses/

Choosing the right AR solution for Theatrical Performances

19 March 2025

by Ana R Blog post

Choosing suitable AR devices for the VOXReality AR Theatre has been a challenging endeavour. The selection criteria are based on the user and system requirements of the VOXReality use case, which have been extracted through a rigorous user-centric design process and iterative system architecture design. The four critical selection criteria dictate that:

The AR device should have a comfortable and discreet glass-like form factor for improved user experience and technological acceptance in theatres.
The AR device should support affordability, durability and long-term maintenance for feasible implementation at audience-wide scale.
The AR device should support personalization, so that each audience member can customize the fit to their needs, and allow strict sanitization protocols, so that device distribution can adhere to high level public health standards.
The AR device should support application development with open standards instead of proprietary SDKs for widespread adoption of the solution.

Given the above criteria, the selection process presents a clear challenge because no readily available AR solution offers a perfect fit. To address this need, the VOXReality team performed an extensive investigation of the range of available options with a view to the past, present, and future, and is presenting the results below.

The past

A quick look to the past shows that popular AR options were clearly unsuitable given the selection criteria. Specifically, affordable AR has a long, proven track as affordable and user-friendly camera-based AR, deployed on consumer smartphones and distributed through appropriate platforms as standalone applications. The restrictions on the user experience though, such as holding a phone up while seated to watch the play through a small screen, make this a clearly prohibiting option. Previous innovative designs of highly sophisticated -and costly- augmented reality devices, sometimes also referred to as holographic devices or mixed reality devices, do not support either scalability, or discreet presence required by the use case, and are also similarly rejected. Finally, a range of industry-oriented AR designs available as early as 2011 focused on monocular, non-invasive AR displays with limited resolution and display capabilities, at a prohibiting cost due to (among others) extensive durability and safety certifications. Therefore, with a view to the past, one can posit that the AR theatre use case had not taken up popularity due to pragmatic constraints.

The present

In recent years and as early as 2020, hardware developments picked up pace. Apart from evolutions of previously available designs and/or newcomers in similar design concepts, more diverse design concepts have been introduced in a persistent trend of lowering procurement costs and offering more support for open-source frameworks. Nowadays, this trend has culminated in a wide range of “smart glasses”, i.e. wearables with glass-like form factor supporting some level of audiovisual augmentation, which rely on external devices for computations (such as smartphones).

This design concept finds its origins in the previously mentioned industry-oriented AR designs, as well as their business-oriented counterparts. This time though, the AR glass concept is entering the consumer space with options that are durable for daily, street wear and tear while also remaining affordable for personal use. Some designs are provided directly bundled with proprietary software in a closed system approach (like AI-driven features or social media-oriented integrations), but the majority offers user-friendly, plug-n-play, tethered or wireless capabilities, directly supporting most personal smartphones or even laptops.

The VOXReality design

This landscape enables new, alternative system designs for AR Theatre use cases: instead of theatre-owned hardware/software solutions, one can envision system with a combinations of consumer and theatre-owned hardware/software elements. By investigating how various stakeholders respond to this potential, we can pinpoint best practices and future recommendations.

Examining implemented AR theatre use cases, one can validate that the past landscape is dominated by a design approach with theatre-owned integrated (hardware/software) solutions. Excellent examples where the theatre provides hardware and software to the audience, are the National Theatre’s smart caption glasses feature [1], developed by Accenture and the National Theatre, as well as the Greek-based SmartSubs [2] project.

One new alternative that presents itself is for audience to use their own custom hardware/software solutions dedicated to live subtitling and translations during performances. In this case, each user can choose their own AR device, pre-bundled with general-purpose translation software of their preference.

As eloquently described in a recent article though [3], general purpose AI-captioning and translations frequently make mistakes and fail to capture nuances, which especially in artistic performances can break immersion and negatively impact audience experience. Therefore, in VOXReality we design for a transition from the past to the future: developing custom software dedicated to theatrical needs, optimized at generating real-time subtitles and translations of literary text, which can also be easily deployed on theatre-owned AR devices and/or on consumer-owned devices with minimal adaptations. This is enabled by a rigorous user-centric design approach which can verify the features and requirements per deployment option, as well as contemporary technical development practices using open standards such as OpenXR.

The future

The future looks bright with community driven initiatives showing how accessible AR and AI technology can be, as in the example of open-source smart glasses you can build on your own [4], and in the continuous improvements on automatic speech recognition and neural machine translation allowing models to run performantly on ever less resources. VOXReality aims to leave a long-standing contribution to the domain of AR theatre with the objective of establishing reliable, immersive and performant technological solutions as the mainstream in making cultural heritage content accessible to all.

Spyros Polychronopoulos

Research Manager at ADAPTIT and Assistant Professor at the Department of Music Technology and Acoustics of the HMU

&

Olga Chatzifoti

Links

[1] https://www.nationaltheatre.org.uk/your-visit/access/

[2] https://smartsubs.ime.gr/results.html

[3] https://www.theverge.com/2025/1/24/24351013/ray-ban-meta-smart-glasses-translation-wearables

[4] https://github.com/AugmentOS-Community/OpenSourceSmartGlasses

Photo by James Bellorini, retrieved from https://www.citymatters.london/london-short-film-festival-smart-caption-glasses/

VOXReality template LinkedIn v3 (2).pdf (Instagram Post) (41)

Seeing into the Black-Box: Providing textual explanations when Machine Learning models fail.

6 March 2025

by Ana R Blog post

Machine learning is a scientific practice which is heavily tied with the terms “error” and “approximation”. Sciences like Mathematics and Physics are associated with error, induced by a need to model how things work. Moreover, the abilities of humans in intelligence tasks are also tied with error, since some actions associated with these abilities may be the result of failure, while other actions may be deemed as truly successful ones. There have been myriads of times when our thinking, our categorization ability, or our human decisions, have failed. Machine learning models, which try to mimic and compete with human intelligence in certain tasks, are also tied with successful operations or erroneous ones.

But how can a machine learning model, a deterministic model with the ability to empirically compute the confidence it has for a particular action, diagnose itself that it makes an error when processing a particular input? Even for a machine learning engineer, trying to intuitively understand why without studying a particular method seems difficult.

In this article, we discuss a recent algorithm for this problem that convincingly explains how; in particular, we describe the Language Based Error Explainability (LBEE) method by Csurka et al. Here, we will recreate an explanation on how this method leverages the convenience of generating embeddings via the CLIP model contributed by OpenAI, which allows one to translate text extracts and images into high-dimensional vectors that reside in a common vector space. By projecting texts or images in this common high-dimensional space, we can compute the dot product between two embeddings (which is a well-known operation that measures the similarity among two vectors) to quantitatively measure how similar the two original text/image objects are as a function of other dot product operations (or, put simply, other similarities among vectors) involving pairs of other objects.

The designers of LBEE have developed a model that can report a textual error description of a model failure in cases where the underlying model asserts an empirically low confidence score in taking an action that the model was designed for. Part of the difficulty in grasping how such a method fundamentally works is our innate wondering about how the textual descriptions explaining the model failure are generated from scratch as a function of an input datum. In our brains, we often do not put much effort when we need to explain why a failure happens and we instantly arrive at clues to describe them, unless the cause drifts apart from our fundamental understanding of the inner workings of the object that is involved in the failure. To keep things interesting, we can provide an answer to this wondering: instead of assembling these descriptions once for each input, we can generate them following a recipe a-priori and then reuse them in the LBEE task by computationally reasoning about the relevance of a candidate set of explanations in relation to a given model input. In the remainder of this article, we will see how:

Suppose that we have a classification model that was trained to classify the object type of an only object depicted in a small color image. We could, for example, take photographs of objects in a white background with our phone camera and pass these images to the model in order for it to classify the object names. The classification model can yield a confidence score ω that is between 0 and 1, representing the normalized confidence that the model has when assigning the images to a particular class in relation to all the possible object types that are recognizable by the model. It is usually observed that when a model does poorly in generating a prediction, the resulting confidence score may be quite low. But what is a good empirical threshold T that allows us to identify a poor prediction or a confident prediction? To empirically estimate two such thresholds, one for identifying easy predictions and one for identifying hard predictions, we can take a large dataset of images (e.g., the ImageNet datase) and pass each image to the classifier. For the images which were classified correctly, we can plot the confidence scores generated by the model as a normalized histogram. By doing so, we may expect to see two large lobes in the histogram, representing the confidence values which correspond to confident inferences and less confident inferences. We may also expect to see some degradation in the frequency masses around the two lobes, which is possible. Otherwise, we would come up with a histogram presenting two lobes which are highly leptokurtic. One lobe concentrates empirical prediction scores that are lower, and a second lobe will concentrate many scores which are relatively high. Then, we can set an empirical threshold that separates the two lobes.

Csurka and collaborators designate separating images as easy or hard based on the confidence score of a classification machine learning model and their relation to the cut-off threshold (see Figure 1). By distinguishing these two image sets, the authors compute the embeddings of the images from each group, and for each image in these sets they compute an ordered sequence of numbers (for our convenience, we will use the term vector to refer to this sequence of numbers) which describe the semantic information of the image. To do this, they employ the CLIP model contributed by OpenAI, the company that is famous for the delivery of the ChatGTP chatbot model, which excels at producing embeddings for images and text in a joint high-dimensional vector space. The computed embeddings can be used to measure the similarity between an image and a very small text extract, or the similarity between a pair of text extracts or images.

As a later step, the authors wanted to identify the groups of image embeddings that share similarities. To do this, they use a clustering algorithm which can take in vectors generated by a “generating machine” and identify the clusters of the embeddings. The number of clusters that fit a particular dataset is non-trivial to define. All in all, we come up with two types of clusters: clusters of CLIP embeddings for “easy” images, and clusters of CLIP embeddings for “hard” images. Then, any hard cluster center is picked and for it the closest easy cluster center is found. This allows us coming up with two embedding vectors originating from the clustering algorithm. The two clusters, “easy” and “hard”, are visually denoted at the top-right sector of Figure 1, by green and red -dotted enclosures.

The LBEE algorithm then generates a set S of sentences that describe the above-said images. Therefore, for each short sentence that is generated, the CLIP embedding is computed. As it was mentioned earlier, this text embedding can be directly compared to the embedding of any image by calculating the dot product (or inner product) of the two embedding vectors. Consider that the dot product measures a quantity that in the signal processing community is called linear correlation. The authors apply this operation directly. They compute the similarity of each text error description by computing the so-called cosine similarity measure between a text extract embedding and an image embedding, ultimately computing two relevance score vectors of dimensionality k < N. Each dimension is tied with a given textual description. By taking these two score vectors into consideration, the authors pass the two vectors in a sentence selection algorithm (we cover them in the next paragraph). The selected sentences are considered for this forward process by taking into account each hard cluster. The union of these sentence-sets is then output to the user, in return for an image that was supplied as input.

The authors chose to define four sentence selection algorithms, named SetDiff, PDiff, FPDiff and TopS. SetDiff computes the sentence-sets corresponding to a hard cluster and to an easy cluster. Then, the algorithm removes from the hard cluster sentence-set the sentences that also appear in the easy cluster sentence-set, and reports the resulting set to the user. PDiff takes two similarity score vectors i and j of dimensionality $k$ (where k denotes the top-$k$ relevant text descriptions), one from the hard set and one from the easy set. Then, Diff computes the difference between these two vectors, and from there it retains the sentences corresponding to the top $k$ values. TopS trivially reports as an answer all the sentences that correspond to the vector of top-k similarities. Figure 3 presents an example of textual failure modes generated by a computer vision model, each using one of the TopS, SetDiff, Diff and FPDiff methods. To enable evaluation of the LBEE model and methodology, the authors had to also introduce an auxiliary set of metrics, adapted to the specificalities of the technique. To further understanding on this innovative and very useful work, we recommend you to keep on reading [1].

References

[1] What could go wrong? Discovering and describing failure modes in computer vision, published in the proceedings of ECCV 2024.

Sotiris Karavarsamis

Research Assistant at Visual Computing Lab (VCL)@CERTH/ITI

The rise of immersive technologies in theatre

17 February 2025

by Ana R Blog post

Transforming the Audience Experience with VR and AR

Many aspects of our society have been profoundly impacted by the development of technology, especially the entertainment industry, which includes theatre. Virtual Reality (VR) and Augmented Reality (AR) are technologies that have massively changed how people think about entertainment and performance. These technologies are extremely versatile and can be used both to enhance the spectator’s experience without altering the essence of theatrical representation and to completely transform performances compared to classical theatre. The concept of augmented reality was first introduced in 1992 by Caudell and Mizell[1], and followingly expounded upon by Ronald Azuma, who outlined its potential applications in the entertainment sector, among others[2]. However, the real breakthrough came between 2010 and 2015, with the advent of VR headsets. In 2015, Microsoft’s HoloLens introduced the ability to overlay virtual objects onto the real world, fostering new experimentation. In the same year, the platform The Void was launched, becoming popular thanks to hyper-reality experiences that combined virtual reality with interactive physical environments. Due to its popularity, the platform was able to collaborate with major companies like Disney and work on internationally renowned projects such as Star Wars: Secrets of the Empire. The COVID-19 pandemic provided a strong push for the adoption of immersive technology, forcing theatres worldwide for necessity to experiment with digital formats and virtual experiences[3] (LINK).

The VR and AR in the entertainment market

The immersive technology market is expanding, driven by sectors such as entertainment, education, healthcare, and business, which are increasingly adopting VR and AR technologies. In 2023, the value of the immersive technology market was $29.13 billion, and future projections indicate it will reach $134 billion by 2030, with an annual growth rate of over 25%[4] (LINK). With over fifty percent of the market in 2023, the video game industry continues to be the leading industry for VR and AR in entertainment[5] (LINK). However, in live events and theatre these technologies have increasingly been used. Artificial Intelligence (AI) is being used into VR and AR experiences to enhance interactions and make them more accessible and natural[6] (LINK). Furthermore, as smart glasses and headsets have become more powerful and lighter, their latest developments have made their adoption easier for a wider range of users. Thanks to government-sponsored research initiatives like Horizon Europe, the growing investments in digital innovation, and the growing use of XR technologies in industries like entertainment, healthcare, and education, the immersive technology market in the EU is predicted to reach $108 billion by 2030[7] (LINK).

Enhancing theatre accessibility and audience engagement

Immersive technologies present plenty of possibilities to improve theatrical productions, enabling creativity in both the performance and its inclusivity. By employing virtual reality headsets, real-time subtitles and scene-specific context it is possible to improve the audience immersion and promoting inclusivity among people with hearing or language disabilities. This will increase the amount of people who attend plays, particularly in tourist cities where accessibility is severely limited by language barriers. Moreover, the use of these technologies increases the potential audience for theatrical plays because it also overcome geographic restrictions by enabling viewers to enjoy live performances from a distance in completely immersive virtual theatres. It will allow people who are unable to travel because of age-related problems or disabilities to attend to the performances not only in two dimensions, as it is now in the case of watching a theatre show on television, but to attend performances in a very immersive experience. Finally, visual effects can be added to performances using VR and AR technologies, bridging the gap between traditional performing arts and modern production techniques.

Applications of VR/AR in Theatrical Performances

The incorporation of VR and AR into theatre has completely transformed how audiences interact with displays, these technologies have introduced new means to boost storytelling, accessibility, and interaction. The potential of these technologies in live performances has been shown by a variety of projects:

National Theatre’s Immersive Storytelling Studio: To increase audience engagement, the National Theatre in the UK has adopted immersive technologies. Their Immersive Storytelling Studio investigates the potential of VR and AR to produce more immersive and engaging experiences (LINK)[8].

White Dwarf (Lefkos Nanos) by Polyplanity Productions: this experimental project creates a novel theatrical experience by fusing augmented reality with live performance through the interaction of digital materials with performers on stage (LINK)[9].
Smart Subs by Demokritos Institute: This project makes theatre performances more accessible to international and hearing-impaired audiences by using AR-powered smart captions that provide live subtitles (LINK)[10].
XRAI Glass: The use of AI technology in this case in combination with AR smart glasses, can provide real-time transcriptions and translations, enabling people with hearing impairments to follow along or comprehend plays in multiple languages (LINK)[11].
National Theatre Smart caption glasses (UK): The National Theatre in collaboration with Accenture and a team of speech and language experts led by Professor Andrew Lambourne has developed a “Smart caption glasses” solution as part of their accessibility program for their performances. The smart caption glasses have been in effect since 2018 (LINK), and have also been demonstrated in the 2020 London Short Film Festival for cinematic screenings (LINK).

These applications show how VR and AR are improving visual effects while also increasing accessibility and inclusivity in theatre. Theatre companies can reach a wider audience, overcome language hurdles, and produce captivating, interactive shows that push the limits of conventional theatre by incorporating immersive technologies.

Conclusion

As technology advances, VR and AR technology will become increasingly used in theatrical performances, both to create a more immersive experience and to make theatre more accessible, attracting new audiences and expanding the reach of the performing arts. In an increasingly digital environment, these technologies will guarantee that live performances continue to be both revolutionary and relevant in the cultural context. Additionally, the creation of AI-powered VR and AR tools will enable to modify and customize shows according to audience preferences, resulting in more profound emotional experiences and unprecedented accessibility to theatre.

References

Azuma, Ronald T. “A survey of augmented reality.” Presence: teleoperators & virtual environments 6.4 (1997): 355-385Iudova-Romanova, Kateryna, et al. “Virtual reality in contemporary theatre.” ACM Journal on Computing and Cultural Heritage 15.4 (2023): 1-11.

Jernigan, Daniel, et al. “Digitally augmented reality characters in live theatre performances.” International Journal of Performance Arts and Digital Media 5.1 (2009): 35-49.

Pike, Shane. “” Make it so”: Communal augmented reality and the future of theatre and performance.” Fusion Journal 15 (2019): 108-118.

Pike, Shane. “Virtually relevant: AR/VR and the theatre.” Fusion Journal 17 (2020): 120-128.

Srinivasan, Saikrishna. Envisioning VR theatre: Virtual reality as an assistive technology in theatre performance. Diss. The University of Waikato, 2024.

—

[1] Caudell, Thomas & Mizell, David. (1992). Augmented reality: An application of heads-up display technology to manual manufacturing processes. Proceedings of the Twenty-Fifth Hawaii International Conference on System Sciences. 2. 659 – 669 vol.2. 10.1109/HICSS.1992.183317.

[2] Azuma, Ronald T. “A survey of augmented reality.” Presence: teleoperators & virtual environments 6.4 (1997): 355-385

[3] Signiant. VR & AR: How COVID-19 Accelerated Adoption, According to Experts. 2024

[4] Verified Market Reports. Immersive Technologies Market Report. 2024

[5] Verified Market Reports. Immersive Technologies Market Report. 2024.

[6] Reuters. VR, AR headsets demand set to surge as AI lowers costs, IDC says. 2024.

[7] Mordor Intelligence. Europe Immersive Entertainment Market Report. 2024.

[8] National Theatre. Immersive Storytelling Studio. 2024.

[9] Polyplanity Productions. White Dwarf (Lefkos Nanos). 2024.

[10] Demokritos Institute. Smart Subs Project. 2024.

[11] XRAI Glass. Smart Glasses for Real-Time Subtitles. 2024.

Greta Ioli

Greta Ioli is an EU Project Manager in the R&D department of Maggioli Group, one of Italy's leading companies providing software and digital services for Public Administrations. After earning a degree in International Relations – European Affairs from the University of Bologna, she specialized in European projects. Greta is mainly involved in drafting project proposals and managing dissemination, communication, and exploitation activities.

VOXReality template LinkedIn v3 (2).pdf (Instagram Post) (40)

Partner Interview #8 with F6S

31 January 2025

by Ana R Blog post

The VOXReality project is driving innovation in Extended Reality (XR) by bridging this technology with real-world applications. At the heart of this initiative is F6S, a key partner ensuring the seamless execution of open calls and supporting third-party projects (TPs) from selection to implementation. In this interview, we sit down with Mateusz Kowacki from F6S to discuss their role in the consortium, the impact of mentorship, and how the project is shaping the future of AI and XR technologies.

Can you provide an overview of your organization's involvement in the VOXReality project and your specific role within the consortium?

F6S played a crucial operational role in the VOXReality project by managing the preparation and execution of the open calls. This thorought approach involved: designing the application process: determining eligibility criteria, application requirements and eveluation metrics, developing and disseminating the call, managing selection and implementation of the TP’s projects.

Essentially, F6S acted as the facilitator ensuring a smooth and efficient process of preparing and implementing open calls.

How do you ensure that both mentors and the projects they guide benefit from the mentorship process, and what does that look like in practice?

There are a lot of important factors that made the process of mentoring within VOXReality project a success but one of the key elements might be communication. That involves clearly outline of the roles and responsibilities of both the mentor and the project team. This includes setting expectations for communication frequency, meeting schedules, and deliverables. What is more regular check in with both mentors and projects to assess progress, identify any challenges, and provide support. Gather feedback on the mentorship experience to continuously improve the program. Those are for sure the core and basic elements of successful implementation. What we also developed in a sprint 2, based on lessons learnt from sprint 1, is a clear calendar of upcoming activities that involve TP’s and mentors. That help us with better execution and better understanding of our tasks.

Regular meetings, checkups, openness to discuss have also played a crucial role. F6S helped all partners to better execute and navigate through the implementation of open call.

How does the VOXReality team ensure that the XR applications being developed are both innovative and practical for real-world use?

The VOXReality team employs a multi-faceted approach to ensure that the XR applications being developed are both innovative and practical for real-world use. By funding projects through open calls, VOXReality fosters innovation and encourages a diverse range of ideas and approaches. This collaborative approach ensures that the development of XR applications benefits from the expertise of a wider community, leading to more creative and practical solutions. So basically, the whole selection process has been designed to cover as innovative technologies as possible. We have been lucky to attract a lot of application, so our selection of 5 TP’s has not been an easy task as a lot of projects represented good value of innovations and real-world use. Nevertheless, we believe that those five selected entities represent the best potential for future development, and we are sure that their pursuit for innovation will end up with their success.

The language translation system for VOXReality prioritizes cultural sensitivity and artistic integrity by relying on these literary translations, which capture the cultural nuances and emotional subtleties of the original text. To ensure that these aspects are preserved throughout the development, we conduct thorough evaluations of the translation outputs through internal checks. This evaluation is crucial for verifying that the translations maintain the intended cultural and artistic elements, thereby respecting the integrity of the original performance.

How do you think the VOXReality Open Call and the coaching process will shape the success and growth of innovative projects in the XR and AI fields?

I believe that the idea of cascade funding is crucial for discovering potential in small teams of creative professionals and for sure the projects like VOXReality help to leverage their activities to the higher level and bigger audience. The role of a coach is to ensure successful implementation of TP’s project within VOXReality but also to see the bigger picture of possibilities within the sector of public funded projects.

What excites you most about the Third-Party Projects joining VOXReality, and how do you believe AI and XR technologies will reshape the industries they are targeting?

The cooperation with them. That’s for sure very interesting to see, how they work, how they interact. The dynamism, agility but at the same time keeping the deadlines and meeting expectations. It is something for sure that can inspire. Not only them but also bigger entities to think sometimes outside the box, to leave the comfort zone. For some of those entities the project with VOXReality project is a game changer in their entrepreneurial history, and we are very happy to be the part of it. XR technologies have very big potential of changing and creating our everyday life but we need always to see the real, social value into what we are doing within XR technologies. That’s one of the mottos we have in VOXReality. To bring real value to the society.

Mateusz Kowacki

EU Project Manager @ F6S

Copy of OC EIC Solvers Webinar #2 (Instagram Post) (1)

Hybrid XR Training : Bridging Guided Tutorials and Open-Ended learning with AI

17 December 2024

by Ana R Blog post

Leesa Joyce

Head of Research Implementation at Hololight

Training methodologies significantly impact skill acquisition and retention, especially in fields requiring psychomotor skills like machining or other technical tasks. Open-ended and close-ended training represent two distinct pedagogical frameworks, each with unique advantages and limitations.

Close-Ended Training

Traditional close-ended training is characterized by its structured and prescriptive approach. Tasks are typically designed with a single correct pathway, focusing on minimizing errors and ensuring compliance with predefined standards. This method is effective for teaching specific skills that require strict adherence to safety protocols or operational sequences, such as in high-stakes environments like aviation or surgery (Abich et al., 2021).

However, close-ended systems often limit learners’ creativity and adaptability by discouraging exploration. They can result in a rigid learning experience, where trainees may struggle to apply knowledge flexibly in unstructured real-world scenarios (Studer et al., 2024). Additionally, such systems may reduce engagement, as they often emphasize repetition and discourage deviation from the expected path.

Open-Ended Training

Open-ended training, in contrast, fosters exploration and self-directed learning. It is rooted in constructivist principles, emphasizing active engagement and the development of problem-solving skills through exploration. This approach allows multiple pathways to achieve the same goal, encouraging learners to experiment and understand the underlying principles of tasks (Land & Hannafin, 1996).

In the context of psychomotor skills, open-ended training enables learners to adapt to different tools, approaches, and constraints. For example, an open-ended VR system for machining skills, as demonstrated by Studer et al. (2024), allows trainees to achieve objectives using various methods while enforcing critical protocols where necessary. This flexibility mirrors real-world scenarios where tasks rarely follow a single blueprint, enhancing learners’ readiness for practical challenges.

Benefits and Challenges

Open-ended training excels in promoting creativity, adaptability, and deeper conceptual understanding. Studies have shown that learners trained in open-ended environments often exhibit better problem-solving abilities and higher engagement levels (Ianovici & Weissblueth, 2016). For instance, in manufacturing industry, the employees may encounter situations requiring innovative approaches to meet production goals. An open-ended framework better equips them for such challenges.

However, this method may be less suitable for beginners who require a clear framework to build foundational skills. Research suggests that novices benefit from close-ended approaches to develop initial competence before transitioning to more exploratory methods (Ianovici & Weissblueth, 2016).

Applications in Modern Training

The integration of technologies such as Virtual Reality (VR) and Augmented Reality (AR) into learning processes has amplified the potential of open-ended training. Extended Reality (XR) platforms can simulate diverse scenarios, offering real-time feedback and dynamic task adjustments to accommodate different learning styles and levels of expertise (Abich et al., 2021). In contrast, close-ended modules provide step-by-step instructions for specific tasks, ensuring accuracy and consistency.

For example, the open-ended XR training system for machining tasks by Studer et al. (2024) combines guided tutorials with open-ended practice. This hybrid approach balances structure and flexibility, addressing the limitations of both methods.

The choice between open-ended and close-ended training should align with the learners’ needs, the complexity of the skills being taught, and the desired outcomes. While close-ended training ensures compliance and foundational competence, open-ended training prepares learners for the dynamic and unpredictable nature of real-world challenges. Leveraging both approaches in a complementary manner, particularly through advanced technologies like XR, offers a comprehensive framework for effective skill development.

Hybrid Learning in Industrial Assembly Lines: VOXReality’s Transformative Approach

The VOXReality project revolutionizes hybrid learning in industrial assembly lines by integrating cutting-edge AI-driven natural language processing and speech recognition modules. This approach addresses the key challenges of open-ended training, such as a lack of familiarity with machinery, uncertainty about assembly protocols, safety concerns, and insufficient guidance. By offering real-time interaction and support, VOXReality fosters an environment where workers can learn dynamically and creatively without feeling overwhelmed. The system enables users to receive immediate feedback and contextual instructions, paving the way for more efficient and engaging open-ended training scenarios. VOXReality not only enhances workforce competence but also ensures a safer and more intuitive learning process in industrial settings.

References

Abich, J., Parker, J., Murphy, J. S., & Eudy, M. (2021). A review of the evidence for training effectiveness with virtual reality technology. Virtual Reality, 25(4), 919–933.
Ianovici, E., & Weissblueth, E. (2016). Effects of learning strategies, styles, and skill level on motor skills acquisition. Journal of Physical Education and Sport, 16(4), 1169.
Land, S. M., & Hannafin, M. J. (1996). A conceptual framework for theories-in-action with open-ended learning environments. Educational Technology Research and Development, 44(3), 37–53.
Studer, K., Lie, H., Zhao, Z., Thomson, B., & Turakhia, D. (2024). An Open-Ended System in Virtual Reality for Training Machining Skills. CHI EA ’24.

Beyond the Jargon: Coordinating XR and NLP Projects Without Losing Your Headset

3 December 2024

by Ana R Blog post

Extended Reality (XR) technologies provide interactive environments that facilitates immersivity by overlaying digital elements onto the real world, by allowing digital and physical objects to coexist and interact in real time or by transporting the users into a fully digital world. As XR technologies evolve, there’s an increasing demand for more natural and intuitive ways to interact with these environments. This is where Natural Language Processing (NLP) comes in. NLP technologies enable machines to understand, interpret, and respond to human language, providing the foundation for voice-controlled interactions, real-time translations, and conversational agents within XR environments. Integrating NLP technologies into XR applications is a complex process and requires a collaborative effort of experts from a wide range of areas such as artificial intelligence (AI), computer vision, human computer interaction, user experience design, and software development and domain experts from the fields the XR application is targeting. Consequently, such an effort requires a comprehensive understanding of both the scientific underpinnings and the technical requirements involved as well as targeted coordination of all stakeholders in the research and development processes.

The role of a scientific and technical coordinator is ensuring that the research aligns with the development pipeline, while also facilitating alignment between the interdisciplinary teams responsible for each component. The scientific and technical coordinator needs to ensure smooth and efficient workflow and facilitate cross-team collaboration, know-how alignment, scientific grounding of research and development, task and achievement monitoring, and communication of results. In VOXReality, Maastricht University’s Department of Advanced Computing Sciences (DACS) serves as the scientific and technical coordinator.

Our approach to scientific and technical coordination, centered on the pillars of Team, Users, Plan, Risks, Results, and Documentation, aligns closely with best practices for guiding the research, development, and integration of NLP and XR technologies. Building a Team of interdisciplinary experts, having clear communication, and ensuring common understanding among the team members fosters innovation and timely and quality delivery of results through collaboration. A focus on Users from the beginning until the end ensures that the project is driven by real-world needs, integrating feedback loops to create intuitive and engaging experiences. A detailed Plan, with well-defined milestones and achievement goals, provides structure and adaptability, ensuring the project stays on track despite challenges. Proactively addressing Risks through contingency planning and continuous performance testing mitigates potential disruptions. Tracking and analyzing Results against benchmarks ensures the project meets its objectives while delivering measurable value. Finally, robust Documentation preserves knowledge, captures lessons learned, informs the stakeholders, and paves the way for future innovation.

The VOXReality project consists of three phases that require a dedicated approach for scientific and technical coordination. Phase 1 is the “Design” phase, where the focus is on the design of the research activities as well as the challenges as defined by the usecases that they seek to address. This phase mainly focuses on the requirements which needs to be collected from different stakeholders, analyzed in terms of feasibility, scope and relevance to the project goals and mapped to functionalities in order to retrieve the technical and systemic requirements for each data-driven NLP model and the envisaged XR application. Phase 2 is the “Development” phase, where the core development of the NLP models and the XR applications takes place. In this phase, the models need to be verified and validated on a functional, technical and workflow basis and the XR applications that integrate the validated NLP models need to be tested both by the technical teams and end-users. Finally, Phase 3 is the “Maturation” phase, where the VOXReality NLP models and the XR applications are refined and improved based on the feedback received from the evaluation of Phase 2. A thorough technical assessment of the models and applications need to be conducted before their final validation with end-users in the demonstration pilot studies. Moreover, five projects outside of the project consortium will have access to the NLP models to develop new XR applications.

Currently, we have completed the first two phases with requirement extraction completed, first versions of the NLP models released, first versions of the XR applications developed, and software tests and user pilot studies completed successfully. In Phase 1, from the scientific and technical coordination point of view, weekly meetings were organized. These meetings allowed each team involved in the project to able to familiarize themselves with the various disciplines represented and the terminologies that would be utilized throughout the project, the consolidation of viewpoints regarding the design, implementation, and execution of each use case, and the formal definition and documentation of the use cases. In Phase 2, bi-weekly technical meetings and monthly achievement meetings were conducted. These meetings allowed monitoring the advancements in the technical tasks such as developments, internal tests and experiments and achievements, especially the methodologies followed to evaluate them and potential achievements reached.

This structured approach and the coordinated effort of all team members in the VOXReality project resulted in the release of 13 NLP models, 4 datasets, a code repository that includes 6 modular components, and 3 use case applications and the publication of 4 deliverables in the form of technical reports and 6 scientific articles in journals and conferences.

Yusuf Can Semerci

Assistant Professor in Computer Vision and Human Computer Interaction at Maastricht University

Augmented Reality Theatre: Expanding Accessibility and Cultural Heritage Through XR Technologies

3 December 2024

by Ana R Blog post

As new XR technologies emerge, the potential for significantly increasing the accessibility of theatrical performances grows, creating new opportunities for inclusion and ensuring that wider audiences can fully experience these performances. Our project exemplifies the benefit of using new technologies in preserving cultural heritage and improving accessibility through a pilot on Augmented Reality Theatre, realized by the joint efforts of Gruppo Maggioli, a leading IT company in the Italian market, Adaptit, a telecommunications innovator, and Athens Epidaurus Festival, one of Greece’s leading cultural organisations and organiser of the summer festival of the same name. Our project is part of the European Union-funded Research and Innovation Action VOXReality , investigating voice-driven interaction in XR spaces.

Captions in theatres

Captions are an essential feature for increasing accessibility and inclusivity in theatres. They provide real-time text descriptions of spoken dialogue or important auditory cues. Primarily designed to assist individuals with hearing impairments (e.g. deaf or hard of hearing), they can provide comprehension support for any member of the audience. Providing captions allows any individual to follow the narrative, understand nuanced dialogues, and appreciate the performance’s full context.

Translations of the captions are designed to allow non-native speakers or those who do not understand the performance’s original language to also fully engage with the content. This feature is particularly beneficial in culturally diverse communities or international venues where audiences may come from various linguistic backgrounds. Caption translations open up educational and cultural exchange by broadening the reach of performances to global audiences.

Delivery formats: Open vs. Closed captions

Typical caption delivery in theatres falls under two categories: open captions and closed captions. Open captions are displayed on screens placed around the stage or projected onto the stage itself. They are visible to everyone in the audience and cannot be turned off. Since they are essentially a part of the theatrical stage, they can be designed to artistically blend with the stage’s scenic elements. Open captions fall short when it comes to translations, since only a limited number of languages can be displayed simultaneously. Furthermore, their readability is not uniform across all audience seats, since distance, angle and obstacles affect visibility.
Closed captions are typically displayed on devices, such as captioning devices or smartphone apps, that can be activated by audience members themselves. They provide flexibility to be turned on or off depending on the individual’s need and allow for customizable settings, such as font size and color adjustments, catering to individual preferences. They are also ideal for caption translations, since each user can select their preferred language.

With regards to accessibility and inclusivity, closed captions are a preferable option due to extensive customizations which can improve readability and comprehension. On the downside, they require a more elaborate technical framework to synchronize the delivery of the captions to the audience’s devices and bring up considerations of usability of the device or application from the audience.

AR closed captions

Closed captions are usually delivered using smartphone screens, but they can also be delivered using AR glasses. AR glasses can display the captions directly on the lenses with minimal obstructions to the user’s visual field. This allows the user to focus on a single visual frame of reference, instead of looking back and forth between smartphone screen and stage. This makes for an improved user experience without fear of missing out and can also benefit comprehension because of reduced mental workload. The AR delivery mode multiplies the potential benefits in terms of accessibility, but also the usability concerns and the theatre’s technological capacity.

Contextual Commentary

Aiming to foster a deeper connection with the performance, another feature introduces contextual commentary. Delivered through AR glasses, this commentary may include background information such as character insights, cultural and historical context, or artistic influences. This approach enhances artistic expression by giving theater directors the ability to curate and control the information shared with the audience, but it may also serve as a powerful tool for the preservation and sharing of the cultural context of theatrical plays. This is especially needed for the preservation of the ancient Greek culture and its dissemination to a wider audience. The interactive and immersive mode of delivery allows for a dynamic presentation, overcoming cultural and language barriers and making performances more inclusive.

VoxReality contribution

To assess those benefits in practice, our project aims to deliver an excerpt of an Ancient Greek tragedy to an international audience amplified with augmented reality translated captions and audiovisual effects. The play selected for the project is Hippolytus by Euripides, translated by Kostas Topouzis and adapted by the Athens Epidaurus Festival’s Artistic Director, Katerina Evangelatos. This is an ambitious pilot whose results can help determine AEF’s future course for performances of international appeal – a challenging task taking AEF’s heavy cultural weight into account.

The first user evaluation was completed in May 2024 with a closed performance. Despite being at an early technical and aesthetic level, the initial evaluation was decisively positive with users stating that they would be interested in attending this kind of theatre in real conditions, and that they saw practical benefit and artistic merit in the provided features. Negative feedback was focused on the technical performance of the system and the learning curve of the AR application.

Future Perspectives: Enhancing Theatre Through Innovation

An important potential aspect of future theatre will be the audience’s ability to individualize what is otherwise a collective experience—whether for practical reasons, as in VoxReality, or for various artistic purposes. This technology-supported sensitivity can allow for broader participation, and thus, broader representation in the future of performative arts. Our relationship with the future though can be shaped by revisiting our relationship with past: through this new lens we can lift linguistic barriers to exchange cultural works between communities worldwide, and we can revisit our own cultural heritage with a renewed understanding, both of which can shapes our contemporary cultural identity. This is an exciting era with fast moving changes that leave us with the challenge of comprehending our own potential – a challenge that will be determined by our ability to disseminate knowledge and promote collaboration.

The first public performances will be delivered on May 2025, in Athens, Greece, during the Festival’s 70th anniversary year and will be open for attendance through a user recruitment process. Theatre lovers who are not fluent in Greek are wholeheartedly welcome to attend.

Olga Chatzifoti

Extended Reality applications developer working with Gruppo Maggioli for the design and development of the Augmented Reality use case of the VOXReality HORIZON research project. She is also a researcher in the Department of Informatics and Telecommunications of the University of Athens. Under the mentorship of Dr. Maria Roussou, she is studying the cognitive and affective dimensions of voice-based interactions in immersive environments, with a focus on interactive digital narratives. She has an extensive, multidisciplinary educational background, spanning from architecture to informatics, and has performed research work on serious game design and immersive environments in Europe, USA and the UK.

&

Elena Oikonomou

Project manager for the Athens Epidaurus Festival, representing the organization in the HORIZON Europe VOXReality research project to advance innovative, accessible applications for cultural engagement. With a decade of experience in European initiatives, she specializes in circular economy, accessibility, innovation and skill development. She contributes a background that integrates insights from social sciences and environmental research, supporting AEF’s commitment to outreach and inclusivity. AEF, a prominent cultural institution in Greece, has hosted the renown annual Athens Epidaurus Festival for over 70 years.

Jerry Spanakis

Leesa Joyce

&

Gabriele Princiotta

Ana Rita Alves

Spyros Polychronopoulos

&

Olga Chatzifoti

Sotiris Karavarsamis

Greta Ioli

Can you provide an overview of your organization's involvement in the VOXReality project and your specific role within the consortium?

How do you ensure that both mentors and the projects they guide benefit from the mentorship process, and what does that look like in practice?

How does the VOXReality team ensure that the XR applications being developed are both innovative and practical for real-world use?

How do you think the VOXReality Open Call and the coaching process will shape the success and growth of innovative projects in the XR and AI fields?

What excites you most about the Third-Party Projects joining VOXReality, and how do you believe AI and XR technologies will reshape the industries they are targeting?

Mateusz Kowacki

Leesa Joyce

Close-Ended Training

Open-Ended Training

Benefits and Challenges

Applications in Modern Training

Hybrid Learning in Industrial Assembly Lines: VOXReality’s Transformative Approach

References

Yusuf Can Semerci

Captions in theatres

Delivery formats: Open vs. Closed captions

AR closed captions

Contextual Commentary

VoxReality contribution

Future Perspectives: Enhancing Theatre Through Innovation

Olga Chatzifoti

&

Elena Oikonomou

POLICIES

EMAIL

SOCIAL MEDIA