To celebrate International Women’s Day, SERMAS was part of a special webinar, hosted on 19 March, featuring female voices in Extended Reality (XR). The panel discussion's main objective was to empower the next generation of female innovators by highlighting the continued underrepresentation of women in all Science, Technology, Engineering, and Mathematics (STEM) fields. 

Seven women from different EU-funded projects Megha Quamara from SERMAS (that’s us!), Leesa Joyce and Moonisa Ahsan from VOX Reality, Georgia Papaioannou from HECOF, Maria Madarieta MASTER XR, Marievi Xezonaki CORTEX2, and Grace Dinan from TRANSMIXR, got together to showcase their journeys. Regina Van Tongeren, from Women in Immersive Tech, was the incredible moderator guiding the discussion where participants shared their insights, navigated the challenges they've encountered, and recounted their inspiring experiences within this fast-growing sector. 

"Representation matters. We need diverse voices in boardrooms and beyond." – Regina van Tongeren 

"You don’t have to fit the mould—you have to break it. Challenge yourself." – Georgia Papaioannou 
 
"Opportunities in XR are endless. Focus on what you’re good at—it will be your superpower." – Moonisa Ahsan 

"Diversity and inclusion should be a priority from the very beginning of every project." – Grace Dinan 

"Trust yourself and the pace you’re progressing—don’t compare yourself to anyone else." – Marievi Xezonaki 
 
“The references are very important, the women reference for example teachers at the school. To let them know that is possible. We should raise awareness, we are mentors, look “I’m doing this, it’s possible”. This way we can break the barriers and inspire”. - Maria Madarieta Elordi 
 
“We talked about hard skills, as we can be as talented as the other genders but talking about the other side we can bring another world of perspective. We can give a different perspective from the man that’s already there. “- Leesa Joyce 
 
“I think programmes and XR initiatives could have dedicated spaces to women entrepreneurs, offering training, technical guidance, how to apply for funding and tech related contributions. Additionally, introducing small changes and flexibility to help on the family side would be great to support careers.” - Megha Quamara 

These are just some of the key takeaways from the webinar, which engaged over 70 participants with a fantastic Q&A session. It's encouraging to see so many people actively making efforts in changing the narrative.  

Let's amplify the power of women in XR and pave the way for a more equitable and innovative future. 

We were in Saint-Malo, France, for the 3rd Annual Workshop on Multi-modal Affective and Social Behaviour Analysis and Synthesis in Extended Reality (MASSXR) on 8 March 2025. Held as part of the 32nd IEEE Conference on Virtual Reality and 3D User Interfaces (IEEE VR), the half-day workshop at Le Grand Large brought together a community of around 35-40 researchers and practitioners.

Representing fields spanning cybersecurity, human-computer interaction, computer graphics/animation, multi-modal machine learning, AI, data privacy, and socio-technical studies, attendees convened to tackle a crucial and increasingly relevant topic: the state of security in Extended Reality (XR). The discussions centred on identifying future directions and exploring opportunities to boost system robustness and, critically, enhance user trust in XR technologies.

The workshop showcased the latest advancements in the field with the presentation of four insightful research papers. These presentations highlighted the innovative work being done to address the unique challenges and opportunities presented by XR.

As keynote speakers, the workshop had our SERMAS project coordinator, Lorenzo Sabattini, from the Università degli Studi di Modena e Reggio Emilia, Italy, presenting a talk on "Robots Teaming with Robots and Humans: Social Acceptance and Coordinated Control." His talk delved into the complexities of human-robot collaboration, the factors influencing social acceptance of robots, and the strategies for achieving effective decentralized control in multi-robot systems. Friederike Eyssel from Bielefeld University, Germany, offered insights into "Trusting Social Robots." Her talk focused on the critical aspects of building trust in human-robot interaction, emphasising the vital roles of privacy and transparency, supported by compelling psychological and experimental research. These concepts have significant implications for XR environments, where robots can be seamlessly integrated to enhance user interactions and immersive experiences.

The keynote presentations paved the way for a panel discussion that engaged the speakers, Marc Latoschik from Würzburg University, the workshop organisers, and active participation from the audience. This interactive session served as a platform to examine the current landscape of the field, identify significant challenges that need to be addressed, and explore the exciting emerging opportunities within XR. The debate focused on some important questions, including the discussion on how advances in AI, robotics, and understanding emotions are changing our trust in digital interactions. There were also discussions about the balance between technology being useful and becoming intrusive. Finally, the group debated at whether XR could be a strong tool for cybersecurity, particularly in areas like spotting threats, improving logins, and making security training more effective.

Beyond the formal presentations and discussions, MASSXR 2025 fostered a strong sense of community through a dedicated coffee break. This provided an excellent opportunity for attendees to network not only with fellow workshop participants but also to forge connections with researchers, developers, and practitioners attending other events within the main IEEE VR 2025 conference that day. These informal interactions are often where new collaborations and ideas are born.

The MASSXR 2025 was organised by our project partners from Kings College London in support of SERMAS and Innovate UK, and in partnership with Centrum Wiskunde & Informatica (CWI), TU Delft, Utrecht University, Purdue University, University of Massachusetts Boston, and Bryn Mawr College.

We're always on the hunt for exciting ways to support our startups and founders, so when the opportunity arose to be part of Europe’s foremost avant-garde XR event, Stereopsia, we couldn’t resist. 

From 9-11 December, under the umbrella of AI On Demand platform, SERMAS, and other XR projects (VOXReality, XR2Learn, XR4ED, Cortex2, Transmixr , MotivateXR, and Amplify) were together with a vibrant booth in the European Village.  

From SERMAS, we invited all of our open call winners and the 3DforXR, PRINIA. LANGSWITCH, XR-Shield teams joined for the three days of event. Buzzing with ideas and demonstrations our companies had the chance to showcase their work in a pitching competition at the heart of the event – one of the hightlight's of our teams participation since they could pitch their awesome solutions and innovations to investors, potential partners, and industry leaders. 

The event was alive with opportunities to connect, collaborate, and network, and for us it was a pleasure to meet in-person our teams, know more about them, their solutions and their feedback on ur project and next steps.  

We can’t wait for the chance to cross paths again in the future and bring another exciting opportunity like this to our teams! 

Our second open call has wrapped up, and while we await the results, we wanted to share some insights. SERMAS OC2 DEMONSTRATE aimed to push the boundaries by funding innovative AR/VR/XR sub-projects. This round invited fresh ideas from new industries and domains beyond those explored in the SERMAS pilots.

OC 2 Demonstration

We encouraged applicants to take the cutting-edge XR technologies from the SERMAS Toolkit and reimagine them within their chosen fields, aligning with the vision of their proposed projects. The focus was on applying these innovations to one or more key areas: training, services, and guidance.

OC2 Demo statistics - countries
OC2 Demo statistics - sectors
OC2 Demo statistics - domains
OC2 Demo statistics - applications and entries

Here’s a look at the results: We had over 100 applications initiated in our pipeline (a big thank you to all the innovators who applied!). A total of 81 entities submitted applications. Greece took the lead with 13 applications, followed closely by Italy with 10, and Portugal rounding out the top three with seven applications.

In terms of domains, healthcare came out on top, representing 31% of submissions. Manufacturing, Education, and Cultural Heritage were tied, each with 19%. As for the sectors, training and services were neck and neck, with 39% and 36% of applications, respectively.

We’re in the final stages of selecting our winners—stay tuned!

Can you briefly explain what your project is all about? What’s unique about it?

The PRINIA project addresses the need to protect individual privacy and personal data in XR facial recognition. To achieve this, PRINIA develops an autonomous module that implements facial recognition in XR while ensuring the protection of individual privacy and compliance with relevant EU legislation and regulations by adopting a validated research framework and a three-step innovation process: i) privacy rules and compliance management, ii) facial recognition supporting privacy enhancement, and iii) integration for privacy-preserving XR facial recognition. The derived autonomous module supports various scenarios within XR environments, including individual identification, group categorization, and headset user identification, ensuring that users can be identified and categorized without compromising privacy or other sensitive data when using immersive technologies.

What’s the biggest milestone with your project your startup(s) have achieved so far, and what has surprised you most on this journey?

The biggest milestone achieved by the PRINIA project so far is completing the conceptual design of the PRINIA module, including the management of privacy rules/compliance and the implementation of XR facial recognition with privacy enhancement techniques. This milestone signifies a step forward in designing a system that meets the PRINIA and SERMAS projects' goals by demonstrating a path toward achieving a privacy-preserving facial recognition module for XR environments. What surprised us most on this journey is that while the EU has described several steps for privacy preservation in detail and that there is a growing interest in privacy-preserving techniques, challenges like scalability and accuracy often arise. In PRINIA, we now focus on specifying such challenges and refining differential privacy techniques to overcome these limitations.

How did you measure success?

To measure the success of PRINIA, we use SMART objectives and KPIs. They are divided into three main categories: promotion and dissemination, technical excellence, and user experience. Regarding promotion and dissemination, we have presented PRINIA in the "Shaping The Future" workshop, aiming to co-develop principles for policy recommendations for responsible innovation in virtual worlds during the esteemed CHI conference. We also publish and share newsletters periodically. Regarding technical excellence, we have tested the identification accuracy for HMD user identification while applying privacy enchantments, achieving a score of over 90%. Regarding user acceptance, we plan to perform evaluation studies in the following months, aiming to measure user acceptance, workload, and convenience during the PRINIA scenarios.

What are your goals over the next three and six, months?

Over the following months, the goal is to complete the MVPs and integrate them with the SERMAS ecosystem. In particular, during the next three months, we will focus on identifying and verifying individuals in an XR environment. For example, imagine trainees who undergo a facial recognition process for authentication during virtual journalism training, and once authenticated, they are granted access to the training environment. Next, in the following three months, we will focus on categorizing individuals into groups based on common characteristics, such as emotional states. For example, PRINIA will be able to categorize customers based on their facial attributes (e.g., satisfied customers, anxious customers, neutral customers), and personalized customer experience is provided in the customer reception kiosk.

How has SERMAS helped you during the past few months?

SERMAS has helped us in multiple ways during the past few months. We collaborate well with our mentors (King's College London), who support and guide us in making appropriate decisions and adjustments to implement the PRINIA project better and achieve the SERMAS objectives. We have regular meetings to share insights and keep track of the projects. Moreover, we receive feedback and insights from sprint review meetings with the rest of the stakeholders and partners of the SERMAS consortium. Finally, SERMAS supports us technically and financially to implement the solutions described in the PRINIA project.

The PRINIA project was presented by George E. Raptis, Designer and Researcher at Human Opsis, at the Shaping the Future Workshop.

Companies: Human Opsis and Algolysis.

Can you briefly explain what your project is all about? What’s unique about it?

3dify is a user-friendly and open-source web application to empower end-users to effortlessly convert 2D images into realistic 3D avatars requiring no technical skills in 3D modeling. 3Dify makes the best out of AI, intending to minimize the craft-making for extracting face features and converting them into MakeHuman representation. However, users are not substituted by AI solutions, as they can still manually refine facial features, select hairstyles, fine-tune colours and body proportions, and even add emotional expressions, creating a highly versatile tool for avatar authoring. The active role of end-users sets a new paradigm, allowing individuals to create semi-realistic avatars leveraging Unity for high-fidelity rendering, real-time shadow generation, advanced customization, dynamic animations, and emotional feedback.

What’s the biggest milestone with your project your startup(s) have achieved so far, and what has surprised you most on this journey?

The main goal of 3dify is to provide an open-source tool that allows non-experts to create a fully animated and ready-to-use 3D semi-realistic avatar.

How did you measure success?

The avatar generation combines custom algorithms and AI-based processes that accurately extract dozens of facial features from a single image to create a semirealistic avatar in less than one minute. KPIs: Number of mapped face features > 25; Response time between the submission of the input and the downloaded output < 20 sec.

What are your goals over the next three and six, months?

For the second half of the project, we plan to enhance the web application by improving avatar customization and adding a preview feature for animations that display different emotional states.

How has SERMAS helped you during the past few months?

SERMAS has been instrumental in advancing our project. The development of 3dify has helped us take a step forward in our research on Artificial Intelligence and avatar generation, which in turn will improve our academic output in the field. Their feedback and insights have been essential in improving the functionality and usability of our application.

3DIFY - Team

Company: Centro Regionale Information Communication Technology

Can you briefly explain what your project is all about? What’s unique about it?

The LANGSWITCH project aims to develop socially acceptable ASR (Automatic Speech Recognition) systems suitable for use in environments and agents such as those of the SERMAS project. These ASR systems must work in real-time, perform well in noisy environments, distinguish between different speakers, respect the privacy of the user, and work in a variety of languages with very different characteristics (large, less-resourced...).

What’s the biggest milestone with your project your startup(s) have achieved so far, and what has surprised you most on this journey?

So far we have worked with English, Spanish and Basque. For English, we have outperformed a high-quality, state-of-the-art system such as Whisper in very demanding noise conditions. For Spanish and Basque, we have improved on Whisper's results in general and also in noisy conditions.

How did you measure success?

To measure the performance of our system, we use the standard WER (Word Error Rate) indicator, which is the percentage of words that are not correctly transcribed. We measure it in different noise conditions: no noise at all and different SNRs (Signal Noise Ratios), ranging from 10 dB (quite high ambient noise) to 0 dB (noise at the same volume as the speech). For English, we aimed to obtain a WER of around 5% in noisy conditions. We obtained 5.13, 5.65 and 7.21% in the 10, 8 and 5 dB conditions respectively by fine-tuning the Whisper large model and improving its results by one to two points. The results obtained by the Whisper small model and our system developed by fine-tuning it are worse, but the improvement over the base Whisper small model is greater, from two to six points. For the other languages, the results we obtain are similar, but the improvements over the base Whisper models are greater because the base Whisper models do not perform as well for these languages.

What are your goals over the next three and six, months?

Over the next six months, we plan to develop ASR systems that perform similarly in noisy environments in two more languages, French and Italian, as well as a speaker identification system that respects user privacy.

How has SERMAS helped you during the past few months?

The mentoring provided by SERMAS has been very helpful, as they have provided us with detailed use cases of ASR systems in real-world scenarios, thus pointing out the direction and specifics of our developments.

Company: ORAI NLP Teknologiak (Elhuyar Fundazioa)

Can you briefly explain what your project is all about? What’s unique about it?

3DforXR is about generating 3D assets, i.e. photorealistic 3D models, in a fast, easy and straightforward way, processing them, and then using them inside eXtended Reality applications and experiences. By combining 3D Computer Vision and Artificial Intelligence, 3DforXR will support three different modalities. 1) multi-view 3D reconstruction, where one can upload multiple overlapping images of an object and obtain automatically an exact 3D copy of it, 2) single image 3D prediction, where a single front-facing image of an object is enough to generate a 3D model that resembles, as well as possible, to reality and 3) 3D from text, where one provides just a textual description of the 3D model he needs for an XR application and 3DforXR tool generates a relative 3D asset. Although several software solutions for 3D reconstruction exist in the market, 3DforXR’s unique point is that it offers a single point, a web application at the end of the project, that combines different approaches, where one can generate 3D models optimized and ready to be used in XR applications, from different inputs, images or just text.

What’s the biggest milestone with your project your startup(s) have achieved so far, and what has surprised you most on this journey?

We are currently very close to the completion of the second release of the 3DforXR project which corresponds with the deployment of the enhanced versions of the two modalities that correspond to 3D asset generation from multiple or single images. We are excited by the progress achieved so far and were surprised in a positive way by the amelioration of the results in the demanding single image 3D prediction module. A library with processing tools to modify both the geometry and the appearance of the derived 3D models is also ready to be shared with SERMAS partners and we can’t wait for their valuable feedback on the second release of our tools.

How did you measure success?

To measure the success of the two developed modalities we performed quantitative and qualitative evaluations of their results. To evaluate the accuracy of the generated 3D models two KPIs were estimated. For the multi-view 3D reconstruction approach, we measured depth estimation accuracy which was higher than 85%. To estimate this, we run our software on synthetic data, i.e. images synthesized through computer graphics from ground-truth 3D models and compared the estimated depth maps corresponding to each image against the ground truth ones. For the single-image prediction module a similar approach was followed, where instead of depth maps, we compared the predicted 3D models against a publicly available dataset. An F-Score of 80% was achieved for the successful examples.

What are your goals over the next three and six months?

Our main goal for the next three months is to focus on the development of the third modality of 3DforXR, which is the generation of 3D assets from textual descriptions. Looking further ahead in time in the next 6 months we are looking forward to integrating all the 3DforXR technology in the SERMAS toolkit, offering it to end users via a single web application with an intuitive User Interface and proceeding with actions towards the communication, dissemination and exploitation of the project outcomes. A preparation of a publication to share our technological progress with the scientific XR and Computer Vision community is also in our goals towards the last trimester of the project.

How has SERMAS helped you during the past few months?

Besides the obvious help of providing valuable funding to develop 3DforXR, mentoring meetings helped us verify that we are on the right track, while the feedback from the evaluation of the first Release allowed our team to focus on what is considered more important by the SERMAS consortium in order to maximize the impact of our solution.

Company: Up2metric

Can you briefly explain what your project is all about? What’s unique about it?

ALIVE aims to develop Empathic Virtual Assistants (EVAs) for customer support use cases. EVAs are capable of recognising, interpreting and responding to human emotions, by applying state-of-the-art Deep Learning (DL) algorithms for vision, text and voice-based emotion recognition. The output of the models is congregated into a unique state that is fed in real-time in the dialogue state of a Large Language Model (LLM) for empathic text generation, and in the state machine of the Avatar to adapt its state accordingly and modify each answer and interaction (e.g., facial expressions) with the maximum personalization to the user’s needs.

What’s the biggest milestone with your project your startup(s) have achieved so far, and what has surprised you most on this journey?

Our biggest milestone is the development of a pipeline, taking as input audio, text and image that recognizes the user's emotion and provides it as input to an LLM. Our project has received a lot of interest from the research community and the industry, even from stakeholders having diverse backgrounds (i.e., neuroscientists, and philosophers).

How did you measure success?

One we developed five basic emotional states, recognizable by the user. Two: we use at least three modalities as input to perceive the user’s presence/interaction and integrate at least two as aggregators of the final emotional state.

What are your goals over the next three and six, months?

Provide the generated empathetic text to an Avatar, which will adapt to them using empathetically relevant facial expressions. Prepare a small-scale pilot for validation purposes. Disseminate/exploit the project's outcomes.

How has SERMAS helped you during the past few months?

Apart from funding, we receive fruitful feedback and guidance from our mentor (Viktor Schmuck) every month.

Companies: Thingenious and IGODI

Author: Viktor Schmuck, Research Associate in Robotics at King's College London

On the 21st of May 2024, the European Council formally adopted the EU Artificial Intelligence (AI) Act. But what does it cover, and how does it impact solutions, such as the Socially-acceptable Extended Reality Models And Systems (SERMAS) project, which deals with biometric identification and emotion recognition? 

According to the official website of the EU Artificial Intelligence Act, the purpose of the drawn regulation is to promote the adoption of human-centric AI solutions that people can trust, ensuring that such products are developed having in mind the health, safety, and fundamental rights of people. As such, the EU AI Act outlines, among others, a set of rules, prohibitions, and requirements for AI systems and their operators.  

When analysing something like the result of the SERMAS project, the SERMAS Toolkit, from the EU AI Act’s point of view, we don’t only need to verify whether the designed solution is compliant with the regulations laid down but also assess if the outcomes of the project, such as exploitable results (e.g., a software product, solutions deployed “in the wild”), will be compliant with them.  

Parts of the Toolkit are being developed in research institutions, such as SUPSI, TUDa, or KCL, which makes it fall under the exclusion criteria of scientific research and development according to Article 2(6). In addition, since the AI systems and models are not yet placed onto the market or in service, the AI Act regulations are not yet applicable for the Toolkit according to Article 2(8) as well. As a general rule of thumb, if a solution is either being developed solely as a scientific research activity or is not yet placed on the market, it does not need to go through the administrative process and risk assessment outlined by the Act. 

That being said, if a solution involving AI is planned to be put on the market or be provided as a service for those who wish to deploy it, even if it’s open source, it is better to design its components with the regulations in mind and prepare the necessary documentation required for the legal release of the software. This article outlines some of these aspects on the example of the SERMAS Toolkit. Still, it’s important to emphasize that AI systems need to be individually evaluated (and e.g., registered) against the regulations of the Act. 

First, we should look at the syntax relevant to the SERMAS Toolkit. These terms are all outlined in Article 3 of the regulations. To begin with, the SERMAS Toolkit is considered an “AI system” since it is a “machine-based system that is designed to operate with varying levels of autonomy and that may exhibit adaptiveness after deployment, and that, for ... implicit objectives, infers, from the input it receives” (Article 3(1)). A part of the SERMAS Toolkit is a virtual agent whose facial expressions and behaviour are adjusted based on how it perceives users, and since it is capable of user identification. However, the system is not fully autonomous as it has predefined knowledge and behaviour that it exhibits which is not governed by AI systems. Therefore, it is considered an AI system that has varied levels of autonomy. Speaking of user identification, the Toolkit also deals with:  

While Article 6 outlines that systems that have biometric identification and emotion recognition count as High-Risk AI systems, the SERMAS Toolkit is only considered to be one due to the emotion recognition, since Annex III states that “biometric identification systems ... shall not include AI systems intended to be used for biometric verification the sole purpose of which is to confirm that a specific natural person is the person he or she claims to be” (Annex III(1a)) - which is the only use-case of biometric identification. However, Article 6(2) and Annex III also specify that an AI system is considered high risk if it is “intended to be used for emotion recognition” (Annex III(1c)) to any extent. While according to the EU AI Act, an AI system having emotion recognition would be prohibited, the SERMAS Toolkit and any other system can be deployed if a review body finds that it does not negatively affect its users (e.g., cause harm or discriminate against them). Highlighting factors relevant to the SERMAS Toolkit, each AI system is evaluated based on the “intended purpose of the AI system”, the extent of its use, the amount of data it processes, and the extent to which it acts autonomously (Article 7(2a-d)). Moreover, the review body evaluates the potential of harm or people suffering an adverse impact by using the system, and the possibility of a person overriding a “decision or recommendation that may lead to potential harm” (Article 7(2g)). Finally, “the magnitude and likelihood of benefit of the deployment of the AI system for individuals, groups, or society at large, including possible improvements in product safety” (Article 7(2j)) is also taken into account.  

But how does the SERMAS Toolkit compare to the above criteria? To begin with, it only processes the absolute necessary, minimum amount of data and is undergoing multiple reviews during its conception to ensure the secure management of any identifiable personal data. Moreover, it only uses user identification to enhance a building’s security in an access-control scenario. And finally, it solely uses emotion recognition with the aim of providing and enhancing user experience by adapting the behaviour of a virtual agent. This tailored experience does not change the output of the agent, only its body language and facial expressions – which can only take shape as neutral to positive nonverbal features. As such, the agent does not and can not discriminate against people.  

So, while the Toolkit is not a prohibited solution after all, it is still considered a High-risk system, which means that upon deployment, or when provided to deployers, some regulations need to be complied with, as outlined by the Act. For instance, a risk-management system needs to be implemented and periodically revised with focus on the system’s biometric identification and emotion recognition sub-systems (Article 9(1-2)). Since these sub-systems are model-based classification or identification methods, the regulations outlined by Article 10(1-6) for the training, validation, and testing of the underlying models and datasets should be followed (e.g., datasets used for training the models should be ethically collected, with as few errors as possible, and without bias). Moreover, a thorough technical documentation is expected to be published and updated according to Article 11(1), and the system should come with appropriate logging of interactions (Article 12(3)) and overseeing tools (Article 14(1-2)), especially during identification processes. Lastly, when the Toolkit is put on the market, which is part of the exploitable results of the project, it needs to undergo assessment and is required to be registered in a database with its accompanying documentation (Article 14(1-2)). 

In conclusion, the EU AI Act means that AI systems dealing with personal data, especially when it comes to emotion recognition or identification that affects how an autonomous system operates, need to comply with the laid-out regulations. Solutions put in service or on the market may be classified as High-risk and would be prohibited by default. However, a thorough assessment procedure, compliance with additional security requirements, logging, and documentation may mean that an AI system can be cleared to be deployed after all.  As for nonprofessional or scientific research purpose systems, such as the still under-development SERMAS Toolkit, they are encouraged to voluntarily comply with the requirements outlined by the Act (Recital 109), and in the long run, they will benefit from being developed with the requirements in mind. 

The EU AI Act can be explored in detail here, and to get a quick assessment of whether an AI system that will be put on the market or in service is affected by it, and to what extent, they also provide this compliance checker tool

Funded by the european union
Copyright by SERMAS Consortium. All Rights Reserved.