Interview‘In the End, it’s Always about Closing Knowledge Gaps’
18 December 2024
Photo: Mahdi Jampour
Robots, Covid-19, agriculture, Georgian palimpsests – AI scientist Mahdi Jampour has already dealt with a colourful range of topics in his career. In this interview, he talks about his background and his current research at the CSMC.
Mahdi Jampour, in 2007, most people had never heard of artificial intelligence, let alone understood anything about it. You, however, already did an MSc on this subject back then. How did you get into AI so early on?
I set the foundation for this in 2003, when I started studying computer science at the university. Back then, artificial intelligence was still relatively new, with less than half a century of development. I had some early successes in this field trying to develop specialised programmes that automated routine tasks, which I found deeply satisfying. Later, in 2007, I decided to pursue a Master’s degree in artificial intelligence.
After completing your MSc, you left Iran and continued your academic career in Austria. Why?
It was an opportunity I eagerly pursued. Throughout my studies, I focused on expanding my knowledge, and I was excited about being awarded a PhD scholarship. This allowed me to pursue my PhD at one of the top universities in the world in this area, TU Graz. I chose this institution because of my supervisor, Professor Horst Bischof, whose reputation in the academic world was a key factor in my decision. I learned a lot from him and its computer vision team.
What topic did you deal with in your doctoral thesis?
My focus was on computer vision. This branch of AI is very broad and practical. A lot of the talk about AI these days is centred around applications of computer vision, natural language processing, and machine learning. Of course, other fields of AI have also grown, but these three sectors have seen a particularly rapid development and an increasing impact on human life.
More specifically, I chose the challenging topic of facial expression recognition in non-frontal face images. For a machine, understanding human facial expressions is an intricate task because types of faces and feedback of emotional states are very diverse. The practical challenge I wanted to address was this: Given a non-frontal image of a human face, can we propose intelligent methods to correctly recognise the person’s emotions? When part of the face is not visible, the information of that area is not available and recognition becomes much more difficult. In my thesis, I introduced methods to provide intelligent agents with this capability. More importantly, some of those ideas have also helped in other applications.
Can you give some examples?
In my view, research should help people to live better lives, so I used my knowledge in various contexts. For example, after my PhD, I invented a robot that performs shelf-reading for libraries. The library for which we developed this tool had 120,000 books on its shelves. You can imagine how tedious it is to find one again if it is sorted incorrectly. With many users, this happens all the time. In the past, librarians spent several days a year doing nothing other than checking the shelves and correcting such errors. With our robot, this is done automatically – and not once every year, but every day. It can also help users: just give it the title or number of a book and it will guide you to it.
During the COVID-19 pandemic, my colleagues and I proposed an intelligent vision-based method that analysed CT scans to accurately recognise COVID-19 infections. And in yet another project, we invented a machine utilising computer-vision methods to process pistachios, grading and sorting them based on quality. This innovation addresses the increasing demand for intelligent machines in agricultural practices.
It requires significant openness and curiosity to understand how we can support one another across our varied professional backgrounds.
Now the obvious question: Which path leads from pistachios to manuscripts?
What matters is the thinking that is at the heart of all of these projects. In the end, it’s always about closing knowledge gaps and utilising artificial intelligence effectively to address specific needs. For instance, in recent collaborations with my colleagues Jost Gippert and Hussein Mohammed, we were dealing with palimpsests – manuscripts from which the writing had been erased, and new text written over it. These earlier writings are immensely valuable for scholars. Drawing on past knowledge on how to recover facial information from non-frontal images, we can reconstruct the undertexts, yielding significant results.
How exactly do you achieve this?
The approach involves generating synthetic multispectral images to create a robust training dataset without requiring manual labeling. This synthetic dataset is then used to fine-tune a generative inpainting model, enabling it to reconstruct the obscured undertext more effectively. By leveraging generative artificial intelligence, the pipeline deciphers complex visual data patterns to improve the legibility of undertexts and address the limitations of conventional MSI techniques. Recently, I developed a more advanced method of handwriting inpainting based on Latent Diffusion Models (LDM), which has provided results which look even better.
What do you appreciate about working with historical written artefacts, especially considering that your expertise in computer science has already led you into so many different areas of application?
There are many researchers at the CSMC who excel in their respective fields. I am inspired by the thought of what we can accomplish if we harness these diverse potentials in a collaborative and productive way. This is no easy task; it requires significant openness and curiosity to understand how we can support one another across our varied professional backgrounds. However, the effort is worthwhile. I firmly believe that integrating AI into research on written artefacts can bring substantial benefits to our research community.