“As a user, I would like to choose my search engine according to my preferences, just like my newspaper”
Interview with Prof. Dr. Michael Granitzer. The professor at the University of Passau is the scientific project leader of the European OpenSearchWeb.eu project as well as co-moderator of the OSF tech specialist group.
Interview: Susanne Vieser
In research institutes, in commercial and scientific data centers, in companies and in many groups, IT specialists are working on an open index for the Internet. The distributed team work is part of the Open Search Foundation (OSF) system. It serves data sovereignty and makes transparent how information from the Internet is registered and output. And in this way, technologies and modules are created that could be used for more services or other software.
“‘Open’ here stands on the one hand for the comprehensible and transparent creation of such an index, and on the other hand it means that the index itself can be used as needed in various ways,” explains Michael Granitzer. Granitzer, who holds a doctorate in computer science, researches how information is made accessible on the Internet and in the media, how usage data is evaluated, and how data is processed for smart artificial intelligence (AI) systems.
At OSF, Granitzer coordinates the tech working group with Stefan Voigt. As a researcher at the University of Passau, he also leads the European project OpenSearchWeb.eu,, which promotes the open Web index for more data sovereignty in Europe.
With open source software, the source code is open and can be improved by anyone. Why do search engine providers keep their codes and algorithms a secret?
Prof. Dr. Michael Granitzer: It is difficult for me to assess the business practices of search engine providers, so I can only guess. However, I would rule out technical reasons; the focus is certainly on protection against competition: how search engines rank the results is one of the most important factors for user acceptance and thus a feature that is hidden from competitors. Publishing the associated source code or source code is rather secondary. There is also not one source code, but many different parts make up the overall system. Core technologies, such as Google’s high-performance database system Bigtable, have already been replicated in open source projects and are thus freely available. Another reason for secrecy is, of course, that it makes it easier to adapt business processes and systems. If the parameterization of the algorithms were transparent, decisions or changes would have to be justified, thus flexibility is lost; I see this similarly in the introduction of business processes and practices.
How did you get involved with the Open Search Foundation and why are you involved here?
Granitzer: Through my colleague Professor Dr. Christian Gütl from the Technical University in Graz. Among other things, he conducts research in the area of information retrieval and invited me to help organize the first Open Search Symposium in 2019. The Worldwide Web has fascinated me since my studies, I consider it an enrichment for our society, but unfortunately – driven by more and more advertising – it has changed from a free information platform to a relatively monopolized business platform in recent years. Open search systems, however, should lead back to the freer information platform on which users find more possibilities for self-determination. As a user, I would like to be able to choose my search engine according to my preferences, just like I choose my newspaper, and not be limited to choosing between two very similar newspapers.
What technical ingredients does an open search need?
Granitzer: The core is certainly an open Web index, i.e., the data structure that quickly returns a sequence of pages that are as relevant as possible in response to search queries. “Open” here stands on the one hand for the comprehensible and transparent creation of such an index – also in terms of control by content owners. On the other hand, it means that – within the legal framework – the index itself can be used as needed in various ways. Companies could then, for example, combine parts of the index with their internal search, private users could use the part of the index that is relevant to them and use it to search on their devices. These are just two examples. I believe that openness and transparency offer many more and more interesting applications, especially in the field of artificial intelligence. An open index is the prerequisite for the emergence of a new ecosystem for such search and discovery applications.
You are coordinating the work on the open search and are experimenting with distributed crawling, i.e., the automated indexing of information and websites on different computer systems: How is the open search to be created?
Granitzer: We want to decentralize the classic pipeline of a search engine – from crawling to the actual search – and distribute it across various research institutions, computing centers and other infrastructure facilities. From our point of view, this not only creates the open index, but the results of intermediate steps can also be useful. For example, processed HTML documents can be used in smart language models based on AI, or individual processing modules, such as extracting specific information from indexed web pages, could give rise to their own services. Beyond that, it’s about control and governance options for content owners, as well as the development of innovative algorithms, such as for determining the information quality of a page. So we have work to do on all fronts.
How do you juggle the OSF’s tech working group for this and how do they coordinate?
Granitzer: The tech working group consists of about 30 members – freelance developers and IT specialists as well as researchers from all over Europe – and sees itself as a community for the realization of an open index. Some of the developers are already working together in research projects such as the EU’s OpenWebSearch.eu, while others are organizing themselves freely in smaller groups. Currently we are organizing ourselves bottom-up and less through classical software engineering processes. This will change when more concrete parts are available. Then, similar to other open source projects, coordination will be done through leadership or governance rules for infrastructure entities. These rules will be developed within OpenWebSearch.eu.
Suppose you meet Larry Page and Sergej Brin in a beer garden – what would you discuss with the Google founders and why?
Granitzer: Why they dropped the motto “Dont be evil”.
What vision are you pursuing in terms of search – what should Internet search look like in five, what should it look like in ten years?
Granitzer: Open, transparent and with a lot of freedom of choice for users. The goal is to make it easy for researchers and innovation drivers to use the Web as a source of information – that’s all open search really is.