c’t: “Basis for a thousand search engines – The EU wants to build a public web index by 2025”.
“Does it always have to be Google or Bing? With the OpenWebSearch project, the EU wants to protect its sovereignty on the Internet. The goal is a freely accessible web directory that feeds diverse search engines and language models and should trigger a boom in new web services.
Arne Grävemeyer reports in the 9/2023 issue of c’t about the openwebsearch.eu project, in which the Open Search Foundation is significantly involved. The article takes a detailed look at the project, its background and future development. Michael Granitzer (University of Passau, OSF and project lead of openwebsearch.eu), Stefan Voigt (Open Search Foundation, DLR), Christian Gütl (Graz University of Technology) and Phil Höfer (SuMa e.V./MetaGer) have their say.
“But what could you do with a large web index if it were freely available to the public? One could build alternative search engines or specialised search services according to selected topics. Users would have free choice and could better protect their private user profiles. Linguists could use the data pool of a large web index to follow how our language is developing, and sociologists could observe how we interact with each other in the social media. Web services could use it to look for clues to incipient pandemics or other catastrophic events and thus build an early warning system.”
“We are not a European Google,” says Michael Granitzer, Chair of Data Science at the University of Passau, who is coordinating the OpenWebSearch project. He says the project is not about building a large search engine, but much more fundamentally about establishing an infrastructure that search engines and other services can later work with. Google’s size is certainly out of reach at the beginning. “It will be more like Wikipedia, which started with a small core compared to large publishers and then grew continuously.”
“Even at the start of the project, and thus before the hype around ChatGPT, the partners considered the Open Web Index, with its focus on European content and languages, as a data pool for specialised language models. New search engines could also immediately use these models as an interface for search queries. “Users are usually not looking for links, but for answers to their questions or even suggested solutions,” says Gütl. That speaks for the use of chatbots, he says.”
“In terms of Europe’s digital sovereignty, the Open Web Index can certainly be seen as a critical infrastructure. The project partners hope that it will create transparent structures on the web. The envisaged European web index promises more plurality and hopefully benefits above all those who simply provide the best and most reliable information on their websites.”
Online version of the article (paywall) at heise.de
Links to Open Web Search, compiled by c’t: ct.de/y6sw