“Does it always have to be Google or Bing? With the OpenWebSearch project, the EU wants to protect its sovereignty on the Internet. The goal is a freely accessible web directory that feeds diverse search engines and language models and should trigger a boom in new web services.
Arne Grävemeyer reports in the 9/2023 issue of c’t about the openwebsearch.eu project, in which the Open Search Foundation is significantly involved. The article takes a detailed look at the project, its background and future development. Michael Granitzer (University of Passau, OSF and project lead of openwebsearch.eu), Stefan Voigt (Open Search Foundation, DLR), Christian Gütl (Graz University of Technology) and Phil Höfer (SuMa e.V./MetaGer) have their say.
“But what could you do with a large web index if it were freely available to the public? One could build alternative search engines or specialised search services according to selected topics. Users would have free choice and could better protect their private user profiles. Linguists could use the data pool of a large web index to follow how our language is developing, and sociologists could observe how we interact with each other in the social media. Web services could use it to look for clues to incipient pandemics or other catastrophic events and thus build an early warning system.”
“We are not a European Google,” says Michael Granitzer, Chair of Data Science at the University of Passau, who is coordinating the OpenWebSearch project. He says the project is not about building a large search engine, but much more fundamentally about establishing an infrastructure that search engines and other services can later work with. Google’s size is certainly out of reach at the beginning. “It will be more like Wikipedia, which started with a small core compared to large publishers and then grew continuously.”
“Even at the start of the project, and thus before the hype around ChatGPT, the partners considered the Open Web Index, with its focus on European content and languages, as a data pool for specialised language models. New search engines could also immediately use these models as an interface for search queries. “Users are usually not looking for links, but for answers to their questions or even suggested solutions,” says Gütl. That speaks for the use of chatbots, he says.”
“In terms of Europe’s digital sovereignty, the Open Web Index can certainly be seen as a critical infrastructure. The project partners hope that it will create transparent structures on the web. The envisaged European web index promises more plurality and hopefully benefits above all those who simply provide the best and most reliable information on their websites.”
Online version of the article (paywall) at heise.de
Links to Open Web Search, compiled by c’t: ct.de/y6sw
“Google dominates internet search, now an EU project is trying to build an alternative with ‘European values’. Can it succeed?” – The SZ reported on our EU project OpenWebSearch.eu in the business section. Mirjam Hauck spoke with Michael Granitzer about it. He researches and teaches at the University of Passau and heads the 8.5 million euro project.
“We can’t compete with Google,” Granitzer says, dampening expectations. It is difficult to displace the top dogs. And the budget of 8.5 million euros is “a drop in the ocean”. By way of comparison: Microsoft has invested ten billion dollars in the AI company OpenAI and its bot Chat-GPT alone. That is about 1200 times the budget of the EU project.”
The goal of Open Web Search is to eventually cover 50 to 60 percent of the websites that Google also has in its index. That would be about 500 to 600 billion web pages. Because, as Michael Granitzer explains it: “More than 50 percent is a critical mass. If it works with that, you can also cover 100 per cent with more computer resources.”
Granitzer does not believe that they are too late with the OWI. “It’s not about building a competitor to Google or Microsoft, but first about making web data more easily accessible,” says the professor. This data could also be used to train CI models. In addition, “we simply have to make progress on this topic in Europe”.
For Granitzer, his project is also about whether a different advertising market and thus different business models are possible than those dominated by Google. If users had several search engines to choose from, there would be no such problems, says Granitzer. “Oligopolies or monopolies have never been drivers of innovation.” An example: “Currently, we are limited to seeing a list of ten links, of which we look at three, and there is a lot of advertising, which is increasing. I’m already asking myself the question: is this really web search?”
The EU project OpenWebSearch.EU, of which the Open Search Foundation is a partner, starts with the first Third-Party-Call. Research teams from science and research can apply now with their proposals on legal, economic and technical aspects of Open Search. They have the opportunity to participate in the development of a free and open search index and to become part of the Open Search Community.
The OpenWebSearch.EU project consortium is eager to onboard new third-party project teams in the OpenWebSearch.eu landscape and integrate them into the future activities for sustainable Research and Development. Candidates for third-party funded projects should address the closely related topics of the project. They should aim to extend and enrich the existing R&D activities as well as suggesting new ones which are complementary to the project objectives.
The first call consists of two tracks:
Conceptual contributions on legal or economic aspects of Open Search
Building an Open Web Index (OWI) does not only include technical challenges, but also legal and societal ones, especially when considering recent EU legislation like the Digital Service Act or the Digital Market Act. Furthermore, challenges for new business models or significant changes in the search engine market arise. The consortium seeks for two possible kinds of studies:
Legal Studies to analyse and understand legal constraints and requirements for building and operating an OWI, which includes, but is not limited to
- (i) compilation and analysis of the laws and norms that are relevant to building and maintaining an OWI,
- (ii) legal assessment of technical and non-technical prevention mechanism,
- (iii) legal assessment of the implications of the right to de-referencing for an OWI or
- (iv) analysis of existing open source and open data licenses in regard to the suitability for usage in an OWI.
Economic Studies for setting up and maintaining an OWI as public European infrastructure. This includes, but is not limited to studies for analysing and estimating the costs associated with setting up, operating and maintaining a distributed open web index infrastructure across Europe and analysing and estimating the market potential and economic impact of such an infrastructure.
Technical approaches to legally compliant data acquisition considering societal constraints:
Web crawling is the predominant method for web search engines to gather content for their index. However, webmaster and content owners have only limited control over the crawling process via mostly proprietary services. OpenWebSearch.eu is looking for concepts and approaches for opening the proprietary components and provide webmasters and content owners with more control over the crawling process and the usage of their content. Envisioned solutions should be technical in nature, including new metadata schemata/ontologies, algorithms / services for collecting website metadata, services and tools for webmaster and content owners to define legal constraints for crawling as well as open datasets and machine learning models for analysing and filtering web pages during the crawling process.
Dates and Modalities
- Opening date: 1st March 2023
- Closing date: 28th of April 2023, 17:00
- Notification date: 30th June 2023
- Start of projects: 1st August 2023
Successful applications can request funding between 25,000 and 120,000 EUR in this first call for a funding period of up to 12 months.
In particular, the call is targeting smaller companies (e.g. SMEs, start-ups), individual innovators, individual researchers or research teams (e.g. doctoral or post-doctoral researchers) from renowned universities. Eligible applicants are individuals residing in EU Member States or Horizon Europe Associated Countries, or organisations registered in EU Member States or Horizon Europe Associated Countries.
Find more info and the proposal package for download on the project website: openwebsearch.eu/call1
Open Search Foundation e.V.
The Open Search Foundation e.V. is a European movement of people and organisations that work together to create the foundation for independent, free and self-determined access to information on the Internet. In cooperation with research institutions, computer centres and other partners, we’re committed to a searching the web in a way that benefits everyone. The promotion of research in the field of search engines, plus education and cooperation, form the pillars of our work.