The SOURCE research project (launched on April 1, 2026) makes a key contribution to the German federal government’s IT security research as part of the BMFTR research program “Digital.Sicher.Souverän.” (“Digital.Secure.Sovereign.”)

SOURCE stands for “Skalierbare, offene und umfassende Erkennung von Desinformationskampagnen im Web“ (“Scalable, Open, and Comprehensive Detection of Disinformation Campaigns on the Web”). The project is supported by the BMFTR as part of the funding line “Vertrauen in Demokratie und Staat: Digitale Desinformation erkennen und abwehren” (“Trust in Democracy and the State: Detecting and Countering Digital Disinformation”).

Together with our project partners, the University of Passau, the University of Kassel, the Leibniz Supercomputing Centre (LRZ), and Alliance4Europe (A4E), we were successful in the competitive tendering process for this collaborative project within the BMFTR research program “Digital.Sicher.Souverän.”

The focus of SOURCE is the development of a freely accessible database of potential disinformation artifacts. This involves utilizing web data from the Open Web Index and additional social media crawls, which are then analyzed and interpreted in terms of content and origin using AI tools. The necessary AI, cloud, and storage capacities are provided by a federated digital infrastructure.

The active Open Web Search community, as well as technical and editorial expert communities, are brought on board early for needs assessments. The communities are expected to contribute concrete use cases, which will be used to test, further enrich, and make the database content practical.

The data generated by the project will be made available to both the scientific community and users in the fields of journalism, business, and civil society, thereby creating a solid foundation for identifying and combating disinformation on the web.

The German Federal Ministry of Research, Technology and Space (BMFTR) is funding the project with €2,500,000 over a period of three years.

In a recent interview with Reset Digital for Good, Prof. Dr. Michael Granitzer, Chair of Data Science at the University of Passau and project manager of our recently completed OpenWebSearch.EU project, provided insights into the significance of the Open Web Index, which was developed in the project.

„Our mission is to break up the silo of a single search engine. (…) We’re doing this by crawling the web, collecting web pages, and preparing them to be consumed by search engines. Preparing them involves cleaning advertisements and navigation links, then extracting the main content. This index can be used by individuals or organisations to build their own search engines.“ states Michael.

While scaling up to compete with monopolies such as Google would require enormous resources, Michael believes that a community-oriented project could complement traditional and AI search as a public good. He envisions a future in which small AI models enable users to search for and combine information from various sources, independently of large technology companies.

In the future, societies will be shaped by humans and models working closely together

„In an ideal scenario, the AI model is running on my machine and controlling my data. It’s a tool that helps me, conducts searches on my behalf and understands what I want to do. I’m talking about small language models rather than large language models. A model that helps me search, aggregate and synthesise information based on search endpoints that I choose.“, he states

You can read the full interview here: https://en.reset.org/fighting-the-search-monopoly-with-an-open-source-index-an-interview-with-michael-granitzer-from-openwebsearch/

Our OpenWebSearch.EU project was recently featured in a German arte.tv report about European alternatives to overseas BigTech web services.

The video highlights our commitment to strengthening European digital sovereignty in the world wide web.
The report provides insights from Prof. Dr. Ir. Djoerd Hiemstra, Professor of Federated Search and Head of the Information Retrieval research group at Radboud University, one of the OpenWebSearch.EU consortium partners. Djoerd introduced the Open Web Index in its current state and the role it could play in creating powerful European search solutions.
Skip to minute 4:16 to hear Djoerd‘s insights:

Alternatively, watch the video directly on Arte.tv: https://www.arte.tv/de/videos/121620-127-A/wo-bleibt-das-europaeische-google-oder-facebook/

Just in time for the wrap-up of the 42-month EU project OpenWebSearch.EU, we present exciting use cases based on the Open Web Index, which was developed in the project


As a reminder, the OpenWebSearch.EU project was implemented by 14 partner organizations from the research and non-profit sectors and aimed to create the first European Open Web Index as the centerpiece for sovereign structured access to the internet.

The OWI (Open Web Index) has been up and running since June 2025 and has crawled an impressive 1.3 petabytes of data to this date. In the course of the project, a total of 15 third-party partner projects were integrated through various open calls. The goal: to conduct legal, technical, and commercial analysis and feasibility studies related to the Open Web Index in order to lay a solid foundation for the expansion of a European web data infrastructure.

Seven of the third-party partner projects (projects from Open Call 2) dealt with specific technical application examples based on the OWI. The projects demonstrate the range of possibilities when web index data is openly accessible. We have summarized the promising results briefly:

VERITAS project: Fact-checking the war in Ukraine with a RAG chatbot

The company DEXAI (Czechia) developed a so-called retrieval-augmented generation chatbot and a Chrome browser extension for real-time fact-checking. Statements about the war in Ukraine were examined as test examples. The system filtered 30 days of OWI crawl data, extracted news content, indexed it using embeddings, and used an established LLM to generate source-supported, evidence-based responses to user queries. Users can highlight any text on a web page and receive an instant evaluation based on verified news sources. The project shows that open web data enables domain-specific fact checking tools that would otherwise rely on proprietary search APIs.
The full VERITAS story can be read here: https://openwebsearch.eu/results-veritas/

 

AKASE: The world’s arguments as a knowledge graph

The University of Groningen (Netherlands) constructed an argumentation knowledge graph based on over 105 million web index documents. The system automatically identifies argumentative content – claims and premises on websites, recognizes rhetorical fallacies, evaluates argumentation logics, and documents support, attack, and paraphrasing relationships between arguments. The applications include a search engine that reorders results according to argument quality and ArgsBase, a multi-agent deliberation platform that won the JTS Early Career Researcher Prize.
The full AKASE story is available here: https://openwebsearch.eu/akase-results/

 

CIFFIL Service: Sharing search statistics between Dutch municipalities

Spinque (Netherlands) integrated the Common Index File Format (CIFF) into its search platform to enable Dutch municipalities to easily exchange index statistics. Small municipal document collections – some with fewer than 10,000 documents – often suffer from poor search quality because the data sets are simply too small to provide accurate term frequency estimates (statistics on the frequency and relevance of certain terms within data collections). As a result, search result rankings cannot be designed effectively. By adopting statistics from larger municipalities via CIFF, smaller municipalities can significantly improve their ranking effectiveness.
Read the full CIFFIL story: https://openwebsearch.eu/ciffil-results/

 

DTCommerce: Supporting retailers in their transition to digital

ZenLab (Slovenia) developed open-source tools to facilitate the transition to e-commerce for brick-and-mortar retailers. Based on an Excel export from the company’s ERP or accounting tool, the system searches for information on the products listed therein. To do so it uses titles, descriptions, images, and MSRPs from supplier websites, optimizes existing descriptions using AI, and finally imports everything automatically into a WooCommerce online store via a WordPress plugin.
You can read the full DTCommerce story here: https://openwebsearch.eu/ditcommerce-results/

 

OMMS: Open Maps as an alternative to established Maps Apps

The E Foundation (France) used OpenWebSearch.EU’s crawling tools to harvest structured business data – opening hours, contact information, FAQ – from websites linked to OpenStreetMap Points of Interest (POI). This data is made available via an open-source POI server for mobile map applications. Starting in the Seattle metropolitan area and then expanding globally, the project team found that about 12% of POI-linked websites contain analyzable structured data. In addition, the project identified two promising future directions for OpenWebSearch.EU: the publication of POI relevance rankings (based on PageRank or similar metrics) to improve result sorting in open data geocoders, and the use of backlink data as an open alternative to proprietary rating databases.
The full OMMS story can be read here: https://openwebsearch.eu/results-omms/

 

FUN: Rethinking web crawling

The University of Pisa and the University of Glasgow (Italy/UK) proposed a paradigm shift in web crawling. Traditional crawlers use link-based heuristics such as PageRank to decide which pages to consider. FUN argues that in the age of AI, crawlers should instead use language models to assess the semantic quality of pages. The team developed four neural crawling strategies and tested them on 87 million pages from ClueWeb22-B. For natural language queries, the best strategy consistently outperformed PageRank in both crawling effectiveness and downstream retrieval quality, while remaining competitive for keyword queries.
The full FUN story is available here: https://openwebsearch.eu/fun-results/

 

TILDE: Trustworthy health search with fairness-conscious ranking

Know Center Research GmbH (Austria) built a health-related search system on the OWI that addresses potential biases and varying degrees of trustworthiness beyond pure search result relevance. The system extracted medical content from around 200,000 health-related websites, standardized it against the clinical UMLS ontology, and implemented a hybrid retrieval engine combining entity-based and semantic search. Its unique feature is a three-stage fairness pipeline: it enriches each search result with trustworthiness and neutrality attributes, sorts results to maximize fairness while maintaining credibility and diversity of viewpoints, and checks its own system outputs for stereotypes. The visual web interface allows users to explore medical evidence via visual knowledge graphs and faceted search.
The full TILDE story can be read here: https://openwebsearch.eu/tilde-results/

 

What happens next?

An Open Web Index enables applications that proprietary search cannot offer.
The research contributions have a direct impact on how the infrastructure itself should evolve.
With the completion of all open calls, the OpenWebSearch.EU project has built a community that extends far beyond the core consortium. The code, data, models, and tools from these projects are predominantly open source and freely available. The infrastructure contributions will continue beyond the formal end of the project.

Stefan Voigt, board member of the OpenSearchFoundation and spokesperson for the OpenWebSearch.eu project, emphasizes the importance of a democratic, independent, and public web search in Europe in an interview with the German Bertelsmann Stiftung’s Change Magazine.

Europe’s digital dependence on the U.S. is particularly evident in everyday life: whether it’s Amazon, WhatsApp, or Apple Pay, the use of U.S. services highlights Europe’s digital vulnerability. This realization has given rise to initiatives such as EuroStack and the OpenWebSearch.eu project, which aim to achieve digital independence for Europe.

“We have a mammoth task ahead of us. But I am convinced that we will be able to create alternatives in Europe,” says Stefan Voigt.

 

You can find the full German article here.

Alternatively, you can download the German Change Magazine directly here.

 

The online magazine European Perspectives recently interviewed our OSF board member, Dr. Stefan Voigt, about the OpenWebSearch.eu initiative.

According to the magazine, based on the latest data, nearly nine out of ten internet searches queries from Europe are processed by U.S. technology companies such as Google and Microsoft. This heavy reliance on overseas infrastructure is also due to the lack of European alternatives.

To address this, the OpenWebSearch.eu initiative funded by the Horizon Europe project of the same name has developed an OpenWebIndex. This webindex is a crucial step toward reclaiming sovereign access to indexing and accessing online information. Dr. Stefan Voigt, board member of the OpenSearchFoundation and OWS spokesperson, highlights the project’s urgency and significance for Europe’s digital future.

“The web is meant to be a public thing. We want to relate it back to the public”, he states, emphasizing the need for Europe to secure its digital autonomy.

You can read the full article here: https://euperspectives.eu/2025/11/is-open-search-the-path-to-european-digital-sovereignity/

 

 

“EU research power against Google’s dominance” German news broadcaster ZDF reports on OpenWebSearch.eu and the vision of open web search

“A European association is challenging Google: a public web index should finally ensure diversity on the search market. A Bavarian association plays a key role in this endeavour.” – In a current report, ZDF introduces the Open Search Foundation and the EU-funded OpenWebSearch.eu project, which was set up to build an independent, European search infrastructure. “Search engines decide what content is visible and how user, data and payment flows move,” states Dr. Stefan Voigt, CEO of the Open Search Foundation. “It is unacceptable for just one company to dominate this key infrastructure of the digital world.”

The article informs about the work of the Open Search Foundation and the beforementioned Horizon Europe project, which aims to build a free, community-driven search index that enables new, diverse search engine models – e.g. for science, journalism or regional content. The index could also serve as a data pool for AI models. The project is supported by 14 European partners from research and society, including the Leibniz Supercomputing Centre in Munich and CERN in Geneva. The EU funds the project with 8.5 million euros.

 

Zum Artikel Export Export