5. Symposium on Open Web Search – #ossym23 Recap of Day 2

For the 5th time Researchers, Tech experts, Political Representatives as well as Industry Executives came together to discuss about open web search and the foundations of a human-centric, transparent and open internet with solid grounds in EUROPE and beyond.

The Open Search Symposium 2023 was hosted at CERN in Geneva from October 4th to October 6th welcoming round about 100 participants both online and in person.

#ossym23 Recap Day 2

Day 2 at a glance

Keynote speech by EU Vice-President of the EU Commission for Values and Transparency Věra Jourová

Day two of #ossym23 continued with more inspiring talks and discussions surrounding the topics of Open Web Search, Transparency and Collaboration. The day was kicked off by a powerful keynote speech by Věra Jourová Vice-president of the EU-Commission for Values and Transparency in which she addressed important questions like whether the internet respects our rights and what the role of policy making will be.

She pointed out that “our view of the digital transformation is one that puts people at the center. We need an open internet with solid protections for users and a level playing field for businesses”.

She also mentioned the importance of regulation which should go hand in hand with research, tech development and market innovation. If Europe wants to be at the forefront of an Industry 4.0 it needs to keep investing in talents, boosting innovation and fostering business environment.

The challenge according to Věra Jourová is how to scale up and be attractive to the market. Not by competing with established corporates like Google and Microsoft but by building a viable ecosystem for new business from the ground up. Furthermore she said that “it is necessary to build a cooperative eco-system ensuring cultural diversity”.

Vice-President Jourová closed her speech by acknowledging the efforts of Open Search Foundation and OpenWebSearch.eu and its partners to lead this important conversation.

Research Tracks: Human Centric Search and User Experience

The day continued with exciting insights into Understanding and Mitigating Cognitive Bias during Web Search”. Simon Hitzginger presented this new research from a group of researchers (Simon Hitzginger, Christian Gütl, Alexander Nussbaumer, Chiara Ruß-Baumann) of Technische Universität Graz. The premise being, that even if search engines operated completely neutral, the problem of human bias would remain when it comes to evaluating results. In their study, the researchers provided participants with surveys asking them for opinions on a polarizing topic. Afterwards participants were asked to search the topic in a controlled search surrounding, providing the same amount of articles, namely 20 in favor and 20 against the topic claim as well as 9 neutral ones. The outcome: The participants’ pre-determined opinions matched the focus and time spent on articles confirming said opinion. The study proofs that human behavior in itself comes with biased beliefs and decision making. Moreover, search engines that are not providing balanced results from relevant sources might reinforce the given biases, whether or not they are actually benefitting the person. On the flipside, by showing that the prediction of confirmation by analyzing search behaviour is possible, these tolls can be used to inform and educate users about their often hidden biases. However, more data and research is needed.

Adjactent to the topic of Search Behavior, Steffen Leich-Nienhaus from Mercedez Benz Group AG introduced an approach to Interfacing generic and specialized search engines on the User side. Being responsible for delivering information to corporate workers, Steffen and his team work on semantic extensions for queries and web content. He presented interesting case studies and practical analytics.

Manuel Noia, CEO of Linknovate Science presented the Linknovate startup radar – datalife use case” – an innovative research platform that was developed collaboratively with his colleague Carlos Rodriguez from The University of Santiago de Compostela as well as Javier Parapar from the Information Retrieval Lab at the University of A Coruña.

The Startup radar works as a start-up scouting system and incorporates a tool for technical due diligence. Companies can detect Mergers & Acquisitions, Funding Events and Product Launches as well as provide knowledge and innovation weak signals.

Linknovate uses 4 approaches, namely: Keywords, extractive summary (key sentences), Linknovate company descriptions, and abstractive summary (full info in order to create an explanation).  Users can tag references and automatic reports are being provided.

The final research track shed a light on the philosophical and sociological elements in search. Manuel Theophil from Rheinland-Pfälzische Technische Universität shared insights into an attempt of Reaching beyond ethics: Perspectives of Human rights education on an open search index”.

To discuss “Ethics in Search” one has to understand the „currency“ of the given ethical criteria. What do concepts like “open” or “freedom” mean in  the context of search engines? What is the evolution of the relationship between “a sense of reality” and “scientific facts” like in our day and age?  

How does this relationship affect political opinion-forming? In the past, reasoning and deliberation was part of the political opinion making proess. Today, according to sociologists, most of what we find are aggregated private opinions. There is hardly a  process of deliberation. We rely more and more on surveying of fixed opinions.

In his presentation Manuel examined the journey from Myth to Logos , from Philosophy to Technicism and from Tech Solutionism to Human Rights Education, which is a fundamental right according to the UN-Declaration on Human Rights Education – issued in 2011. 

Interactive Workshops and Industry Panel Session

After a lunch break on campus, the Workshop Sessions of the day focussed on Dystopia vs Utopia – a hands-on workshop on ethical aspects of web search”, Cross Border Legal Considerations” and Energy Efficiency in Open Web Search”.

The afternoon of day two featured a panel session on Business Applications and Economic Value of  open Search”. Hosted by Isabell Claus from thinkers.ai, this industry-focussed discussion welcomed Jacqueline Erhart from ASFINAG Maut Services GmbH and Prof. Uwe Seebacher – Author and Editor at Springer Nature Group.

Isabell Claus appealed to the community to embrace business opportunities, as business value could attract financing, which would enhance the potential for science. 

Jacqueline Erhart stated that “the amount of data that needs to be processed in order to run the business at ASFINAG is rather intense and does require efficient tools including AI. Sorting through information more efficiently is a core requirement to drive innovation, meet ESG goals and secure cyber security”.

Uwe Seebacher pointed out that “information is available, but the validity of it remains obscure”. The time it takes to research and evaluate the quality of the information seems huge, leaving little room for the actual interpretation of the data sets. The key to implementing AI based applications is a thorough trust-building process. In order to preselect relevant from irrelavant information, we must trust the evaluation criteria and methodology. “Now we must make sure to take the businesses with us to ensure they learn to use the tools“ he stated.

Both panelists agreed that we are currently 10 years behind and need to be able to access open data sets as soon as possible in order to train AI.

Keynote Talk on Biases in Search

Moving back from business to research, Ricardo Baeza-Yates, Director at the Institute of Experiential AI at Northeastern University USA delivered an in depth keynote presentation surrounding Bias in Search and Recommender Systems. 

To begin with he explained a variety of biases we are tied to as human beings, including Statistical Biases (i.e. significant systematic deviation from prior distribution), Cultural Biases (interpretations and judgements phenomena acquired through life), Cognitive Biases (systematic patterns of deviation from norm or rationality in judgement) and many more. Most web systems –according to Baeza-Yates – are optimized by using user feedback (clicks and engagements).

However, our search choices are already biased as we get limited results we can respond to in the first place. The biases thus reinforce themselves, creating filter bubbles and echo chambers. Sometimes these systems compete with themselves. Improvement in one system might lead to degration of another that uses a different approach.

Since all human data input is biased, the algorithm enhances various biases and spits out more consolidated biases, said Baeza-Yates. The solution would be to de-bias input, tune the algorithms and de-bias the output as well. Another relevant point he made was that more developed countries usually have more information which influences outcomes.

Baeza-Yates appeals that we should use data to break through filter bubbles as opposed to using data from others to predict behavior.

One of the biggest biases online are defined by interactions on the web. How things are presented affects interactions. In addition, there is a second order bias happening. Duplicating content by linking back to original content leads to better ranking of certain pages, meaning that our feedback loop influenced the algorithm.

The key takeaway from this insightful lecture-like talk was that it takes multidisciplinary teams to reduce the interplay of biases in web search (which beautifully reinforces Manuel Theophil‘s call for deliberation).

Knowledge Graphs and Authentic AI

Next up was Branimir Rakić –co-founder and CTO at Trace Labs – who came to talk about Building Authentic AI: The Synergy of Decentralized Knowledge Graph and Community Incentives.

According to Rakic, trusted data will become the cornerstone of human security. He introduced the Knowledge Revolution as the third big global revolution, following the invention of the Printing Press which provided a solution for scarcity by allowing for replication and the invention of the Internet, which encouraged connectivity as an antidote to fragmentation. The Trust Revolution now requires decentralized AI. Soon there will be more content produced by machines than by people. He predicted that “Knowledge will be a new asset class”. The value of data is at the core of Big Tech companies already. Creating and owning knowlegde via unique NFTs might be the next investment run. Trusted AI systems can build on knowledge assets, provided that the source is a trusted institution. The idea is to use Blockchain for decentralized ownership but to place the knowledge (that is stored in a graph) above the blockchain. On top of that, search engines could dock on.

Industry Track with Alternative Search Engines Mojeek, fragFinn.de and Marginalia Search

Following this innovative talk, the industry track featuring Alternative Search Engines was moderated and led on by Christine Plote – Member of the Executive Board at Open Search Foundation e.V.

Colin Hayhurst shared about his company Mojeek – a UK based, non-tracking search engine running their own independent index. Anke Meinders introduced fragFinn.de – a curated non-profit safe search engine for kids in Germany and Viktor Löfgen talked about Marginalia Search – a small, specialized search engine based in Sweden that enables users to find small, high-quality websites referring to websites that hardly show up in commercial search engines because they are overrun by larger commercial websites and SEM.

With this deep dive into the current web search landscape, make sure to stick around for the recap of Day 3 of #ossym23.

Check back to #ossym23 DAY 1 here.