Four Years of Fake News

A new paper briefly reviewing the scientific literature on fake news published up to 2020 is online on First Monday. You can read it here: Four Years of Fake News. A Quantitative Analysis of the Scientific Literature

Abstract

Introduction: Since 2016, “fake news” has been the main buzzword for online misinformation and disinformation. This term has been widely used and discussed by scholars, leading to hundreds of publications in a few years. This report provides a quantitative analysis of the scientific literature on the topic published up to 2020.

Methods: Documents mentioning the keyword “fake news” have been searched in Scopus, a large multidisciplinary scientific database. Frequency analysis of metadata and automated lexical analysis of titles and abstracts have been employed to answer the research questions.

Results: 2,368 scientific documents mentioned “fake news” in the title or abstract, published by 5,060 authors and 1,225 sources. Until 2016 the number of documents mentioning the term was less than 10 per year, suddenly rising from 2017 (203 documents), and steadily increasing in the following years (477 in 2018, 694 in 2019, and 951 in 2020). Among the most prolific countries are the USA and European countries such as the UK, but also many non-Western countries such as India and China. Computer Science and Social Sciences are the disciplinary fields with the largest number of documents published. Three main thematic areas emerged: computational methodologies for fake news detection, the social and individual dimension of fake news, and fake news in the public and political sphere. There are 10 documents with more than 200 citations, and two papers with a record number of citations (Alcott & Gentzkow, 2017; Lazer et al., 2018).

Conclusions: Research on “fake news” keeps on the rise, with a marked upward trend following the 2016 USA Presidential election. Despite having been the subject of debate and also criticism, the term is still widely used. A strong methodological interest in fake news detection through machine learning algorithms emerged, which – it can be argued – can be profitably balanced by a social science approach able to unpack the phenomenon also from a qualitative and theoretical point of view. Although dominated by the USA and other Western countries, the research landscape includes different countries of the world, thus enabling a wider and more nuanced knowledge of the problem. A constantly growing field of study like the one concerning fake news requires scholars to have a general overview of the scientific productions on the topic, and systematic literature reviews can be of help. The variety of perspectives and topics addressed by scholars also means that future analyses will need to focus on more specific topics.

A online handbook to learn R and Time Series Analysis

I have recently started to teach a course in data analysis with R at the University of Vienna, and I am creating a free online book where I explain fundamental R functions and data analysis operations, with a specific focus on time series analysis.

I’ll update the online book as the course goes on, but some chapters are already online. You can read the book at this link: Time Series Analysis With R

Digital Animal Advocacy: A Study on Facebook Communication Styles of Italian Animal Rights Organizations and their Followers’ Reactions

A new paper (open access!) is out: Digital Animal Advocacy: A Study on Facebook Communication Styles of Italian Animal Rights Organizations and their Followers’ Reactions.

The paper analyzes the Facebook communication of the Italian galaxy of Italian animal advocates by using a text-mining approach, and reflects on the role of social media in promoting specific political approaches to animal rights.

The social media ecology does not change or construct the different positions of different sectors of animal advocacy, but contributes to amplify their distances, favoring the visibility or, using the term adopted by Dijck and Poell (2013), the ‘popularity’ of some groups over others. It is no coincidence that mainstream AAOs, which have greater financial and professional resources at their disposal, also have Facebook pages with the highest number of followers (Tab. 3), nor that they are clearly fully aware of the possibilities of exploiting the algorithms that preside over the distribution of the most popular content (…) in order to hack the social media attention economy (…) by calling on the concerted efforts of well-organized armies of web-activists.

From this perspective, Italian animal advocacy reflects a lack of democracy in digital platforms and is a further proof of the adage ‘the rich get richer and the poor get poorer’ (Merton, 1968). At least for the moment, the horizontal and democratic nature of Internet-based communication that is hoped for (in the case of anarchist AAOs) or explicitly claimed as already existent and widespread (in the case of anti-political AAOs) is absent (…)

Correspondences analysis of the posts published by four categories of animal rights organizations. The words represented in the map are a subset of the analyzed words with the highest contribution to the axes and representation quality.

QAnon in Italy: a growing phenomenon?

Pandemic are scary. There is nothing worse that an invisible enemy to spread anxiety. Information is partial, contradictory. Rumors and minsinformation flourish. People have to grapple daily with uncertainty and fear. Such a context is a breeding ground for conspiracy theories and their harmful consequences.

Among the interesting phenomena that have been observed so far, there is the attempt to incorporate the pandemic into already established conspiracy and fringe narratives, such as the anti-vax and the Q-Anon one. The first one is rather popular in Italy. In this country, a law on mandatory child vaccinations (2017) sparked an heated debate and information crisis and
fueled the spread of misinformation on social media. The QAnon conspiracy theory, on the contrary, is not so common in Italy. However, the situation could change.

Analyzing 12 months of data from Google Trends, Twitter and Facebook, I have observed a unequivocal rising interest in QAnon theories in Italy during the last period of Covid-19 quarantine (March/April 2020).

Italian Google Search interest, Facebook posts (CrowdTangle data) and Tweets matching the keywords WWG1WGA and Qanon. Data are scaled on a range of 0 to 100. The blu and red vertical lines represent, respectively, the structural break dates in the Google Trends and the two social media time series.

At the moment the number of tweets and posts is small. Nevertheless, if the peak keeps growing…

Data are divided in the pre-pandemic period (before 2019-12-31), pandemic period (after 2019-12-31 but before 2020-03-15) and peak of interest period (after 2020-03-15)

The following table includes the most active Facebook entities (pages, public groups, or verified profiles) and Twitter accounts in the collected data sets.

Facebook pages/groups and Twitter accounts that have the published the highest number of posts matching the keywords QAnon or WWG1WGA (not necessarily “QAnon believers”)

In the following Google sheets are included some examples of the Facebook posts and tweets with the highest engagement (data are divided in the pre-pandemic period – before 2019-12-31 – the pandemic period – after 2019-12-31 and before 2020-03-15 – and peak of interest period, after 2020-03-15).

Health and political misinformation in Italy. The case of mandatory vaccine law

Political and health misinformation are huge problems today. Both can have nefarious consequences on individuals, citizens, institutions and society. As I and co-authors have recently observed in our paper “Blurred Shots: Investigating the Information Crisis Around Vaccination in Italy”, politicization can increase the circulation of misinformation on vaccines “both directly and by opening the door to pseudoscientific and conspiratorial content (…) published by problematic news sources”. It is therefore interesting to further analyze the link between politicization and misinformation.

Italy is an excellent case for studying the intertwining of politics and health misinformation, since it has been at the center of both political and health information crisis during the election (on March 2018) and the debate on the law on mandatory child vaccinations (around July 2017).

I analyzed about 500,000 tweets published between 2016-02-01 and 2019-01-31 (three years),18 months before and after July 31, 2017, when the Italian law on mandatory vaccinations came into force (you can read here the working paper).

The peak of discussions during the political debate was very clear and a structural break analysis confirmed that the usual flow of conversation dramatically changed around that time.

Annotated time series of the Italian tweets on vaccines. The red box indicates the boundaries of the structural break, exactly during the debate on the mandatory vaccinations law.

I categorized over 1,000 information sources shared by Twitter’s users and used a combination of network analysis, correspondence analysis and clustering to identify the groups of information source categories frequently shared together.

The resulting map was unequivocal: the Twitter information environment related to vaccines was polarized and characterized by information homophily. The information sources openly promoting anti-vax perspectives were clustered with blacklisted domains (included in the blacklists of the main Italian debunkers), alternative therapy sites, and conspiracy blogs. On the opposite polarity of the Correspondence Analysis axis (the dimension account for over the 70% of the total variance) we find scientific information sources, those of health organizations, as well as the official sources. Political information sources (such as websites of political parties and advocacy organizations) lie in between these two polarities.

Clusters of information sources

I measured the spread of health misinformation by using the tweets that shared sources included in the cluster of problematic information sources and found that the time series of tweets sharing misinformation was characterized by a structural break during the political debate as the whole series of tweets. It seemed, therefore, that the politicization of the topic fostered not only the conversations on the topic in general, but also the spread of health misinformation.

Since the spread of misinformation can be clearly associated with the general attention and more specifically with the media attention to the topic (misinformation sources can try to “ride the wave” of this interest), I used a Vector Autoregressive Model to further test the hypothesis, and found that politicization, operationalized through a proxy variable indicating the structural break, was associated with an increase in quantity of misinformation also keeping constant the quantity of news media coverage of the previous days, which I measured through Media Cloud (but I found roughly the same results also using the time series of tweets sharing information sources categorized as news media).

Moreover, it seems that while misinformation, on average, grew a lot during the political debate (of about 6 times the period before the political debate) and also after that time (about two times the period before the political debate), the number of tweets sharing scientific and health information sources increased much less during the central period of the debate (only about two times) and even decreased after that time.

This study is only a small step toward the understanding of the intertwining of health and political misinformation. There is still much work to be done, other case studies are needed and more sophisticated measures are needed. In the meanwhile you can read here the full paper of the study briefly described above.


CooRnet: an R package for detecting coordinated behavior on social media

Today we have released CooRnet, an R package developed for detecting coordinated link sharing behavior on Facebook and Instagram.

Given a list of URLs, the package queries the CrowdTangle API link endpoint and retrieves the Facebook shares performed by public pages, groups and verified accounts, identifying the networks involved in coordinated activity.

The basic functions of CooRnet are augmented with other useful functions that create, for instance, the graph of the coordinated networks (to do additional network analysis) or the dataset of the most shared URLs.

CooRnet implements the methods we applied and detailed in our research on coordinated link sharing behavior on Facebook, as described in the report Understanding Coordinated and Inauthentic Link Sharing Behavior on Facebook in the Run-up to 2018 General Election and 2019 European Election in Italy – where we found, for instance, that URLs shared in a coordinated way gained more engagement than those shared in a non-coordinated way – and in the paper It takes a village to manipulate the media: coordinated link sharing behavior during 2018 and 2019 Italian elections.

Engagement by coordinated and non-coordinated activity on Facebook. Understanding Coordinated and Inauthentic Link Sharing Behavior on Facebook in the Run-up to 2018 General Election and 2019 European Election in Italy

In these works we found that networks involved in coordinated link sharing behavior are consistently associated with the spread of misinformation on Facebook. In the two figures below you can see the proportion of blacklisted domains shared by coordinated and non-coordinated entities, and the proportion of problematic entities (signaled by Avaaz) included in the coordinated and non-coordinated entities, as emerged in our studies on the Italian elections.

A more detailed description of CooRnet can be found on the dedicated website.

Proportion of problematic domains shared by coordinated and non-coordinated entities. The panel on the right displays the risk ratio (RR) values, all statistically significant. (It takes a village to manipulate the media)
Proportion of problematic entities included in the coordinated and non-coordinated entities. The panel on the right displays the risk ratio (RR) values, all statistically significant. (It takes a village to manipulate the media)

Understanding Coordinated and Inauthentic Link Sharing Behavior on Facebook

I am pleased to announce that it has just been published “Understanding Coordinated and Inauthentic Link Sharing Behavior on Facebook in the Run-up to 2018 General Election and 2019 European Election in Italy”. This report is the first outcome of our research project funded by the Social Science Research Council in partnership with Facebook and Social Science One.

In the report we analyzed “coordinated inauthentic behavior”, a concept only briefly defined in Facebook public statements which we found useful to frame our research in the light of existing scientific literature.

Based on a combination of CrowdTangle API (a tool for accessing Facebook and other social media data) and two datasets of Italian political news stories published in the run-up to the 2018 Italian general election and 2019 European election, we developed a method that led to the identification of several networks of pages/groups/verified public profiles (“entities”) that shared the same political news stories on Facebook within a very short period of time (10 in 2018, composed of 28 entities, and 50 in 2019, composed of 143 entities). We called this behavior “coordinated link sharing”. You can find the R script to implement the method here.

One of the Italian coordinated networks

By analyzing the social media profiles of such coordinated entities, we observed that while some of them were clearly political, others presented themselves as entertainment venues, despite sharing political content too. Since the political news stories shared by these non-political entities can reach a broad audience which is largely unguarded against attempts to influence, we describe their behavior as “inauthentic” (look at the following tweet to get an idea of what we mean by “inauthentic”).

Our analyses showed that the news shared by the coordinated networks received a Facebook engagement higher than other news included in our dataset. Further analyses are needed to understand the impact of coordinated activities on engagement and public opinion.

The engagement of the the news items shared by coordinated (orange) and non-coordinated (light blue) entities

We found also that much news boosted anti-immigration and far-right propaganda (primary League-friendly propaganda) and that several of the news outlets shared by these networks, as well as some of the Facebook pages involved in coordinated behavior, were already blacklisted by fact-checkers.

We estimated the partisan leaning of the online news media outlets shared by the coordinated networks using the Multi-Party Media Partisanship Attention Score (MP-MPAS), a method we described in a recently published paper.

From Robots to Social Robots. Trends, Representation and Facebook Engagement of Robot-Related News Stories

In recent years technological beings have been entering our individual and social lives in ever increasing numbers. From virtual personal assistants like Microsoft Cortana, Apple Siri, Amazon Alexa and Google Home to robots working with us and for us, artificial creatures are leaving both the fictional world they inhabited for centuries and the industrial and aeronautical fields in which they used to be applied, to increasingly share our living space.

To accommodate these artificial creatures in social life, a space in the symbolic order of society needs to be created. Since media communication is a significative means for creating such a cultural space and there is a lack of research in the field, we decided to conduct an empirical analysis on the Italian online media coverage of robot-related news stories and their Facebook engagement.

Robot-related news items published between 2014 and 2018 and their Facebook engagement

The study has analyzed how, and how much, robot-related topics have been covered and represented in the headlines of Italian online newspapers throughout recent years, relying on text mining techniques for unsupervised text classification developed in R. The news stories were collected through Media Cloud and their Facebook engagement retrieved through the Facebook Graph API, using an approach directly inspired by the Mapping Italian News project.

Topics in robot-related news stories

Results support the idea that online media have been increasingly covering robot-related news stories, and online public has been increasingly affected by this. The text analysis has revealed that the most relevant topic in online news media has concerned the work-skills of robots, which partly arouses astonishment, and partly concern about job losses.

Specificity table. The most characteristic lemmas in each topic (English translation).
The 10 most engaging Italian robot-related news stories on Facebook (English translation)

The news with the highest engagement concerns the experiment of two robots that started “talking to each other” in an unknown language (Facebook engagement: 67,819). Then, there is news on issues such as the future, human-robot relationship, and work-related controversies and policies.

Although fears that robots might steal human jobs and become autonomous and uncontrollable seem to persist, news representing robots as a threat is less than expected. This might support the idea that threatening representations of robots (Mori, 1970) are not so widespread or engaging.

This was not a specific area of inquiry of the current study and further research is needed to assess the attitude toward robots, and how and why it has changed through the years. However, some observations are possible. For example, there might be a lack of awareness regarding risks associated with the use of robots – for example war robots – due to a scarce media coverage of the topic. However, a stronger explanation has to do with socialization practices promoting human-robots coexistence: a lot of news revolves around the use of robots in teaching activities, entertainment industry, festivals, exhibits and in the personal and familial sphere. These activities promote a gradual, positive integration of robots in everyday life. Considering that many people still have limited direct experience with robots, media play a central role in promoting a positive representation of robots. Finally, a significant role is played, and will be played in the future, by marketing activities aimed at promoting positive attitudes toward consumer robotics products.

From a general perspective, the results have shown that online news stories on robots have increased over time, doubling in the five years considered in the study. The Facebook engagement follows the same path, so validating the idea of an increasing interest towards robots among the Italian online public and suggesting they no longer appear a topic people perceive far from their lives. In turn, familiarity with robots is reinforced by their increasing presence in online news stories.

The complete paper can be accessed here: From Robots to Social Robots. Trends, Representation and Facebook Engagement of Robot-Related News Stories Published by Italian Online News Media.

First AoIR Flashpoint Symposium

On June 24, 2019 the University of Urbino hosted the first AoIR Flashpoint Symposium. I am really happy to have contributed to the success of the event with the other members of the organizing committee.

The Flashpoint Symposium is a new format of academic meeting that, as the president of the Association of Internet Researchers Axel Bruns said, aims at responding  “more rapidly to the key issues of the day than conventional conferences, journals, and books are able to do”.

Title of the Flashpoint Symposium was “Below the Radar: Private Groups, Locked Platforms and Ephemeral Contents”. The focus of the event was on the problems researchers face in accessing social media data and on the issues of studying social media contents in an environment marked by an increasing number of ephemeral user generated contents.

The AoIR Flashpoint Symposium was kicked off with the keynote speech of the digital anthropologist Crystal Abidin, and closed by Rebekah Tromble.

Crystal Abidin presented a lot of engaging research materials and an interesting perspective on how the danah boyd’s concept of networked publics could be revisited in the light of the recent transformations of the Internet.

The closing keynote speech was delivered by Rebekah Tromble that addressed the issue of research ethics in a scenario where social media data are increasingly difficult for researchers to access, soliciting scholars to thinking critically on the social importance of research questions and on the ethics of data treatment and conservation.

The AoIR Flashpoint Symposium was transmitted via live streaming and the video registration is available on the YouTube channel of the University of Urbino.

Axel Bruns wrote a live blog during the conference that can be read on his website. The website with the program of the conference can be accessed at the following link.

Diverging Patterns of Interaction Around News on Social Media: Insularity and Partisanship During the 2018 Italian Election Campaign

It has just been published on Information, Communication & Society “Diverging patterns of interaction around news on social media: insularity and partisanship during the 2018 Italian election campaign”.

I co-authored this paper with Fabio Giglietto, Augusto Valeriani and Giada Marino, devoting most of my attention to the methodological and statistical analyses sections.

The study – an outcome of the Mapping Italian News project – sheds light on the Italian online news media ecosystem and digital behaviour of partisan communities using methods we described in a recently published paper that usefully exploit and mix Twitter with Facebook data.

We found that:

  1. On Twitter, sources mainly shared by supporters of populist parties (the Five Star Movement and the League) are characterized by higher levels of insularity compared to those shared by supporters of other parties.
  2. On Facebook, news items published by highly insular sources receive a higher number of shares per comment.
  3. News stories presenting a positive framing of the ‘cyber party’ Five Star Movement received a higher number of shares per comment compared to items presenting the Movement in a negative light, while the opposite is true for stories on all other political parties (see the figure below).

You can read the full paper here.