A systematic review of automated hyperpartisan news detection

Michele Joshua Maggini; Davide Bassi; Paloma Piot; Gaël Dias; Pablo Gamallo Otero

doi:10.1371/journal.pone.0316989

Abstract

Hyperpartisan news consists of articles with strong biases that support specific political parties. The spread of such news increases polarization among readers, which threatens social unity and democratic stability. Automated tools can help identify hyperpartisan news in the daily flood of articles, offering a way to tackle these problems. With recent advances in machine learning and deep learning, there are now more methods available to address this issue. This literature review collects and organizes the different methods used in previous studies on hyperpartisan news detection. Using the PRISMA methodology, we reviewed and systematized approaches and datasets from 81 articles published from January 2015 to 2024. Our analysis includes several steps: differentiating hyperpartisan news detection from similar tasks, identifying text sources, labeling methods, and evaluating models. We found some key gaps: there is no clear definition of hyperpartisanship in Computer Science, and most datasets are in English, highlighting the need for more datasets in minority languages. Moreover, the tendency is that deep learning models perform better than traditional machine learning, but Large Language Models’ (LLMs) capacities in this domain have been limitedly studied. This paper is the first to systematically review hyperpartisan news detection, laying a solid groundwork for future research.

Citation: Maggini MJ, Bassi D, Piot P, Dias G, Otero PG (2025) A systematic review of automated hyperpartisan news detection. PLoS ONE 20(2): e0316989. https://doi.org/10.1371/journal.pone.0316989

Editor: Pablo Henríquez, Universidad Diego Portales, CHILE

Received: August 12, 2024; Accepted: December 19, 2024; Published: February 21, 2025

Copyright: © 2025 Maggini et al. This is an open access article distributed under the terms of the CreativeCommonsAttributionLicense, which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited.

Data Availability: All the files regarding the collection and screening process are publicly available at the following GitHub repository: https://github.com/MichJoM/Hyperpartisan_News_Detection_Systematic_Review/tree/main. We do not need to give access to data, since they are already open.

Funding: This work is supported by the EUHORIZON2021 European Union’s Horizon Europe research and innovation programme (https://cordis.europa.eu/project/id/101073351/es)the Marie Skłodowska-Curie Grant No.: 101073351. Views and opinions expressed are however those of the author(s) only and do not necessarily reflect those of the European Union or the European Research Executive Agency (REA). Neither the European Union nor the granting authority can be held responsible for them. The authors have no relevant financial or non-financial interests to disclose. The funders had no role in study design, data collection and analysis, decision to publish, or preparation of the manuscript.

Competing interests: The authors have declared that no competing interests exist.

Introduction

The foundation of democratic governments rests on the voting process conducted by citizens [1]. Political parties, in their quest for votes, heavily rely on news media to disseminate their messages during campaigns. While transparent information and active political participation are crucial for a healthy democracy, political entities increasingly employ hyperpartisan communication strategies. These tactics aim to discredit opposing factions and distort reality, potentially impacting how governments represent their constituents. Although hyperpartisan campaign methods may increase voter participation [2] and strengthen the connection between voting decisions and specific ideologies, they can have significant negative consequences. As [3]emonstrates, this communication style can highlight divisive tensions within society, complicating governance and potentially alienating citizens when opposing sides gain power. Consequently, hyperpartisanism poses a threat to the proper functioning of democracy [4] by polarizing and dividing the social fabric, reducing trust in governmental entities and mainstream news [5], and exacerbating tensions between governments and their oppositions [6].

The rise of alternative media outlets further amplifies these threats to democracy [7], as they often share polarizing content [8]. In the online sphere, hyperpartisanship proliferates through various channels social networks publishers’ websites. The dissemination of hyperpartisan news, characterized by highly polarized political and ideological content, capitalizes on the virality facilitated by platform algorithms [9]. While the term gained prominence during the 2016 U.S. election [10], there is no evidence suggesting that this specific event triggered a systemic hyper-polarization [11].

The digital realm has become a significant arena for political influence [12], affecting the entire infosphere [13]. The close relationship between hyperpartisanship and online interactions has led to increased attention on these manipulative forms of communication [14]. On the policy front, the EU Commission’s 2018 expert report [15] addressed related topics such as disinformation, defamation, hate speech, and incitement to violence. More recently, the European Parliament adopted the Digital Services Act (DSA) [16] in 2022, aiming to provide "a secure, predictable and trustworthy online environment" (Article 1. 1). In line with [9] and [17], we categorize hyperpartisan news under the broader umbrella of misinformation, closely related to fake news detection. Hyperpartisan news detection as a classification task is specifically related to the news domain and can focus on linguistic, semantic, and meta-data features. The objective is for an algorithm to predict a text’s political affiliation or determine if the content is hyperpartisan. The rising academic interest in hyperpartisan detection is testified by the high participation of 42 teams at task 4 of SemEval-2019 [18].

For this systematic review, we only considered automated text-based strategies applied to news articles. Manual detection of hyperpartisan news has been proposed. It mainly focuses on discourse analysis [19–21]. Despite its effectiveness, this approach does not scale with the daily news spreading. Hence, automated methods such as deep learning, social network analysis, or cross-methodologies like [22] are more effective. These approaches rely on different features, so that hyperpartisan news detection may be tackled adopting content, sources, and user-based data [23].

The article is organized as follows: the Related Works section covers the relevant surveys on similar topics, highlighting the main features and comparing their limitations with regards to our study; the Methodology section discusses the methodology adopted for this systematic review, including research questions, search strategy, criteria selection, and selection procedure; the section Hyperpartisan news detection: description of the phenomenon focuses on the definition of hyperpartisanship, highlighting its multi-task and cross-disciplinary nature. Afterward, we present the textual frames where hyperpartisan traits are traceable and the spectrum of methodologies used in different computational sub-fields. Then, we covered the diverse strategies and scales used to label hyperpartisanship. Section Approaches for automatic hyperpartisan news detection contains a global categorization and discussion of the most performant model in the papers screened and selected. We distinguished between the typology of the model, the results, the features and the approaches employed. Section Datasets is a descriptive overview of the datasets used in this domain: we collected the cited datasets and their features. Finally, section Conclusions and future works concludes the article by presenting the main findings of our literature review.

The main contributions of this study are:

Comparing the different definitions of hyperpartisan news detection;
Collecting and discussing the diverse approaches and algorithms used in the selected literature, specifically for the news domain;
Reporting evaluation metrics, features and embeddings considered in the studies;
Presenting the main findings, the engineering innovations and research designs;
Collecting and analyzing 38 datasets used in the literature, focusing on English and less representated languages;
Delineating prevailing research gaps and challenges in hyperpartisan news detection task.

Related works

The current state of the literature lacks a systematic review specialized in automatic hyperpartisan news detection. While there are various relevant survey papers, they predominantly focus on fake news and bias detection tasks. For instance, [24] examined fake news detection while considering the relation between factuality and political bias of news sources without showing any dataset or discussing the methodologies. [25] started from a theorical introduction of the fake news phenomenon to then cover the technical methodologies considering different perspectives from content to style analysis. [26] compared manual and automated approaches to identify media bias, distinguishing several forms of bias occurring in the distinct steps of news production. Similarly, [27] investigated the application of deep learning algorithms in fake news detection, building upon a taxonomy proposed by [17], where hyperpartisan news detection overlapped with fake news detection. [28] covers the broad field of disinformation by designing a taxonomy without considering either automated approaches or the datasets used in the literature. Similarly, [29] analyzes the general phenomenon of media bias detection by describing its diverse manifestations (e.g., spin bias, ideology bias, coverage bias), distinguished the techniques to detect them and reported 17 datasets. Except for this last author, no particular attention was given to hyperpartisan news detection from the others.

Methodology

In this section we will present and describe the methodology adopted to conduct this systematic review following [30]’s guidelines. The planning and execution phases of this study are detailed in the following subsections, while the results phases are discussed in section Hyperpartisan news detection: description of the phenomenon, section Approaches for automatic hyperpartisan news detection and section Datasets.

Research Questions

The Research Questions (RQ) that motivated the need for this systematic review are the following:

RQ1 Does a categorization for hyperpartisan news detection methods exist?
RQ2 Is hyperpartisan news detection a stand-alone or over-lapping task?
RQ3 What are the proposed solutions using textual data?
RQ4 Does the task keep up with the new Natural Language Processing technologies like autoregressive models?
RQ5 What are the results of the models developed?
RQ6 What are the datasets used for this task? How are they structured? Have they been updated to cover the latest political global and regional trends?
RQ7 How can the current state of research on hyperpartisan detection be characterized in diverse languages and countries?

Search strategy

We adopted the Preferred Reporting Items for Systematic Reviews and Meta-Analysis (PRISMA) guidelines [31], consisting of a checklist (http://www.prisma-statement.org/documents/PRISMA_2020_checklist.pdf) and a flow diagram Fig 1 to illustrate in a simplified and clear way the steps made. To retrieve papers, primary different academic online databases were used to overcome their respective limitations [32] in terms of topic coverage and papers available: ACM Digital Library, Google Scholar, Scopus, ProQuest, and IEEExplore. Our query archetype was: ((hyperpartisan OR “political bias” OR “hyper-partisan” OR partisanship OR hyperpartisanship OR “political polarization”) AND (news OR bias OR articles) AND (detection OR classification)). The first set of words contains the different homographs. We also searched in all subject fields, to capture as many semantically similar papers as possible, including potentially miscategorized papers. We selected the 2015-2024 timeframe to analyze trends before the term was coined, considering a period in which studies on this topic grew, and increasingly powerful models were employed.

Download:

Fig 1. PRISMA Flow Diagram.

The Flow Diagram illustrates the steps during document collection and evaluation. We skimmed more than 1553 papers and finally we selected a subset of 81.

https://doi.org/10.1371/journal.pone.0316989.g001

For the purpose of obtaining pertinent papers related to our questions, the queries reported in Table 1 were a refined result of a structured process based on different steps introduced by [29]. Queries within each database were structured to match titles, abstracts, and keywords. We extended this pipeline, introducing the following step: “Network visualization and exploration”. It concerned the usage of the research software ResearchRabbit (https://researchrabbitapp.com) to exhaustively capture possibly omitted papers through the citation links structure.

Keywords domain extrapolation: Initial reviews on similar topics helped us identify the keywords used in this domain. We noticed a lack of scientific agreement on writing “hyperpartisan”. To cover all these morphologically diverse forms (hyper-partisan, hyperpartisan, hyper-partisan), we included them in our queries, treating them as synonyms;
Iterative searches: This process allowed us to select the most appropriate terms by comparing the results retrieved using different keywords combinations. We examined how much titles and abstracts related with the queries;
Verifying against established literature: To ensure the efficiency of our search terms, we compared the results to a list of papers in the domain of hyperpartisan detection;
Network visualization and exploration: To further validate our verification, we used ResearchRabbit, a tool to visualize the citation links between papers in the same collection. It suggested similar papers written by the same or different authors, highlighting stored publications in the user’s folder. This tool helped us in gauging the coherence of our results.

Download:

Table 1. Queries performed with advanced search for each database and the number of papers retrieved.

https://doi.org/10.1371/journal.pone.0313772.t001

Selection criteria

Before describing the screening process, we illustrate the criteria employed for the paper selection.

Inclusion criteria
- Papers primarily focused on automated hyperpartisan news detection;
- Papers that used the related task (e.g. fake news detection) as a synonym of hyperpartisan news detection;
- Publications from 2015 to 2024;
Exclusion criteria
- Exclusion of sources that either address the hyperpartisan news detection problem from a theoretical perspective, namely theory papers, or manual detection;
- Studies discussing only related topics, such as fake news detection, stance detection, or political bias;
- Findings that do not use news domain datasets as the main source for hyperpartisan news detection, i.e. social network analysis, comments analysis, and tweets detection-based approaches;
- Literature reviews, books, thesis and posters.

Screening and selection process

The following search strategy and procedures for study selection and analysis were used. The study selection, quality assessment of the included studies, and thematic analysis were performed by one author (PP). However, the procedures and findings were discussed by all authors, and potential disagreements were resolved by consensus.

To manage the screening and selection processes, we utilized Rayyan (https://www.rayyan.ai/) for its AI-powered capabilities, which allowed the two reviewers to conduct a blinded selection process, preventing any mutual influence. Specific eligibility criteria were established to ensure the reliability of the study. These criteria were applied independently by each reviewer to maintain objectivity and consistency. The criteria included: relevance to the predefined inclusion criteria, evaluation of models using both accuracy and F1 score, and comprehensive reporting of the dataset used. Only papers that met the criteria and were accepted by both reviewers were selected. In cases where there was disagreement, a third reviewer was consulted to assess the paper’s eligibility. The initial dataset consisted of 723 papers from ACM Digital Library (https://dl.acm.org/), 571 from Google Scholar (https://scholar.google.com/), 1 from ScienceDirect (https://www.sciencedirect.com/), 97 from Scopus (https://www.scopus.com/home.uri), 159 from ProQuest (https://www.proquest.com/index), and 1 from IEEE Xplorer (https://ieeexplore.ieee.org/Xplore/home.jsp). Notably, Google Scholar initially retrieved 1800 results, but we noted that, after the threshold of 500 results, it did not produce relevant documents. That led us to manually collect only the first 571 papers.

We conducted the entire selection process using Rayyan, as described in Fig 1. It automatically detected 118 duplicates. After manual checks, we removed them. Left with 1441 studies, screening titles and abstracts was the initial step. Following thorough evaluations, 67 papers were retained from a curated pool of 110, eliminating 43 papers that did not meet specific focus or dataset criteria.

Additionally, to examine the cohesion and coherence of our references, we used ResearchRabbit to visualize the citation network, identifying two prominent clusters with centers in [33] and [18]. [18] is a key work from the SemEval initiative, set a foundational benchmark for detecting hyperpartisan in news articles, which informed our criteria for selecting relevant studies. This shared task saw the participation of 42 teams. They explored several approaches that future research will expand upon it. Moreover, the two datasets described in are important benchmarks for hyperpartisan news detection. Similarly, [33] compared linguistics and topical methodologies too discern between hyperpartisan and neutral news. That was one of the first work in literature and defined the importance of linguistics features in this task. 14 additional papers were included after exploring similar works and citations thanks to this procedure. Lastly, we compared the several definitions of hyperpartisan news to stress the importance of having a specific and clear task not overlapping with related ones. Our work offers an extensive and comprehensive investigation of state-of-the-art techniques considering both mixed approaches, machine and deep learning application. To ensure our systematic review is both homogeneous and robust in terms of comparability, we focused on the most commonly used performance metrics in NLP: accuracy and F1 score. By collecting and analyzing these standard metrics, we aim to maintain consistency across the studies and enhance the reliability of our comparative analysis. Lastly, we retrieved and analyzed 38 datasets, reporting the evaluation metrics, embeddings and features used by researchers. Finally, we present some descriptive results regarding the trend of the publications over time (Fig 2) and the selected sample that highlight the main publishers (Fig 3).

Download:

Fig 2. The bar chart illustrates the trend of the selected publications over time.

https://doi.org/10.1371/journal.pone.0316989.g002

Download:

Fig 3. The pie chart shows the main publishers for the selected papers.

https://doi.org/10.1371/journal.pone.0316989.g003

Transparency and replicability

Emphasis was placed on transparency and replicability to adhere to rigorous academic standards and required by PLOS ONE’s policy on Data Availability. Thus, a GitHub repository stores the queries employed and described in the paper as well as the results of the screening process described above. This enables fellow researchers to replicate the methodology and verify the findings. The repository is accessible at https://github.com/MichJoM/Hyperpartisan_News_Detection_Systematic_Review/tree/main. In addition to the previous information, it contains the explanation of how missing data were handled.

Hyperpartisan news detection: Description of the phenomenon

In this section, we begin by examining the definitions of hyperpartisan news detection found in the reviewed literature. We then delve into the various biases that are related to our investigated phenomenon and constitute it. Additionally, we examined the diverse hyperpartisan sources and we provide an overview of the application domains. Finally, we discuss the different strategies for labeling an entity as hyperpartisan.

The problematics of the definition

Definition of hyperpartisanship.

The term Hyperpartisanship (https://claremontreviewofbooks.com/hyperpartisanship/) is not certified in any dictionary. A widely accepted definition considers hyperpartisan news as having an extreme bias toward a particular political ideology or party [18]. This type of news reporting often presents information in a highly sensationalized and one-sided manner, prioritizing ideological loyalty over objective reporting and critical analysis. This behavior denotes an extreme political allegiance to a party, leading to intense disagreement with the opposing faction [18].

Vagueness of the definition and overlap with similar tasks.

The minimalist definition of hyperpartisanship is widely adopted by computer scientists, who tend to simplify social phenomena models when applying automated detection [26]. Hyperpartisanship coexists within the broader category of junk news and shares characteristics with tasks such as political, ideological, and fake news detection [34]. Due to the vagueness of the definition, hyperpartisan headlines are often difficult to cluster within the misinformation set, and there is a lack of consensus on what precisely constitutes hyperpartisanship [35]. The perception of news as hyperpartisan can depend on the reader’s epistemic bubble [36]. Additionally, both left and right extremisms do not show significant stylistic differences, making hyperpartisanship a subject-shifting concept [33]. While humans can assess the degree of hyperpartisanship in a given text due to their cultural and linguistic awareness, machines lack this capability.

Hyperpartisan news detection often overlaps or is confused with other disinformation tasks, such as fake news detection [19,37–40,94], and stance detection [41]. Specifically, hyperpartisanship might be conveyed through elements of fake news, aimed at propagating a specific agenda and manipulating readers to adopt a particular position on a given topic [40].

Traits of hyperpartisan news.

From a linguistic perspective, hyperpartisan articles exhibit a high count of adjectives and adverbs [42,43], extensive use of pronouns, and words of disgust [44]. These articles tend to feature longer paragraphs written in a sensationalist style, full of emotional language and rare terms [45]. Right-wing media, in particular, often employ hyperpartisan headlines, corroborating earlier findings [46,47]. Hyperpartisan news articles display hyper-polarized linguistic traits in their titles as well. However, hyperpartisanship opposes to balanced news, which are intended to report facts with balanced tone and informative intention.

Analogue biases.

Hyperpartisan news detection is a task in which certain textual features indicated above suggest that the writer is expressing an extremist, one-sided opinion. Moreover, various degrees with which typologies of bias occur contribute to make the text hyperpartisan. There are several taxonomies proposals for junk news like [28,34]. We will use the bias categories collected by Oxford (https://catalogofbias.org/biases/spin-bias/) and [29] to discuss the founding biases of hyperpartisan articles.

Spin bias, or rhetoric bias [29], strictly concerns the linguistic structure of the article, its persuasion. The deliberate or inadvertent misrepresentation of research outcomes, leading to unjustified indications of positive or negative results, potentially could result in misleading conclusions. Written language is the product of the conscious application of strategic discursive and persuasive patterns to interest the readers. The words contribute to giving a particular meaning to the entire text, especially if they leverage an emotional lexicon with superlatives.

Ad hominem bias is a rhetorical strategy in which one moves away from the topic of the controversy by contesting not the statement of the interlocutor, but the interlocutor themselves and his personal characteristics or traits [48]. This rhetorical strategy was frequently used in sophistry and is still widely used today in political discussions and journalistic controversies.

Presence bias or opinion statement involves the inclusion of subjective opinions within news articles, influencing readers’ perceptions. It occurs when factual reporting is mingled with subjective viewpoints or opinions [49]. In other words, it reflects the degree of agreement and statement sharing of an entity, i.e. users or publishers [50].

Ideological bias occurs when news reporting or content is influenced by a particular ideological stance or viewpoint, impacting the presentation and selection of news topics. Ideological detection is different from political bias because some ideologies can be shared even by opposite parties. Ideologies often contrast each other, but to be classified they need this comparison [51].

Framing bias involves presenting information to shape or influence people’s perceptions of an issue or event by emphasizing certain aspects while downplaying others [52,120]. In this case, using linguistics and rhetorical figures helps the author partially present the selected information. Therefore, framing expresses a publishers leaning towards an ideology. Frames are tools that emphasize specific information while potentially favoring one aspect over another, with or without being slanted [53]. It is performed in moral content and style used [21].

Coverage bias, is not present in Table 2 since it is not a textual bias. It refers to the disproportionate attention or neglect of topics or events in news reporting, leading to an imbalance in coverage across different subjects [54].

Download:

Table 2. Examples of statements for specific biases and hyperpartisan statements for that bias.

https://doi.org/10.1371/journal.pone.0313772.t002

Political bias could be easily confused with ideological bias. Since a party is a combination of both an ideology and a political leaning, this bias is related to the inclination of news media or information sources or people to favor one political party’s agenda [55].

In this context, it is essential to avoid conflating the reification of the social phenomenon involving linguistic indicators with the entirety of the specified biases. Namely, not all categories of biases mentioned can be classified as hyperpartisan when they manifest. The linguistic element of exaggeration per se does not automatically denote hyperpartisanship; rather, it necessitates contextual positioning, such as aligning with a particular party or ideology. Simply adopting a stance is insufficient for categorization as hyperpartisan; it is the degree of exaggeration in that stance that holds significance. We propose some examples to illustrate this in Table 2.

Proposal for a definition.

By reviewing the various definitions collected in Table 3, several key observations emerge:

the concept of hyperpartisanship is an intersected field and shares features with the typologies of media biases discussed in section Analogue bias. In the following list the indexes define the intrinsic characteristics in the hyperpartisan definitions in 3, “Characteristic" column:
1. spin bias;
2. ad hominem bias;
3. opinion statement bias;
4. ideology bias;
5. framing bias;
6. coverage bias;
7. political bias;
it is commonly acknowledged that hyperpartisan news exhibits one-sided political bias, incorporating specific statements aligning with the ideology of a particular political party and/or agenda;
the lack of a commonly shared definition across various studies results in the characteristics of detection being variable and mutually exclusive undermining the integrity and scientific rigor of research in this field;
while approaching this classification task, some researches like [56] lack a methodological approach because do not introduce a definition of the phenomenon.

Download:

Table 3. Definitions of hyperpartisanship given in the selected papers.

https://doi.org/10.1371/journal.pone.0313772.t003

In light of these considerations, hyperpartisan detection must necessarily consider different variables simultaneously: positioning, presence of a bias and its degree of exaggeration. Does the current state of the art in detection methodologies do this? As mentioned earlier, a detection method that simultaneously considers the different types of biases and these three variables has not been conducted. Various research works tend to focus individually on specific subsets of linguistic and content-based features, as outlined in the following sections.

Considering these elements, we propose the following definition to aid future research in addressing hyperpartisan news in Computer Science: Hyperpartisan news detection is the process of identifying news articles that exhibit extreme one-sidedness, characterized by a pronounced use of bias. The prefix "hyper-" highlights the exaggerated application of at least one specific type of bias—such as spin, ad hominem attacks, opinionated statements, ideological slants, framing, selective coverage, political leaning, or slant bias—to promote a particular ideological perspective. This strong ideological alignment is conveyed through amplified linguistic elements that reinforce one of these bias types within the text.

Where can hyperpartisanship be detected? Perspectives on the sources

In this section, we will give a general overview of the main sources typologies considered to detect hyperpartisan news articles.

In light of the prevalence of hyperpartisan news dissemination online, the methodologies discovered are implemented specifically on online news outlets. Initially, when considering the domain of publishers, a linguistic approach can be applied to news analysis to detect hyperpartisanship. This approach involves studying textual information within articles using style-based or topic-based models [33,46,62,128]. Detection methods consider specific sections, such as the title [46,47], sentences [63], quotes in the body [42], or encompass both [46,58,64,65,94]. Otherwise, researchers investigated hyperpartisanship spread starting from entities involved in the writing and publishing process, like journalist’s [66] or media [18] leaning . Considering publishers as entities often interconnected through economic and political bonds [67], they form a polarized network, which can be analyzed using metadata like external links [68–70,138]. While determining bias based on the source is feasible [66,71], an article from a biased media outlet may not always be hyperpartisan [49,72]. This issue was underscored by [72], which highlighted the inadequacy of the information source in determining an article’s hyperpartisanship. This method generates a system capable of indicating bias scores in news and suggesting similar topics from different sources to encourage readership of diverse perspectives or to avoid extremely biased news.

Working with textual data enables the extraction of sentiment features [73]. For instance, [74] observed that sentiment analysis, applied to titles and sentences using TextBlob (https://textblob.readthedocs.io/en/dev/), improved evaluation metrics. Additionally, [75] noted that hyperpartisan articles tend to convey more aggressive and negative sentiments compared to other articles. Using VADER (https://github.com/cjhutto/vaderSentiment), [76] conducted experiments to analyze the contribution of sentiment features in indicating the author’s bias. Meanwhile, [77] approached hyperpartisan news detection by considering sentiment as a means to capture the polarity of articles.

Moreover, [78] employed both textual and image features to detect hyperpartisanship. Their study revealed that automated methods outperformed humans and that incorporating additional information such as images and titles enhanced the accuracy of the model.

How hyperpartisanship is labeled?

Understanding the measurement of hyperpartisanship involves considering the diverse scales utilized. In Social Sciences, a range of indexes and scales is employed for this purpose, leveraging distinct features from those used in automatic detection methodologies. For instance, polarization is calculated with the CSES Polarization Index. The Common CSES Polarization Index (PI) is a tool used to assess the distribution of political parties across the Left/Right ideological spectrum. These metrics gauge ideological positioning and account for party sizes or vote shares, offering a comprehensive view of ideological stance and political influence [3]. Differently, automatic hyperpartisan detection relies on linguistic features. Some studies employ binary classification methods, utilizing labels such as hyperpartisan/mainstream (i.e. non-hyperpartisan) [18], Left/Right [66,118]. However, such distinctions often overlook nuanced differences within diverse political leanings [33]. Few studies have extended their scope to include a more fine-grained polarization range [79]. For example, [80] approached hyperpartisan detection as a multi-class classification problem, employing both 7- and 5-point scales to define affiliations: 1-2.5 – far-left, 2.5-3.5 – center-left, 3.5-4.5 – center, 4.5-5.5 – center-right, 5.5-7 – far-right. Similarly, [73] used a scale and [81] sought to manage granularity by distinguishing between right, center, and left affiliations.

Approaches for automatic hyperpartisan news detection

The detection of hyperpartisan content encompasses a range of methodologies, varying from traditional non-deep learning approaches to cutting-edge deep learning techniques, as well as mixed learning algorithms. Non-deep learning methodologies often rely on traditional machine learning algorithms, leveraging handcrafted features and rule-based systems to identify linguistic patterns, stylistic markers, and network structures within textual and metadata sources. These approaches commonly include stylometric analysis and topic modeling methods to discern biased content. In contrast, deep learning methodologies harness the power of neural networks to automatically extract intricate features from raw data, enabling the identification of complex patterns and relationships in unstructured text or network data. These techniques, such as convolutional neural networks (CNNs), recurrent neural networks (RNNs), and Transformers, excel in learning representations directly from the data.

Models discussion

In the following subsections, to encompass the risk of bias, we grouped and discussed the mentioned studies by model architecture. We will differentiate between Non-deep learning, Deep learning, and other methodologies adopted in the papers selected for the systematic review. The Deep Learning section includes Non-transformer’s Deep Learning models and the Transformers family. In the following tables: 4, 5, 6 and Table 7, we categorize the best model for each paper, reporting its performance. When researchers compared more models on the same task using different datasets, like the case of [118], we report only the best model’s performance.

Non-deep learning methods.

In Table 4, we categorized papers using the traditional machine learning approaches. The methodologies involved algorithms like Support Vector Machines, Random Forest, and Logistic Regression, followed by Linear regression, Naive Bayes, Linear SVC, KNN, XGBoost, and Maxent.

Download:

Table 4. This table describes the traditional Machine Learning algorithms used in the selected literature.

https://doi.org/10.1371/journal.pone.0313772.t004

There are effective strategies adopted with the SVM model. For instance, [70] used this algorithm combined with the sentiment analysis via National Council Canada’s Emotion Lexicon (NRC Emotion Lexicon) to analyze the emotional content in the article. Moreover, they extended the linguistic approach applyinh Linguistic Inquiry and Word Count (LIWC). They also considered the articles’ structure and meta-data as features. [94], adopted n-grams, i.e. bi- and tri-grams, and dependency sub-trees that impacted the performance. On the other hand, [83] experimented several embeddings: Doc2Vec [95], Glove [96], ELMo [97]. They found out that “adding simple lexical and sentiment features hurts the performance". [43] studied the linguistics divergencies between fake and hyperpartisan news employing an SVM. In this case, it emerged that hyperpartisan articles exhibit more sentences and a higher adjective count compared to unbiased news. When comparing the characteristics of extreme polarized articles against fake news, they noted that the former contains high usage of question/exclamation marks and adjectives. These sentence-related features delineate distinct linguistic patterns. [77] confirmed the robust potentialities of the Logistic Regression, ranking in the second place at the SemEval-2019. [77] built representations with Universal Sentence Encoder (USE) [98] and combined both semantic and handcrafted features, paying attention to the grade of the adjectives and subjectivity and distinguishing between two levels of polarity: sentence and article level. [37] used the Reuter Dataset for the training and the test, combining the ELMo embedding with a logistic regression classifier as already done by [72] and [77], confirming the effectiveness of this method. [77] discovered that the most relevant features concern bias lexicon and polarity. [93] placed third at the SemEval-2019 and found that article length was a distinctive trait of biased articles. By working at the phrase level, they created a set of phrases to discern the different types of articles, paying attention to removing n-grams containing publishers’ style biases. [46] focused on news titles with a topic-based approach. They also built a dataset considering two distinct typologies of news titles, augmenting the granularity of the detection. The first category pertains to descriptions of confrontations or conflicts between opposing parties, suggesting a deeply polarized political climate. The second set involves opinions that express a biased, inflammatory, and aggressive stance against a policy, a political party, or a politician. [73] thought there was an interdependence between factuality and political ideology bias, so that introduced a multi-task learning setup with the Copula Ordinal Regression (COR) [99]. They used the entire news outlet and considered diverse scales for measuring factuality (3-point scale) and political bias (7-point scale). [40] with Maximum Entropy Modeling (MaxEnt) by-passed linguistics features to build a model capable of generalizing as much as possible, [40] devised a document classification system that combines clustering features with simple local features. They showcased the effectiveness of employing distributional features from large in-domain unlabeled data. [85] approached the task using n-gram embeddings with article and title polarity, implementing the XGBoost model with all of these scalar features, but it performed poorly. They derived their methodology of applying stylometric analysis from [33]. This approach utilized n-grams, readability scores and Part-of-Speech (PoS) followed by binary classification. Thanks to unmasking information, they simultaneously compared documents with opposite political leaning. In doing so, [33] investigated the style variations depending on the political orientation and confronted it with a topic-based bag-of-words models. This methodology highlighted the limited usefulness of integrating corpus characteristics when performing a granular distinction amongst left, right and mainstream styles. Indeed, both the political extremes show similarities and can produce confounding effects in the model. Hence, concerning the style analysis for the hyperpartisan detection, the categories should be limited to mainstream and hyperpartisan without considering the specific leaning.

Furthermore, for a complete understanding of the approaches used in the literature, we summarized them in the Table 5. In this case, although ELMo, BERT, and Word2Vec embeddings were used as features of Non-Deep Learning algorithms. Table 5 describes only the features used gwith the best models proposed in Non-Deep Learning approaches in Table 4. We distinguish between features (Morpho-syntactic, Lexicon, Semantic, Sentiment and Metadata) and approaches (style-based and topic-based).

Download:

Table 5. Features used with the best models described in Table 4. The features described in the columns are the following: Morpho-syntactic (MS), Lexicon (L), Semantic (S), Sentiment (SE) and Metadata (M). The approaches are: Style-based (SB) and Topic-based (TB).

https://doi.org/10.1371/journal.pone.0313772.t005

Deep learning methods.

In the following paragraphs, we analyzed the Deep Learning methods adopted by diverse authors to solve the hyperpartisan detection task. In Table 6, we categorized papers using the traditional machine learning approaches. Lastly, at the end of the section, a comprehensive Table 7 collects and illustrates the results rounded to two decimals reported by all the authors studied in our systematic review.

Download:

Table 6. Collection of the most performant deep learning models used in the literature.

https://doi.org/10.1371/journal.pone.0313772.t006

Download:

Table 7. This table describes the best performances of the models.

https://doi.org/10.1371/journal.pone.0313772.t007

Deep learning: Non transformer-based architectures.

[42] employed a fusion of CNN and LSTM, utilizing quantitative linguistic features extracted through GloVe. In this way, they highlighted the crucial role of incorporating linguistic features alongside representations based on word vectors. Additionally, they built a meta-classifier to filter noisy data to apply to the by-publisher dataset. [72] won the SemEval-2019 Task 4 by combining rich morphological and contextual representations by averaging the three vectors per word into ELMo embeddings. Their model was used for further studies by [105] for pseudo-labeling frameworks: Overlap-checking and Meta-learning. Overlap-checking consists of adding data, helping the model train, while Meta-learning allows the model to be continually trained on a clean dataset and a pseudo dataset. This last work inspired [107]. In their article, they used a HAN combined with ELMo embeddings. The HAN is a model capable of balancing the information in a current state, deciding whether to update it and how much the past information contributes to its new state. In this case, the information stems from sentence level, confirming that richer article representations yield better performances. By encapsulating the articles’ structure, connectors and paying attention to stylistic markers, handcrafted stylistic features and emotion lexicons, they reached the state-of-the-art in 2020 on the SemEval-2019 Task 4 dataset. [109] improved the HAN standard model by introducing Knowledge Encoding (KE) components. The HAN segment functions to grasp word and sentence relationships within a news article, employing a structured hierarchy across three levels—word, sentence, and title. Meanwhile, the KE component integrates common and political knowledge associated with real-world entities into the prediction process for determining the political stance of the news article. Since the model is not language-based, it could work with diverse languages beyond English. [112] developed a pre-training framework encoding knowledge about entity mentions, namely masked tokens as frame indicators, and modeling the propagation between users with a social information graph. They noted that models pre-trained on general sources and tasks have limited ability to focus on biased text segments. [113] introduced a voting system of LSTMs to build a controlled dataset to train another LSTM. It was an example to demonstrate the importance of having a balanced and clean dataset to run experiments. Lastly, [120] built a Hierarchical-LSTM applied to subframes (n-grams) to tackle the framing bias. In this paper, they introduced a pioneering framework aimed at pretraining text models utilizing signals derived from the abundant social and linguistic context available, encompassing elements such as entity mentions, news dissemination, and frame indicators.

Deep learning: Transformer-based architectures.

Regarding the Transformers architectures, we observed a massive utilization of BERTbase and BERTlarge. BERTbase is a pre-trained BERT model trained on a smaller dataset than BERTlarge. BERTbase differentiates itself in cased and uncased, depending on whether to discern between cases and uncased words. [121] wanted to remove the bias when modeling the medium. They observed that combining bias mitigation with triplet loss, Twitter bios and media-level representations increased the model efficacy. [118] proposed a multi-task BERT-based model with contrastive learning to tackle framing bias in news articles. [122] with BERT and combinations of syntactic bigram counts and psycholinguistic features investigated the inference of political information and hyperpartisanship on author and text level starting from linguistic data. [123] showed that fine-tuning the model entails better results. [124] introduced a semi-supervised framework trained using federating learning, namely algorithms are trained independently across diverse datasets. Furthermore, textual data are tagged to extrapolate wh-questions replies and temporal lexicon information. The same author replicated this approach in [125]. In the quest for precise detection and data denoising, the same author replicated this approach with variations in [125]. [125] employed an attention-based strategy to learn text representation, aiming to identify target expressions accurately while extracting pertinent contextual information. They generated a BERT attention embedding query utilizing lexicon expansion, content segmentation and temporal event analysis. Ultimately, this approach enhances the understanding of consecutive news articles within a temporal framework. [126] experimented using BERTbase and BERTlarge feeding them with embeddings of different lengths. They were interested in analyzing the parts of the articles, looking for a consistent level of hyperpartisanship that demonstrated to exist. [60] from the confrontation between BERT and ELMo models, confirmed that the inputs and embeddings dimensions contributed to affecting positively the performance. [127] performed domain adaptation, showing its efficacy. [116] operated in a low-resource scenario with prompt-based learning and employed masked political phrase prediction and a frozen pre-trained language model that relies on transformer architecture, utilizing the robustly optimized BERT approach known as RoBERTa as a backbone for their own model, MP-tuning. [117] focuses on political ideology and stance detection, comparing triplets of documents on the same history to detect dissimilarities amongst them. They trained RoBERTa through continual learning. Whereas, [128] improved their model’s performance with cross-domain contrastive learning and this work is noticeable that they used GPT-2 for augmenting hyperpartisan textual data. Lastly, [129] faced the task for Persian hyperpartisan tweets by prompting GPT-3.5, a multi-language conversational generative LLM released in 2022, and open-weights model like Llama2 [130]. [129] compared the capabilities of Large Language Models (LLMs) and BERT-based models like RoBERTa and ParseBERT to detect English and Persian tweets, providing instructions with different levels of specifity to the models. Despite the huge dimension and the extensive training of LLMs, fine-tuning ParseBERT and RoBERTa has proven to be more efficient and practical for certain tasks.

Other methods.

Within the vast landscape of computational frameworks, certain algorithms defy classification within the traditional realms of deep learning or non-deep learning. This chapter delves into the exploration of these unique frameworks—sophisticated combinations of diverse models, labeling techniques and graph approaches—that operate beyond the conventional boundaries of established categorizations.

[49] applies a framework for presentation bias, studying hyperpartisanship with a graph-based method. This three-step framework is so structured: collecting related-articles clusters on the same topic; applying Aspect-based Sentiment Analysis (ABSA) with BERTbase to rate and classify fine-grained opinions in the pairs of sentences; the variation in bias between news sources within similar categories is figured out by contrasting the scores of matching pairs of articles. This comparison is done for every combination of news sources within these categories, and the differences in bias are averaged across all article groups. This averaging process leads to the development of a bias matrix. [69] proposed a Multi-View Document Attention Model (MVDAM) capable of modeling at the same time title, structure and metadata like links in order to estimate the political ideology of a news article. This framework based on the Bayesian approach utilizes different models for creating the 3-D representation: a convolutional neural network for learning the title, Node2Vec for the network and HAN for the content. [131] worked mostly on manual features like metatopic, namely polarizing topics, using an end-to-end tool: The Gavagai Explorer, which performed poorly.

[33] performed a political orientation prediction and hyperpartisan classification task using an unmasking technique with binary classifiers. For the first task, they found that left-wing news tends to be easily misclassified. This study noticed that individual political orientation is struggling to predict and that a style-based approach overcomes the content-based one. Moreover, they discovered subtle differences in style between hyperpartisan news belonging to different political leanings. [62] using masking and transformer-based models proved that topic-based approaches lead to better results than style-based. Instead, [132] made a comparative examination of BERT-based models and masking-based models, enriching comprehension regarding the strengths and constraints of varied approaches in bias detection, offering crucial insights for upcoming research and advancements in this domain. In essence, these models’ contribution lies in their capacity to augment the precision, clarity, and comprehensibility of bias detection within political and social discussions. Consequently, they propel advancements in this pivotal research domain. Furthermore, [133] investigates using large language models for automated stance detection in a lower-resource language, focusing on immigration. It annotates pro- and anti-immigration examples to compare performance across models. The study finds that GPT-3.5 matches supervised models’ accuracy, offering a simpler alternative for hyperpartisan detection in media monitoring. Lastly, for the sake of exhaustiveness, we will briefly cover other methods not focusing on news textual features. For this reason, the following discussed papers are not included in our final selection. However, in this way, the reader can understand the complexities of approaches to tackle hyperpartisanship. [134] maps linguistic divergence across the U.S. political spectrum using 1.5M social media posts (20M words) from 10k Twitter users. By analyzing followers of 72 news accounts, it identifies variations in topics, sentiment, and lexical semantics. Methods combine data mining, lexicostatistics, machine learning, large language models, and human annotation. [135] analyzes language differences on Twitter among 5,373 Democratic and 5,386 Republican followers to explore psychological traits tied to political leanings. Using naturalistic data, it confirms hypotheses: liberals’ language shows uniqueness, swearing, anxiety, and emotions, while conservatives’ language reflects group identity, achievement, and religion, supporting prior research. To conclude, [136] introduced FAULTANA (FAULT-line Alignment Network Analysis), a computational method to identify societal fault lines and polarization drivers in online interactions. Using data from Birdwatch (Twitter) and DerStandard forums, it reveals two polarized groups aligned with political identities. FAULTANA tracks polarization over time, highlighting divisive issues and their impact. We present the best performances retrieved in the selected papers in Table 7.

Datasets

In the previous section, we provided an overview of methodologies employed in addressing hyperpartisan detection. Effective models depend on top-notch data quality to function optimally. However, constructing a high-quality, well-balanced dataset can be both time-consuming and resource-intensive. This challenge is compounded by shifts in data policies across social networks since the Cambridge Analytica scandal, leading to potential difficulties or cost changes in obtaining data. Additionally, a trend has emerged within news sources where access to data is restricted due to its previous utilization in training models like GPT (https://www.washingtonpost.com/technology/interactive/2023/ai-chatbot-learning/). Consequently, news sources have implemented paywalls and crawler restrictions (https://ilmanifesto.it/termini-e-condizioni), making it exceedingly challenging to gather suitable information for this and similar tasks.

Datasets presentation

To support upcoming studies on identifying hyperpartisan news and related tasks, we have created an extensive table: Table 8, which outlines key attributes of datasets relevant to hyperpartisan news detection. This table includes datasets referenced in different papers. Some are not primarily used for hyperpartisan detection but could be. It is important to note that when subsets or extended versions of earlier datasets exist, we consider them separate entities denoted by *. Additionally, datasets marked with ** signify merged collections. The column labeled Data indicates the number of articles gathered by the researchers.

Download:

Table 8. This table describes the datasets found in the literature.

https://doi.org/10.1371/journal.pone.0313772.t008

To provide comprehensive insights into the table, we will give a philological explanation of the datasets marked with the symbols * and **. Framing Triplet Dataset is a combination of the following datasets: SemEval-2019 task 4 along with [120]’s data. Furthermore, [120] expands the SemEval-2019 task 4 dataset by incorporating articles collected from polarized sources and then labeled through mediabiasfactcheck.com. Regarding BIGNEWS, collected by [117], it has two subsets, respectively: BIGNEWSBLN is a downsampled corpus maintaining an equal distribution of ideologies, and BIGNEWSALIGN, which clusters news stories from opposing sources but on the same topic. In their research, [49] utilized a subset of All-the-news (https://www.kaggle.com/datasets/snapcrack/all-the-news). Furthermore, [33] worked with a subset of articles crawled from the URLs contained in The BuzzFeed-Webis Fake News Corpus collected by [140]. By cleaning [33]’s dataset, [62] obtained a new dataset. The same researchers created StereoImmigrants, a collection of Spanish news about immigrants, for [132].

Labeling and retrieving processes relied upon platforms like Allside (https://allsides.com/), Factcheck (https://mediabiasfactcheck.com/), Politifact (https://www.politifact.com/), as ground truth for establishing the bias of an article and as source where to collect data. Indeed, in these contexts, experts assign news to the political orientation.AllSides.com is a media company that specializes in providing balanced news coverage by collecting and comparing news stories from various sources with different political leanings. The platform categorizes news articles based on their political bias—whether left, center, or right—and scores them according to the level of partisanship they contain.

Since [121] noted that training models with big datasets reduce the performance due to their noise, researchers started to prefer the quality rather than the dimension. Indeed, [46], after a deeper analysis of the SemEval 2019 dataset, revealed several issues with this ground truth dataset widely used: class imbalance, task-label unalignment, and distribution shift.

As we can see from Fig 4, there is an imbalanced distribution towards English data, leaving the context of minority languages understudied. Datasets are available at the respective links: [81] https://gitlab.com/checkthat_lab/clef2023-checkthat-lab/-/tree/main/task3?ref_type=heads, [109] https://github.com/yy-ko/khan-www23., [118] https://github.com/MSU-NLP-CSS/CLoSE_framing, [80] https://github.com/axenov/politik-news, [132] https://github.com/jjsjunquera/StereoImmigrants, [120] https://github.com/ShamikRoy/Subframe-Prediction, [141] https://urlis.net/zon9n8wr, [58] https://drive.google.com/drive/folders/1IyaKYeDkl7ubuabTI65G0nSBfxQNdeTr [142] https://dataverse.harvard.edu/dataset.xhtml?persistentId=doi:10.7910/DVN/ULHLCB, [18] https://zenodo.org/records/1489920 [143] www.ccs.neu.edu/home/luwang/data.html, [33] https://github.com/BuzzFeedNews/2016-10-facebook-fact-check, Reuter http://about.reuters.com/ researchandstandards/corpus/, [144] https://github.com/RWalecki/copula_ordinal_regression, [145] https://dataverse.harvard.edu/dataset.xhtml?persistentId=doi:10.7910/DVN/ZCXSKG, BuzzFeed https://github.com/BuzzFeedNews/2016-10-facebook-fact-check/blob/master/data/facebook-fact-check.csv.

Download:

Fig 4. Language distribution in the datasets described in Table 8.

https://doi.org/10.1371/journal.pone.0316989.g004

Potential limitations

The studies collected hold significant value, yet several inherent limitations in the datasets could influence the comprehensiveness and applicability of future findings. Firstly, the absence of a distinct dataset designed to differentiate between hyperpartisan and partisan news poses a fundamental challenge, potentially impacting the classification accuracy. Secondly, 2 shows that the predominant focus is on English news articles within the dataset. This fact raises concerns about minority languages and their respective democratic contexts, possibly skewing the representation and applicability of Anglo-American papers’ conclusions to their different socio-cultural environment. This discrepancy might lead to situations where certain democracies lack the necessary tools and datasets in their native language, hindering their ability to develop similarly effective analytical tools as over-represented democracies. Additionally, the phenomenon of hyperpartisanship varies significantly between countries due to the variety of party systems and different cultural backgrounds [6]. Consequently, the development of models trained on linguistically non-representative data may compromise their ability to efficiently detect hyperpartisanship in under-represented democracies, thereby impacting their success rates. Furthermore, issues pertaining to dataset maintenance, such as broken URLs, may impede replicability and accessibility for future research endeavors [33]. Furthermore, temporal lexicon constraints might hinder capturing shifts in textual patterns, tones, and context, affecting the accuracy of temporal analysis [46]. We highlight that cross-lingual comparison of hyperpartisan traits has never been studied from a computational approach. Thus, it is not possible to define if the online environment flattens cultural-linguistic traits pertinent to hyperpartisanship independently from the country and its political system. Another consideration regards the limited availability of data over time due to paywalls and copyright restrictions poses a significant barrier, potentially restricting the depth and breadth of future analysis within certain timeframes. Lastly, despite the popularity and the good results that researchers achieved, as far as we know, autoregressive models were not used.

Conclusions and future works

In synthesizing insights from 81 studies, our systematic review illuminates the value of existing research in understanding hyperpartisan news. We summarized all the papers included in the systematic review in Table 9. With the support of this table, we are going to reply to the initial research questions.

Download:

Table 9. Table summarizing the papers selected with PRISMA methodology.

https://doi.org/10.1371/journal.pone.0313772.t009

RQ1: Does a categorization for hyperpartisan news detection methods exist? Currently, there is no widely adopted comprehensive categorization system in the literature. The field still lacks standardized mathematical models for quantifying textual exaggerations that define hyperpartisan content. One key contribution of this systematic review is that it represents the first attempt to systematize news-based approaches while also enhancing the traditional PRISMA methodology by integrating ResearchRabbit during the "Identification of studies via other methods" phase. ResearchRabbit facilitated a systematic, data-driven expansion of our literature pool by visualizing clusters based on citation linkages. This clustering approach provided a structured method for identifying and selecting relevant studies by uncovering both direct citation relationships and keyword-based topic similarities. As a result, the tool contributed to a more comprehensive and cohesive expansion of the selected literature base. Furthermore, we proposed a specific definition of the studied phenomenon that can be applied in Computer Social Science and Computer Science.

RQ2: Is hyperpartisan news detection a stand-alone or overlapping task? The complexity of hyperpartisan news detection hints at an overlapping task encompassing various forms of media bias, suggesting a shift towards multi-label detection for nuanced representations. Research shows that models with fine-grained label sets outperform binary classifications, yet the majority of studies use simplified, binary categories.

RQ3: What are the proposed solutions using textual data? Research commonly applies text-based methods, such as Natural Language Processing (NLP) techniques, to detect hyperpartisan content by identifying linguistic patterns of exaggeration and emotional tone. In terms of labels, fine-grained labels show improved model accuracy in detecting diverse biases, but in this case the annotation required is costly.

RQ4: Does the task keep up with new NLP technologies like autoregressive models? To date, the adoption of advanced autoregressive models in hyperpartisan news detection is limited, revealing a critical gap. This gap underscores a need to explore these models, which could improve detection accuracy with state-of-the-art language understanding.

RQ5: What are the results of the developed models developed? Since the release of BERT, this model architecture—and particularly its variants, such as RoBERTa—has achieved state-of-the-art performance in a wide range of classification tasks.

RQ6: What datasets are used for this task? How are they structured? Have they been updated to cover the latest political global and regional trends? Datasets predominantly comprise English-language news articles, which risks skewing results when applying models to non-English contexts. Limited representation of minority languages restricts model generalization and hampers analysis of unique democratic and socio-political dynamics. In addition, dataset maintenance issues (e.g., broken URLs) hinder replicability, and paywalls or copyright constraints restrict access to time-sensitive data, impacting longitudinal research.

RQ7: How can the current state of research on hyperpartisan detection be characterized in diverse languages and countries? The absence of linguistically diverse datasets is a significant limitation, especially in minority and underrepresented cultures. This restricts the field’s capacity to develop effective hyperpartisan detection models for varied linguistic environments. Current datasets’ Anglo-American focus may limit models’ efficacy when applied to global democracies with different political and cultural contexts, exacerbating bias and misinformation issues in these areas. Moreover, the lack of cross-lingual studies leaves the impact of online environments on cultural-linguistic variations in hyperpartisan traits unexplored.

In conclusion, while existing research provides insights into hyperpartisan news, limitations in dataset diversity, language inclusion, and methodology highlight the need for more robust, globally representative resources. Future research could benefit from exploring autoregressive models and expanding cross-lingual analysis for a broader understanding of hyperpartisanship in diverse political systems and cultural contexts.

References

1. Falkenbach M, Bekker M, Greer SL. Do parties make a difference? A review of partisan effects on health and the welfare state. Eur J Public Health 2020;30(4):673–82. pmid:31334750
- View Article
- PubMed/NCBI
- Google Scholar
2. Ellger F. The mobilizing effect of party system polarization. Evidence from Europe. Comparat Politic Stud 2023;57(8):1310–38.
- View Article
- Google Scholar
3. Dalton RJ. Modeling ideological polarization in democratic party systems. Elector Stud. 2021;72102346.
- View Article
- Google Scholar
4. Lorenz-Spreen P, Oswald L, Lewandowsky S, Hertwig R. A systematic review of worldwide causal and correlational evidence on digital media and democracy. Nat Hum Behav 2023;7(1):74–101. pmid:36344657
- View Article
- PubMed/NCBI
- Google Scholar
5. Guess AM, Barberó P, Munzert S, Yang J. The consequences of online partisan media. Proc Natl Acad Sci U S A 2021;118(14):e2013464118. pmid:33782116
- View Article
- PubMed/NCBI
- Google Scholar
6. Dalton RJ. Party identification and nonpartisanship. Int Encyclop Soc Behav Sci. 2015;6–6.
- View Article
- Google Scholar
7. McCoy J, Somer M. Toward a theory of pernicious polarization and how it harms democracies: comparative evidence and possible remedies. Ann Am Acad Politic Soc Sci 2018;681(1):234–71.
- View Article
- Google Scholar
8. Holt K, Ustad Figenschou T, Frischlich L. Key dimensions of alternative news media. Digit Journalism 2019;7(7):860–9.
- View Article
- Google Scholar
9. Tucker J, Guess A, Barbera P, Vaccari C, Siegel A, Sanovich S, et al. Social media, political polarization, and political disinformation: a review of the scientific literature. SSRN Electron J. 2018.
- View Article
- Google Scholar
10. Anthonio T. Robust document representations for hyperpartisan and fake news detection. 2019.
- View Article
- Google Scholar
11. Bartels LM. Partisanship in the trump era. J Politics. 2018;80:1483–94.
- View Article
- Google Scholar
12. Hawdon J, Ranganathan S, Leman S, Bookhultz S, Mitra T. Social media use, political polarization, and social capital: is social media tearing the U.S. apart? In: Meiselwitz G editor. Social computing and social media. Design, ethics, user behavior, and social network analysis. Springer; 2020. p. 243–260. https://doi.org/10.1007/978-3-030-49570-1_17
13. Bawden D, Robinson L. Curating the infosphere: Luciano floridi’s philosophy of information as the foundation for library and information science. J Documentation 2018;74(1):2–17.
- View Article
- Google Scholar
14. Nannini L, Bonel E, Bassi D, Maggini MJ. Beyond phase-in: assessing impacts on disinformation of the EU Digital Services Act. AI Ethics. 2024.
- View Article
- Google Scholar
15. Commission E. A multi-dimensional approach to disinformation – Report of the independent High level Group on fake news and online disinformation. Publications Office. 2018.
- View Article
- Google Scholar
16. European Parliament Council. Proposal for a regulation of the European parliament and of the council on a single market for digital services (digital services act) and amending directive 2000/31/EC. 2020.
- View Article
- Google Scholar
17. Bondielli A, Marcelloni F. A survey on fake news and rumour detection techniques. Inf Sci. 2019;55–55.
- View Article
- Google Scholar
18. Kiesel J, Mestre M, Shukla R, Vincent E, Adineh P, Corney D, et al. SemEval-2019 task 4: hyperpartisan news detection. In: Proceedings of the 13th International Workshop on Semantic Evaluation. 2019.
- View Article
- Google Scholar
19. Sousa-Silva R. Fighting the fake: a forensic linguistic analysis to fake news detection. Int J Semiot Law 2022;35(6):2409–33. pmid:35505837
- View Article
- PubMed/NCBI
- Google Scholar
20. Dykstra A. Critical reading of online news commentary headlines: stylistic and pragmatic aspects. Topics Linguist 2019;20(2):90–105.
- View Article
- Google Scholar
21. Xu WW, Sang Y, Kim C. What drives hyper-partisan news sharing: exploring the role of source, style, and content. Digital Journalism 2020;8(4):486–505.
- View Article
- Google Scholar
22. Pescetelli N, Barkoczi D, Cebrian M. Bots influence opinion dynamics without direct human-bot interaction: the mediating role of recommender systems. Appl Netw Sci 2022;7(1):46.
- View Article
- Google Scholar
23. Pitoura E, Tsaparas P, Flouris G, Fundulaki I, Papadakos P, Abiteboul S, et al. On measuring bias in online information. arXiv preprint 2017
- View Article
- Google Scholar
24. Nakov P, Sencar HT, An J, Kwak H. A survey on predicting the factuality and the bias of news media. arXiv preprint 2021
- View Article
- Google Scholar
25. Kondamudi MR, Sahoo SR, Chouhan L, Yadav N. A comprehensive survey of fake news in social networks: attributes, features, and detection approaches. J King Saud Univ – Comput Inf Sci 2023;35(6):101571.
- View Article
- Google Scholar
26. Hamborg F, Donnay K, Gipp B. Automated identification of media bias in news articles: an interdisciplinary literature review. Int J Digit Libr 2018;20(4):391–415.
- View Article
- Google Scholar
27. Medeiros FDC, Braga RB. Fake news detection in social media: a systematic review. In: XVI Brazilian Symposium on Information Systems. 2020; p. 1–8. https://doi.org/10.1145/3411564.3411648
28. Kapantai E, Christopoulou A, Berberidis C, Peristeras V. A systematic literature review on disinformation: toward a unified taxonomical framework. New Media Soc 2020;23(5):1301–26.
- View Article
- Google Scholar
29. Rodrigo-Ginós F-J, Carrillo-de-Albornoz J, Plaza L. A systematic review on media bias detection: what is media bias, how it is expressed, and how to detect it. Exp Syst Appl. 2024;237121641.
- View Article
- Google Scholar
30. Kitchenham B, Charters S. Guidelines for performing systematic literature reviews in software engineering. Technical Report. 2007.
- View Article
- Google Scholar
31. Moher D, Altman DG, Liberati A, Tetzlaff J. PRISMA statement. Epidemiology. 2011;22(1):128; author reply 128. https://doi.org/10.1097/EDE.0b013e3181fe7825 pmid:21150360
32. Gusenbauer M, Haddaway NR. Which academic search systems are suitable for systematic reviews or meta-analyses? Evaluating retrieval qualities of Google Scholar, PubMed, and 26 other resources. Res Synth Methods 2020;11(2):181–217. pmid:31614060
- View Article
- PubMed/NCBI
- Google Scholar
33. Potthast M, Kiesel J, Reinartz K, Bevendorff J, Stein B. A stylometric inquiry into hyperpartisan and fake news. In: Gurevych I, Miyao Y, editors. In: Gurevych I, Miyao Y, editors; 2018. p. 231–231.
- View Article
- Google Scholar
34. Zannettou S, Sirivianos M, Blackburn J, Kourtellis N. The web of false information. J Data Inf Quality 2019;11(3):1–37.
- View Article
- Google Scholar
35. Altay S, Berriche M, Heuer H, Farkas J, Rathje S. A survey of expert views on misinformation: definitions, determinants, solutions, and future of the field. Harvard Kennedy School Misinf Rev 2023;4:1–34.
- View Article
- Google Scholar
36. Ross Arguedas A, Robertson C, Fletcher R, Nielsen R. Echo chambers, filter bubbles, and polarisation: a literature review. Tech. Reportuan. 2022.
- View Article
- Google Scholar
37. Garg S, Sharma DK. Role of ELMo embedding in detecting fake news on social media. In: 2022 11th International Conference on System Modeling & Advancement in Research Trends (SMART). In: 2022 11th International Conference on System Modeling & Advancement in Research Trends (SMART); 2022. p. 57–57.
- View Article
- Google Scholar
38. Ross RM, Rand DG, Pennycook G. Beyond “fake news”: analytic thinking and the detection of false and hyperpartisan news headlines. Judgm decis mak 2021;16(2):484–504.
- View Article
- Google Scholar
39. Mouróo RR, Robertson CT. Fake news as discursive integration: an analysis of sites that publish false, misleading, hyperpartisan and sensational information. Journalism Stud 2019;20(14):2077–95.
- View Article
- Google Scholar
40. Agerri R. Doris Martin at SemEval-2019 task 4: hyperpartisan news detection with generic semi-supervised features. In: Proceedings of the 13th International Workshop on Semantic Evaluation. In: Proceedings of the 13th International Workshop on Semantic Evaluation; 2019. p. 944–944.
- View Article
- Google Scholar
41. Bourgonje P, Moreno Schneider J, Rehm G. From clickbait to fake news detection: an approach based on detecting the stance of headlines to articles. In: Proceedings of the 2017 EMNLP Workshop: Natural Language Processing Meets Journalism. In: Proceedings of the 2017 EMNLP Workshop: Natural Language Processing Meets Journalism; 2017. p. 84–84.
- View Article
- Google Scholar
42. Pórez-Almendros C, Espinosa-Anke L, Schockaert S. Cardiff University at SemEval-2019 task 4: linguistic features for hyperpartisan news detection. In: Proceedings of the 13th International Workshop on Semantic Evaluation. In: Proceedings of the 13th International Workshop on Semantic Evaluation; 2019. p. 929–929.
- View Article
- Google Scholar
43. Dumitru VC, Rebedea T. Fake and hyper-partisan news identification. In: Moldoveanu A, Dix AJ, editors. 16th International Conference on Human-Computer Interaction, RoCHI 2019; 2019 Oct 17–18; Bucharest, Romania. Matrix Rom; 2019, p. 60–7.
- View Article
- Google Scholar
44. Knauth J. Orwellian-times at SemEval-2019 Task 4: a stylistic and content-based classifier. In: Proceedings of the 13th International Workshop on Semantic Evaluation. In: Proceedings of the 13th International Workshop on Semantic Evaluation; 2019. p. 976–976.
- View Article
- Google Scholar
45. Sengupta S, Pedersen T. Duluth at SemEval-2019 Task 4: the pioquinto manterola hyperpartisan news detector. In: Proceedings of the 13th International Workshop On Semantic Evaluation. In: Proceedings of the 13th International Workshop On Semantic Evaluation; 2019. p. 949–949.
- View Article
- Google Scholar
46. Lyu H, Pan J, Wang Z, Luo J. Computational assessment of hyperpartisanship in news titles. ICWSM. 2024;18:999–1012.
- View Article
- Google Scholar
47. Amason E, Palanker J, Shen MC, Medero J. Harvey Mudd College at SemEval-2019 Task 4: the D.X. Beaumont hyperpartisan news detector. Proceedings of the 13th International Workshop on Semantic Evaluation. Association for Computational Linguistics; 2019. p. 967–70. https://doi.org/10.18653/v1/s19-2166
48. Walton D. Ad Hominem arguments. University Alabama Press; 1998.
49. Tran M. How biased are American media outlets? A framework for presentation bias regression. In: 2020 IEEE International Conference on Big Data (Big Data). In: 2020 IEEE International Conference on Big Data (Big Data); 2020. p. 4359–4359.
- View Article
- Google Scholar
50. Anand B, Di Tella R, Galetovic A. Information or opinion? Media bias as product differentiation. Econ Manag Strategy 2007;16(3):635–82.
- View Article
- Google Scholar
51. Sharma A, Kaur N, Sen A, Seth A. Ideology detection in the Indian mass media. In: 2020 IEEE/ACM International Conference on Advances in Social Networks Analysis and Mining (ASONAM). In: 2020 IEEE/ACM International Conference on Advances in Social Networks Analysis and Mining (ASONAM); 2020. p. 627–627.
- View Article
- Google Scholar
52. Baumer E, Elovic E, Qin Y, Polletta F, Gay G. Testing and comparing computational approaches for identifying the language of framing in political news. In: Proceedings of the 2015 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies. In: Proceedings of the 2015 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies; 2015. p. 1472–1472.
- View Article
- Google Scholar
53. Kong H-K, Liu Z, Karahalios K. Frames and slants in titles of visualizations on controversial topics. In: Proceedings of the 2018 CHI Conference on Human Factors in Computing Systems. In: Proceedings of the 2018 CHI Conference on Human Factors in Computing Systems; 2018. p. 1–1.
- View Article
- Google Scholar
54. Leeson PT, Coyne CJ. Manipulating the media. SSRN. 2011.
- View Article
- Google Scholar
55. Honeycutt N, Jussim L. Political bias in the social sciences: a critical, theoretical, and empirical review. In: Ideological and political bias in psychology: nature, scope, and solutions. 2023. p. 97–146. doi: https://doi.org/10.1007/978-3-031-29148-7_5.
56. Patankar A, Bose J, Khanna H. A bias aware news recommendation system. In: 2019 IEEE 13th International Conference on Semantic Computing (ICSC). 2019. p. 232–8. https://doi.org/10.1109/icosc.2019.8665610
57. Barnidge M, Peacock C. A third wave of selective exposure research? The challenges posed by hyperpartisan news on social media. MaC 2019;7(3):4–7.
- View Article
- Google Scholar
58. Gangula RRR, Duggenpudi SR, Mamidi R. Detecting political bias in news articles using headline attention. In: Proceedings of the 2019 ACL Workshop BlackboxNLP: Analyzing and Interpreting Neural Networks for NLP. In: Proceedings of the 2019 ACL Workshop BlackboxNLP: Analyzing and Interpreting Neural Networks for NLP; 2019. p. 77–77.
- View Article
- Google Scholar
59. Pierri F, Artoni A, Ceri S. HoaxItaly: a collection of Italian disinformation and fact-checking stories shared on Twitter in 2019. arXiv preprint 2020
- View Article
- Google Scholar
60. Huang GKW, Lee JC. Hyperpartisan news and articles detection using BERT and ELMo. In: 2019 International Conference on Computer and Drone Applications (IConDA). In: 2019 International Conference on Computer and Drone Applications (IConDA); 2019. p. 29–29.
- View Article
- Google Scholar
61. University Politehnica of Bucharest, Dumitru VC, Rebedea T. Topic-based models with fact checking for fake news identification. In: RoCHI - International Conference on Human-Computer Interaction, In: RoCHI - International Conference on Human-Computer Interaction; 2021. p. 182–182.
- View Article
- Google Scholar
62. Sanchez-Junquera J, Rosso P, Montes M, Ponzetto S, 2021. Masking and transformer-based models for hyperpartisanship detection in news. 2021. p. 1244–1251. 10.26615/978-954-452-072-4_140
- View Article
- Google Scholar
63. Jeong Lim S, Jatowt A, Yoshikawa M. Understanding characteristics of biased sentences in news articles. In: CIKM Workshops; 2018.
- View Article
- Google Scholar
64. Naredla NR, Adedoyin FF. Detection of hyperpartisan news articles using natural language processing technique. Int J Inf Manag Data Insights 2022;2(1):100064.
- View Article
- Google Scholar
65. Papadopoulou O, Kordopatis-Zilos G, Zampoglou M, Papadopoulos S, Kompatsiaris Y. Brenda Starr at SemEval-2019 Task 4: hyperpartisan news detection. In: Proceedings of the 13th International Workshop on Semantic Evaluation. 2019; 924–8. https://doi.org/10.18653/v1/s19-2157
66. M. Alzhrani K. Political ideology detection of news articles using deep neural networks. Intell Automat Soft Comput 2022;33(1):483–500.
- View Article
- Google Scholar
67. Hermann ES, Chomsky N. Manufacturing consent: the political economy of the mass media. Manuf Consent. 1994.
- View Article
- Google Scholar
68. Hrckova A, Moro R, Srba I, Bielikova M. Quantitative and qualitative analysis of linking patterns of mainstream and partisan online news media in Central Europe. OIR 2021;46(5):954–73.
- View Article
- Google Scholar
69. Kulkarni V, Ye J, Skiena S, Wang WY. Multi-view models for political ideology detection of news articles. In: Proceedings of the 2018 Conference on Empirical Methods in Natural Language Processing. 2018.
- View Article
- Google Scholar
70. Alabdulkarim A, Alhindi T. Spider-Jerusalem at SemEval-2019 Task 4: hyperpartisan news detection. In: Proceedings of the 13th International Workshop on Semantic Evaluation. In: Proceedings of the 13th International Workshop on Semantic Evaluation; 2019. p. 985–985.
- View Article
- Google Scholar
71. Alzhrani K. Ideology detection of personalized political news coverage. In: Proceedings of the 2020 4th International Conference on Compute and Data Analysis. In: Proceedings of the 2020 4th International Conference on Compute and Data Analysis; 2020. p. 10–10.
- View Article
- Google Scholar
72. Jiang Y, Petrak J, Song X, Bontcheva K, Maynard D. Team Bertha von Suttner at SemEval-2019 Task 4: hyperpartisan news detection using ELMo sentence representation convolutional network. Proceedings of the 13th International Workshop on Semantic Evaluation. 2019. p. 840–4. https://doi.org/10.18653/v1/s19-2146
73. Baly R, Karadzhov G Saleh A, Glass J, Nakov P. Multi-task ordinal regression for jointly predicting the trustworthiness and the leading political ideology of news media. In: Burstein J, Doran C, Solorio T, editors. In: Burstein J, Doran C, Solorio T, editors; 2019. p. 2109–2109.
- View Article
- Google Scholar
74. Chen C, Park C, Dwyer J, Medero J. Harvey Mudd College at SemEval-2019 Task 4: the Carl Kolchak hyperpartisan news detector. In: Proceedings of the 13th International Workshop on Semantic Evaluation. In: Proceedings of the 13th International Workshop on Semantic Evaluation; 2019. p. 957–957.
- View Article
- Google Scholar
75. Palić N, Vladika J, Čubelić D, Lovrenčić I, Buljan M, Šnajder J. TakeLab at SemEval-2019 Task 4: hyperpartisan news detection. In: Proceedings of the 13th International Workshop on Semantic Evaluation. In: Proceedings of the 13th International Workshop on Semantic Evaluation; 2019. p. 995–995.
- View Article
- Google Scholar
76. Anthonio T, Kloppenburg L. Team Kermit-the-frog at SemEval-2019 Task 4: bias detection through sentiment analysis and simple linguistic features. In: Proceedings of the 13th International Workshop on Semantic Evaluation. In: Proceedings of the 13th International Workshop on Semantic Evaluation; 2019. p. 1016–1016.
- View Article
- Google Scholar
77. Srivastava V, Gupta A, Prakash D, Sahoo SK, R.R R, Kim YH. Vernon-fenwick at SemEval-2019 Task 4: hyperpartisan news detection using lexical and semantic features. In: Proceedings of the 13th International Workshop on Semantic Evaluation. Association for Computational Linguistics; 2019. p. 1078–82. https://doi.org/10.18653/v1/s19-2189
78. Spezzano F, Shrestha A, Fails JA, Stone BW. That’s fake news! reliability of news when provided title, image, source bias & full article. Proc ACM Hum-Comput Interact. 2021;5(CSCW1):1–19. https://doi.org/10.1145/3449183
79. Sridharan ASN. An automated news bias classifier using caenorhabditis elegans inspired recursive feedback network architecture. arXiv preprint 2022
- View Article
- Google Scholar
80. Aksenov D, Bourgonje P, Zaczynska K, Ostendorff M, Moreno-Schneider J, Rehm G. Fine-grained classification of political bias in german news: a data set and initial experiments. In: Proceedings of the 5th Workshop on Online Abuse and Harms (WOAH 2021). In: Proceedings of the 5th Workshop on Online Abuse and Harms (WOAH 2021); 2021. p. 121–121.
- View Article
- Google Scholar
81. Azizov D, Nakov P, Liang S, Frank at checkthat! 2023: Detecting the political bias of news articles and news media. In: Conference and Labs of the Evaluation Forum. 2023.
- View Article
- Google Scholar
82. Hearst MA, Dumais ST, Osuna E, Platt J, Scholkopf B. Support vector machines. IEEE Intell Syst Their Appl 1998;13(4):18–28.
- View Article
- Google Scholar
83. Yeh C-L, Loni B, Schuth A. Tom Jumbo-Grumbo at SemEval-2019 Task 4: hyperpartisan news detection with GloVe vectors and SVM. In: Proceedings of the 13th International Workshop on Semantic Evaluation. In: Proceedings of the 13th International Workshop on Semantic Evaluation; 2019. p. 1067–1067.
- View Article
- Google Scholar
84. Chen T, Guestrin C. XGBoost. In: Proceedings of the 22nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining. 2016. p. 785–94. https://doi.org/10.1145/2939672.2939785
85. Gupta V, Kaur Jolly BL, Kaur R, Chakraborty T. Clark Kent at SemEval-2019 Task 4: stylometric insights into hyperpartisan news detection. In: Proceedings of the 13th International Workshop on Semantic Evaluation. In: Proceedings of the 13th International Workshop on Semantic Evaluation; 2019. p. 934–934.
- View Article
- Google Scholar
86. Merow C, Smith MJ, Silander JA Jr. A practical guide to MaxEnt for modeling species’ distributions: what it does, and why inputs and settings matter. Ecography 2013;36(10):1058–69.
- View Article
- Google Scholar
87. Breiman L. Mach Learn. 2001;45(1):5–32
- View Article
- Google Scholar
88. Chakravartula N, Indurthi V, Syed B. Fermi at SemEval-2019 Task 4: the Sarah-Jane-Smith hyperpartisan news detector. In: Proceedings of the 13th International Workshop on Semantic Evaluation. Association for Computational Linguistics; 2019.
89. Cruz A, Rocha G, Sousa-Silva R, Lopes Cardoso H. Team Fernando-Pessa at SemEval-2019 Task 4: back to basics in hyperpartisan news detection. In: Proceedings of the 13th International Workshop on Semantic Evaluation. In: Proceedings of the 13th International Workshop on Semantic Evaluation; 2019. p. 999–999.
- View Article
- Google Scholar
90. Stevanoski B, Gievska S. Team Ned Leeds at SemEval-2019 Task 4: exploring language indicators of hyperpartisan reporting. In: Proceedings of the 13th International Workshop on Semantic Evaluation. In: Proceedings of the 13th International Workshop on Semantic Evaluation; 2019. p. 1026–1026.
- View Article
- Google Scholar
91. Saleh A, Baly R, Barrón-Cedeóo A, Da San Martino G, Mohtarami M, Nakov P, et al. Team QCRI-MIT at SemEval-2019 Task 4: propaganda analysis meets hyperpartisan news detection. In: Proceedings of the 13th International Workshop on Semantic Evaluation. In: Proceedings of the 13th International Workshop on Semantic Evaluation; 2019. p. 1041–1041.
- View Article
- Google Scholar
92. Bestgen Y. Tintin at SemEval-2019 Task 4: Detecting Hyperpartisan News Article with only Simple Tokens. Proceedings of the 13th International Workshop on Semantic Evaluation. Association for Computational Linguistics; 2019. p. 1062–6. https://doi.org/10.18653/v1/s19-2186
93. Hanawa K, Sasaki S, Ouchi H, Suzuki J, Inui K. The Sally Smedley hyperpartisan news detector at SemEval-2019 Task 4. In: Proceedings of the 13th International Workshop on Semantic Evaluation. In: Proceedings of the 13th International Workshop on Semantic Evaluation; 2019. p. 1057–1057.
- View Article
- Google Scholar
94. Nguyen D-V, Dang T, Nguyen N. NLP@UIT at SemEval-2019 Task 4: the paparazzo hyperpartisan news detector. In: Proceedings of the 13th International Workshop on Semantic Evaluation. In: Proceedings of the 13th International Workshop on Semantic Evaluation; 2019. p. 971–971.
- View Article
- Google Scholar
95. Le QV, Mikolov T. Distributed representations of sentences and documents. arXiv preprint 2014
- View Article
- Google Scholar
96. Pennington J, Socher R, Manning C. Glove: global vectors for word representation. In: Proceedings of the 2014 Conference on Empirical Methods in Natural Language Processing (EMNLP). In: Proceedings of the 2014 Conference on Empirical Methods in Natural Language Processing (EMNLP); 2014. p. 1532–1532.
- View Article
- Google Scholar
97. Peters ME, Neumann M, Iyyer M, Gardner M, Clark C, Lee K, et al. Deep contextualized word representations. arXiv preprint 2018
- View Article
- Google Scholar
98. Cer D, Yang Y, Kong S, Hua N, Limtiaco N, John RS, et al. Universal Sentence Encoder. arXiv preprint 2018
- View Article
- Google Scholar
99. Walecki R, Rudovic O, Pavlovic V, Pantic M. Copula ordinal regression for joint estimation of facial action unit intensity. In: 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR). 2016. p. 4902–10. https://doi.org/10.1109/cvpr.2016.530
100. Rumelhart DE, Hinton GE, Williams RJ. Learning internal representations by error propagation. 1986
- View Article
- Google Scholar
101. Dorogush AV, Gulin A, Gusev G, Kazeev N, Prokhorenkova, LO, Vorobev A. Fighting biases with dynamic boosting. arXiv preprint 2017
- View Article
- Google Scholar
102. Gerald Ki Wei H, Jun Choi L. Hyperpartisan news classification with ELMo and bias feature. J Inf Sci Eng. 2021;37.
- View Article
- Google Scholar
103. Fórber M, Qurdina A, Ahmedi L. Team Peter Brinkmann at SemEval-2019 Task 4: detecting biased news articles using convolutional neural networks. In: Proceedings of the 13th International Workshop on Semantic Evaluation. In: Proceedings of the 13th International Workshop on Semantic Evaluation; 2019. p. 1032–1032.
- View Article
- Google Scholar
104. Zehe A, Hettinger L, Ernst S, Hauptmann C, Hotho A. Team Xenophilius Lovegood at SemEval-2019 Task 4: hyperpartisanship classification using convolutional neural networks. In: Proceedings of the 13th International Workshop on Semantic Evaluation. In: Proceedings of the 13th International Workshop on Semantic Evaluation; 2019. p. 1047–1047.
- View Article
- Google Scholar
105. Ruan Q, Mac Namee B, Dong R. Bias bubbles: using semi-supervised learning to measure how many biased news articles are around us. In: AICS. 2021. p. 153–64.
- View Article
- Google Scholar
106. Yang Z, Yang D, Dyer C, He X, Smola A, Hovy E. Hierarchical attention networks for document classification. In: Knight K, Nenkova A, Rambow O, editors. Proceedings of the 2016 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies. San Diego, California: Association for Computational Linguistics; 2016. p. 1480–9 https://doi.org/10.18653/v1/n16-1174
107. Cruz AF, Rocha G, Cardoso HL. On document representations for detection of biased news articles. In: Proceedings of the 35th Annual ACM Symposium on Applied Computing. In: Proceedings of the 35th Annual ACM Symposium on Applied Computing; 2020. p. 892–892.
- View Article
- Google Scholar
108. Moreno JG, Pitarch Y, Pinel-Sauvagnat K, Hubert G. Rouletabille at SemEval-2019 Task 4: neural network baseline for identification of hyperpartisan publishers. In: Proceedings of the 13th International Workshop on Semantic Evaluation. In: Proceedings of the 13th International Workshop on Semantic Evaluation; 2019. p. 981–981.
- View Article
- Google Scholar
109. Ko Y, Ryu S, Han S, Jeon Y, Kim J, Park S, et al. KHAN: knowledge-aware hierarchical attention networks for accurate political stance prediction. In: Proceedings of the ACM Web Conference 2023 WWW 2023. Austin, TX: ACM; 2023. p. 1572–83. ISBN: 9781450394161. https://doi.org/10.1145/3543507.3583300
110. Hochreiter S, Schmidhuber J. Long short-term memory. Neural Comput 1997;9(8):1735–80. pmid:9377276
- View Article
- PubMed/NCBI
- Google Scholar
111. Isbister T, Johansson F. Dick-Preston and Morbo at SemEval-2019 Task 4: transfer learning for hyperpartisan news detection. In: Proceedings of the 13th International Workshop on Semantic Evaluation. In: Proceedings of the 13th International Workshop on Semantic Evaluation; 2019. p. 939–939.
- View Article
- Google Scholar
112. Li C, Goldwasser D. Using social and linguistic information to adapt pretrained representations for political perspective identification. In: Zong C, Xia F, Li W, Navigli R, editors. In: Zong C, Xia F, Li W, Navigli R, editors; 2021. p. 4569–4569.
- View Article
- Google Scholar
113. Cramerus R, Scheffler T. Team Kit Kittredge at SemEval-2019 Task 4: LSTM voting system. In: Proceedings of the 13th International Workshop on Semantic Evaluation. In: Proceedings of the 13th International Workshop on Semantic Evaluation; 2019. p. 1021–1021.
- View Article
- Google Scholar
114. Zhang C, Rajendran A, Abdul-Mageed M. UBC-NLP at SemEval-2019 Task 4: hyperpartisan news detection with attention-based Bi-LSTMs. In: Proceedings of the 13th International Workshop on Semantic Evaluation. In: Proceedings of the 13th International Workshop on Semantic Evaluation; 2019. p. 1072–1072.
- View Article
- Google Scholar
115. Liu Y, Ott M, Goyal N, Du J, Joshi M, Chen D, et al. Roberta: a robustly optimized BERT pretraining approach. arXiv preprint 2019
- View Article
- Google Scholar
116. Kim K-M, Lee M, Won H-S, Kim M-J, Kim Y, Lee S. Multi-stage prompt tuning for political perspective detection in low-resource settings. Appl Sci 2023;13(10):6252.
- View Article
- Google Scholar
117. Liu Y, Zhang XF, Wegsman D, Beauchamp N, Wang L. POLITICS: pretraining with same-story article comparison for ideology prediction and stance detection. In: Carpuat M, de Marneffe MC, Meza Ruiz IV, editors. In: Carpuat M, de Marneffe MC, Meza Ruiz IV, editors; 2022. p. 1354–1354.
- View Article
- Google Scholar
118. Kim MY, Johnson KM. CLoSE: contrastive learning of subframe embeddings for political bias classification of news media. In: Calzolari N, Huang CR, Kim H, Pustejovsky J, Wanner L, Choi KS, et al., editors. Proceedings of the 29th International Conference on Computational Linguistics, International Committee on Computational Linguistics. 2022. p. 2780–2793.
- View Article
- Google Scholar
119. Devlin J, Chang M, Lee K, Toutanova K. BERT: pre-training of deep bidirectional transformers for language understanding. arXiv preprint 2018
- View Article
- Google Scholar
120. Roy S, Goldwasser D. Weakly supervised learning of nuanced frames for analyzing polarization in news media. In: Webber B, Cohn T, He Y, Liu Y, editors. In: Webber B, Cohn T, He Y, Liu Y, editors; 2020. p. 7698–7698.
- View Article
- Google Scholar
121. Baly R, Da San Martino G, Glass J, Nakov P. We can detect your bias: predicting the political ideology of news articles. In: Webber B, Cohn T, He Y, Liu Y, editors. In: Webber B, Cohn T, He Y, Liu Y, editors; 2020. p. 4982–4982.
- View Article
- Google Scholar
122. Da Silva SC, Paraboni I. Politically-oriented information inference from text. JUCS 2023;29(6):569–94.
- View Article
- Google Scholar
123. Shaprin D, Da San Martino G, Barrón-Cedeóo A, Nakov P. Team Jack Ryder at SemEval-2019 Task 4: using BERT representations for detecting hyperpartisan news. In: Proceedings of the 13th International Workshop on Semantic Evaluation. In: Proceedings of the 13th International Workshop on Semantic Evaluation; 2019. p. 1012–1012.
- View Article
- Google Scholar
124. Ahmed U, Lin JC, Srivastava G. Temporal positional lexicon expansion for federated learning based on hyperpatism detection. Exp Syst 2022;40(5):e13183.
- View Article
- Google Scholar
125. Ahmed U, Lin JC-W, Srivastava G. Semisupervised federated learning for temporal news hyperpatism detection. IEEE Trans Comput Soc Syst 2023;10(4):1758–69.
- View Article
- Google Scholar
126. Drissi M, Sandoval Segura P, Ojha V, Medero J. Harvey Mudd College at SemEval-2019 Task 4: the Clint Buchanan hyperpartisan news detector. In: Proceedings of the 13th International Workshop on Semantic Evaluation. In: Proceedings of the 13th International Workshop on Semantic Evaluation; 2019. p. 962–962.
- View Article
- Google Scholar
127. Mutlu O, Can OA, Dayanik E. Team Howard Beale at SemEval-2019 Task 4: hyperpartisan news detection with BERT. Proceedings of the 13th International Workshop on Semantic Evaluation. Association for Computational Linguistics; 2019. p. 1007–11. https://doi.org/10.18653/v1/s19-2175
128. Smaşdu RA, Echim SV, Cercel DC, Marin I, Pop F. From fake to hyperpartisan news detection using domain adaptation. arXiv preprint 2023
- View Article
- Google Scholar
129. Omidi Shayegan S, Nejadgholi I, Pelrine K, Yu H, Levy S, Yang Z, et al. An evaluation of language models for hyperpartisan ideology detection in Persian Twitter. In: Ojha AK, Ahmadi S, Cinkova S, Fransen T, Liu CH, McCrae JP, editors. In: Ojha AK, Ahmadi S, Cinkova S, Fransen T, Liu CH, McCrae JP, editors; 2024. p. 51–51. https://aclanthology.org/2024.eurali-1.8.
- View Article
- Google Scholar
130. Touvron H, Martin L, Stone K, Albert P, Almahairi A, Babaei Y, et al. Llama 2: open foundation and fine-tuned chat models. arXiv preprint 2023
- View Article
- Google Scholar
131. Afsarmanesh N, Karlgren J, Sumbler P, Viereckel N. Team Harry Friberg at SemEval-2019 Task 4: identifying hyperpartisan news through editorially defined metatopics. In: Proceedings of the 13th International Workshop on Semantic Evaluation. In: Proceedings of the 13th International Workshop on Semantic Evaluation; 2019. p. 1004–1004.
- View Article
- Google Scholar
132. Sanchez-Junquera J. On the detection of political and social bias. 2021.
- View Article
- Google Scholar
133. Mets M, Karjus A, Ibrus I, Schich M. Automated stance detection in complex topics and small languages: the challenging case of immigration in polarizing news media. PLoS One 2024;19(4):e0302380. pmid:38669237
- View Article
- PubMed/NCBI
- Google Scholar
134. Karjus A, Cuskley C. Evolving linguistic divergence on polarizing social media. Humanit Soc Sci Commun 2024;11(1):422.
- View Article
- Google Scholar
135. Sylwester K, Purver M. Twitter language use reflects psychological differences between democrats and republicans. PLoS One 2015;10(9):e0137422. pmid:26375581
- View Article
- PubMed/NCBI
- Google Scholar
136. Fraxanet E, Pellert M, Schweighofer S, Gómez V, Garcia D. Unpacking polarization: antagonism and alignment in signed networks of online interaction. PNAS Nexus. 2024;3(12):pgae276. https://doi.org/10.1093/pnasnexus/pgae276 pmid:39703230
137. Lee N, Liu Z, Fung P. Team yeon-zi at SemEval-2019 Task 4: hyperpartisan news detection by de-noising weakly-labeled data. In: Proceedings of the 13th International Workshop on Semantic Evaluation. In: Proceedings of the 13th International Workshop on Semantic Evaluation; 2019. p. 1052–1052.
- View Article
- Google Scholar
138. Joo Y, Hwang I. Steve Martin at SemEval-2019 Task 4: ensemble learning model for detecting hyperpartisan news. In: Proceedings of the 13th International Workshop on Semantic Evaluation. Association for Computational Linguistics; 2019, p. 990–4. https://doi.org/10.18653/v1/s19-2171
139. Ning Z, Lin Y, Zhong R. Team Peter-Parker at SemEval-2019 Task 4: BERT-based method in hyperpartisan news detection. In: Proceedings of the 13th International Workshop on Semantic Evaluation. In: Proceedings of the 13th International Workshop on Semantic Evaluation; 2019. p. 1037–1037.
- View Article
- Google Scholar
140. Silverman C, Strapagiel L, Shaban H, Hall E, Singer-Vine J. Hyperpartisan facebook pages are publishing false and misleading information at an alarming rate. BuzzFeed News. 2016.
- View Article
- Google Scholar
141. Gebhard L, Hamborg F. The POLUSA dataset: 0.9m political news articles balanced by time and outlet popularity. 2020.
- View Article
- Google Scholar
142. Norregaard J, Horne BD, Adali S. NELA-GT-2018: a large multi-labelled news dataset for the study of misinformation in news articles. arXiv preprint 2019
- View Article
- Google Scholar
143. Fan L, White M, Sharma E, Su R, Choubey PK, Huang R, et al. In plain sight: Media bias through the lens of factual reporting. In: Inui K, Jiang J, Ng V, Wan X, editors. In: Inui K, Jiang J, Ng V, Wan X, editors; 2019. p. 6343–6343.
- View Article
- Google Scholar
144. Baly R, Karadzhov G, Alexandrov D, Glass J, Nakov P. Predicting factuality of reporting and bias of news media sources. In: Riloff E, Chiang D, Hockenmaier J, Tsujii J, editors. In: Riloff E, Chiang D, Hockenmaier J, Tsujii J, editors; 2018. p. 3528–3528.
- View Article
- Google Scholar
145. Horne BD, Dron W, Khedr S, Adali S. Sampling the news producers: a large news and feature data set for the study of the complex media landscape. arXiv preprint 2018
- View Article
- Google Scholar
146. Szwoch J, Staszkow M, Rzepka R, Araki K. Creation of polish online news corpus for political polarization studies. In: Afli H, Alam M, Bouamor H, Casagran CB, Boland C, Ghannay S, editors. Proceedings of the LREC 2022 workshop on Natural Language Processing for Political Sciences. European Language Resources Association; 2022. p. 86–90
- View Article
- Google Scholar
147. Lim S, Jatowt A, Yoshikawa M. Creating a dataset for fine-grained bias detection in news articles. 2020.
- View Article
- Google Scholar
148. Li C, Goldwasser D. Encoding social information with graph convolutional networks for political perspective detection in news media. In: Korhonen A, Traum D, Marquez L, editors. In: Korhonen A, Traum D, Marquez L, editors; 2019. p. 2594–2594.
- View Article
- Google Scholar

[ref1] 1. Falkenbach M, Bekker M, Greer SL. Do parties make a difference? A review of partisan effects on health and the welfare state. Eur J Public Health 2020;30(4):673–82. pmid:31334750
View Article
PubMed/NCBI
Google Scholar

[2] View Article

[3] PubMed/NCBI

[4] Google Scholar

[ref2] 2. Ellger F. The mobilizing effect of party system polarization. Evidence from Europe. Comparat Politic Stud 2023;57(8):1310–38.
View Article
Google Scholar

[6] View Article

[7] Google Scholar

[ref3] 3. Dalton RJ. Modeling ideological polarization in democratic party systems. Elector Stud. 2021;72102346.
View Article
Google Scholar

[9] View Article

[10] Google Scholar

[ref4] 4. Lorenz-Spreen P, Oswald L, Lewandowsky S, Hertwig R. A systematic review of worldwide causal and correlational evidence on digital media and democracy. Nat Hum Behav 2023;7(1):74–101. pmid:36344657
View Article
PubMed/NCBI
Google Scholar

[12] View Article

[13] PubMed/NCBI

[14] Google Scholar

[ref5] 5. Guess AM, Barberó P, Munzert S, Yang J. The consequences of online partisan media. Proc Natl Acad Sci U S A 2021;118(14):e2013464118. pmid:33782116
View Article
PubMed/NCBI
Google Scholar

[16] View Article

[17] PubMed/NCBI

[18] Google Scholar

[ref6] 6. Dalton RJ. Party identification and nonpartisanship. Int Encyclop Soc Behav Sci. 2015;6–6.
View Article
Google Scholar

[20] View Article

[21] Google Scholar

[ref7] 7. McCoy J, Somer M. Toward a theory of pernicious polarization and how it harms democracies: comparative evidence and possible remedies. Ann Am Acad Politic Soc Sci 2018;681(1):234–71.
View Article
Google Scholar

[23] View Article

[24] Google Scholar

[ref8] 8. Holt K, Ustad Figenschou T, Frischlich L. Key dimensions of alternative news media. Digit Journalism 2019;7(7):860–9.
View Article
Google Scholar

[26] View Article

[27] Google Scholar

[ref9] 9. Tucker J, Guess A, Barbera P, Vaccari C, Siegel A, Sanovich S, et al. Social media, political polarization, and political disinformation: a review of the scientific literature. SSRN Electron J. 2018.
View Article
Google Scholar

[29] View Article

[30] Google Scholar

[ref10] 10. Anthonio T. Robust document representations for hyperpartisan and fake news detection. 2019.
View Article
Google Scholar

[32] View Article

[33] Google Scholar

[ref11] 11. Bartels LM. Partisanship in the trump era. J Politics. 2018;80:1483–94.
View Article
Google Scholar

[35] View Article

[36] Google Scholar

[ref12] 12. Hawdon J, Ranganathan S, Leman S, Bookhultz S, Mitra T. Social media use, political polarization, and social capital: is social media tearing the U.S. apart? In: Meiselwitz G editor. Social computing and social media. Design, ethics, user behavior, and social network analysis. Springer; 2020. p. 243–260. https://doi.org/10.1007/978-3-030-49570-1_17

[ref13] 13. Bawden D, Robinson L. Curating the infosphere: Luciano floridi’s philosophy of information as the foundation for library and information science. J Documentation 2018;74(1):2–17.
View Article
Google Scholar

[39] View Article

[40] Google Scholar

[ref14] 14. Nannini L, Bonel E, Bassi D, Maggini MJ. Beyond phase-in: assessing impacts on disinformation of the EU Digital Services Act. AI Ethics. 2024.
View Article
Google Scholar

[42] View Article

[43] Google Scholar

[ref15] 15. Commission E. A multi-dimensional approach to disinformation – Report of the independent High level Group on fake news and online disinformation. Publications Office. 2018.
View Article
Google Scholar

[45] View Article

[46] Google Scholar

[ref16] 16. European Parliament Council. Proposal for a regulation of the European parliament and of the council on a single market for digital services (digital services act) and amending directive 2000/31/EC. 2020.
View Article
Google Scholar

[48] View Article

[49] Google Scholar

[ref17] 17. Bondielli A, Marcelloni F. A survey on fake news and rumour detection techniques. Inf Sci. 2019;55–55.
View Article
Google Scholar

[51] View Article

[52] Google Scholar

[ref18] 18. Kiesel J, Mestre M, Shukla R, Vincent E, Adineh P, Corney D, et al. SemEval-2019 task 4: hyperpartisan news detection. In: Proceedings of the 13th International Workshop on Semantic Evaluation. 2019.
View Article
Google Scholar

[54] View Article

[55] Google Scholar

[ref19] 19. Sousa-Silva R. Fighting the fake: a forensic linguistic analysis to fake news detection. Int J Semiot Law 2022;35(6):2409–33. pmid:35505837
View Article
PubMed/NCBI
Google Scholar

[57] View Article

[58] PubMed/NCBI

[59] Google Scholar

[ref20] 20. Dykstra A. Critical reading of online news commentary headlines: stylistic and pragmatic aspects. Topics Linguist 2019;20(2):90–105.
View Article
Google Scholar

[61] View Article

[62] Google Scholar

[ref21] 21. Xu WW, Sang Y, Kim C. What drives hyper-partisan news sharing: exploring the role of source, style, and content. Digital Journalism 2020;8(4):486–505.
View Article
Google Scholar

[64] View Article

[65] Google Scholar

[ref22] 22. Pescetelli N, Barkoczi D, Cebrian M. Bots influence opinion dynamics without direct human-bot interaction: the mediating role of recommender systems. Appl Netw Sci 2022;7(1):46.
View Article
Google Scholar

[67] View Article

[68] Google Scholar

[ref23] 23. Pitoura E, Tsaparas P, Flouris G, Fundulaki I, Papadakos P, Abiteboul S, et al. On measuring bias in online information. arXiv preprint 2017
View Article
Google Scholar

[70] View Article

[71] Google Scholar

[ref24] 24. Nakov P, Sencar HT, An J, Kwak H. A survey on predicting the factuality and the bias of news media. arXiv preprint 2021
View Article
Google Scholar

[73] View Article

[74] Google Scholar

[ref25] 25. Kondamudi MR, Sahoo SR, Chouhan L, Yadav N. A comprehensive survey of fake news in social networks: attributes, features, and detection approaches. J King Saud Univ – Comput Inf Sci 2023;35(6):101571.
View Article
Google Scholar

[76] View Article

[77] Google Scholar

[ref26] 26. Hamborg F, Donnay K, Gipp B. Automated identification of media bias in news articles: an interdisciplinary literature review. Int J Digit Libr 2018;20(4):391–415.
View Article
Google Scholar

[79] View Article

[80] Google Scholar

[ref27] 27. Medeiros FDC, Braga RB. Fake news detection in social media: a systematic review. In: XVI Brazilian Symposium on Information Systems. 2020; p. 1–8. https://doi.org/10.1145/3411564.3411648

[ref28] 28. Kapantai E, Christopoulou A, Berberidis C, Peristeras V. A systematic literature review on disinformation: toward a unified taxonomical framework. New Media Soc 2020;23(5):1301–26.
View Article
Google Scholar

[83] View Article

[84] Google Scholar

[ref29] 29. Rodrigo-Ginós F-J, Carrillo-de-Albornoz J, Plaza L. A systematic review on media bias detection: what is media bias, how it is expressed, and how to detect it. Exp Syst Appl. 2024;237121641.
View Article
Google Scholar

[86] View Article

[87] Google Scholar

[ref30] 30. Kitchenham B, Charters S. Guidelines for performing systematic literature reviews in software engineering. Technical Report. 2007.
View Article
Google Scholar

[89] View Article

[90] Google Scholar

[ref31] 31. Moher D, Altman DG, Liberati A, Tetzlaff J. PRISMA statement. Epidemiology. 2011;22(1):128; author reply 128. https://doi.org/10.1097/EDE.0b013e3181fe7825 pmid:21150360

[ref32] 32. Gusenbauer M, Haddaway NR. Which academic search systems are suitable for systematic reviews or meta-analyses? Evaluating retrieval qualities of Google Scholar, PubMed, and 26 other resources. Res Synth Methods 2020;11(2):181–217. pmid:31614060
View Article
PubMed/NCBI
Google Scholar

[93] View Article

[94] PubMed/NCBI

[95] Google Scholar

[ref33] 33. Potthast M, Kiesel J, Reinartz K, Bevendorff J, Stein B. A stylometric inquiry into hyperpartisan and fake news. In: Gurevych I, Miyao Y, editors. In: Gurevych I, Miyao Y, editors; 2018. p. 231–231.
View Article
Google Scholar

[97] View Article

[98] Google Scholar

[ref34] 34. Zannettou S, Sirivianos M, Blackburn J, Kourtellis N. The web of false information. J Data Inf Quality 2019;11(3):1–37.
View Article
Google Scholar

[100] View Article

[101] Google Scholar

[ref35] 35. Altay S, Berriche M, Heuer H, Farkas J, Rathje S. A survey of expert views on misinformation: definitions, determinants, solutions, and future of the field. Harvard Kennedy School Misinf Rev 2023;4:1–34.
View Article
Google Scholar

[103] View Article

[104] Google Scholar

[ref36] 36. Ross Arguedas A, Robertson C, Fletcher R, Nielsen R. Echo chambers, filter bubbles, and polarisation: a literature review. Tech. Reportuan. 2022.
View Article
Google Scholar

[106] View Article

[107] Google Scholar

[ref37] 37. Garg S, Sharma DK. Role of ELMo embedding in detecting fake news on social media. In: 2022 11th International Conference on System Modeling & Advancement in Research Trends (SMART). In: 2022 11th International Conference on System Modeling & Advancement in Research Trends (SMART); 2022. p. 57–57.
View Article
Google Scholar

[109] View Article

[110] Google Scholar

[ref38] 38. Ross RM, Rand DG, Pennycook G. Beyond “fake news”: analytic thinking and the detection of false and hyperpartisan news headlines. Judgm decis mak 2021;16(2):484–504.
View Article
Google Scholar

[112] View Article

[113] Google Scholar

[ref39] 39. Mouróo RR, Robertson CT. Fake news as discursive integration: an analysis of sites that publish false, misleading, hyperpartisan and sensational information. Journalism Stud 2019;20(14):2077–95.
View Article
Google Scholar

[115] View Article

[116] Google Scholar

[ref40] 40. Agerri R. Doris Martin at SemEval-2019 task 4: hyperpartisan news detection with generic semi-supervised features. In: Proceedings of the 13th International Workshop on Semantic Evaluation. In: Proceedings of the 13th International Workshop on Semantic Evaluation; 2019. p. 944–944.
View Article
Google Scholar

[118] View Article

[119] Google Scholar

[ref41] 41. Bourgonje P, Moreno Schneider J, Rehm G. From clickbait to fake news detection: an approach based on detecting the stance of headlines to articles. In: Proceedings of the 2017 EMNLP Workshop: Natural Language Processing Meets Journalism. In: Proceedings of the 2017 EMNLP Workshop: Natural Language Processing Meets Journalism; 2017. p. 84–84.
View Article
Google Scholar

[121] View Article

[122] Google Scholar

[ref42] 42. Pórez-Almendros C, Espinosa-Anke L, Schockaert S. Cardiff University at SemEval-2019 task 4: linguistic features for hyperpartisan news detection. In: Proceedings of the 13th International Workshop on Semantic Evaluation. In: Proceedings of the 13th International Workshop on Semantic Evaluation; 2019. p. 929–929.
View Article
Google Scholar

[124] View Article

[125] Google Scholar

[ref43] 43. Dumitru VC, Rebedea T. Fake and hyper-partisan news identification. In: Moldoveanu A, Dix AJ, editors. 16th International Conference on Human-Computer Interaction, RoCHI 2019; 2019 Oct 17–18; Bucharest, Romania. Matrix Rom; 2019, p. 60–7.
View Article
Google Scholar

[127] View Article

[128] Google Scholar

[ref44] 44. Knauth J. Orwellian-times at SemEval-2019 Task 4: a stylistic and content-based classifier. In: Proceedings of the 13th International Workshop on Semantic Evaluation. In: Proceedings of the 13th International Workshop on Semantic Evaluation; 2019. p. 976–976.
View Article
Google Scholar

[130] View Article

[131] Google Scholar

[ref45] 45. Sengupta S, Pedersen T. Duluth at SemEval-2019 Task 4: the pioquinto manterola hyperpartisan news detector. In: Proceedings of the 13th International Workshop On Semantic Evaluation. In: Proceedings of the 13th International Workshop On Semantic Evaluation; 2019. p. 949–949.
View Article
Google Scholar

[133] View Article

[134] Google Scholar

[ref46] 46. Lyu H, Pan J, Wang Z, Luo J. Computational assessment of hyperpartisanship in news titles. ICWSM. 2024;18:999–1012.
View Article
Google Scholar

[136] View Article

[137] Google Scholar

[ref47] 47. Amason E, Palanker J, Shen MC, Medero J. Harvey Mudd College at SemEval-2019 Task 4: the D.X. Beaumont hyperpartisan news detector. Proceedings of the 13th International Workshop on Semantic Evaluation. Association for Computational Linguistics; 2019. p. 967–70. https://doi.org/10.18653/v1/s19-2166

[ref48] 48. Walton D. Ad Hominem arguments. University Alabama Press; 1998.

[ref49] 49. Tran M. How biased are American media outlets? A framework for presentation bias regression. In: 2020 IEEE International Conference on Big Data (Big Data). In: 2020 IEEE International Conference on Big Data (Big Data); 2020. p. 4359–4359.
View Article
Google Scholar

[141] View Article

[142] Google Scholar

[ref50] 50. Anand B, Di Tella R, Galetovic A. Information or opinion? Media bias as product differentiation. Econ Manag Strategy 2007;16(3):635–82.
View Article
Google Scholar

[144] View Article

[145] Google Scholar

[ref51] 51. Sharma A, Kaur N, Sen A, Seth A. Ideology detection in the Indian mass media. In: 2020 IEEE/ACM International Conference on Advances in Social Networks Analysis and Mining (ASONAM). In: 2020 IEEE/ACM International Conference on Advances in Social Networks Analysis and Mining (ASONAM); 2020. p. 627–627.
View Article
Google Scholar

[147] View Article

[148] Google Scholar

[ref52] 52. Baumer E, Elovic E, Qin Y, Polletta F, Gay G. Testing and comparing computational approaches for identifying the language of framing in political news. In: Proceedings of the 2015 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies. In: Proceedings of the 2015 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies; 2015. p. 1472–1472.
View Article
Google Scholar

[150] View Article

[151] Google Scholar

[ref53] 53. Kong H-K, Liu Z, Karahalios K. Frames and slants in titles of visualizations on controversial topics. In: Proceedings of the 2018 CHI Conference on Human Factors in Computing Systems. In: Proceedings of the 2018 CHI Conference on Human Factors in Computing Systems; 2018. p. 1–1.
View Article
Google Scholar

[153] View Article

[154] Google Scholar

[ref54] 54. Leeson PT, Coyne CJ. Manipulating the media. SSRN. 2011.
View Article
Google Scholar

[156] View Article

[157] Google Scholar

[ref55] 55. Honeycutt N, Jussim L. Political bias in the social sciences: a critical, theoretical, and empirical review. In: Ideological and political bias in psychology: nature, scope, and solutions. 2023. p. 97–146. doi: https://doi.org/10.1007/978-3-031-29148-7_5.

[ref56] 56. Patankar A, Bose J, Khanna H. A bias aware news recommendation system. In: 2019 IEEE 13th International Conference on Semantic Computing (ICSC). 2019. p. 232–8. https://doi.org/10.1109/icosc.2019.8665610

[ref57] 57. Barnidge M, Peacock C. A third wave of selective exposure research? The challenges posed by hyperpartisan news on social media. MaC 2019;7(3):4–7.
View Article
Google Scholar

[161] View Article

[162] Google Scholar

[ref58] 58. Gangula RRR, Duggenpudi SR, Mamidi R. Detecting political bias in news articles using headline attention. In: Proceedings of the 2019 ACL Workshop BlackboxNLP: Analyzing and Interpreting Neural Networks for NLP. In: Proceedings of the 2019 ACL Workshop BlackboxNLP: Analyzing and Interpreting Neural Networks for NLP; 2019. p. 77–77.
View Article
Google Scholar

[164] View Article

[165] Google Scholar

[ref59] 59. Pierri F, Artoni A, Ceri S. HoaxItaly: a collection of Italian disinformation and fact-checking stories shared on Twitter in 2019. arXiv preprint 2020
View Article
Google Scholar

[167] View Article

[168] Google Scholar

[ref60] 60. Huang GKW, Lee JC. Hyperpartisan news and articles detection using BERT and ELMo. In: 2019 International Conference on Computer and Drone Applications (IConDA). In: 2019 International Conference on Computer and Drone Applications (IConDA); 2019. p. 29–29.
View Article
Google Scholar

[170] View Article

[171] Google Scholar

[ref61] 61. University Politehnica of Bucharest, Dumitru VC, Rebedea T. Topic-based models with fact checking for fake news identification. In: RoCHI - International Conference on Human-Computer Interaction, In: RoCHI - International Conference on Human-Computer Interaction; 2021. p. 182–182.
View Article
Google Scholar

[173] View Article

[174] Google Scholar

[ref62] 62. Sanchez-Junquera J, Rosso P, Montes M, Ponzetto S, 2021. Masking and transformer-based models for hyperpartisanship detection in news. 2021. p. 1244–1251. 10.26615/978-954-452-072-4_140
View Article
Google Scholar

[176] View Article

[177] Google Scholar

[ref63] 63. Jeong Lim S, Jatowt A, Yoshikawa M. Understanding characteristics of biased sentences in news articles. In: CIKM Workshops; 2018.
View Article
Google Scholar

[179] View Article

[180] Google Scholar

[ref64] 64. Naredla NR, Adedoyin FF. Detection of hyperpartisan news articles using natural language processing technique. Int J Inf Manag Data Insights 2022;2(1):100064.
View Article
Google Scholar

[182] View Article

[183] Google Scholar

[ref65] 65. Papadopoulou O, Kordopatis-Zilos G, Zampoglou M, Papadopoulos S, Kompatsiaris Y. Brenda Starr at SemEval-2019 Task 4: hyperpartisan news detection. In: Proceedings of the 13th International Workshop on Semantic Evaluation. 2019; 924–8. https://doi.org/10.18653/v1/s19-2157

[ref66] 66. M. Alzhrani K. Political ideology detection of news articles using deep neural networks. Intell Automat Soft Comput 2022;33(1):483–500.
View Article
Google Scholar

[186] View Article

[187] Google Scholar

[ref67] 67. Hermann ES, Chomsky N. Manufacturing consent: the political economy of the mass media. Manuf Consent. 1994.
View Article
Google Scholar

[189] View Article

[190] Google Scholar

[ref68] 68. Hrckova A, Moro R, Srba I, Bielikova M. Quantitative and qualitative analysis of linking patterns of mainstream and partisan online news media in Central Europe. OIR 2021;46(5):954–73.
View Article
Google Scholar

[192] View Article

[193] Google Scholar

[ref69] 69. Kulkarni V, Ye J, Skiena S, Wang WY. Multi-view models for political ideology detection of news articles. In: Proceedings of the 2018 Conference on Empirical Methods in Natural Language Processing. 2018.
View Article
Google Scholar

[195] View Article

[196] Google Scholar

[ref70] 70. Alabdulkarim A, Alhindi T. Spider-Jerusalem at SemEval-2019 Task 4: hyperpartisan news detection. In: Proceedings of the 13th International Workshop on Semantic Evaluation. In: Proceedings of the 13th International Workshop on Semantic Evaluation; 2019. p. 985–985.
View Article
Google Scholar

[198] View Article

[199] Google Scholar

[ref71] 71. Alzhrani K. Ideology detection of personalized political news coverage. In: Proceedings of the 2020 4th International Conference on Compute and Data Analysis. In: Proceedings of the 2020 4th International Conference on Compute and Data Analysis; 2020. p. 10–10.
View Article
Google Scholar

[201] View Article

[202] Google Scholar

[ref72] 72. Jiang Y, Petrak J, Song X, Bontcheva K, Maynard D. Team Bertha von Suttner at SemEval-2019 Task 4: hyperpartisan news detection using ELMo sentence representation convolutional network. Proceedings of the 13th International Workshop on Semantic Evaluation. 2019. p. 840–4. https://doi.org/10.18653/v1/s19-2146

[ref73] 73. Baly R, Karadzhov G Saleh A, Glass J, Nakov P. Multi-task ordinal regression for jointly predicting the trustworthiness and the leading political ideology of news media. In: Burstein J, Doran C, Solorio T, editors. In: Burstein J, Doran C, Solorio T, editors; 2019. p. 2109–2109.
View Article
Google Scholar

[205] View Article

[206] Google Scholar

[ref74] 74. Chen C, Park C, Dwyer J, Medero J. Harvey Mudd College at SemEval-2019 Task 4: the Carl Kolchak hyperpartisan news detector. In: Proceedings of the 13th International Workshop on Semantic Evaluation. In: Proceedings of the 13th International Workshop on Semantic Evaluation; 2019. p. 957–957.
View Article
Google Scholar

[208] View Article

[209] Google Scholar

[ref75] 75. Palić N, Vladika J, Čubelić D, Lovrenčić I, Buljan M, Šnajder J. TakeLab at SemEval-2019 Task 4: hyperpartisan news detection. In: Proceedings of the 13th International Workshop on Semantic Evaluation. In: Proceedings of the 13th International Workshop on Semantic Evaluation; 2019. p. 995–995.
View Article
Google Scholar

[211] View Article

[212] Google Scholar

[ref76] 76. Anthonio T, Kloppenburg L. Team Kermit-the-frog at SemEval-2019 Task 4: bias detection through sentiment analysis and simple linguistic features. In: Proceedings of the 13th International Workshop on Semantic Evaluation. In: Proceedings of the 13th International Workshop on Semantic Evaluation; 2019. p. 1016–1016.
View Article
Google Scholar

[214] View Article

[215] Google Scholar

[ref77] 77. Srivastava V, Gupta A, Prakash D, Sahoo SK, R.R R, Kim YH. Vernon-fenwick at SemEval-2019 Task 4: hyperpartisan news detection using lexical and semantic features. In: Proceedings of the 13th International Workshop on Semantic Evaluation. Association for Computational Linguistics; 2019. p. 1078–82. https://doi.org/10.18653/v1/s19-2189

[ref78] 78. Spezzano F, Shrestha A, Fails JA, Stone BW. That’s fake news! reliability of news when provided title, image, source bias & full article. Proc ACM Hum-Comput Interact. 2021;5(CSCW1):1–19. https://doi.org/10.1145/3449183

[ref79] 79. Sridharan ASN. An automated news bias classifier using caenorhabditis elegans inspired recursive feedback network architecture. arXiv preprint 2022
View Article
Google Scholar

[219] View Article

[220] Google Scholar

[ref80] 80. Aksenov D, Bourgonje P, Zaczynska K, Ostendorff M, Moreno-Schneider J, Rehm G. Fine-grained classification of political bias in german news: a data set and initial experiments. In: Proceedings of the 5th Workshop on Online Abuse and Harms (WOAH 2021). In: Proceedings of the 5th Workshop on Online Abuse and Harms (WOAH 2021); 2021. p. 121–121.
View Article
Google Scholar

[222] View Article

[223] Google Scholar

[ref81] 81. Azizov D, Nakov P, Liang S, Frank at checkthat! 2023: Detecting the political bias of news articles and news media. In: Conference and Labs of the Evaluation Forum. 2023.
View Article
Google Scholar

[225] View Article

[226] Google Scholar

[ref82] 82. Hearst MA, Dumais ST, Osuna E, Platt J, Scholkopf B. Support vector machines. IEEE Intell Syst Their Appl 1998;13(4):18–28.
View Article
Google Scholar

[228] View Article

[229] Google Scholar

[ref83] 83. Yeh C-L, Loni B, Schuth A. Tom Jumbo-Grumbo at SemEval-2019 Task 4: hyperpartisan news detection with GloVe vectors and SVM. In: Proceedings of the 13th International Workshop on Semantic Evaluation. In: Proceedings of the 13th International Workshop on Semantic Evaluation; 2019. p. 1067–1067.
View Article
Google Scholar

[231] View Article

[232] Google Scholar

[ref84] 84. Chen T, Guestrin C. XGBoost. In: Proceedings of the 22nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining. 2016. p. 785–94. https://doi.org/10.1145/2939672.2939785

[ref85] 85. Gupta V, Kaur Jolly BL, Kaur R, Chakraborty T. Clark Kent at SemEval-2019 Task 4: stylometric insights into hyperpartisan news detection. In: Proceedings of the 13th International Workshop on Semantic Evaluation. In: Proceedings of the 13th International Workshop on Semantic Evaluation; 2019. p. 934–934.
View Article
Google Scholar

[235] View Article

[236] Google Scholar

[ref86] 86. Merow C, Smith MJ, Silander JA Jr. A practical guide to MaxEnt for modeling species’ distributions: what it does, and why inputs and settings matter. Ecography 2013;36(10):1058–69.
View Article
Google Scholar

[238] View Article

[239] Google Scholar

[ref87] 87. Breiman L. Mach Learn. 2001;45(1):5–32
View Article
Google Scholar

[241] View Article

[242] Google Scholar

[ref88] 88. Chakravartula N, Indurthi V, Syed B. Fermi at SemEval-2019 Task 4: the Sarah-Jane-Smith hyperpartisan news detector. In: Proceedings of the 13th International Workshop on Semantic Evaluation. Association for Computational Linguistics; 2019.

[ref89] 89. Cruz A, Rocha G, Sousa-Silva R, Lopes Cardoso H. Team Fernando-Pessa at SemEval-2019 Task 4: back to basics in hyperpartisan news detection. In: Proceedings of the 13th International Workshop on Semantic Evaluation. In: Proceedings of the 13th International Workshop on Semantic Evaluation; 2019. p. 999–999.
View Article
Google Scholar

[245] View Article

[246] Google Scholar

[ref90] 90. Stevanoski B, Gievska S. Team Ned Leeds at SemEval-2019 Task 4: exploring language indicators of hyperpartisan reporting. In: Proceedings of the 13th International Workshop on Semantic Evaluation. In: Proceedings of the 13th International Workshop on Semantic Evaluation; 2019. p. 1026–1026.
View Article
Google Scholar

[248] View Article

[249] Google Scholar

[ref91] 91. Saleh A, Baly R, Barrón-Cedeóo A, Da San Martino G, Mohtarami M, Nakov P, et al. Team QCRI-MIT at SemEval-2019 Task 4: propaganda analysis meets hyperpartisan news detection. In: Proceedings of the 13th International Workshop on Semantic Evaluation. In: Proceedings of the 13th International Workshop on Semantic Evaluation; 2019. p. 1041–1041.
View Article
Google Scholar

[251] View Article

[252] Google Scholar

[ref92] 92. Bestgen Y. Tintin at SemEval-2019 Task 4: Detecting Hyperpartisan News Article with only Simple Tokens. Proceedings of the 13th International Workshop on Semantic Evaluation. Association for Computational Linguistics; 2019. p. 1062–6. https://doi.org/10.18653/v1/s19-2186

[ref93] 93. Hanawa K, Sasaki S, Ouchi H, Suzuki J, Inui K. The Sally Smedley hyperpartisan news detector at SemEval-2019 Task 4. In: Proceedings of the 13th International Workshop on Semantic Evaluation. In: Proceedings of the 13th International Workshop on Semantic Evaluation; 2019. p. 1057–1057.
View Article
Google Scholar

[255] View Article

[256] Google Scholar

[ref94] 94. Nguyen D-V, Dang T, Nguyen N. NLP@UIT at SemEval-2019 Task 4: the paparazzo hyperpartisan news detector. In: Proceedings of the 13th International Workshop on Semantic Evaluation. In: Proceedings of the 13th International Workshop on Semantic Evaluation; 2019. p. 971–971.
View Article
Google Scholar

[258] View Article

[259] Google Scholar

[ref95] 95. Le QV, Mikolov T. Distributed representations of sentences and documents. arXiv preprint 2014
View Article
Google Scholar

[261] View Article

[262] Google Scholar

[ref96] 96. Pennington J, Socher R, Manning C. Glove: global vectors for word representation. In: Proceedings of the 2014 Conference on Empirical Methods in Natural Language Processing (EMNLP). In: Proceedings of the 2014 Conference on Empirical Methods in Natural Language Processing (EMNLP); 2014. p. 1532–1532.
View Article
Google Scholar

[264] View Article

[265] Google Scholar

[ref97] 97. Peters ME, Neumann M, Iyyer M, Gardner M, Clark C, Lee K, et al. Deep contextualized word representations. arXiv preprint 2018
View Article
Google Scholar

[267] View Article

[268] Google Scholar

[ref98] 98. Cer D, Yang Y, Kong S, Hua N, Limtiaco N, John RS, et al. Universal Sentence Encoder. arXiv preprint 2018
View Article
Google Scholar

[270] View Article

[271] Google Scholar

[ref99] 99. Walecki R, Rudovic O, Pavlovic V, Pantic M. Copula ordinal regression for joint estimation of facial action unit intensity. In: 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR). 2016. p. 4902–10. https://doi.org/10.1109/cvpr.2016.530

[ref100] 100. Rumelhart DE, Hinton GE, Williams RJ. Learning internal representations by error propagation. 1986
View Article
Google Scholar

[274] View Article

[275] Google Scholar

[ref101] 101. Dorogush AV, Gulin A, Gusev G, Kazeev N, Prokhorenkova, LO, Vorobev A. Fighting biases with dynamic boosting. arXiv preprint 2017
View Article
Google Scholar

[277] View Article

[278] Google Scholar

[ref102] 102. Gerald Ki Wei H, Jun Choi L. Hyperpartisan news classification with ELMo and bias feature. J Inf Sci Eng. 2021;37.
View Article
Google Scholar

[280] View Article

[281] Google Scholar

[ref103] 103. Fórber M, Qurdina A, Ahmedi L. Team Peter Brinkmann at SemEval-2019 Task 4: detecting biased news articles using convolutional neural networks. In: Proceedings of the 13th International Workshop on Semantic Evaluation. In: Proceedings of the 13th International Workshop on Semantic Evaluation; 2019. p. 1032–1032.
View Article
Google Scholar

[283] View Article

[284] Google Scholar

[ref104] 104. Zehe A, Hettinger L, Ernst S, Hauptmann C, Hotho A. Team Xenophilius Lovegood at SemEval-2019 Task 4: hyperpartisanship classification using convolutional neural networks. In: Proceedings of the 13th International Workshop on Semantic Evaluation. In: Proceedings of the 13th International Workshop on Semantic Evaluation; 2019. p. 1047–1047.
View Article
Google Scholar

[286] View Article

[287] Google Scholar

[ref105] 105. Ruan Q, Mac Namee B, Dong R. Bias bubbles: using semi-supervised learning to measure how many biased news articles are around us. In: AICS. 2021. p. 153–64.
View Article
Google Scholar

[289] View Article

[290] Google Scholar

[ref106] 106. Yang Z, Yang D, Dyer C, He X, Smola A, Hovy E. Hierarchical attention networks for document classification. In: Knight K, Nenkova A, Rambow O, editors. Proceedings of the 2016 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies. San Diego, California: Association for Computational Linguistics; 2016. p. 1480–9 https://doi.org/10.18653/v1/n16-1174

[ref107] 107. Cruz AF, Rocha G, Cardoso HL. On document representations for detection of biased news articles. In: Proceedings of the 35th Annual ACM Symposium on Applied Computing. In: Proceedings of the 35th Annual ACM Symposium on Applied Computing; 2020. p. 892–892.
View Article
Google Scholar

[293] View Article

[294] Google Scholar

[ref108] 108. Moreno JG, Pitarch Y, Pinel-Sauvagnat K, Hubert G. Rouletabille at SemEval-2019 Task 4: neural network baseline for identification of hyperpartisan publishers. In: Proceedings of the 13th International Workshop on Semantic Evaluation. In: Proceedings of the 13th International Workshop on Semantic Evaluation; 2019. p. 981–981.
View Article
Google Scholar

[296] View Article

[297] Google Scholar

[ref109] 109. Ko Y, Ryu S, Han S, Jeon Y, Kim J, Park S, et al. KHAN: knowledge-aware hierarchical attention networks for accurate political stance prediction. In: Proceedings of the ACM Web Conference 2023 WWW 2023. Austin, TX: ACM; 2023. p. 1572–83. ISBN: 9781450394161. https://doi.org/10.1145/3543507.3583300

[ref110] 110. Hochreiter S, Schmidhuber J. Long short-term memory. Neural Comput 1997;9(8):1735–80. pmid:9377276
View Article
PubMed/NCBI
Google Scholar

[300] View Article

[301] PubMed/NCBI

[302] Google Scholar

[ref111] 111. Isbister T, Johansson F. Dick-Preston and Morbo at SemEval-2019 Task 4: transfer learning for hyperpartisan news detection. In: Proceedings of the 13th International Workshop on Semantic Evaluation. In: Proceedings of the 13th International Workshop on Semantic Evaluation; 2019. p. 939–939.
View Article
Google Scholar

[304] View Article

[305] Google Scholar

[ref112] 112. Li C, Goldwasser D. Using social and linguistic information to adapt pretrained representations for political perspective identification. In: Zong C, Xia F, Li W, Navigli R, editors. In: Zong C, Xia F, Li W, Navigli R, editors; 2021. p. 4569–4569.
View Article
Google Scholar

[307] View Article

[308] Google Scholar

[ref113] 113. Cramerus R, Scheffler T. Team Kit Kittredge at SemEval-2019 Task 4: LSTM voting system. In: Proceedings of the 13th International Workshop on Semantic Evaluation. In: Proceedings of the 13th International Workshop on Semantic Evaluation; 2019. p. 1021–1021.
View Article
Google Scholar

[310] View Article

[311] Google Scholar

[ref114] 114. Zhang C, Rajendran A, Abdul-Mageed M. UBC-NLP at SemEval-2019 Task 4: hyperpartisan news detection with attention-based Bi-LSTMs. In: Proceedings of the 13th International Workshop on Semantic Evaluation. In: Proceedings of the 13th International Workshop on Semantic Evaluation; 2019. p. 1072–1072.
View Article
Google Scholar

[313] View Article

[314] Google Scholar

[ref115] 115. Liu Y, Ott M, Goyal N, Du J, Joshi M, Chen D, et al. Roberta: a robustly optimized BERT pretraining approach. arXiv preprint 2019
View Article
Google Scholar

[316] View Article

[317] Google Scholar

[ref116] 116. Kim K-M, Lee M, Won H-S, Kim M-J, Kim Y, Lee S. Multi-stage prompt tuning for political perspective detection in low-resource settings. Appl Sci 2023;13(10):6252.
View Article
Google Scholar

[319] View Article

[320] Google Scholar

[ref117] 117. Liu Y, Zhang XF, Wegsman D, Beauchamp N, Wang L. POLITICS: pretraining with same-story article comparison for ideology prediction and stance detection. In: Carpuat M, de Marneffe MC, Meza Ruiz IV, editors. In: Carpuat M, de Marneffe MC, Meza Ruiz IV, editors; 2022. p. 1354–1354.
View Article
Google Scholar

[322] View Article

[323] Google Scholar

[ref118] 118. Kim MY, Johnson KM. CLoSE: contrastive learning of subframe embeddings for political bias classification of news media. In: Calzolari N, Huang CR, Kim H, Pustejovsky J, Wanner L, Choi KS, et al., editors. Proceedings of the 29th International Conference on Computational Linguistics, International Committee on Computational Linguistics. 2022. p. 2780–2793.
View Article
Google Scholar

[325] View Article

[326] Google Scholar

[ref119] 119. Devlin J, Chang M, Lee K, Toutanova K. BERT: pre-training of deep bidirectional transformers for language understanding. arXiv preprint 2018
View Article
Google Scholar

[328] View Article

[329] Google Scholar

[ref120] 120. Roy S, Goldwasser D. Weakly supervised learning of nuanced frames for analyzing polarization in news media. In: Webber B, Cohn T, He Y, Liu Y, editors. In: Webber B, Cohn T, He Y, Liu Y, editors; 2020. p. 7698–7698.
View Article
Google Scholar

[331] View Article

[332] Google Scholar

[ref121] 121. Baly R, Da San Martino G, Glass J, Nakov P. We can detect your bias: predicting the political ideology of news articles. In: Webber B, Cohn T, He Y, Liu Y, editors. In: Webber B, Cohn T, He Y, Liu Y, editors; 2020. p. 4982–4982.
View Article
Google Scholar

[334] View Article

[335] Google Scholar

[ref122] 122. Da Silva SC, Paraboni I. Politically-oriented information inference from text. JUCS 2023;29(6):569–94.
View Article
Google Scholar

[337] View Article

[338] Google Scholar

[ref123] 123. Shaprin D, Da San Martino G, Barrón-Cedeóo A, Nakov P. Team Jack Ryder at SemEval-2019 Task 4: using BERT representations for detecting hyperpartisan news. In: Proceedings of the 13th International Workshop on Semantic Evaluation. In: Proceedings of the 13th International Workshop on Semantic Evaluation; 2019. p. 1012–1012.
View Article
Google Scholar

[340] View Article

[341] Google Scholar

[ref124] 124. Ahmed U, Lin JC, Srivastava G. Temporal positional lexicon expansion for federated learning based on hyperpatism detection. Exp Syst 2022;40(5):e13183.
View Article
Google Scholar

[343] View Article

[344] Google Scholar

[ref125] 125. Ahmed U, Lin JC-W, Srivastava G. Semisupervised federated learning for temporal news hyperpatism detection. IEEE Trans Comput Soc Syst 2023;10(4):1758–69.
View Article
Google Scholar

[346] View Article

[347] Google Scholar

[ref126] 126. Drissi M, Sandoval Segura P, Ojha V, Medero J. Harvey Mudd College at SemEval-2019 Task 4: the Clint Buchanan hyperpartisan news detector. In: Proceedings of the 13th International Workshop on Semantic Evaluation. In: Proceedings of the 13th International Workshop on Semantic Evaluation; 2019. p. 962–962.
View Article
Google Scholar

[349] View Article

[350] Google Scholar

[ref127] 127. Mutlu O, Can OA, Dayanik E. Team Howard Beale at SemEval-2019 Task 4: hyperpartisan news detection with BERT. Proceedings of the 13th International Workshop on Semantic Evaluation. Association for Computational Linguistics; 2019. p. 1007–11. https://doi.org/10.18653/v1/s19-2175

[ref128] 128. Smaşdu RA, Echim SV, Cercel DC, Marin I, Pop F. From fake to hyperpartisan news detection using domain adaptation. arXiv preprint 2023
View Article
Google Scholar

[353] View Article

[354] Google Scholar

[ref129] 129. Omidi Shayegan S, Nejadgholi I, Pelrine K, Yu H, Levy S, Yang Z, et al. An evaluation of language models for hyperpartisan ideology detection in Persian Twitter. In: Ojha AK, Ahmadi S, Cinkova S, Fransen T, Liu CH, McCrae JP, editors. In: Ojha AK, Ahmadi S, Cinkova S, Fransen T, Liu CH, McCrae JP, editors; 2024. p. 51–51. https://aclanthology.org/2024.eurali-1.8.
View Article
Google Scholar

[356] View Article

[357] Google Scholar

[ref130] 130. Touvron H, Martin L, Stone K, Albert P, Almahairi A, Babaei Y, et al. Llama 2: open foundation and fine-tuned chat models. arXiv preprint 2023
View Article
Google Scholar

[359] View Article

[360] Google Scholar

[ref131] 131. Afsarmanesh N, Karlgren J, Sumbler P, Viereckel N. Team Harry Friberg at SemEval-2019 Task 4: identifying hyperpartisan news through editorially defined metatopics. In: Proceedings of the 13th International Workshop on Semantic Evaluation. In: Proceedings of the 13th International Workshop on Semantic Evaluation; 2019. p. 1004–1004.
View Article
Google Scholar

[362] View Article

[363] Google Scholar

[ref132] 132. Sanchez-Junquera J. On the detection of political and social bias. 2021.
View Article
Google Scholar

[365] View Article

[366] Google Scholar

[ref133] 133. Mets M, Karjus A, Ibrus I, Schich M. Automated stance detection in complex topics and small languages: the challenging case of immigration in polarizing news media. PLoS One 2024;19(4):e0302380. pmid:38669237
View Article
PubMed/NCBI
Google Scholar

[368] View Article

[369] PubMed/NCBI

[370] Google Scholar

[ref134] 134. Karjus A, Cuskley C. Evolving linguistic divergence on polarizing social media. Humanit Soc Sci Commun 2024;11(1):422.
View Article
Google Scholar

[372] View Article

[373] Google Scholar

[ref135] 135. Sylwester K, Purver M. Twitter language use reflects psychological differences between democrats and republicans. PLoS One 2015;10(9):e0137422. pmid:26375581
View Article
PubMed/NCBI
Google Scholar

[375] View Article

[376] PubMed/NCBI

[377] Google Scholar

[ref136] 136. Fraxanet E, Pellert M, Schweighofer S, Gómez V, Garcia D. Unpacking polarization: antagonism and alignment in signed networks of online interaction. PNAS Nexus. 2024;3(12):pgae276. https://doi.org/10.1093/pnasnexus/pgae276 pmid:39703230

[ref137] 137. Lee N, Liu Z, Fung P. Team yeon-zi at SemEval-2019 Task 4: hyperpartisan news detection by de-noising weakly-labeled data. In: Proceedings of the 13th International Workshop on Semantic Evaluation. In: Proceedings of the 13th International Workshop on Semantic Evaluation; 2019. p. 1052–1052.
View Article
Google Scholar

[380] View Article

[381] Google Scholar

[ref138] 138. Joo Y, Hwang I. Steve Martin at SemEval-2019 Task 4: ensemble learning model for detecting hyperpartisan news. In: Proceedings of the 13th International Workshop on Semantic Evaluation. Association for Computational Linguistics; 2019, p. 990–4. https://doi.org/10.18653/v1/s19-2171

[ref139] 139. Ning Z, Lin Y, Zhong R. Team Peter-Parker at SemEval-2019 Task 4: BERT-based method in hyperpartisan news detection. In: Proceedings of the 13th International Workshop on Semantic Evaluation. In: Proceedings of the 13th International Workshop on Semantic Evaluation; 2019. p. 1037–1037.
View Article
Google Scholar

[384] View Article

[385] Google Scholar

[ref140] 140. Silverman C, Strapagiel L, Shaban H, Hall E, Singer-Vine J. Hyperpartisan facebook pages are publishing false and misleading information at an alarming rate. BuzzFeed News. 2016.
View Article
Google Scholar

[387] View Article

[388] Google Scholar

[ref141] 141. Gebhard L, Hamborg F. The POLUSA dataset: 0.9m political news articles balanced by time and outlet popularity. 2020.
View Article
Google Scholar

[390] View Article

[391] Google Scholar

[ref142] 142. Norregaard J, Horne BD, Adali S. NELA-GT-2018: a large multi-labelled news dataset for the study of misinformation in news articles. arXiv preprint 2019
View Article
Google Scholar

[393] View Article

[394] Google Scholar

[ref143] 143. Fan L, White M, Sharma E, Su R, Choubey PK, Huang R, et al. In plain sight: Media bias through the lens of factual reporting. In: Inui K, Jiang J, Ng V, Wan X, editors. In: Inui K, Jiang J, Ng V, Wan X, editors; 2019. p. 6343–6343.
View Article
Google Scholar

[396] View Article

[397] Google Scholar

[ref144] 144. Baly R, Karadzhov G, Alexandrov D, Glass J, Nakov P. Predicting factuality of reporting and bias of news media sources. In: Riloff E, Chiang D, Hockenmaier J, Tsujii J, editors. In: Riloff E, Chiang D, Hockenmaier J, Tsujii J, editors; 2018. p. 3528–3528.
View Article
Google Scholar

[399] View Article

[400] Google Scholar

[ref145] 145. Horne BD, Dron W, Khedr S, Adali S. Sampling the news producers: a large news and feature data set for the study of the complex media landscape. arXiv preprint 2018
View Article
Google Scholar

[402] View Article

[403] Google Scholar

[ref146] 146. Szwoch J, Staszkow M, Rzepka R, Araki K. Creation of polish online news corpus for political polarization studies. In: Afli H, Alam M, Bouamor H, Casagran CB, Boland C, Ghannay S, editors. Proceedings of the LREC 2022 workshop on Natural Language Processing for Political Sciences. European Language Resources Association; 2022. p. 86–90
View Article
Google Scholar

[405] View Article

[406] Google Scholar

[ref147] 147. Lim S, Jatowt A, Yoshikawa M. Creating a dataset for fine-grained bias detection in news articles. 2020.
View Article
Google Scholar

[408] View Article

[409] Google Scholar

[ref148] 148. Li C, Goldwasser D. Encoding social information with graph convolutional networks for political perspective detection in news media. In: Korhonen A, Traum D, Marquez L, editors. In: Korhonen A, Traum D, Marquez L, editors; 2019. p. 2594–2594.
View Article
Google Scholar

[411] View Article

[412] Google Scholar

Figures

Abstract

Introduction

Related works

Methodology

Research Questions

Search strategy

Selection criteria

Screening and selection process

Transparency and replicability

Hyperpartisan news detection: Description of the phenomenon

The problematics of the definition

Definition of hyperpartisanship.

Vagueness of the definition and overlap with similar tasks.

Traits of hyperpartisan news.

Analogue biases.

Proposal for a definition.

Where can hyperpartisanship be detected? Perspectives on the sources

How hyperpartisanship is labeled?

Approaches for automatic hyperpartisan news detection

Models discussion

Non-deep learning methods.

Deep learning methods.

Deep learning: Non transformer-based architectures.

Deep learning: Transformer-based architectures.

Other methods.

Datasets

Datasets presentation

Potential limitations

Conclusions and future works

References