Skip to main content
Advertisement
Browse Subject Areas
?

Click through the PLOS taxonomy to find articles in your field.

For more information about PLOS Subject Areas, click here.

  • Loading metrics

Investigating the role of AI explanations in lay individuals’ comprehension of radiology reports: A metacognition lens

  • Yegin Genc ,

    Roles Conceptualization, Data curation, Formal analysis, Investigation, Methodology, Project administration, Resources, Writing – original draft, Writing – review & editing

    ygenc@pace.edu

    Affiliation Seidenberg School of Computer Science and Information Systems, Pace University, New York, New York, United States of America

  • Mehmet Eren Ahsen,

    Roles Conceptualization, Data curation, Formal analysis, Writing – review & editing

    Affiliations Department of Business Administration, University of Illinois at Urbana-Champaign, Champaign, Illinois, United States of America, University of Illinois at Urbana-Champaign, Carle Illinois School of Medicine, Urbana, Illinois, United States of America

  • Zhan Zhang

    Roles Conceptualization, Data curation, Funding acquisition, Investigation, Writing – review & editing

    Affiliation Seidenberg School of Computer Science and Information Systems, Pace University, New York, New York, United States of America

Abstract

While there has been extensive research on techniques for explainable artificial intelligence (XAI) to enhance AI recommendations, the metacognitive processes in interacting with AI explanations remain underexplored. This study examines how AI explanations impact human decision-making by leveraging cognitive mechanisms that evaluate the accuracy of AI recommendations. We conducted a large-scale experiment (N = 4,302) on Amazon Mechanical Turk (AMT), where participants classified radiology reports as normal or abnormal. Participants were randomly assigned to three groups: a) no AI input (control group), b) AI prediction only, and c) AI prediction with explanation. Our results indicate that AI explanations enhanced task performance. Our results indicate that explanations are more effective when AI prediction confidence is high or users’ self-confidence is low. We conclude by discussing the implications of our findings.

1. Introduction

As many AI systems are mostly working as a “black box”, users face challenges in determining whether they can trust the AI recommendations and use that to make decisions [1]. When complex AI systems supplement decisions that play an essential role in human lives such as those in healthcare, understanding how the complex AI models generate their recommendations (i.e., model predictions) becomes critical to AI system users [2]. Therefore, there has been a push from the community to increase the focus on explainable AI (XAI) when humans and AI agents work together [35]. The rise of the XAI domain aims to promote trust and acceptance of complex AI models that often yield opaque predictions [3,6,7]. Prior work has demonstrated that providing explanations on the inner workings of AI systems can augment decision making by providing system users further information from the model than a simple binary prediction to support decisions [810].

While the effect of explanations in mitigating the limitation of imperfect information provided by AI agents is significant, recent studies have revealed a higher-order relationship between humans and AI agents when together [11,12]. For example, recent studies have unveiled that human agents go through metacognitive processes when interacting with AI systems [11,12]. Metacognition involves the awareness and understanding of one’s own cognitive processes [13]. These metacognitive processes manifest in two separate yet interconnected facets: system monitoring cognition (judgment calls by human agents based on their own preferences while assessing the validity of the AI predictions) [14] and self-monitoring cognition (the self-aware thought processes about their own decision-making strategies and reasoning, as they make decisions based on the AI system’s outputs) [11,12,15]. XAI has predominantly been conceptualized within the scope of system monitoring cognition because explanations primarily aim to elucidate the accuracy of AI recommendations [16]. We argue that XAI should be considered within a broader metacognitive context that encompasses self-monitoring cognitions, for three primary reasons outlined below.

First, decision-makers have been found to follow different metacognitive patterns following AI advice that confirms and disconfirms their initial framing (i.e., initial intuition of the decision). Specifically, AI advice perceived as accurate (i.e., confirming the initial framing) is less likely to unfold metacognitive conflicts that trigger belief conformity and systems justification processes [17]. Therefore, decision-makers will likely engage with explanations differently when they perceive the AI recommendation as accurate vs. inaccurate.

Second, research on augmented decisions with AI found that the two metacognitive processes (system and self-monitoring) “can condition each other, and their dynamic interplay influences decision outcomes” [11]. For example, when decision-makers engage in active consideration, they compare their decision process and machine reasoning using the data elements that support AI advice. As a result, they may refine their perception of the accuracy of their original intuition and the perceived system accuracy. In this view, explanations that aim to support AI advice may influence the refinement of the confidence in the initial framing and the perceived system accuracy.

Finally, metacognitive monitoring constitutes a sensing activity that allows decision-makers to regulate the degree of deliberate, systematic reasoning (as opposed to quick, heuristic reasoning) and the amount of information sought [13,18]. Balancing systematic and heuristic reasoning influences how individuals cope with the cognitive challenges posed by potentially incorrect advice [19,20]. Explanations that reduce the cognitive challenges of detecting potentially inaccurate advice are likely to influence this balancing activity.

Driven by the research gap in investigating self-monitoring cognition in human-AI collaboration, we conducted an online experiment with 4,302 participants using the Mechanical Turk platform [21]. In this experiment, the participants were asked to identify whether the anatomical structure described in a radiology report excerpt represents a normal or an abnormal diagnosis. During the task, the participants in experimental groups were also provided with an AI-based prediction of the anatomical structure. Participants were randomly assigned to conditions where explanations for the AI-based predictions were added or not. Our results validate the impact of explanations on the metacognitive assessment of the human agents’ abilities, namely self-monitoring processes. Our results also show that human agents are more likely to correct an incorrect initial intuition (initial framing) when AI recommendations are explained, regardless of the accuracy of the AI recommendation. This suggests that even an incorrect AI recommendation, when explained, can help human agents recover from incorrect initial framing. Further, we find that the overall positive effects of explanations are more significant when AI prediction confidence is high and human self-confidence is low.

For instance, the applicability of explanations within AI systems extends beyond existing assumptions, notably trust and user acceptance. AI explanations can facilitate users’ introspection and awareness of their cognitive biases. Furthermore, our findings hold significance in the realm of healthcare, highlighting the importance of AI explanations and self-monitoring frameworks in mitigating human errors. By diminishing the prevalence of such errors, the quality of care provided can be significantly enhanced, potentially resulting in more precise diagnoses and efficacious treatments.

The following sections provide the theoretical background for explainable AI and metacognition and describe the theoretical framework (Section 2). We then proceed to test the model experimentally (Section 3). Finally, we report major findings (Section 4) and conclude with a discussion of the implications of our results for theory and practice (Section 5).

2. Theoretical background and hypotheses

2.1. XAI: Explainable AI

As AI systems become increasingly complex and embedded in decision-making environments, the need to make their logic transparent and trustworthy has grown substantially [2224]. This increased demand for transparency led to both technical studies that design and develop explainable components for AI (see Hassija et al. [7] and Arrieta et al. [25] for extensive reviews of current methods) and theoretical studies that aim to ground the technical work with theories of explainable AI [16,2628]. Others have focused on reviewing developments in XAI research by creating taxonomies based on the techniques used to generate explanations and the scope of explanations [7,2931]. Similar studies also reviewed explainable AI, particularly in the medical domain [3235].

Recent studies emphasize that explanations are not only technical tools but also socio-cognitive mechanisms that influence trust, learning, and user engagement in human-AI systems [36,37]. Research also highlights the importance of designing for interpretability, particularly in dynamic digital contexts where algorithmic decisions are intertwined with user-generated content, such as ranking schemes [38]. These developments suggest a growing awareness that effective explainability depends on more than post-hoc transparency—it involves aligning system design with cognitive and organizational expectations [39,40].

From a technical perspective, explainability can be achieved by designing additional tools that enhance prediction interpretation and justification [41]. These tools commonly provide post-hoc explanations that clarify the inner functioning of complex models. These post-hoc explanations can be local explanations that focus on the less complicated solution subspaces that are relevant for the whole model [42,43]; explanations by example that extract data examples appropriate to the generated results [4446]; explanations by simplification that are derived from a newly-trained model optimized for reducing predictive model complexity with minimal compromise from prediction accuracies [47,48]; and feature relevance explanations that aim to clarify the inner functioning of a model by computing a relevance score for its managed variables [4951].

A second stream of technical research in explainable AI focuses on designing inherently more interpretable models [25]. Intelligent systems can be more understandable with the aid of knowledge bases during the design process [52]. Current approaches aim to align AI model features with the features of a knowledge base [53] that are simultaneously constructed during the model building [54]. Another approach to building an inherently interpretable method is to jointly train a less complex but interpretable model with a more complex and accurate model in an ensemble-learning fashion [55,56]. The more transparent of the two models are then used to generate explanations.

While research on AI explanations has been growing, theoretical efforts in explainable AI often overlook the rich cognitive context in which humans interact with explanations in general [16,57]. Studies of human-AI interactions reveal the complex nature of cognitive processes, particularly metacognitive processes, when AI recommendations are involved in decision-making [11,12]. Understanding how these cognitive processes unfold when explanations are added to human-AI interactions can significantly enhance research in explainable AI. For example, recent studies highlight the importance of designing AI systems that can account for both ethical and cognitive concerns of end users, particularly in high-stakes environments [58], and emphasize how explanations can shape trust maintenance and restoration strategies, and long-term performance outcomes in AI-supported decisions [59,60].

2.2. Human metacognition in the research of human-computer interaction

Scholars have increasingly examined the higher-level cognitive processes that guide human judgment in interactions with technology [61,62], as these processes have been reported to be integral to the complex interplay between computing systems and information processing [63]. These higher-level cognitive processes, specifically metacognitions, monitor the progress of our decision-making activities, as well as the effort and time spent on these activities [13]. For example, metacognitions enable decision-makers to dynamically balance between quick, heuristic reasoning and deliberate, systematic reasoning [19,20]. In the context of human-computer interaction, these processes may improve our use of technology, e.g., improve team effectiveness in software development [64], or lead to systematic biases when using technology, e.g., confirmation bias in online reviews [65].

Recently, metacognitions have gained prominence in theorizing about the cognitive challenges associated with augmented decisions that involve AI advice [11,66,67]. With augmented decision-making involving the judgment calls that human agents make based on their preferences while assessing the validity of AI predictions (system-monitoring metacognitions) [11], the decision performance of human agents is also influenced by their assessment of their own abilities (self-monitoring metacognitions) [15]. Furthermore, current perspectives on the integration of humans and machines conceptualize these hybrids as sociotechnical systems, wherein both machines and human agents engage in collaborative learning within metahuman systems [12]. This perspective highlights the advanced learning capabilities linked to shifting goals and assumptions about the nature of learning itself, particularly when agents with differing cognitive architectures, namely machines and humans, are more closely integrated [12]. Seidel et al. [68] mention a feedback loop in this learning process where human agents learn about the “mental models” – i.e., decision models – embedded in the machine agents.

In addition, metacognitive error monitoring processes are also interrelated with the decision-makers’ self-confidence [69] and the credibility [70] or persuasiveness [71] of the recommendations. For example, studies show both over and low self-confidence can adversely affect metacognitive processes [72,73]. Similarly, persuasive communications literature suggests that any message, such as AI explanations, can influence human cognition by affecting the content or the validity of thoughts [74]. The interaction between human metacognition and AI feedback is further complicated by the presentation and context of recommendations, as demonstrated in experiments involving confidence interfaces and trust calibration on digital platforms [e.g., 75].

These dynamics that surface when human and machine agents interact suggest a complex phenomenon with potentially multiple underlying mechanisms of human judgment and explanations about the mental models of machine agents. In this context, understanding how explanations that aim to support system-monitoring processes can also influence self-monitoring processes can provide further insights into the uses of explanations in human-AI interaction.

2.3. Theoretical framework and hypotheses development

As described earlier, more recent views on advice-taking suggest both heuristic and systematic reasoning are involved [13], and together they affect the cognitive processes that go beyond evaluating the machine advice (system-monitoring) and include evaluating decision-makers’ own reasoning (self-monitoring) [11,12,15]. This new view focuses on the interplay between the intuitive assessment of the decision task (i.e., the initial framing) and the additional data provided in the form of machine advice as theorized in the Naturalistic Decision-Making (NDM) framework by Klein [76]. Following suit, we posit that just like the machine predictions, their explanations can also be involved in metacognitive processes that go beyond system-monitoring cognition because these explanations serve as additional data points about the task. Therefore, the interplay between the intuitive assessment of the task and the explanation of machine predictions, just like the prediction itself, is also relevant to the outcome of augmented decision-making. To study the said effects of explanations, we follow the NDM framework and focus on the metacognitive effects of explanations for augmented decision-making.

NDM suggests “[w]hen there are cues that a[n] [initial] judgment could be wrong, [the decision-maker replaces] intuition by careful reasoning” [76]. Therefore, we consider how explanations might serve as “additional cues” that suggest the intuitive judgment might be wrong. Particularly, we first consider the role explanations may play when machine predictions are in conflict with the initial framing. As Jussupow et al. [11] suggested, when machine predictions disconfirm the initial prediction, decision-makers need to overcome the “conflict between their beliefs in their own competence and their beliefs in the AI capabilities.” And when explanations are present, they will likely influence the decision-makers’ beliefs about the AI capabilities. Second, explanations can also challenge the initial framing if they reveal issues with the machine predictions that confirm the initial framing. Particularly, when explanations reveal issues with the machine algorithms that are hard to catch otherwise [23], decision-makers are likely to consider that the agreeing initial framing might also be wrong. This, as the NDM suggests, is likely to trigger careful reasoning regarding the task [76].

In our study, we explore the role of explanations in the context of the accuracy of the machine predictions, because the explanations will support the machine predictions or reveal the irregularities in the machine reasoning, depending on whether the machine is providing a correct prediction or not. While the technical efficiency or fidelity of the explanation mechanism itself (i.e., how accurately the explanation reflects the internal reasoning of the model) is also important, we assume that the explanation mechanisms used in the study provide sufficiently interpretable and reasonable outputs to be meaningful to participants. We consider system-level comparisons between explanation methods (e.g., SHAP, LIME, rule-based) beyond the scope of this study, which focuses instead on the behavioral and metacognitive effects of explanations as perceived by human users. Subsequently, to consider the conflicts with the initial framing, we also consider the accuracy of the initial framing as well.

Finally, considering the persuasive communications literature, the persuasiviness of the AI predictions and their explanations are likely to regulate their impact on the decision makers. Drawing from the literature, we examine two variables–self-confidence (i.e., quality of thoughts) and prediction confidence (i.e., the source credibility) [77]–to study influence of the persuasiveness of explanations. Against this backdrop, we develop our formal hypotheses within our framework, as described below.

Self-monitoring hypotheses.

When machine predictions trigger cognitive conflicts by contradicting the initial framing (i.e., making recommendations that contradict with what the decision maker’s intuition,) we posit that explanations can serve as additional support to validate the machine predictions [25]. The extent to which explanations support machines’ contradictory predictions will influence the decision-makers’ tendency to consider their initial framing and accept the machine predictions [11]. Simply put, conflicting predictions with explanations that make a stronger case for the projections can be more influential than the predictions without explanations. As the self-monitoring processes compare their incorrect initial framing to the conflicting yet correct AI recommendations, they are more likely to switch from their initial framing because the contradicting AI recommendation is strengthened by the explanation (i.e., strong evidence effect.) However, the strengthening effect of explanation is less likely to appear when the AI recommendation is incorrect because explanations of the incorrect AI recommendations are likely to reveal the weaknesses of the predictive model. Therefore, the explanations are less likely to support the recommendations when the conflict is between a correct initial framing and an incorrect AI recommendation. Consequently, we expect this strong evidence effect of explanations to be relevant when the cognitive conflict is between an incorrect initial framing and a correct AI recommendation.

H1A: When initial framing is incorrect, and AI recommendations are correct, explanations help final decisions move towards the accurate AI recommendations and thus away from the initial framing.

Explanations might also help users realize that their initial framing is incorrect when they raise doubt about the validity of the machine predictions that agree with the initial framing. Explanations that show machine reasoning weaknesses undermine the prediction outcome’s validity. And the decision-maker, whose initial framing agrees with the machine predictions, may reconsider the validity of their own reasoning after recognizing the weakness in the machine reasoning. A similar phenomenon has been noted in “weak argument” literature, where when an argument is supported by weak evidence, it is less likely to gain support than when no evidence is presented [78]. Fernbach et al. [78] argue this “weak evidence effect” arises in part because alternative causes are weighted more when humans are reasoning for diagnosis rather than making predictions [79]. Accordingly, as the self-monitoring processes compare the incorrect initial framing to the confirming (yet inaccurate) AI recommendations, they are less likely to stay with their initial framing because the confirming AI recommendations are weakened by the explanation. However, this weakening effect is less likely to appear when the initial framing agrees with the correct AI recommendations because explanations are likely to strengthen the position of the AI recommendations when the predictions are accurate. Therefore, when the agreement is between a correct initial framing and a correct AI recommendation, the explanations are less likely to weaken the recommendation. Consequently, we expect this weakening evidence effect to be relevant when both the initial framing and the AI recommendation are incorrect.

H1B: When both initial framing and AI recommendations are incorrect, explanations help final decisions move away from the incorrect AI recommendations and thus away from the incorrect initial framing.

As discussed above, the possible effects of explanations may differ based on the accuracy of the initial framing by the decision-maker and the machine prediction. Explanations serve as “weak evidence” against confirming inaccurate AI as they reveal the inaccuracies of the machine predictions and “strong evidence” against disconfirming accurate AI as they support the accurate machine predictions. Taken together, we see that the positive effect of explanations is more prevalent when the initial framing is incorrect.

H1: When controlled for recommendation accuracy, the positive impact of explanations will be more prevalent when the initial framing is incorrect.

Table 1 below summarizes our framework in terms of the impact of explanations in the context of the accuracy of the initial framing and the machine predictions.

thumbnail
Table 1. Conceptualizing the effect of explanations based on the initial framing and the machine prediction accuracies.

https://doi.org/10.1371/journal.pone.0321342.t001

Persuasiveness hypotheses.

AI explanations primarily aim to persuade that the predictions derived from machine-learning-based algorithms are justifiable [80]. The literature on persuasive communications suggests that any message, such as AI explanations, can influence human cognition by affecting the content or the validity of thoughts [74] and the quality and source credibility of these messages will impact how persuasive these messages will be [77]. The source for explanations are typically the predictive models they aim to explain since explanations are generated by isolating the contributions of individual features of the predictions [25]. We can reason that the persuasiveness of an AI explanation will be related to the credibility of the predictions they explain. Therefore, we can expect the explanations of predictions with high prediction confidence to be more persuasive.

H2: The positive impact of explanations is more substantial when the prediction confidence of the machine is high.

Further, based on the well-established relationship between attitude confidence and associated behavior [81], we expect confidence in one’s thoughts to influence how they will utilize external information (i.e., the AI explanations) in their decision-making. If decision-makers have lower self-confidence about their intuition (i.e., initial framing), they are more likely to seek further information or additional cues, i.e., explanations [82]; and, vice versa [83,84]. Therefore, we expect to observe the positive effects of explanations in cases where humans are less confident about their judgment because they are more likely to utilize explanations.

H3: The positive impact of explanations is more substantial when self-confidence in human judgment is low.

In summary, our study aims to investigate the impact of AI explanations and confidence on human decision through metacognitive processes. Fig 1 summarizes our conceptual framework.

thumbnail
Fig 1. The theoretical model.

A1 and A2 are adopted from Jussupow et al. (2021). This study investigates the impact of explanations (H1) and confidence (H2, H3) on the final decision accuracy in the context of previously identified metacognitive processes.

https://doi.org/10.1371/journal.pone.0321342.g001

3. Methods

3.1. Experimental design

Following the calls for using online experimentation to study behavioral aspect of technology in the information systems field [85], we conduct an online experiment to test our hypotheses. Participant recruitment and data collection took place between 04/15/2019 and 05/24/2019. Participants were presented with an online consent form and provided consent by clicking on a button that read “I certify that I read and understand the informed consent”. Their consent is recorded in the dataset. The experiment involves identifying whether the anatomical structure described in a radiology report excerpt is abnormal. The report excerpts are from the Audiological and Genetic Database (https://audgendb.github.io). For example, we ask participants to label the normality of the following sentences: “No evidence of erosion in the incus and malleus.” To provide context for the sentence, we also provide the sentences preceding and following from the original report.

Our task aims to examine the effects of AI predictions on human decisions and how explanations can modulate such effects. To that end, we create treatment group based on the presence of an explanation for the predictions (Explained vs. Not Explained). We select a control group where the participants do not see an AI prediction during the prediction task. As a result, participants are randomly assigned to one of the three conditions: 1) AI predictions with explanations; 2) AI predictions without any explanations; 3) without any AI prediction (control group). An example of a task under each condition is shown in Fig 2.

thumbnail
Fig 2. An example of the decision task used in the online experiment in three conditions.

https://doi.org/10.1371/journal.pone.0321342.g002

3.2. Building an AI recommendation model

Data.

The dataset prepared by Cocos et al. [86] had 276 sentences that were previously labeled by the subject matter experts and 727 sentences that were crowdsourced to be labeled. We use the expert labels as the “gold standard” data to evaluate the accuracy of the AI predictions and participants’ decision outcomes. We use the 727 crowd-annotated data set to train the two AI models that provide the predictions for the experiments. Using the crowd-annotated dataset, we train two predictive models using Least Squares Support Vector Machine Classifiers with linear kernels.

Recommendation models.

To build AI-based recommendations, we train Support Vector Machine (SVM)-based predictive models. SVM performs better than intrinsically transparent models and, similar to the state-of-the-art deep learning algorithms, requires further explanations that are created post-hoc (see Arrieta et al. [25] for a detailed classification of models based on their transparency). SVMs keep a fair balance between performance and interpretability and have been applied in various fields [8789]. We avoid complex models such as deep neural networks due to the limited training data size [90]. In particular, we created two models: a “high-accuracy model” trained on the full set of 727 samples, and a “low-accuracy model” trained on a randomly selected 20% subset of the same data. The performance of the two models was evaluated using the accuracy metric on the experimental dataset containing 276 expert-labeled sentences, with classification accuracy calculated as the proportion of correct predictions, given by the formula:

Based on this measure, the “low-accuracy model” achieved an accuracy of 0.702, while the “high-accuracy model” achieved a substantially higher accuracy of 0.891, confirming the intended disparity in predictive performance. To assess the statistical significance of this difference, we conducted DeLong’s test for two correlated ROC curves, which yielded a p-value of , indicating a highly significant difference between the two models.

Following prior work (e.g., [28,42,69]), we generated local explanations for each individual prediction made by the models, using the corresponding input features as the basis for these explanations. These local explanations aim to provide interpretable insights into how specific features influenced a given prediction, thereby offering a more transparent view of model behavior at the instance level. By focusing on the feature contributions for each prediction, we were able to assess the model’s decision-making process in a nuanced and context-specific manner. The next section provides a detailed overview of the methodology used to construct these explanations, including the tools, algorithms, and interpretability techniques employed.

Explanations.

Our annotation tasks are sentences; we need to construct features from them to build a machine-learning model. We extract 1-grams and bigrams from the annotation tasks, a popular way that has been frequently used in the natural language processing literature [91]. We use the Python Scikit-learn package for creating the SVM [92]. The training of the SVM model results in a ranking of words or bigrams (features) in the order of their significance. We use this ranked list of features to construct explanations. As a result of this, tasks in the explained condition include an additional sentence detailing what part of the to-be-annotated sentence is most significantly associated with the AI prediction. For example, for the following sentence to be annotated: “The external canal is clear, and the tympanic membrane is normal,” the AI model predicts (correctly) that the observation should be annotated as “normal.” In the explained condition, we include the following text: “The suggestion is based on the following word(s): ‘normal’ and ‘clear,’ which AI associates with a normal diagnosis.” In this case, the underlined words, normal and clear, are the most significant features in the annotation prediction of the AI model. The difference between explained and not-explained conditions can be seen in Fig 2.

3.3. Participants and data collection process

We conduct our experiments on AMT, an online crowdsourcing platform that facilitates worker recruiting, human intelligence task (HIT) completion, and payment. To partially account for individual differences in cognitive and technical aptitude, we applied strict participant quality filters and included self-reported measures of technical literacy and AI literacy as control variables in our analysis, helping to mitigate variance related to familiarity with technology and decision-support systems. In particular, we invite English-speaking MTurk workers who reside in the United States and who have completed at least 1,000 previous HITs with at least a 95% approval rate to participate in our task. We collect annotations in batches, which are deployed in sequence. Each batch contains tasks designed for one condition. We populate the HITs for each batch by randomly selecting a sentence from the current collection of unlabeled sentences. Table 2 summarizes the demographic characteristics of all participants. We embed two quality assurance mechanisms in each annotation task to minimize the impact of the varying expertise of crowd workers. First, we require three unique workers to label each sentence. Second, workers can only complete one task to avoid any carryover effects.

Participant recruitment and data collection took place between 04/15/2019 and 05/24/2019. In the beginning, participants filled out an online consent form, acknowledging they read the consent form and understood the demands of participating in the study. Upon giving consent, participants complete a demographics questionnaire on a survey-like interface where we provide instructions and examples to help the participants get familiar with the annotation task. Following instructions, we ask participants to classify the highlighted sentences as describing a normal or an abnormal observation of the specified component (Fig 2). Toward the end of the study, participants are asked to complete a survey indicating their domain expertise, trust, and knowledge of AI technology, etc. (as described in Section 3.4.5).

3.4. Measurements and variables

Accuracy measures.

Our accuracy evaluations for both the human and AI are based on the expert responses that serve as the gold standard. For each task, we collected responses from at least three participants and determined task-level accuracy by comparing the aggregated responses (i.e., determined by majority vote) to the gold standard expert responses. We define initial framing accuracy as the majority vote accuracy in the control condition, where participants completed the task without access to an AI prediction. In contrast, final decision accuracy refers to the majority vote accuracy for the same task when participants were shown an AI prediction.

Explanation effect.

The presence of an explanation is one of the treatments in our experimental setup. In “explained” conditions, we provide an explanatory sentence that suggests what part of the to-be-annotated sentence is associated with the AI prediction. Our explained condition measure is 1 when an explanation is present and 0 otherwise.

Machine confidence.

Machine confidence is defined as the AI model’s confidence level, computed as a function of the posterior probabilities of individual predictions and the model’s overall accuracy to capture both instance-level certainty and overall reliability.

Self-confidence.

We measure participant self-confidence for each task based on their confidence in their answers when AI predictions aren’t available. For each task, we measure the confidence among the three participants under the control condition (since participants did not see any AI prediction in this condition) and calculate the average self-confidence score for the task. We ask participants to self-report confidence in their answers. They rate their responses on the Likert Scale from 1, not at all confident, to 5, very confident (mean = 4.00, sd = 0.97).

Control variables.

To control for the model quality, in terms of its overall accuracy, we build several SVM models by varying the size of the training set. For each SVM model, we use five-fold cross-validation to fine-tune its parameters. During the experiments, participants were randomly assigned to tasks that showed AI predictions either from a model that used all the training data (727 samples) as the “high-accuracy model” or the one that used 20% of the training data as the “low-accuracy model.” Each task was completed both with high-accuracy model predictions and low-accuracy model predictions by different participants.

To control for factors that can influence participants’ judgment accuracy other than model accuracy, we collect more data from them during the experiment.

Domain expertise has been discussed to influence how individuals utilize explanations when algorithmic decisions are transparent. For example, Dhaliwal et al. [52] argued that expertise influences how explanations provided by knowledge-based systems are utilized. Specifically, they suggest novice users can make greater use of explanations. Also, the overall benefits of these explanations for expert and novice users can vary depending on the provisioning strategy. In essence, novice users can benefit more from the explanations preceding predictions (feedforward), while experts can benefit more from the explanations following predictions (feedback). To that end, we control for the participants’ domain expertise. We ask participants to self-report healthcare literacy. They rate their responses on the Likert Scale from 1, not at all, to 5, very well (mean = 3.95, sd = 0.85). Second, we consider the participants’ trust in AI predictions. Trust in technological artifacts influences their use [93]. From a decision perspective, trust is also an essential factor in delegating a task to “intelligent agents,” and the interfaces through which human agents intact with these intelligence agents can and should increase trust [94,95]. Moreover, trust also serves as a proxy for participants’ a priori assumptions about the correctness or reliability of AI predictions—a form of baseline framing that may influence how individuals interpret and integrate AI recommendations.

Additionally, Gregor et al. [96] suggested that transparent agents that provide explanations can improve performance because users would feel more comfortable if the intelligent agent could explain its actions. Overall, these perspectives suggest that trust in machine-generated advice is a complex concept [97,98] and will affect the interaction between human and AI agents. To that end, we control for the participants’ trust in AI. We ask participants to self-report their trust in Artificial Intelligence. They rate their responses on the Likert Scale from 1, not at all, to 5, very well (mean = 3.22, sd = 0.97).

Finally, we consider the participants’ knowledge of AI technology. It has been reported that one way to improve the proper utilization of algorithmic aids is to increase algorithmic literacy among human judges. That is, if human judges are more knowledgeable about interpreting statistical outputs such as decision accuracy and appreciate the utility of decision aids under certainty, their interaction with these tools will be positively influenced [99,100]). To that end, we control for the participants’ knowledge about Artificial Intelligence. We ask participants to self-report their knowledge about Artificial Intelligence before the task. They rate their responses on the Likert Scale from 1, not at all, to 5, very well (mean = 3.25s, sd = 0.95). The means and correlations of all variables appear in Table 3.

thumbnail
Table 3. Means and correlations for main hypotheses dependent variables.

https://doi.org/10.1371/journal.pone.0321342.t003

4. Results

4.1. Effects of explanations

To examine the effects of explanations, we perform a multivariate logistic regression analysis. With final decision accuracy—defined as a binary variable equal to 1 if the majority-vote response matches the expert-provided gold standard and 0 otherwise—as the dependent variable, we introduce independent variables in a stepwise fashion to account for potential instability arising from multicollinearity. The results are shown in Table 4, with Final Judgment Accuracy as the dependent variable. Model 1 includes these control variables: Technical Literacy, Domain Expertise, AI Literacy, Trust to AI, and Model (high vs. low accuracy model). In Model 2, we examine the effect of the correct AI prediction. The correct AI prediction has a significant positive relationship with the final decision accuracy, suggesting the final decision is more likely to be accurate when decision-makers are presented with a correct AI prediction rather than an incorrect AI prediction. Model 3 includes the explanation effect, suggesting that the presence of explanations will influence the final decision accuracy. The presence of explanation is strongly and positively associated with final decision accuracy, indicating that decisions are more likely to align with the gold standard when an explanation is provided.

thumbnail
Table 4. Results for regression analysis for final decision accuracy with AI recommendations.

https://doi.org/10.1371/journal.pone.0321342.t004

4.2. Self-monitoring hypotheses (H1, H1A and H1B)

To test the differential effect of initial framing (H1) on the relationship between explanations and final decision accuracy, we conduct a sample-split analysis. We split our data depending on whether the initial framing for each task is accurate. We measure the initial framing accuracy based on the majority vote decision for the task when an AI prediction is not prompted. The results of our split sample analyses are shown in Table 5. The results based on this classification are shown in Model 3 and Model 4 in Table 5. Our results show that the impact of explanations is greater and significant when the initial framing is incorrect, supporting H1.

thumbnail
Table 5. Results for subsample analyses for final judgment accuracy that agree with AI recommendations.

https://doi.org/10.1371/journal.pone.0321342.t005

The effects of explanations on incorrect initial framing (H1A and H1B).

To test the influence of explanations on incorrect initial framing with respect to the AI accuracy, we compare the final decision accuracy with and without the explanations particularly for the tasks where initial framing was captured as incorrect based on the answers without AI recommendations (i.e., control group answers). First, we examine how explanations affect participants’ initial incorrect framing when AI predictions are correct. To do this, we use two-sample proportion tests to compare the accuracy rates (i.e., the proportion of correct responses) between conditions with and without explanations. Among the cases where initial framing is incorrect and AI predictions are correct, the accuracy of final judgment is significantly higher when AI predictions are with explanations (pexplanation = 88.9%) than without explanations (pno_explanation = 76.4%) (HA: pexplanation > pno_explanation, p < 0.05). The significant difference suggests that when decision-makers with incorrect initial framing are prompted with correct AI predictions, the use of explanations leads to more accurate decisions, supporting H1A. The effect of explanations under this condition is also shown in Fig 3 in the left column.

thumbnail
Fig 3. The final decision accuracy rates when initial framing is incorrect.

The graph is categorized based on whether the AI recommendations made available to the subjects are correct (left column) or not (right column.) Error bars indicate 1 standard error.

https://doi.org/10.1371/journal.pone.0321342.g003

Second, we look at the effect of explanations on incorrect initial framing when AI predictions are also incorrect. With a two-sample proportion test, we compared the accuracy rates (i.e., the proportion of correct responses) between conditions with and without explanations. Among the cases where both initial framing and AI predictions are incorrect, the final decision accuracy is significantly higher with explanations (pexplanation = 50%) than without explanations (pno_explanation = 25%.) The difference is significant (HA: pexplanation < pno_explanation, p < 0.05). This result suggests that when decision-makers with incorrect initial framing are prompted with incorrect AI predictions, the use of explanations leads to more accurate decisions, supporting H1B. The effect of explanations under this condition is also shown in Fig 3 on the right column.

To assure that explanations are significantly effective when initial framing is incorrect, we conducted a subsample analysis by splitting the data based on whether the initial framing for the questions were correct or not. The results are shown in Model 4 and Model 5 of Table 5. The results suggest that the effect of explanations is significant (and positive) when the initial framing is incorrect.

4.3. Persuasiveness hypotheses (H2 and H3)

To study the effects of persuasiveness, we conduct two subsample analyses examining how different levels of machine confidence (H2) and self-confidence (H3) change the explanation effects. We continue to show the results of our split sample analyses in Table 5. We first split our data depending on whether the prediction probability is above the median prediction probability in the dataset. The results of this split are shown in Model 6 and Model 7 of Table 5. Our results show that the explanation effect is greater and significant with lower machine prediction accuracy, supporting H2. Similarly, we split the data based on whether the self-confidence score is above the median confidence score, and the results are shown in Model 8 and Model 9 of Table 5. Our results show that the effect of explanations is greater and significant with lower human self-confidence scores, supporting H3. Finally, a summary of our hypotheses, test methods, results and their practical significances are shown in Table 6.

4.4. Robustness tests

We conduct a series of additional analyses to examine the robustness of our results. First, our sample split analyses for machine confidence and self-confidence are based on the median values for confidence scores. To ensure that the results reported are robust, we replicate our analysis by comparing different sample splits. We conduct the same analysis comparing 1) the first quartile with the fourth and 2) the first quartile with the rest, for both machine confidence and self-confidence scores. The results are similar to those of our original analysis.

Second, variable coefficients remain stable when added in a stepwise fashion, suggesting multicollinearity is not an issue. We further test for possible multicollinearity issues by computing the variance inflation factors (VIFs) for the exploratory variables. The VIF values for variables in the complete model, including the interaction term, are shown in Table 3. The VIF values in our model are less than the generally accepted threshold of 10 [101,102]. They are even lower than the more conservative suggested thresholds (i.e., the lowest VIF threshold we found was proposed by Pan et al., [103] as 4).

5. Discussion and conclusion

5.1. Theoretical contributions

Our study aims to expand the theories of XAI, which have drawn attention in the current literature [3,4,104]. Specifically, we consider the concerns raised by Miller [16] on how the theory efforts in XAI are missing the rich cognitive context when humans interact with explanations in general. This rich cognitive context is currently being recognized in the studies of human-AI interactions (e.g., [11,105]). Our work contributes to the XAI literature by studying the interactions of humans, AI, and their explanations through a theoretical lens that emphasizes this cognitive context.

Our findings suggest that explanations of AI decisions improve human-AI collaboration by overcoming the human’s cognitive limitations when they trigger metacognitive self-monitoring processes. This view builds on Jussupow et al.’s [11] decision augmentation with AI, where they studied the role of human actors in compensating for technical errors. In our work, we integrate explanations into the hybrid. We show that explanations can mitigate human judgment errors not only when AI recommendations are correct (H1A) but also when recommendations are incorrect (H1B) through self-monitoring processes. Our findings show that, contrary to Jussupow et al.’s [11] assumptions, at least in some cases, even though both AI and human agents are incorrect, explanations may be triggering self-monitoring processes and lead to correct decisions. For instance, when AI recommendations are incorrect, participants often arrive at the correct final judgment if those recommendations are accompanied by explanations that appear unrelated to the diagnosis (e.g., referencing terms like “internal auditory,” “canal,” “left,” or “cochlear nerve” as indicators of an abnormal diagnosis). This occurs even when the initial framings of these tasks were incorrect.

Our findings also show that the influence of explanations on the final decision accuracy depends on factors that impact how persuasive these explanations are perceived. In particular, we confirm that machine prediction confidence acts as an indicator of how persuasive the explanations are (H2) and decision makers’ self-confidence impacts the effects of explanations (H3).

Our findings also align with Fügener et al. [15], suggesting that metacognitive processes can positively influence human-AI collaboration by addressing human agents’ cognitive limitations. Their findings show that supporting humans’ awareness of these cognitive processes (increasing metacognitive knowledge) can improve collaboration. Our findings expand this by offering that supporting human agents’ use of cognitive processes with help of additional information, such as explanations, can also enhance collaboration.

From a broader perspective, our findings expand the current understanding of machine outcomes’ role in the human agents’ cognitive loop [68]. Lyytinen et al. [12] posit the emergent human-machine hybrids as sociotechnical systems where machines and humans learn jointly. However, the dynamics that surface when combining different cognitive architectures of humans and machines are still unclear [12]. Our findings provide evidence that understandability of the machine predictions triggered by explanations can influence this dynamic.

5.2. Design and practical implications

From a practical perspective, our results suggest that designers and developers of explainable AI systems should extend their focus beyond the effects of explanations on users’ rational decision-making processes. Specifically, their impact on the effects of AI predictions has several important implications. First, explanation performance metrics may need to be adjusted for their metacognitive-related effects. Our findings suggest that explanations can go beyond securing user trust or helping users make sense of machine predictions; they allow users to reflect on their own decisions. For example, for tasks where both human predictions (without an AI input) and AI predictions were inaccurate, human judgment shifted to correct prediction when they saw the explanations for the incorrect AI predictions.

Second, our results suggest that when human decisions are involved, explanations can improve the final decision accuracy even when the models are trained on small datasets. This finding is particularly significant for AI implementations where the training data is relatively scarce (e.g., in healthcare). In such cases, explanations can help overcome the obstacles to building highly accurate models, and decision-makers can still use algorithmic decisions effectively.

Third, given their critical and sensitive nature, decisions are not likely to be fully automated soon in areas like healthcare, and machine predictions should be used cautiously. Our results in this study confirm the existence of the biasing effect in that the overall performance of human-AI collaboration strongly depends on the accuracy of the algorithms when AI predictions are presented.

Fourth, another important consideration is the role of participants’ prior medical knowledge in shaping their comprehension of both radiology report content and AI-generated explanations. Although we included a self-reported measure of healthcare literacy to approximate domain expertise, this measure provides only a general sense of participants’ familiarity with medical concepts. Individuals with more extensive medical knowledge may interpret technical terminology or explanation cues differently, potentially moderating the impact of AI support on decision outcomes. Conversely, those with less familiarity may benefit more from explanatory cues that simplify or contextualize medical language. Future research should explore how varying levels of domain-specific knowledge influence human-AI collaboration, particularly in high-stakes domains like healthcare, and consider incorporating more granular measures of expertise to better tailor AI explanations to user needs.

Fifth, although our study focused on radiology report excerpts related to anatomical assessments, the mechanisms uncovered—particularly how AI explanations influence self-monitoring and final decision accuracy—may generalize to other clinical contexts. In domains such as pathology, dermatology, or even mental health assessments, clinicians and patients routinely interpret textual or semi-structured diagnostic data. In these settings, explanatory cues that highlight salient features or align with human reasoning may similarly support cognitive calibration and improve judgment. Future studies should explore the transferability of these effects across modalities and specialties to assess how explanation design can be adapted to domain-specific workflows and reasoning styles.

Sixth, our study also invites comparison to research on second opinions provided by human professionals. Similar to human collaborators, AI systems offering a second opinion can influence decisions through both the content of the recommendation and the presence of a rationale or explanation. However, unlike human experts, AI systems often lack social cues, experiential context, and perceived accountability—all of which can shape how advice is received and evaluated. Prior research in team cognition (e.g., [82,106]) has shown that explanations from human peers can enhance advice acceptance by promoting understanding and credibility. In clinical settings, second opinions are valued not only for diagnostic correction but also for building confidence and reducing decisional conflict [107]. AI explanations, while more mechanical, may similarly scaffold user reasoning by offering interpretable anchors. Our results suggest that explanations from AI can trigger self-monitoring and override intuitive errors, much like a persuasive human second opinion might. However, the path to trust and reliance likely differs due to the perceived agency, intent, and contextual awareness that human collaborators bring. Future research should explore whether users internalize and weigh second opinions from AI systems differently than those from human experts, particularly in high-stakes domains such as healthcare.

Finally, systems could be designed to elicit users’ initial judgment and confidence before presenting AI recommendations—encouraging self-reflection and enabling tailored presentation of advice. This aligns with our finding that explanations are more effective when initial self-confidence is low. Additionally, tools could incorporate a “devil’s advocate” feature that presents counterfactual explanations challenging the user’s initial framing—even when the AI prediction aligns with the user’s intuition. Doing so could activate metacognitive conflict and help users re-evaluate their reasoning, similar to how human collaborators sometimes challenge each other’s assumptions. Furthermore, dynamically adapting explanation strength or framing based on user confidence and prediction confidence could support more calibrated reliance. We encourage designers to view explanations not just as static justifications, but as interactive mechanisms for cognitive support—capable of modulating trust, effort, and decision quality.

5.3. Limitations and future research

This study has some limitations and creates opportunities for further research. First, we study this only in the context of annotating medical records. This unique nature of textual data allowed us to develop simple insight mechanisms for the decision-maker. Future research should examine whether these effects are observable using more generalizable tasks. Second, our explanations were created by isolating the features with the highest contributions [108], which is a simple —but not necessarily the most optimal—way of creating explanations. Explanations created by other methods, such as model approximation (e.g., [43]) can also be investigated for their contributions to human judgment accuracy. Furthermore, while our study used feature-based explanations generated from an SVM trained on radiology report sentences, it is important to recognize that these explanations are synthetic in nature. That is, they were created specifically for this experiment using controlled, transparent mechanisms (e.g., ranked unigrams and bigrams). In contrast, real-world AI systems such as ChatGPT or GlassBox models often produce explanations that are generated through more complex, probabilistic, and potentially less transparent methods. These real-world systems may involve dialogue-based, multi-modal, or dynamically updated explanations shaped by conversational context. As such, the effects of explanations observed in our study may not fully generalize to real-world deployments where explanations are noisier, more ambiguous, or inconsistently interpreted. Future research should explore whether the self-monitoring benefits observed here persist under more realistic, end-to-end AI explanation interfaces. Third, while explanations significantly improve the overall accuracy, improvements beyond machine learning prediction accuracy can be explored. For example, in our case, the final judgment is ultimately performed by human judges. If we can identify the characteristics of decision opportunities where humans are most likely to succeed and the situations where machine predictions are most likely to fail, then we can design intelligent decision workflows that delegate decisions to agents who are most likely to succeed. Fourth, we approximate the initial framing measures based on the consensus in the control group. The initial framing can be more accurately measured in a with-in-subject experiment setting by asking participants to complete the task first without AI recommendations before completing the same task again with AI recommendations. However, with this approach, the final task outcome could be influenced by the carryover [109] or the priming [110] effects of the first step. Therefore, we used the responses from the control condition where AI recommendations were not presented. While we believe this is a good proxy for what the initial framings would be for the same task when AI recommendations are present, future studies can focus on extracting initial framing instead of using a proxy. Moreover, as we discussed earlier, explanations have the potential to improve final judgment accuracy. Therefore, merely providing practical explanations can alleviate the need for more complex systems or bigger training datasets. At the same time, implementing explanations requires resources. It might be beneficial to look into optimizing resources and modeling accuracy with AI explanations. We would like to note that we measured domain expertise using a self-reported healthcare literacy item, which served as a proxy for familiarity with medical terminology and context. However, we acknowledge that this is a coarse measure that does not account for broader forms of domain-general expertise. Similarly, while we included a self-reported measure of participants’ knowledge of AI technology to approximate their familiarity with algorithmic systems, we acknowledge that this is not an ideal proxy for algorithmic literacy. Algorithmic literacy entails an understanding of how AI systems generate predictions, their underlying assumptions, and appropriate interpretations of their outputs. Our measure likely reflects general familiarity or comfort with AI concepts rather than this deeper cognitive skill set. Future studies should consider incorporate more rigorous assessments of participants’ medical familiarity and algorithmic literacy to better capture how these factors shape human-AI collaboration. Additionally, this study used lay participants recruited from Amazon Mechanical Turk rather than individuals with formal medical training. While this choice may constrain ecological validity for clinical decision-making settings, it reflects a deliberate focus on understanding how general users (e.g., lay patients) interpret AI explanations in the absence of domain expertise. As AI-based tools become increasingly available to patients and the broader public—for example, in direct-to-consumer health platforms or patient-facing EHR systems—understanding how lay individuals rely on AI and engage in self-monitoring is crucial. Finally, our study primarily focuses on participants’ initial framing, and their deliberate consideration of AI recommendations when presented. While we control for participants’ trust in our work, there is a growing emphasis on the importance of calibrating trust (i.e., avoiding over- or under-reliance [111]) to enhance AI-assisted decision-making [112]. We expect further studies to consider the nuanced differences between trust and reliance [113,114] in studying the effects of explanations on metacognitive processes of AI-assisted decision-making.

References

  1. 1. Johnson H, Johnson P. Explanation facilities and interactive systems. Proceedings of the 1st international conference on Intelligent user interfaces - IUI’93. Orlando, Florida, United States: ACM Press; 1993. pp. 159–66.
  2. 2. Goodman B, Flaxman S. European Union Regulations on Algorithmic Decision Making and a “Right to Explanation”. AI Magaz. 2017;38(3):50–7.
  3. 3. Rai A. Explainable AI: from black box to glass box. J Acad Mark Sci. 2019;48(1):137–41.
  4. 4. Rai A, Constantinides P, Sarker S. Editor’s comments: Next-generation digital platforms: Toward human–AI hybrids. Manag Inf Syst Q. 2019;43:III–IX.
  5. 5. Weller A. Transparency: motivations and challenges. Explainable AI: interpreting, explaining and visualizing deep learning. Springer; 2019. pp. 23–40.
  6. 6. Senoner J, Schallmoser S, Kratzwald B, Feuerriegel S, Netland T. Explainable AI improves task performance in human-AI collaboration. Sci Rep. 2024;14(1):31150. pmid:39730794
  7. 7. Hassija V, Chamola V, Mahapatra A, Singal A, Goel D, Huang K, et al. Interpreting Black-Box Models: a review on explainable artificial intelligence. Cogn Comput. 2023;16(1):45–74.
  8. 8. Gao R, Saar-Tsechansky M, De-Arteaga M, Han L, Lee MK, Lease M. Human-ai collaboration with bandit feedback. International Joint Conference on Artificial Intelligence (IJCAI-2021). 2021.
  9. 9. Simon HA. Theories of bounded rationality. Decis Organ. 1972;1:161–76.
  10. 10. Tjoa E, Guan C. A survey on explainable artificial intelligence (XAI): Toward medical XAI. IEEE Trans Neural Netw Learn Syst. 2020.
  11. 11. Jussupow E, Spohrer K, Heinzl A, Gawlitza J. Augmenting medical diagnosis decisions? An investigation into physicians’ decision-making process with artificial intelligence. Inf Syst Res. 2021.
  12. 12. Lyytinen K, Nickerson JV, King JL. Metahuman systems= humans machines that learn. J Inf Technol. 2020.
  13. 13. Ackerman R, Thompson VA. Meta-reasoning: monitoring and control of thinking and reasoning. Trends Cogn Sci. 2017;21(8):607–17. pmid:28625355
  14. 14. Agrawal A, Gans JS, Goldfarb A. Exploring the impact of artificial intelligence: Prediction versus judgment. Inf Econ Policy. 2019.
  15. 15. Fügener A, Grahl J, Gupta A, Ketter W. Cognitive challenges in human–artificial intelligence collaboration: investigating the path toward productive delegation. Inf Syst Res. 2021.
  16. 16. Miller T. Explanation in artificial intelligence: Insights from the social sciences. Artif Intell. 2019;267:1–38.
  17. 17. Heart T, Zucker A, Parmet Y, Pliskin JS, Pliskin N. Investigating physicians’ compliance with drug prescription notifications. J Assoc Inf Syst. 2011;12:3.
  18. 18. Ackerman R. The diminishing criterion model for metacognitive regulation of time investment. J Exp Psychol Gen. 2014;143(3):1349–68. pmid:24364687
  19. 19. Fiedler K, Hütter M, Schott M, Kutzner F. Metacognitive myopia and the overutilization of misleading advice. J Behav Decis Mak. 2019;32:317–33.
  20. 20. Hütter M, Fiedler K. Advice taking under uncertainty: the impact of genuine advice versus arbitrary anchors on judgment. J Exp Soc Psychol. 2019;85:103829.
  21. 21. Buhrmester M, Kwang T, Gosling SD. Amazon’s Mechanical Turk: A new source of inexpensive, yet high-quality data? Perspect Psychol Sci. 2011;6(1):3–5. pmid:26162106
  22. 22. Berente H, Pike B. Arguing the value of virtual worlds: patterns of discursive sensemaking of an innovative technology. MIS Q. 2011;35(3):685.
  23. 23. Gunning D, Stefik M, Choi J, Miller T, Stumpf S, Yang G-Z. XAI-Explainable artificial intelligence. Sci Robot. 2019;4(37):eaay7120. pmid:33137719
  24. 24. Schulte-Derne D, Gnewuch U. Translating AI ethics principles into practice to support robotic process automation implementation. MIS Q Exec. 2024;23(2).
  25. 25. Barredo Arrieta A, Díaz-Rodríguez N, Del Ser J, Bennetot A, Tabik S, Barbado A, et al. Explainable Artificial Intelligence (XAI): Concepts, taxonomies, opportunities and challenges toward responsible AI. Inf Fusion. 2020;58:82–115.
  26. 26. Mohseni S, Zarei N, Ragan ED. A multidisciplinary survey and framework for design and evaluation of explainable AI systems. ACM Trans Interact Intell Syst. 2021;11(3–4):1–45.
  27. 27. Wang D, Yang Q, Abdul A, Lim BY. Designing Theory-Driven User-Centric Explainable AI. Proceedings of the 2019 CHI Conference on Human Factors in Computing Systems. Glasgow Scotland UK: ACM; 2019. pp. 1–15.
  28. 28. Emmert‐Streib F, Yli‐Harja O, Dehmer M. Explainable artificial intelligence and machine learning: a reality rooted perspective. Wiley Interdiscip Rev Data Min Knowl Discov. 2020;10(6).
  29. 29. Abusitta A, Li MQ, Fung BCM. Survey on explainable AI: techniques, challenges and open issues. Expert Syst Appl. 2024;255:124710.
  30. 30. Dwivedi R, Dave D, Naik H, Singhal S, Omer R, Patel P, et al. Explainable AI (XAI): core ideas, techniques, and solutions. ACM Comput Surv. 2023;55(9):1–33.
  31. 31. Wallkötter S, Tulli S, Castellano G, Paiva A, Chetouani M. Explainable embodied agents through social cues: A review. ACM Trans Hum-Robot Interact. 2021.
  32. 32. Sindiramutty SR, Tee WJ, Balakrishnan S, Kaur S, Thangaveloo R, Jazri H, et al. Explainable AI in healthcare application. Advances in Explainable AI Applications for Smart Cities. IGI Global Scientific Publishing; 2024. pp. 123–76. https://www.igi-global.com/chapter/explainable-ai-in-healthcare-application/336874
  33. 33. Jung J, Lee H, Jung H, Kim H. Essential properties and explanation effectiveness of explainable artificial intelligence in healthcare: a systematic review. Heliyon. 2023;9(5):e16110. pmid:37234618
  34. 34. Manresa-Yee C, Roig-Maimó MF, Ramis S, Mas-Sansó R. Advances in XAI: Explanation Interfaces in Healthcare. In: Lim C-P, Chen Y-W, Vaidya A, Mahorkar C, Jain LC, editors. Handbook of Artificial Intelligence in Healthcare. Cham: Springer International Publishing; 2022. pp. 357–69.
  35. 35. Loh HW, Ooi CP, Seoni S, Barua PD, Molinari F, Acharya UR. Application of explainable artificial intelligence for healthcare: a systematic review of the last decade (2011–2022). Comput Methods Programs Biomed. 2022;226:107161.
  36. 36. Ferrario A, Loi M. How Explainability Contributes to Trust in AI. 2022 ACM Conference on Fairness Accountability and Transparency. Seoul Republic of Korea: ACM; 2022. pp. 1457–1466.
  37. 37. Li Y, Wu B, Huang Y, Luan S. Developing trustworthy artificial intelligence: insights from research on interpersonal, human-automation, and human-AI trust. Front Psychol. 2024;15:1382693. pmid:38694439
  38. 38. Yuan J, Bhattacharjee K, Islam AZ, Dasgupta A. TRIVEA: Transparent Ranking Interpretation using Visual Explanation of black-box Algorithmic rankers. Vis Comput. 2023;40(5):3615–31.
  39. 39. Witzel M, Gonnet GH, Snider T. Explainable AI policy: It is time to challenge post hoc explanations. CIGI Papers; 2024. Available from: https://www.econstor.eu/handle/10419/301936
  40. 40. Tarafdar M, Rets I, Zuloaga L, Mondragon N. How HireVue created “glass box” transparency for its AI application. MIS Q Exec. 2025;24:47–65.
  41. 41. Belle V, Papantonis I. Principles and practice of explainable machine learning. Front Big Data. 2021;39:688969. pmid:34278297
  42. 42. Dabkowski P, Gal Y. Real time image saliency for black box classifiers. Adv Neural Inf Process Syst. 2017.
  43. 43. Ribeiro MT, Singh S, Guestrin C. “Why should i trust you?” Explaining the predictions of any classifier. Proceedings of the 22nd ACM SIGKDD international conference on knowledge discovery and data mining. 2016. pp. 1135–44.
  44. 44. Ghorbani A, Wexler J, Zou J, Kim B. Towards automatic concept-based explanations. ArXiv Preprint. 2019.
  45. 45. Goyal Y, Feder A, Shalit U, Kim B. Explaining classifiers with causal concept effect (cace). ArXiv Preprint. 2019.
  46. 46. Kim B, Wattenberg M, Gilmer J, Cai C, Wexler J, Viegas F. Interpretability beyond feature attribution: Quantitative testing with concept activation vectors (tcav). International conference on machine learning. PMLR; 2018. pp. 2668–77.
  47. 47. Zhang Q, Yang Y, Ma H, Wu YN. Interpreting cnns via decision trees. Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. 2019. pp. 6261–70.
  48. 48. Zhang D, Zhou H, Zhang H, Bao X, Huo D, Chen R. Building interpretable interaction trees for deep nlp models. In: Proceedings of the AAAI Conference on Artificial Intelligence, 2021. pp. 14328–37.
  49. 49. Charte D, Charte F, del Jesus MJ, Herrera F. An analysis on the use of autoencoders for representation learning: Fundamentals, learning task case studies, explainability and challenges. Neurocomputing. 2020;404:93–107.
  50. 50. Shrikumar A, Greenside P, Kundaje A. Learning important features through propagating activation differences. International Conference on Machine Learning. PMLR; 2017. pp. 3145–53.
  51. 51. Shrikumar A, Prakash E, Kundaje A. GkmExplain: fast and accurate interpretation of nonlinear gapped k-mer SVMs. Bioinformatics. 2019;35(14):i173–82. pmid:31510661
  52. 52. Dhaliwal JS, Benbasat I. The use and effects of knowledge-based system explanations: theoretical foundations and a framework for empirical evaluation. Inf Syst Res. 1996;7(3):342–62.
  53. 53. Chari S, Seneviratne O, Gruen DM, Foreman MA, Das AK, McGuinness DL. Explanation ontology: A model of explanations for user-centered AI. International Semantic Web Conference. Springer; 2020. pp. 228–243.
  54. 54. Donadello I, Dragoni M. SeXAI: A Semantic Explainable Artificial Intelligence Framework. International Conference of the Italian Association for Artificial Intelligence. Springer; 2020. pp. 51–66.
  55. 55. Kenny EM, Keane MT. Twin-systems to explain artificial neural networks using case-based reasoning: Comparative tests of feature-weighting methods in ANN-CBR twins for XAI. Twenty-Eighth International Joint Conferences on Artifical Intelligence (IJCAI), Macao. 2019. pp. 2708–15.
  56. 56. Shen W, Wei Z, Huang S, Zhang B, Fan J, Zhao P. Interpretable Compositional Convolutional Neural Networks. ArXiv Preprint. 2021.
  57. 57. Endsley MR. Supporting human-AI teams:transparency, explainability, and situation awareness. Comput Hum Behav. 2023;140:107574.
  58. 58. Mikalef P, Conboy K, Lundström JE, Popovič A. Thinking responsibly about responsible AI and ‘the dark side’ of AI. Eur J Inf Syst. 2022;31(3):257–68.
  59. 59. Papagni G, de Pagter J, Zafari S, Filzmoser M, Koeszegi ST. Artificial agents’ explainability to support trust: considerations on timing and context. AI Soc. 2022;38(2):947–60.
  60. 60. Kahr P, Rooks G, Snijders C, Willemsen MC. Good Performance Isn’t Enough to Trust AI: Lessons from Logistics Experts on their Long-Term Collaboration with an AI Planning System. Proceedings of the 2025 CHI Conference on Human Factors in Computing Systems. Yokohama Japan: ACM; 2025. pp. 1–16.
  61. 61. Nazar M, Alam MM, Yafi E, Su’ud MM. A systematic review of human–computer interaction and explainable artificial intelligence in healthcare with artificial intelligence techniques. IEEE Access. 2021;9:153316–48.
  62. 62. Gleasure R, Conboy K, Jiang Q. Technocognitive structuration: modeling the role of cognitive structures in technology adaptation. J Assoc Inf Syst. 2025;26:394–426.
  63. 63. Dimoka A, Pavlou PA, Davis FD. Research commentary—NeuroIS: The potential of cognitive neuroscience for information systems research. Inf Syst Res. 2011;22:687–702.
  64. 64. Yang HD, Kang HR, Mason RM. An exploratory study on meta skills in software development teams: antecedent cooperation skills and personality for shared mental models. Eur J Inf Syst. 2008;17:47–61.
  65. 65. Yin D, Mitra S, Zhang H. Research note—When do consumers value positive vs. negative reviews? An empirical investigation of confirmation bias in online word of mouth. Inf Syst Res. 2016;27:131–44.
  66. 66. Bauer K, von Zahn M, Hinz O. Expl (AI) ned: The impact of explainable artificial intelligence on users’ information processing. Inf Syst Res. 2023.
  67. 67. Tankelevitch L, Kewenig V, Simkute A, Scott AE, Sarkar A, Sellen A, et al. The Metacognitive Demands and Opportunities of Generative AI. Proceedings of the CHI Conference on Human Factors in Computing Systems. Honolulu HI USA: ACM; 2024. pp. 1–24.
  68. 68. Seidel S, Berente N, Lindberg A, Lyytinen K, Nickerson JV. Autonomous tools and design: a triple-loop approach to human-machine learning. Commun ACM. 2018;62:50–7.
  69. 69. Yeung N, Summerfield C. Metacognition in human decision-making: confidence and error monitoring. Philos Trans R Soc Lond B Biol Sci. 2012;367(1594):1310–21. pmid:22492749
  70. 70. Tormala ZL, Petty RE. Source credibility and attitude certainty: a metacognitive analysis of resistance to Persuasion. J Consum Psychol. 2004;14(4):427–42.
  71. 71. Petty RE, Briñol P. Persuasion: from single to multiple to metacognitive processes. Perspect Psychol Sci. 2008;3(2):137–47. pmid:26158880
  72. 72. Erickson S, Heit E. Metacognition and confidence: comparing math to other academic subjects. Front Psychol. 2015;6:742.
  73. 73. Kleitman S, Stankov L. Self-confidence and metacognitive processes. Learn Individ Differ. 2007;17(2):161–73.
  74. 74. Petty RE, Cacioppo JT. Communication and persuasion: Central and peripheral routes to attitude change. Springer Science & Business Media; 2012.
  75. 75. French AM, Storey VC, Wallace L. The impact of cognitive biases on the believability of fake news. Eur J Inf Syst. 2023;34(1):72–93.
  76. 76. Klein G. Naturalistic decision making. Hum Factors. 2008;50(3):456–60. pmid:18689053
  77. 77. Chaiken S, Maheswaran D. Heuristic processing can bias systematic processing: effects of source credibility, argument ambiguity, and task importance on attitude judgment. J Pers Soc Psychol. 1994;66(3):460–73. pmid:8169760
  78. 78. Fernbach PM, Darlow A, Sloman SA. When good evidence goes bad: the weak evidence effect in judgment and decision-making. Cognition. 2011;119(3):459–67. pmid:21345428
  79. 79. Fernbach PM, Darlow A, Sloman SA. Asymmetries in predictive and diagnostic reasoning. J Exp Psychol Gen. 2011;140(2):168–85. pmid:21219081
  80. 80. Dragoni M, Donadello I, Eccher C. Explainable AI meets persuasiveness: translating reasoning results into behavioral change advice. Artif Intell Med. 2020;105:101840. pmid:32505427
  81. 81. Gross SR, Holtz R, Miller N. Attitude certainty. Attitude Strength Antecedents Consequences. 1995;4:215–45.
  82. 82. Bonaccio S, Dalal RS. Advice taking and decision-making: An integrative literature review, and implications for the organizational sciences. Organ Behav Hum Decis Process. 2006;101:127–51.
  83. 83. Hausmann D, Läge D. Sequential evidence accumulation in decision making: The individual desired level of confidence can explain the extent of information acquisition. Judgm Decis Mak. 2008;3:229–43.
  84. 84. Wang X, Du X. Why does advice discounting occur? The combined roles of confidence and trust. Front Psychol. 2018;9:2381. pmid:30555394
  85. 85. Fink L. Why and how online experiments can benefit information systems research. J Assoc Inf Syst. 2022;23:1333–46.
  86. 86. Cocos A, Qian T, Callison-Burch C, Masino AJ. Crowd control: Effectively utilizing unscreened crowd workers for biomedical data annotation. J Biomed Inform. 2017;69:86–92. pmid:28389234
  87. 87. Johansson U, Sönströd C, Norinder U, Boström H. Trade-off between accuracy and interpretability for predictive in silico modeling. Future Med Chem. 2011;3(6):647–63. pmid:21554073
  88. 88. Kamwa I, Samantaray SR, Joos G. On the accuracy versus transparency trade-off of data-mining models for fast-response PMU-based catastrophe predictors. IEEE Trans Smart Grid. 2012;3(1):152–61.
  89. 89. Martin-Barragan B, Lillo R, Romo J. Interpretable support vector machines for functional data. Eur J Oper Res. 2014;232:146–55.
  90. 90. Liu P, Choo KKR, Wang L, Huang F. SVM or deep learning? A comparative study on remote sensing image classification. Soft Comput. 2017;21:7053–65.
  91. 91. Bakliwal A, Arora P, Patil A, Varma V. Towards enhanced opinion classification using NLP techniques. Proceedings of the Workshop on Sentiment Analysis where AI meets Psychology (SAAIP 2011). 2011. pp. 101–7.
  92. 92. Pedregosa F, Varoquaux G, Gramfort A, Michel V, Thirion B, Grisel O. Scikit-learn: Machine learning in Python. J Mach Learn Res. 2011;12:2825–30.
  93. 93. Komiak SYX, Benbasat I. The Effects of Personalization and Familiarity on Trust and Adoption of Recommendation Agents. MIS Q. 2006;30: 941–960.
  94. 94. Leichtmann B, Humer C, Hinterreiter A, Streit M, Mara M. Effects of explainable artificial intelligence on trust and human behavior in a high-risk decision task. Comput Hum Behav. 2023;139:107539.
  95. 95. Maes P. Agents that reduce work and information overload. Readings in human–computer interaction. Elsevier; 1995. pp. 811–21.
  96. 96. Gregor S, Benbasat I. Explanations from intelligent systems: theoretical foundations and implications for practice. MIS Q. 1999;23(4):497.
  97. 97. Choung H, David P, Ross A. Trust in AI and its role in the acceptance of AI technologies. Int J Hum–Comput Interact. 2022;39(9):1727–39.
  98. 98. Lerch FJ, Prietula MJ, Kulik CT. The Turing effect: The nature of trust in expert systems advice. Expertise in context: Human and machine. 1997. pp. 417–48.
  99. 99. Goodyear K, Parasuraman R, Chernyak S, de Visser E, Madhavan P, Deshpande G, et al. An fMRI and effective connectivity study investigating miss errors during advice utilization from human and machine agents. Soc Neurosci. 2017;12(5):570–81. pmid:27409387
  100. 100. Highhouse S. Stubborn reliance on intuition and subjectivity in employee selection. Ind Organ Psychol. 2008;1:333–42.
  101. 101. Hair JF, Black WC, Babin BJ, Anderson RE, Tatham RL. Multivariate data analysis. Upper Saddle River, NJ: Prentice Hall; 1998.
  102. 102. Kennedy P. A guide to econometrics. MIT Press; 2003.
  103. 103. Pan Y, Jackson RT. Ethnic difference in the relationship between acute inflammation and serum ferritin in US adult males. Epidemiol Infect. 2008;136(3):421–31. pmid:17376255
  104. 104. Berente N, Gu B, Recker J, Santhanam R. Managing artificial intelligence. MIS Q. 2021;45(3):1433–50.
  105. 105. Adomavicius G, Bockstedt JC, Curley SP, Zhang J. Reducing recommender systems biases: An investigation of rating display designs. pp. 54.
  106. 106. Yaniv I I, Kleinberger E. Advice taking in decision making: egocentric discounting and reputation formation. Organ Behav Hum Decis Process. 2000;83(2):260–81. pmid:11056071
  107. 107. Epstein RM, Alper BS, Quill TE. Communicating evidence for participatory decision making. JAMA. 2004;291(19):2359–66. pmid:15150208
  108. 108. Biran O, Cotton C. Explanation and justification in machine learning: a survey. IJCAI-17 workshop on explainable AI (XAI). 2017.
  109. 109. Wan EW, Agrawal N. Carryover effects of self-control on decision making: a construal-level perspective. J Consum Res. 2011;38:199–214.
  110. 110. Mussweiler T, Strack F. Hypothesis-consistent testing and semantic priming in the anchoring paradigm: a selective accessibility model. J Exp Soc Psychol. 1999;35(2):136–64.
  111. 111. Buçinca Z, Malaya MB, Gajos KZ. To trust or to think: Cognitive Forcing Functions Can Reduce Overreliance on AI in AI-assisted Decision-making. Proc ACM Hum-Comput Interact. 2021;5(CSCW1):1–21.
  112. 112. Schemmer M, Kuehl N, Benz C, Bartos A, Satzger G. Appropriate Reliance on AI Advice: Conceptualization and the Effect of Explanations. Proceedings of the 28th International Conference on Intelligent User Interfaces. 2023. pp. 410–22.
  113. 113. Cao S, Huang C-M. Understanding user reliance on AI in assisted decision-making. Proc ACM Hum-Comput Interact. 2022;6(CSCW2):1–23.
  114. 114. Eckhardt S, Kühl N, Dolata M, Schwabe G. A survey of AI reliance. arXiv. 2024.