Skip to main content
Advertisement
Browse Subject Areas
?

Click through the PLOS taxonomy to find articles in your field.

For more information about PLOS Subject Areas, click here.

  • Loading metrics

Perceptual judgments are resistant to the advisor’s perceived level of trustworthiness: A deep fake approach

Abstract

As we navigate our environment, we frequently make spontaneous judgments about other’s characteristics. Trustworthiness is a particularly important trait, often judged instantly and used to guide decisions, especially in uncertain situations. Although the impact of trustworthiness on social behaviour is well-documented, its influence on more fundamental cognitive processes, such as perceptual decision-making, remains unclear. The present study aims to fill this gap. In the first experiment (N =  100), we validated a new trustworthiness manipulation by applying deep fake technology to create animated versions of perceptually trustworthy, untrustworthy, and neutral static computer-generated faces. In the second experiment (N =  199), the deep fake procedure was applied to a new set of trustworthy and untrustworthy faces that served as advisors during a perceptual decision-making task. Here participants had to indicate the direction of dots that were either moving coherently to the left or the right (i.e., random dot motion task). Contrary to our predictions, participants did not align more with the advice of trustworthy advisors than that of untrustworthy advisors. While participants made faster decisions and reported greater confidence when aligning with the advice, these effects were not influenced by the advisor’s perceived trustworthiness. We integrate our findings within theoretical frameworks of advice taking, domain specificity of facial trustworthiness, and task requirements.

Introduction

When making decisions, social beings like humans, often rely on others [1], particularly in uncertain situations. In such cases, understanding the intentions of interaction partners becomes critical for determining whether to engage with or disregard the information they provide [2,3]. Social cues, such as perceived facial trustworthiness, play a key role in inferring these intentions [47]. Trustworthiness forms the basis of establishing meaningful relationships [8,9], and our cooperative behaviour [10]. It is a trait that we spontaneously [4] and rapidly evaluate [11,12], that tends to remain stable over time [13], and is consistent across age groups, from toddlers to the elderly [1416]. Crucially, these trustworthiness evaluations influence our behavior. For example, participants invested more money [17], followed advice more and faster [18], and made more risky choices when their interaction partner was perceived as trustworthy [19]. Similarly, individuals with an untrustworthy appearance were perceived as more likely to have committed a crime [20], were convicted more quickly [21], and were remembered better [22]. Such social biases have been primarily demonstrated in social (e.g., criminal judgments), motivational (e.g., cooperation), economic (e.g., investments), and value-based decision-making contexts. While these findings highlight trustworthiness as a critical factor in decision-making, its influence on lower-level sensory processes, such as perceptual decision-making, remains unexplored. To address this gap, we investigated the impact of perceptual trustworthiness on decision-making during a random dot motion task.

Perceptual decision-making plays a fundamental role in helping individuals navigate and interact with the world around them [23]. For example, we rely on it to judge whether a car is far away or nearby when crossing a road or to decide whether food on our plate is fresh or mouldy. In laboratory settings, perceptual decision-making is often assessed using visual discrimination tasks, such as the random dot motion task, which participants frequently perform individually [24]. However, in everyday life, (perceptual) decision making often occurs in interaction with others [25]. On one hand, integrating information from multiple sources has been shown to enhance task performance [26,27], facilitate efficient problem solving [28], improve motor performance [29], and increase the identification of misinformation [30], making it a key aspect of group behavior [1]. This applies to perceptual decisions as well, for example, studies have shown that accuracy in perceptual decision tasks depends on the accuracy of task partners [31]. Moreover, exchanging task-related confidence ratings can improve accuracy on perceptual tasks [32], and interaction is crucial when task partners differ in perceptual accuracy to achieve proficient performance [33]. On the other hand, evidence from the advice taking literature indicates that decision-makers often overestimate their own judgments while underweighting the input of others [3436]. When given the choice to base decisions on group information or act individually, participants frequently ignored group input, even when doing so led to more errors [36], and when combining information from different sources was statistically advantageous [37].

Thus, advice from others can play a crucial role in decision-making. However, not all advice is used equally. For example, people tend to prioritise advice from experts over that from novices [38]. More broadly, advice taking and information integration depend on the characteristics of the interaction partner [34,39] and on the level of uncertainty. Social influence theories [40] and empirical studies [41,42] propose that we are more susceptible to social influences when we are uncertain and lack confidence in our abilities or decisions [4345]. For example, during perceptual decision tasks, ambiguous sensory information increases reliance on collaborator’s reputations compared to tasks with unambiguous information [46]. Participants were also more likely to seek additional information from advisors with unshared information, perceiving it as a sign of greater knowledge [47]. Other studies have found that participants were less likely to follow other’s advice when confident in their own decisions [48], and individuals with greater self-assurance were less inclined to rely on social cues [49].

Overall individuals tend to integrate additional information when making decisions in uncertain situations. Social cues, such as perceived trustworthiness [7] are then used to evaluate the quality of the information [50], to make a more informed decisions. From an evolutionary perspective, fast and often spontaneous trustworthiness impressions [12,13] help us decide whom to approach and whom to avoid [4,6,7]. These impressions, based on perceptual information, could then provide an intuitively accessible source of information that acts as a heuristic for navigating our environment [51]. For example, a recent study found that people are more likely to follow the advice of perceived trustworthy advisors, and do so more quickly, when making value-based decisions—a type of subjective decision-making where choices depend on personal opinions or the perceived value of the options [18]. This trustworthiness advice following bias is in line with findings using non-perceptual trustworthiness manipulations such as the validity of the advisor (i.e., trustworthy; highly rewarding advice, untrustworthy: highly unrewarding advice) [52,53]. Trustworthiness is thus an important predictor of advice utilisation [18,39], particularly in subjective decision-making contexts lacking an objectively correct answer.

To investigate the impact of perceptual trustworthiness on advice-taking in a perceptual decision-making context, we first selected a series of static computer-generated faces previously rated as trustworthy, neutral, or untrustworthy [4]. Using a novel deep fake technique [54], we transformed these static images into interactive, dynamic social partners. This approach reflects the dynamic nature of real-life interactions and enhances the ecological validity of our paradigm by incorporating dynamic faces capable of providing verbal advice. In Experiment 1 (N =  100), we validated our deep fake procedure and assessed whether these animated faces still elicit differences in their perceived trustworthiness. Next, we applied the same deep fake procedure to create a new set of trustworthy and untrustworthy game partners. These animated faces advised participants during a random dot motion task. In this task, participants were presented with a cloud of coherent moving (i.e., moving in the same direction) and incoherent moving dots (i.e., moving in different directions) and were required to indicate the direction of the coherent movements by pressing left or right. The task had two levels of difficulty, namely hard trials with a coherency of 5–10% (i.e., 55–60% of the dots are moving in one direction) or easy trials with a coherency of 25% (i.e., 75% of the dots are moving the one direction). For the easy trials, the advice was always correct, for the hard trials, the advice was sometimes correct (78%) and sometimes incorrect (22%). The frequency and speed (i.e., choice decision time) with which the participants decisions aligned (i.e., advice alignment rate) with the game partner’s advice were measured. Additionally, we included a confidence rating after each trial to measure the level of (un)certainty [5557]. After each trial, participants indicated how confident they were about their decision on a continuous scale (0–100).

Consistent with previous studies on value-based decision-making, we hypothesised that on hard trials with correct advice, participants would respond faster and their decisions would align more with the advice of trustworthy advisors than with that of untrustworthy advisors [18,53]. We expected participants to be more confident in their decisions when the advisor was perceived as trustworthy Fcompared to untrustworthy. In addition to the main effects of trustworthiness, we predicted interaction effects between trustworthiness and advice alignment for the confidence ratings and the choice decision time (i.e., how fast do participants decide). The interaction for confidence ratings was based on the findings that the advisor’s trustworthiness influences advice discounting and utilisation [53] and that integrating information from others, such as in an advice contexts, results in a confidence boost [58]. It was therefore predicted that participants would be the most confident about the decision when the advisor was trustworthy and their decision aligned with the advice, and the least confident when their decision did not align with the advice, and the advisor was trustworthy. The same logic can be applied to the choice decision time. Previous research showed that participants decide faster when they followed advice as well as when they received advice from a perceived trustworthy advisor [18]. Therefore, the largest difference was expected between the advice alignment rate for the perceived trustworthy advisors compared to the perceived untrustworthy advisors.

Method experiment 1

The validation study aimed to investigate whether deep fake manipulations [54] of static computer generated faces that were previously categorised as trustworthy, untrustworthy, and neutral [4] would maintain their trustworthiness levels when animated.

Participants

Data from 110 participants (Mage =  32, SDage =  9, 87 female, 22 male, 1 not reported) on Prolific [59] were collected. Only participants with English as their native language, an approval rate of at least 70% based on at least 10 submissions on Prolific (www.prolific.com), and no prior participation in our previous studies were allowed. Participants who responded too quickly (N =  4, 1.5IQR from the 25th percentile), only used one response (N =  1), or made more than 50% mistakes on our attention check (N =  4) were excluded. Furthermore, one participant was excluded due to multiple exclusion criteria. As a result, the total sample size was 100 (Mage =  32, SDage =  10, 79 females, 20 males, 1 not reported).

In addition to the recruited participants, 52 participants returned the experiment before completing it and 5 timed out. The median completion time was 14.30 minutes, and participants were paid at a rate of 5£ an hour.

The experiment and applied procedures were conducted according to the guidelines of the ethical committee of Ghent University (approval number: 2020/167).

Apparatus and materials

To create animated versions of static images, deep fake models require two inputs: a source video to provide the animation and a static image to be animated [54]. For the source videos, 40 male individuals (Mage =  41, SDage =  14) with English as their native language were recruited through Prolific. Each participant was requested to introduce themselves in a video (e.g., ‘Hi, my name is Mark’). The names were generated using an English name generator (Reedsy). We randomly selected 120 computer-generated faces from the Oosterhof and Todorov database [4], created with FaceGen (www.facegen.com), representing three categories: trustworthy (+3 SD), neutral (0 SD), and untrustworthy (-3 SD), with 40 faces per category. Since deep fake models can only animate static images without incorporating the audio from the source video, the sound was added to the animations after generating the deep fakes. To minimise the impact of the voice on our trust ratings, each voice was used for each level of trustworthiness. Finally, the videos were centred in a 256 * 256 square with a black background for perceptual similarity purposes. You can find some examples of the deep fakes on OSF (OSF).

The experiment was conducted using Google Chrome in full-screen mode and programmed in Javascript with JsPsych [v6.0.5, 60]. Participants were presented with deep fake videos in a random order and asked to rate the trustworthiness of each virtual character on a scale of 1 (very untrustworthy) to 9 (very trustworthy) using their computer mouse.

Procedure

Participants provided their consent before taking part by pressing the consent button with their computer mouse. In each trial, they viewed a deep fake video lasting between 1 and 3 seconds, in which the avatar introduced themselves with the phrase ‘Hi, my name is X’. After viewing the video, participants selected the ‘continue’ button and were then asked to rate the virtual character’s trustworthiness on a Likert scale ranging from 1 to 9. The order of the videos was randomized. Additionally, participants were required to listen carefully to the names of the virtual faces. On 10 random trials, they were asked to type in the name of the avatar they had just been introduced to. This was used as an attention check and was always presented after participants had rated the avatar.

Design

We had a within-subject design with one factor consisting of three levels (trustworthiness: trustworthy, neutral, and untrustworthy).

Preprocessing and analyses

Preprocessing was done in R [v4.3.1, 61], and analyses were conducted in JASP [v0.18.3, 62]. All trials that deviated by more than 2 standard deviations from the participant’s decision times means were considered outliers (4%). To centre the data around zero, we subtracted the variable mean from each data point. Additionally, we standardised the data variability by dividing the data points by the standard deviation. Both centring and scaling were performed using the scale function (v 4.3.1, base Package).

The data was analysed using a within-subject repeated measures ANOVA (Type III) [63,64] with trustworthiness as a factor. Post hoc t-tests were conducted to determine the direction of significant effects, and the corresponding p-values were corrected using Holm’s correction [65]. Lastly, we ran a one sample t-test, to see if each level of trustworthiness was different from zero, which was set as the reference level due to our centring and scaling approach. Additionally, equivalent Bayesian analyses were performed using the default settings of JASP [v0.18.3, 62], and interpreted according to the JASP guidelines [66].

Results

The assumption of sphericity was violated, as confirmed by Mauchy’s test for sphericity, W =  0.59, χ²2 =  51.09, p < .0001. Therefore, the reported results were corrected using the Greenhouse-Geisser correction. Our analyses showed that participants significantly rated the trustworthy faces as the most trustworthy (M =  0.16, SD =  0.47), the untrustworthy as the least trustworthy (M =  − 0.20, SD =  0.52), and the neutral faces in between the trustworthy and untrustworthy ratings (M =  0.05, SD =  0.48), F(1.14, 140.80) =  93.00, MSE =  0.050, p < .0001, =  0.48 (see Fig 1). The post hoc analyses revealed that the trustworthy faces received significantly higher trustworthiness ratings than the neutral and untrustworthy faces (t(99) =  3.93, p < .00001, d =  0.22, BF10 =  31246; t(99) =  13.28, p < .00001, d =  0.74, BF10 >  1000000, respectively). Additionally, the neutral faces were rated as more trustworthy than the untrustworthy faces, t(99) =  9.34, p < .00001, d =  0.52, BF10 >  1000000.

thumbnail
Fig 1. Results of the trustworthiness validation study.

On the x-axis you can find the different levels of trustworthiness (trustworthy, neutral, and untrustworthy), and on they y-axis the scaled and centred trustworthiness ratings. The black dots represent the mean of each level of trustworthiness and the black bars represent the error bars.

https://doi.org/10.1371/journal.pone.0319039.g001

We conducted a one-sample t-test to compare each level of trustworthiness to the centred level (0). The trustworthy faces received significantly higher ratings than zero (t(99) =  3.39, p = .001, d =  0.33, BF10 =  19), while the untrustworthy faces received significantly lower ratings than zero (t(99) =  − 3.93, p < .001, d =  − 0.39, BF10 =  120). However, the neutral faces did not significantly differ from zero (t(99) =  1.06, p = .29, d =  0.11, BF01 =  5.26).

Discussion experiment 1

In our validation study, we demonstrated that after applying the deep fake procedure to transform the static computer-generated images into a video format, animated trustworthy faces were rated as trustworthy, neutral faces as neutral, and untrustworthy faces as untrustworthy This was confirmed by both frequentist and Bayesian analyses. Additionally, the centred and scaled data for the neutral faces were not significantly different from zero; by contrast, the trustworthy (larger than zero) and untrustworthy (smaller than zero) faces were significantly different from the baseline (i.e., zero). Overall, the experiment demonstrated the success of our deep fake procedure.

Method experiment 2

Given the conclusive results of our validation study, the same procedure was applied when creating the animated faces used in our main experiment without rerating them. In this preregistered experiment (preregistration), we investigated whether perceived trustworthiness, as manipulated with our deep fake animations, modulates perceptual decision making.

Participants

We recruited 244 participants (Mage =  30.8, SDage =  5.75, 124 female, 119 male, and 2 who did not report their gender) on Prolific [59]. Participants were required to have at least 10 submissions on Prolific (www.prolific.com), be native English speakers between the ages of 18 and 40 and could not have participated in any of our previous studies. Participants whose accuracy was below or equal to 66% on the easy coherency trials (N =  23), those with an accuracy below or equal to 75% on our attention check (N =  31), or those who responded faster or slower than 1.5 interquartile range from the 25th or 75th percentile (N =  2), were excluded from the experiment. Of the excluded participants, 10 violated multiple exclusion criteria. The final sample size was 199 (Mage =  30.70, SDage =  5.77, with 98 females and 100 males; one participant did not report their gender). We did not deviate from our preregistration.

The sample size was determined through a power analysis using G * power [67] before the experiment. Since it is currently not possible to calculate the sample size of (general) linear mixed models with G * power, a classic repeated measures ANOVA design was assumed with a small effect size (f = .10), power of .80, and an alpha level of.05. Although we recognize that this sample size determination approach is suboptimal for (general) linear mixed models compared to simulations [68], we reasoned that the applied method is a more conservative estimate of the sample size; after all, linear mixed models account for more unexplained variance as they take both fixed and random effects into account.

In addition to the recruited participants, 73 participants started the experiment but either did not complete it on time or returned their submission. The median completion time was 21.15 minutes, and participants were compensated at a rate of £6 per hour, in accordance with Prolific’s guidelines.

The experiment and applied procedures were conducted according to the guidelines of the ethical committee of Ghent University (approval number: 2024-084W).

Apparatus and materials

For the trustworthiness stimulus, we applied the same deep fake procedure used in our validation study to four randomly selected faces (i.e., + 3SD four trustworthy, and -3SD four untrustworthy) from [4]. This database consists of 100 face identities with three distinct levels trustworthiness (e.g., untrustworthy, neutral, and untrustworthy) for each face identity (i.e., 300 faces in total). As we wanted to assure that each face consisted of a different face identity, we selected four new faces with a different face identity for each level of trustworthiness (i.e., trustworthy, untrustworthy). The source video used was of the first author of this paper saying ‘left’ or ‘right’. We combined this with the voices of four UK based male English native speakers, which were recorded in a previous experiment [69]. The audio was edited onto the deep fake videos afterwards. The deep fakes were sized 480 by 480 pixels and presented on a black background, serving as the advisor during our experiment.

The random dot motion task was programmed in Javascript using the JsPsych Library [60] and could only be run on Google Chrome. The ‘ROK’ plugin [70] was used to generate the random dot motion trials. The stimulus consisted of 100 dots presented in an ellipse that moved either to the left or right, with a speed setting in ROK of 40 (i.e., percentage of width ellipse/seconds). The size of the ellipse, dots, and moving distance were scaled according to the participants’ screen size. This rescaling was also applied to the deep fakes and static images. The task had two levels of difficulty, namely hard trials with a coherency of 5–10% or easy trials easy trials with a coherency of 25%. For the easy trials, the advice was always correct, for the hard trials, the advice was sometimes correct (78%) and sometimes incorrect (22%). This resulted in three trial types (i.e., easy, hard-correct; hard-incorrect). This distinction between hard and easy trials was based on an earlier study, which found stronger social bias effects in hard compared to easy trials [46]. In addition, the differences in validity were introduced to mimic the accuracy of the participants. This to increase the ecological validity of the advice context. Participants were required to indicate the direction of the coherent dots by pressing ‘c’ for left and ‘n’ for right. For the confidence ratings, participants indicated their level of confidence by moving a cursor on a continuous scale from 0 (not very confident) to 100 (very confident). The cursor always started in the middle and participants had to move their mouse before continuing.

The experiment was conducted in full-screen mode with a black background.

Procedure

Participants agreed with the informed consent form before starting the experiment by pressing the consent button with their computer mouse. An overview of the task was presented, and the participants received the task instructions for each phase. This modified random dot motion paradigm consisted of three phases: an introduction, a random dot motion, and a confidence rating phase (see Fig 2).

thumbnail
Fig 2. One trial of the experimental procedure.

In the figure, an example trial of our experimental procedure is presented. Participants were introduced to an advisor with whom they will play in the current trial. On some trials, a red cross was displayed, and participants were required to press the space bar as quickly as possible (within 1500 ms). This served as an attention task. Prior to the random dot motion task, a fixation cross was displayed for 1000 ms. During the random dot motion task, participants had 3000 ms to indicate the direction of the coherent dots. Simultaneously, the advisors indicated the coherent direction. Participants then indicated their confidence level by moving their mouse along a continuous scale. Following each trial, there was a 2000 ms intertrial interval.

https://doi.org/10.1371/journal.pone.0319039.g002

On each trial, the participants were first introduced (for 1500 ms) to a trustworthy or untrustworthy advisor (i.e., static image of the deep fake advisor) who would advise in the second phase of the random dot motion task. On 16% of the trials (counterbalanced over trustworthiness and block levels), a red cross was shown just above the nose of the advisors after 250 or 500ms. Participants had to press the space bar as fast as possible (max 1500 ms). This latter task served the purpose of letting the participants focus on the faces and as an attention check. Next a fixation cross was presented for 1000ms, after which the second phase started. During the second phase (i.e., Random dot motion task), participants had to indicate the direction of the moving dots, and after 300 ms the advisor, whose face was presented above the random dot task, indicated the direction they thought the moving dots were moving (i.e., by saying ‘left’ or ‘right’). On 50% of the trials the advice was left, and right on the remaining half. Participants had 3000 ms to indicate the direction by pressing the “c” (i.e., Left) or “n” (i.e., Right) key. Following a fixation cross of 1000 ms, the last phase started in which participants had to indicate how confident they were about their decision by moving their mouse on a continuous scale going from 0 (not very confident) to 100 (very confident). They had 10000 ms to decide. After an intertrial interval of 2000 ms, a new trial started. In total this procedure was repeated 96 times, divided over two blocks of each 48 trials. Prior to the experimental phase, participant did a practice run of six trials with a deep fake advisor with a neutral expression. In this practice run, there were four hard trials with correct advice and two easy trials (i.e., 50% left, 50% right).

Design

Our main measurement of interest was the advice alignment rate, that is whether the advice and the decision of the participants are similar. Additionally measured the choice decision time and confidence ratings. Our experiment was within subject.

Analyses

All trials in which participants responded faster than 100 ms or that were 2 standard deviations + / -from the participant’s mean were considered outliers and removed (5%). Additionally, when participants did not respond in time during the confidence ratings ( > 10000 ms), that trial was considered an outlier and removed from the confidence analyses (.005%).

As written in our preregistration, for our main analyses we only included hard trials with correct advice.

Main analyses.

For the advice alignment rate, we constructed a general linear mixed model and assumed a binomial distribution with a logistic regression with the “lme4” package [v1.1-34, 71,72]. The fixed effect structure consisted of trustworthiness (i.e., trustworthy, and untrustworthy), and the random effect structure was determined with the backward selection method [73] (See supplementary materials S1 Table). We calculated the p-values with type III Wald Chi-square test, and reported the odds ratio [74] and the asymptotic 95% confidence intervals as the effect size.

For the choice decision time and the confidence ratings, we constructed a linear mixed model with “lmerTest” [v3.1-3, 75] including trustworthiness (i.e., trustworthy, untrustworthy) and advice alignment (i.e., aligned, not aligned) as fixed effects. The random effect structures were determined with the backward selection method [73] (see supplementary materials S1 Table). P-values were calculated with the Sattherwaite (type =  III) method in R [v.4.3.1, 61]. The corresponding effect sizes and 95% confidence intervals were calculated with the “effectsize” package [v0.8.6, 76]. To determine the direction of the effects, we ran post hoc t-tests and corrected for multiple comparison with the false discovery rate. We controlled for normality of the residuals by checking the distribution of the residuals and quantile-quantile plots [77].

In addition to the preregistered analyses, equivalence tests were conducted to detect the absence of effects [78,79]. This was done by calculating the overlap of a region of practical equivalence and the 95% confidence intervals for each effect. Equivalence tests help to determine whether the observed effect falls within a region of practical equivalence. In the context of equivalence testing, the null hypothesis posits that the observed effect does not fall within the region of practical equivalence. Conversely the alternative hypothesis asserts that the effects does fall within this region. If the null hypothesis of the equivalence test is rejected, it indicates that the observed data falls entirely within this region, implying that the difference is not practically significant. Conversely, if the null hypothesis cannot be rejected, it suggests that the observed effect lies outside the region of practical equivalence, indicating that the difference may have practical significance [78,79]. We used the bayestestR package [v0.13.1, 80] to determine the region of practical equivalence and ran equivalence tests according to the “TOSTER” recommendations [78,79]. In addition to the p-values related to the hypothesis testing, we report the second-generation p-value (sgpv), a statistic that represents the proportion of data-supported hypotheses that are also null hypotheses [81], for each effect in our main analyses. If this is close to zero, this means that there was almost no overlap between the region of practical equivalence and the 95% confidence interval, and when it is closer to 1 there was almost a 100% overlap.

Results

Advice alignment rate

Our analyses revealed that the participants decisions did not align more with the trustworthy (78.3, 95% CI [76.3, 80.1]) advisor compared to the untrustworthy (79.2, 95% CI [77.3, 81.0]) advisor; χ²1 =  1.44, p = .23 (See Fig 3A; odds ratio: 0.94, 95% CI [0.86, 1.04]). See supplementary materials S2 Table for the model summary. Our equivalence test revealed that the main effect of trustworthiness falls within our region of practical equivalence 95%CI [−0.08, 0.02], p < .001, sgpv > .999 (regions of practical equivalence =  [−0.18, 0.18]), hereby, confirming that the difference in trustworthiness has no practical implications.

thumbnail
Fig 3. Graphical depiction results main experiment.

The marginal estimated means for the advice alignment rate (A), choice decision time (B), and confidence ratings (C) (i.e., y-axis) for the second experiment. On the x-axis you can find trustworthiness (i.e., trustworthy, untrustworthy). The graphs for choice decision times and confidence ratings are separated according to following (i.e., red) or not following (blue) the advisor. The black bars (graph A), and red and blue bars (graphs B, C) represent the error rates.

https://doi.org/10.1371/journal.pone.0319039.g003

Choice decision time

There was no significant difference in the choice decision time between trustworthy (1457 ms, 95% CI [1405, 1510]) and untrustworthy advisors (1458 ms, 95% CI [1406, 1510]), F(1, 155.37) =  0.003, p = .956, =  0 95% CI [0, 0.00]. Our equivalence test revealed that the main effect of trustworthiness did fall within our region of practical equivalence 95% CI [−10.14, 9.58], p < .001, sgpv > .999 (regions of practical equivalence =  [−51.20, 51.20]), confirming that the difference in trustworthiness has no practical implications. However, participants were generally faster when their decision aligned with the advice (1328 ms, 95% CI [1283, 1373]) than when it did not align (1588 ms, 95% CI [1526, 1650]), F(1, 182.28) =  215.08, p < .0001, =  0.56 95% CI [0.46, 0.63]. This was confirmed by our test of equivalence. Indeed, the main of effect of advice alignment [112.52, 147.24] did not fall within the regions of practical equivalence ([−51.20, 51.20]), p > .999, sgpv < .001 Lastly, there was no significant interaction (See Fig 3B) between trustworthiness and advice alignment, F(1, 155.02) =  0, p = .991, =  0 95% CI [0, 0]. See supplementary materials S3 Table for descriptives, and S4 Table for model summary). Our equivalence test revealed that the interaction between trustworthiness and advice alignment did fall with the region of practical equivalence 95% CI [−9.78, 9.98], p < .001, sgpv > .999, confirming that the interaction effect has no practical implications.

Confidence ratings

Our analyses revealed that there was no significant difference in confidence ratings after deciding when the advisor was trustworthy (62.6, 95% CI [60.8, 64.4]), compared to untrustworthy (62.9, 95% CI [61.1, 64.7]), F(1, 10178) =  0.55, p = .458, =  0 95% CI [0, 0]. This was confirmed by our equivalence test (region of practical equivalence [−2.04, 2.04]), which showed that the main effect of trustworthiness [−0.50, 0.23], falls within our region of practical equivalence, p < .001, sgpv > .999. Participants were, however, more confident when their decision aligned with the advice (70, 95% CI [68.4, 71.6]), compared to when they did not align (55.6, 95% CI [53.4, 57.8]), F(1, 191) =  300.51, p < .0001, =  0.68 95% CI [0.60, 0.74]. Our equivalence tests showed that the main effect of advice alignment ([−8.02, − 6.39]), did not fall within the region of practical equivalence p > .999, sgpv < .001. However, there was no significant interaction (See Fig 3C) between advice alignment and trustworthiness, F(1, 10159) =  3.16, p = .075, =  0 95% CI [0, 0] (see supplementary materials S5 Table for descriptives, and S6 Table for model summary). Our equivalence tests showed that the interaction effect ([−0.69, 0.03]) did fall within our region of equivalence, p < .001, sgpv > .999. As a result, any difference in the interaction effect between confidence and trustworthiness has no practical implication.

Exploratory analyses

For exploratory purposes, we conducted three additional analyses. First, we conducted the same analysis as above for advice alignment rate, choice decision time, and confidence ratings, but with difficulty (i.e., easy, hard correct, hard incorrect) as a fixed effect. These analyses were preregistered as exploratory. The random effect structure was determined using the backward selection criterion [73] (See supplementary materials S7 Table for our model comparisons). For further details, we refer to the analyses section of Experiment 2. Please note that we were only interested in the influence of difficulty and will therefore only report the results of these analyses (i.e., main and interaction effects).

Second, we investigated whether participants integrated the advice they received into their decision. This (non-preregistered) analysis was done to rule out the possibility that participants simply ignored the advice and relied solely on their own performance. Specifically, in our study, we included both hard trials with correct advice and hard trials with incorrect advice. Given that both trials had the same level of coherence (i.e., 5–10%), the rate of advice alignment should be different for both trial types if participants were influenced by the advice they received. To test this, we used a linear mixed model and compared the advice alignment rate in the different difficulty levels (i.e., hard correct, hard incorrect, easy). Please note that, for the sake of completeness, we have also included the easy trials in the analyses. The fixed effect structure consisted of difficulty (i.e., hard correct, hard incorrect, easy), and the random effect structure of (difficulty|subject). Post hoc analyses consisted of the z-ratio and were correct for multiple comparison with the false discovery rate. Please note that for the hard trials with incorrect advice, we inverted the advice alignment rate as aligning with the advice on such trials would lead the wrong answer. Thus, in the hard trials with incorrect advice, the correct answer was the opposite of the received advice.

Third, we explored the association between participant’s confidence and advice alignment rate. Again, this analysis was not registered. Previous work suggests that confidence is an important predictor of advice following [53]. Therefore, we investigated the impact of confidence on the previous trial on the probability of aligning with the advice on the current trial. We conducted two different analyses: one including only the hard correct trials, and one including all trials. The continuous fixed effect was the scaled confidence ratings on the previous trial, and the random effect was (confidence|subject) for both analyses.

Exploratory analysis 1: The effect of difficulty

Advice alignment rate.

When difficulty was included in the model, our analyses showed a significant main effect of difficulty, χ²2 =  554.69, p < .0001. Post hoc analyses revealed that the advice alignment rate was higher for easy trials (98.4, 95% CI [98.0, 99.2]), compared to hard correct advice trials (78.8, 95% CI [77.0, 80.2]); z =  14.11, p < .0001, odds ratio =  21.04 95% CI [12.55, 35.30], and hard incorrect advice trials (38.8, 95% CI [35.6, 42.2]), z =  19.20, p < .0001, odds ratio =  123.27, 95% CI [67.63, 224.70]). Likewise, the advice alignment rate was significantly higher for hard trials with correct advice compared to hard trials with incorrect advice, z =  22.09, p < .0001, odds ratio =  5.86, 95% CI [4.84, 7.09]). There was no significant interaction effect between trustworthiness and difficulty (See Fig 4A), χ²2 =  2.38, p = .304 (see Supplementary materials S8 Table for descriptives, and S9 Table for the model summary).

thumbnail
Fig 4. Graphical depiction results main experiment including difficulty.

Note. The marginal estimated mean of the advice alignment rate (A), choice decision times (B), confidence ratings (C) including trial difficulty for the second experiment. On the x-axis you can find levels of trustworthiness (i.e., trustworthy, untrustworthy). The graphs for choice decision times and confidence ratings are separated according to advice alignment (i.e., aligned: red, not aligned: blue), and the trustworthiness (i.e., trustworthy: trustworthy, untrustworthy). The black bars (graph A), and red and blue bars (graphs B, C) represent the error rates.

https://doi.org/10.1371/journal.pone.0319039.g004

Choice decision time.

When difficulty was included, our analyses revealed that there was a significant main effect of difficulty, F(2, 309.1) =  42.78, p < .0001, =  0.22 95% CI [0.14, 0.29]. Participants were faster on easy trials (1289 ms, 95% CI [1283, 1341]), compared to hard correct advice trials (1459 ms, 95% CI [1408, 1509]), t(911) =  − 9.24, p < .0001, β =  − 169.28, 95% CI [−213.2, − 125.3], and hard incorrect advice trials (1454 ms, 95% CI [1401, 1506]), t(516) =  − 8.30, p < .00001, β =  − 164.57, 95% CI [−212.2, − 116.9]. However, there was no significant difference between hard correct advice and hard incorrect advice trials, t(182) =  0.49, p = .624, β =  4.71, 95% CI [−18.5, 27.9] (See Fig 4B). Likewise, our analysis revealed a significant interaction effect between difficulty and advice alignment, F(2, 7841.8) =  242.39, p < .0001, =  0.06 95% CI [0.05, 0.07]. Post hoc analyses revealed that in the easy and hard correct advice conditions, participants were faster when their decision aligned with the advice than when it did not, t(2703) =  8.15, p < .0001, β =  259, 95% CI [183, 334.5]; t(233) =  18.43, p < .0001, β =  259, 95% CI [225, 292.6] respectively. However, in the hard incorrect advice trials, participants were faster when their decision did not align with the advice than when it did align, t(622) =  − 5.88, p < .0001, β =  − 105, 95% CI [−148, − 62.2] (see Supplementary materials S10 Table for descriptives, and S11 Table for the model summary). Moreover, there was no significant interaction effect between trustworthiness and difficulty, F(2, 16287.3) =  0.32, p = .726, =  0 95% CI [0, 0], nor between trustworthiness, difficulty, and advice alignment, F(2, 15931.0) =  0.85, p = .426, =  095% CI [0, 0].

Confidence ratings.

Our exploratory analyses revealed that participants reported the highest confidence levels on easy (71.1, 95% CI [69.1, 73.1]), compared to hard correct advice (63.3 95% CI [61.7, 64.9]), and hard incorrect advice trials (63.4, 95% CI [61.7, 65.2]), F(2, 358.5) =  41.24, p < .0001, =  0.19 95% CI [0.12, 0.26]. Post hoc analyses revealed that the confidence ratings were significantly higher on easy compared to hard correct advice trials, t(661) =  9.01, p < .0001, β =  7.81, 95% CI [5.73, 9.89] and hard incorrect advice trials, t(445) =  7.72, p < .0001, β =  7.66, 95% CI [5.27, 10.05]. There was no significant difference between hard correct advice and hard incorrect advice trials, t =  − 0.35, p = .727, β =  − 0.15, 95% CI [−1.14, 0.85]. In a similar vein, there was a significant interaction between advice alignment and difficulty (See Fig 4C), F(2, 14186.5) =  438.80, p < .0001, =  0.06 95% CI [0.04, 0.07]. On easy and hard correct advice trials, participants were significantly more confident when they aligned (79.6, 95% CI [78.0, 81.3]; 70.0, 95% CI [68.4, 71.7]), compared to when they did not align with the advice (62.5, 95% CI[59.6, 65.5]; 56.5, 95% CI [54.8, 58.2]), t(17292) =  13.49, p < .0001, β =  − 17.08, 95% CI [−20.11, − 14.05]; t(17485) =  36.79, p < .0001, β =  − 13.49, 95% CI [−14.37, − 12.61]. However, on hard incorrect advice trials, participants indicated higher confidence levels when their decision did not align to the advice (66.8, 95% CI [65.0, 68.6]), compared to when it did align (60.0, 95% CI [58.1, 61.9]) the advice, t(9525) =  − 11.28, p < .0001, β =  6.82, 95% CI [5.37, 8.26], see supplementary materials S12 Table for descriptives, and S13 Table for model summary). There was no significant interaction between trustworthiness and difficulty F(2,17484.7) =  2.23, p = .108, =  0 95% CI [0, 0.00], nor of advice alignment, task difficulty, and trustworthiness, F(2,17569.5) =  2.47, p = .085, =  0 95% CI [0, 0.00].

Overall, trial difficulty influenced participant’s performance. Participants were more likely to align with the advisor, made faster decisions, and reported higher levels of confidence on easy trials compared to hard trials (both correct and incorrect). Although the advice alignment rate was lowest on trials with incorrect advice, there was no significant difference in confidence ratings or choice decision time between hard correct and hard incorrect trials. Furthermore, confidence ratings were higher and choice decision time was faster on trials with correct advice (i.e., easy, hard correct). The opposite effect was found in hard trials with incorrect advice. Participants generally integrated advice into their decisions, especially on trials with correct advice, but rejected incorrect advice (see also Exploratory analysis 2). However, none of these effects were modulated by the trustworthiness of the advisor.

Exploratory analysis 2: Comparing estimated accuracy

The estimated accuracy was the highest on easy trials (98.7%, 95% CI [98.0, 99.2]) and hard correct trials (78.8, 95% CI [77.0, 80.5]), and the lowest on hard incorrect advice trials (61.2, 95% CI [57.8, 64.4]). Our analyses revealed a significant main effect of difficulty, χ²2 =  330.1, p < .0001. Post hoc analyses revealed a significant difference in the accuracy between easy and hard correct (z =  14.08, p < .0001, odds ratio =  21, 95% CI [12.65, 34.85]), between easy and hard incorrect (z =  17.65, p < .0001, odds ratio =  49.50, 95% CI [29.48, 83.11]. Most importantly, the difference between hard correct and hard incorrect trials was also significant z =  8.89, p < .0001. The probability of responding accurately was higher for hard trials with correct advice (79%, 95% CI [77,81]) compared to hard trials with incorrect advice (61%, 95% CI [58,64]), odds ratio =  2.36, 95% CI [1.87, 2.97].

The observed difference between hard correct and hard incorrect trials suggests that participants were affected by the advice provided. When inspecting individual advice alignment rates, only 15 out of 199 participants aligned with the advice on more than 70% of the trials, suggesting that the task was sufficiently challenging for most participants and that the observed variation in advice alignment was not due to a ceiling effect.

Exploratory analysis 3: Confidence and advice alignment rate

Our analyses revealed that when participants were more confident on the previous trial, they were less likely to follow the advice on the current trial. However, the effect was not significant for hard trials, χ²1 =  2.40, p = .122, β = − 0.05, 95% CI [−0.11, 0.01], and not significant when including all trials, χ²1 =  0.54, p = .464, β = − 0.02, 95% CI [−0.06, 0.03]. These non-significant effects could be attributed to the limited number of trials and the randomisation of trial difficulty.

General discussion

In this study, we examined the effect of perceived trustworthiness on perceptual decision-making. We successfully created animated versions of computer-generated faces that were previously evaluated as trustworthy or untrustworthy [4]. We applied deep fake models to these static images, and these were combined with audio to produce trustworthy and untrustworthy faces that could give verbal advice. We validated this novel procedure in Experiment 1, and our results showed that trustworthy faces were rated significantly more trustworthy than neutral or untrustworthy faces. Additionally, neutral faces did not significantly differ from the midpoint of our ratings, while trustworthy faces were rated significantly above it, and untrustworthy faces were rated significantly below it. Thus, applying deep fake models to computer-generated faces maintains the trustworthiness properties of the faces [4,7,12,82]. We believe that this novel approach offers researchers the opportunity to create interactive, animated, and more socially engaging versions of established facial databases, potentially enabling a more ecologically valid methodology in the study of social cognition and first impressions.

In Experiment 2, we applied this methodology to create a new set of advisors, which were used as ‘trustworthy’ and ‘untrustworthy’ interaction partners in a decision-making task. Our exploratory control analyses suggested that participants integrated the advice in their decision-making process. However, contrary to our predictions, there was no reliable influence of perceived trustworthiness on the advice alignment rate, confidence ratings, or speed with which participants made their decision. This failure to reject the null hypothesis was also confirmed by our equivalence test for all main effects of trustworthiness (i.e., for advice alignment rate, choice decision time, and confidence ratings) and interactions between trustworthiness and advice alignment (i.e., for choice decision time, and confidence ratings).

Participants reported higher confidence ratings and made decisions more quickly when their decisions aligned with the advice, compared to when they did not, regardless of the trustworthiness of these deep fake advisors. For the confidence ratings, this is in line with studies indicating that there is a confidence boost when using advice/information from multiple sources [58]. Similarly, our findings for choice decision times align with previous research, which demonstrated that participants made faster decisions when their choices matched the advice compared to when they did not [18]. Alternatively, this can be explained by task accuracy. Given that we only looked at hard trials with correct advice, it is also possible that participants were more confident because their decision was correct.

The lack of effect for trustworthiness is surprising, given the substantial evidence that perceived trustworthiness influences economic choices [17,83], criminal judgements [21], risk taking behaviour [19], consumer decisions on online platforms [84], and loan decision from creditors [85]. Previous research demonstrated that perceptual trustworthiness also has an impact on advice taking in a value-based decision-making context [18]. We did not find such effects here. However, in the studies conducted by [18], participants engaged in a two-choice decision-making task where they selected one of two doors to find the door hiding a reward. These tasks are considered subjective as there are multiple options and there is no objectively correct answer. In the current study, the task is objective, and there is only one correct answer (i.e., the coherent movement is either to the left or to the right). In their meta-analysis [34], found that individuals are more receptive to advice in subjective tasks with multiple answers [52,53,83] compared to objective tasks, potentially explaining the difference between the present study and previous work. In a similar vein, participants could rely on their own skills to complete the task. It is therefore possible that participants simply did not pay attention to the social characteristics of the advisors or did not even look at the advisors at all, at least when deciding on the direction of the moving dots. For example, in earlier work showing social biases towards perceived trustworthy faces, the only cues the participants could rely on were the faces and feedback following their decision [18]. Even more, the provided advice was not auditory, but the advisors moved to the option they deemed the best. Here, there was an initial bias to follow perceived trustworthy advisors more, but this effect disappeared throughout the experiment.

An alternative interpretation for the null effect is that perceived trustworthiness is simply not a predictor of advice quality within a perceptual decision-making context. Research has shown that the influence of facial features, such as trustworthiness, is domain specific (For a review see [7]). For instance, when it comes to elections, competence is a significant predictor [86,87], whereas in a military setting [88], facial dominance is an important factor for career advancement. In the context of perceptual decision-making, earlier work demonstrated that the task reputation (i.e., very good at the task, or very bad at the task) of our interaction partner as indicated by visual cues (i.e., 1 star =  low reputation, 3 stars =  high reputation), influences advice utilisation [46]. Crucially, task reputation was also reflected in the actual accuracy of the advisor (i.e., high reputation – more accurate, low reputation – less accurate). Our paradigm was conceptually similar to [46], but our findings differ from theirs. In our experiment, the advice accuracy was identical for both trustworthy and untrustworthy advisors, meaning that perceptual trustworthiness was not directly indicative of advice accuracy. It is possible that participants recognised this lack of connection. Consequently, our results provide further evidence that social biases are more likely to emerge only when there is a direct association between the social trait and the quality of the advice—specifically, the accuracy of the advisors. This appears to hold true, at least, in an objective task context [34].

Lastly, it is possible that our advice alignment measurement was not sensitive enough to detect potentially small effects of perceived trustworthiness. While we were able to measure when the decision of the participants aligned with the advice, how fast participants decided, and how confident participants were after deciding, it was not possible to directly measure to what extent participants changed their opinion based on the advice [34,89,90]. One paradigm, allowing such measures, is the judge-advisor system [34,36,39,90]. Future studies could integrate deep fake manipulations with advice taking paradigms such as the judge-advisor system to further explore social biases within an advice taking context. Moreover, one could make the task more cooperative, for example by providing a reward for team efforts [91,92], or in general a reward for a correct answer [58]. This may encourage participants to engage more with the advisor and focus more closely on their social characteristics.

Overall, our findings demonstrate, that unlike value-based decision-making [18], perceptual decision making does not seem to be modulated by the perceived trustworthiness of the advisors. This highlights that social biases such as perceived trustworthiness, are potentially domain-specific [7], and that their influence depends on the task requirements [34].

Conclusion

Advice is an essential and interactive aspect of our decision-making processes in our daily life. However, the impact of the advisor’s social characteristics on advice-taking, especially in objective task contexts, is not well comprehended. In this study, we have successfully developed a new procedure to create dynamic trustworthy and untrustworthy faces. In a subsequent step, we examined the impact of perceptual trustworthiness of these faces on perceptual decision-making. Although we observed an effect of advice alignment on confidence ratings and choice decision times, we did not find any modulation by the trustworthiness of the advisor. We hypothesise that this may be due to the absence of a correlation between the trustworthiness characteristics and the quality of the advice, the domain specificity of perceptual trustworthiness, or task requirements.

Supporting information

S2 Table. Model summary for the advice alignment rate (logit).

https://doi.org/10.1371/journal.pone.0319039.s002

(DOCX)

S3 Table. Descriptives estimated marginal means choice decision time.

https://doi.org/10.1371/journal.pone.0319039.s003

(DOCX)

S4 Table. Model summary for the choice decision time.

https://doi.org/10.1371/journal.pone.0319039.s004

(DOCX)

S5 Table. Descriptives estimated marginal means confidence ratings.

https://doi.org/10.1371/journal.pone.0319039.s005

(DOCX)

S6 Table. Model summary for the confidence ratings.

https://doi.org/10.1371/journal.pone.0319039.s006

(DOCX)

S7 Table. Models experiment 2 including difficulty.

https://doi.org/10.1371/journal.pone.0319039.s007

(DOCX)

S8 Table. Descriptives estimated marginal means advice alignment rate including difficulty.

https://doi.org/10.1371/journal.pone.0319039.s008

(DOCX)

S9 Table. Model summary for advice alignment including difficulty (logit).

https://doi.org/10.1371/journal.pone.0319039.s009

(DOCX)

S10 Table. Descriptives estimated marginal means choice decision time including difficulty.

https://doi.org/10.1371/journal.pone.0319039.s010

(DOCX)

S11 Table. Model summary for the choice decision time including difficulty.

https://doi.org/10.1371/journal.pone.0319039.s011

(DOCX)

S12 Table. Descriptives estimated marginal means confidence ratings including difficulty.

https://doi.org/10.1371/journal.pone.0319039.s012

(DOCX)

S13 Table. Model summary for confidence ratings including difficulty.

https://doi.org/10.1371/journal.pone.0319039.s013

(DOCX)

Acknowledgments

We used DeepL Write, and ChatGPT 4o for a spelling and language check. We thank Doctor Zhang Chen for proofreading the manuscript.

References

  1. 1. Kameda T, Toyokawa W, Tindale RS. Information aggregation and collective intelligence beyond the wisdom of crowds. Nat Rev Psychol. 2022;1(6):345–57.
  2. 2. Ache F, Rader C, Hütter M. Advisors want their advice to be used – but not too much: An interpersonal perspective on advice taking. J Exp Soc Psychol. 2020;89:103979.
  3. 3. Belkin LY, Kong DT. Implications of advice rejection in repeated exchanges: Advisor responses and advisee gratitude expression as a buffer. J Exp Soc Psychol. 2018;78:181–94.
  4. 4. Oosterhof NN, Todorov A. The functional basis of face evaluation. Proc Natl Acad Sci U S A. 2008;105(32):11087–92. pmid:18685089
  5. 5. Pandeirada JNS, Fernandes NL, Madeira M, Marinho PI, Vasconcelos M. Can I trust this person? Evaluations of trustworthiness from faces and relevant individual variables. Front Psychol. 2022;13:857511. pmid:35619794
  6. 6. Todorov A. Evaluating faces on trustworthiness: An extension of systems for recognition of emotions signaling approach/avoidance behaviors. Ann N Y Acad Sci. 2008;1124:208–24. pmid:18400932
  7. 7. Todorov A, Olivola CY, Dotsch R, Mende-Siedlecki P. Social attributions from faces: Determinants, consequences, accuracy, and functional significance. Annu Rev Psychol. 2015;66:519–45. pmid:25196277
  8. 8. Krueger F, McCabe K, Moll J, Kriegeskorte N, Zahn R, Strenziok M, et al. Neural correlates of trust. Proc Natl Acad Sci U S A. 2007;104(50):20084–9. pmid:18056800
  9. 9. Krueger F, Meyer-Lindenberg A. Toward a model of interpersonal trust drawn from neuroscience, psychology, and economics. Trends Neurosci. 2019;42(2):92–101. pmid:30482606
  10. 10. Krumhuber E, Manstead ASR, Cosker D, Marshall D, Rosin PL, Kappas A. Facial dynamics as indicators of trustworthiness and cooperative behavior. Emotion. 2007;7(4):730–5. pmid:18039040
  11. 11. De Neys W, Hopfensitz A, Bonnefon J-F. Split-second trustworthiness detection from faces in an economic game. Exp Psychol. 2017;64(4):231–9. pmid:28922996
  12. 12. Todorov A, Pakrashi M, Oosterhof NN. Evaluating faces on trustworthiness after minimal time exposure. Soc Cogn. 2009;27(6):813–33.
  13. 13. Willis J, Todorov A. First impressions: Making up your mind after a 100-ms exposure to a face. Psychol Sci. 2006;17(7):592–8. pmid:16866745
  14. 14. Cogsdill EJ, Todorov AT, Spelke ES, Banaji MR. Inferring character from faces: A developmental study. Psychol Sci. 2014;25(5):1132–9. pmid:24570261
  15. 15. Horta M, Shoenfelt A, Lighthall NR, Perez E, Frazier I, Heemskerk A, et al. Age-group differences in trust-related decision-making and learning. Sci Rep. 2024;14(1):68. pmid:38167997
  16. 16. Sakuta Y, Kanazawa S, Yamaguchi MK. Infants prefer a trustworthy person: An early sign of social cognition in infants. PLoS One. 2018;13(9):e0203541. pmid:30188941
  17. 17. van ’t Wout M, Sanfey AG. Friend or foe: The effect of implicit trustworthiness judgments in social decision-making. Cognition. 2008;108(3):796–803. pmid:18721917
  18. 18. Van der Biest M, Verschooren S, Verbruggen F, Brass M. Don’t judge a book by its cover: The effect of perceived facial trustworthiness on advice following in the context of value-based decision-making. J Exp Soc Psychol. 2025;118:104719.
  19. 19. Qi Y, Luo Y, Feng Y, Chen Z, Li Q, Du F, et al. Trustworthy faces make people more risk-tolerant: The effect of facial trustworthiness on risk decision-making under gain and loss conditions. Psych J. 2022;11(1):43–50. pmid:34747121
  20. 20. Flowe HD. Do characteristics of faces that convey trustworthiness and dominance underlie perceptions of criminality? PLoS One. 2012;7(6):e37253. pmid:22675479
  21. 21. Porter S, Ten Brinke L, Gustaw C. Dangerous decisions: The impact of first impressions of trustworthiness on the evaluation of legal evidence and defendant culpability. Psychol Crime Law. 2010;16(6):477–91.
  22. 22. Rule NO, Slepian ML, Ambady N. A memory advantage for untrustworthy faces. Cognition. 2012;125(2):207–18. pmid:22874071
  23. 23. Hauser CK, Salinas E. Perceptual decision making. In: Jaeger D, Jung R, editors. Encyclopedia of Computational Neuroscience. New York, NY: Springer; 2013. p. 1–21. https://doi.org/10.1007/978-1-4614-7320-6_317-1
  24. 24. van Maanen L, Grasman RPPP, Forstmann BU, Keuken MC, Brown SD, Wagenmakers E-J. Similarity and number of alternatives in the random-dot motion paradigm. Atten Percept Psychophys. 2012;74(4):739–53. pmid:22287207
  25. 25. Baumgart KG, Byvshev P, Sliby A-N, Strube A, König P, Wahn B. Neurophysiological correlates of collective perceptual decision-making. Eur J Neurosci. 2020;51(7):1676–96. pmid:31418946
  26. 26. Bahrami B, Didino D, Frith C, Butterworth B, Rees G. Collective enumeration. J Exp Psychol Hum Percept Perform. 2013;39(2):338–47. pmid:22889187
  27. 27. Hamada D, Nakayama M, Saiki J. Wisdom of crowds and collective decision-making in a survival situation with complex information integration. Cogn Res Princ Implic. 2020;5(1):48. pmid:33057843
  28. 28. Laughlin PR, Hatch EC, Silver JS, Boh L. Groups perform better than the best individuals on letters-to-numbers problems: Effects of group size. J Pers Soc Psychol. 2006;90(4):644–51. pmid:16649860
  29. 29. Wahn B, Karlinsky A, Schmitz L, König P. Let’s move it together: A review of group benefits in joint object control. Front Psychol. 2018;9:918. pmid:29930528
  30. 30. Martel C, Allen J, Pennycook G, Rand DG. Crowds can effectively identify misinformation at scale. Perspect Psychol Sci. 2024;19(2):477–88. pmid:37594056
  31. 31. Bahrami B, Olsen K, Latham PE, Roepstorff A, Rees G, Frith CD. Optimally interacting minds. Science. 2010;329(5995):1081–5. pmid:20798320
  32. 32. Pescetelli N, Rees G, Bahrami B. The perceptual and social components of metacognition. J Exp Psychol Gen. 2016;145(8):949–65. pmid:27454040
  33. 33. Bang D, Fusaroli R, Tylén K, Olsen K, Latham PE, Lau JYF, et al. Does interaction matter? Testing whether a confidence heuristic can replace interaction in collective decision-making. Conscious Cogn. 2014;26(100):13–23. pmid:24650632
  34. 34. Bailey PE, Leon T, Ebner NC, Moustafa AA, Weidemann G. A meta-analysis of the weight of advice in decision-making. Curr Psychol. 2023;42(28):24516–41. pmid:39711945
  35. 35. Yaniv I, Choshen-Hillel S. When guessing what another person would say is better than giving your own opinion: Using perspective-taking to improve advice-taking. J Exp Soc Psychol. 2012;48(5):1022–8.
  36. 36. Yonah M, Kessler Y. “They don’t know better than I do”: People prefer seeing for themselves over using the wisdom of crowds in perceptual decision making. J Cogn. 2021;4(1):30. pmid:34222788
  37. 37. Herzog SM, Hertwig R. The wisdom of many in one mind: Improving individual judgments with dialectical bootstrapping. Psychol Sci. 2009;20(2):231–7. pmid:19170937
  38. 38. Meshi D, Biele G, Korn CW, Heekeren HR. How expert advice influences decision making. PLoS One. 2012;7(11):e49748. pmid:23185425
  39. 39. Bonaccio S, Dalal RS. Advice taking and decision-making: An integrative literature review, and implications for the organizational sciences. Organ Behav Hum Decis Process. 2006;101(2):127–51.
  40. 40. Cialdini RB. Influence: The psychology of persuasion. Rev. ed., [Nachdr.]. New York, NY: Collins; 2007.
  41. 41. Pescetelli N, Hauperich A-K, Yeung N. Confidence, advice seeking and changes of mind in decision making. Cognition. 2021;215:104810. pmid:34147712
  42. 42. Wang X, Du X. Why does advice discounting occur? The combined roles of confidence and trust. Front Psychol. 2018;9:2381. pmid:30555394
  43. 43. Ciranka SK, van den Bos W. A bayesian model of social influence under risk and uncertainty. PsyArXiv; 2020. https://doi.org/10.31234/osf.io/mujek
  44. 44. Melamed D, Savage SV, Munn C. Uncertainty and social influence. Socius: Sociol Res Dyn World. 2019;5.
  45. 45. Pfeffer J, Salancik GR, Leblebici H. The effect of uncertainty on the use of social influence in organizational decision making. Adm Sci Q. 1976;21(2):227.
  46. 46. Qi S, Footer O, Camerer CF, Mobbs D. A collaborator’s reputation can bias decisions and anxiety under uncertainty. J Neurosci. 2018;38(9):2262–9. pmid:29378862
  47. 47. Van Swol LM, Ludutsky CL. Tell me something I don’t know: Decision makers’ preference for advisors with unshared information. Commun Res. 2007;34(3):297–312.
  48. 48. Van Swol LM, Sniezek JA. Factors affecting the acceptance of expert advice. Br J Soc Psychol. 2005;44(Pt 3):443–61. pmid:16238848
  49. 49. Morgan TJH, Rendell LE, Ehn M, Hoppitt W, Laland KN. The evolutionary basis of human social learning. Proc Biol Sci. 2012;279(1729):653–62. pmid:21795267
  50. 50. Lynn FB, Podolny JM, Tao L. A sociological (De)construction of the relationship between status and quality. Am J Sociol. 2009;115(3):755–804.
  51. 51. Jaeger B, Evans A, Stel M, van Beest I. Explaining the persistent influence of facial cues in social decision-making. 2018.
  52. 52. Clements MF, Brübach L, Glazov J, Gu S, Kashif R, Catmur C, et al. Measuring trust with the wayfinding task: Implementing a novel task in immersive virtual reality and desktop setups across remote and in-person test environments. PLoS One. 2023;18(11):e0294420. pmid:38015928
  53. 53. Van der Biest M, Cracco E, Wisniewski D, Brass M, González-García C. Investigating the effect of trustworthiness on instruction-based reflexivity. Acta Psychol (Amst). 2020;207:103085. pmid:32416515
  54. 54. Siarohin A, Lathuilière S, Tulyakov S, Ricci E, Sebe N. First order motion model for image animation. Adv Neural Inf Process Syst. 2019. Curran Associates, Inc. Available from: https://papers.nips.cc/paper/2019/hash/31c0b36aef265d9221af80872ceb62f9-Abstract.html
  55. 55. Skewes J, Frith C, Overgaard M. Awareness and confidence in perceptual decision-making. Brain Multiphys. 2021;2:100030.
  56. 56. Vafaei Shooshtari S, Esmaily Sadrabadi J, Azizi Z, Ebrahimpour R. Confidence representation of perceptual decision by EEG and eye data in a random dot motion task. Neuroscience. 2019;406:510–27. pmid:30904664
  57. 57. Zizlsperger L, Sauvigny T, Händel B, Haarmeier T. Cortical representations of confidence in a visual perceptual decision. Nat Commun. 2014;5:3940. pmid:24899466
  58. 58. Lorenz J, Rauhut H, Schweitzer F, Helbing D. How social influence can undermine the wisdom of crowd effect. Proc Natl Acad Sci U S A. 2011;108(22):9020–5. pmid:21576485
  59. 59. Prolific. In: Prolific [Internet]. 2024. Available: https://www.prolific.com
  60. 60. de Leeuw JR. jsPsych: A JavaScript library for creating behavioral experiments in a Web browser. Behav Res Methods. 2015;47(1):1–12. pmid:24683129
  61. 61. R core Team. A language and environment for statistical computing. R Foundation for Statistical Computing, Vienna, Austria; 2023. Available: https://www.R-project.org/
  62. 62. JASP Team. JASP team (2024). JASP (Version 0.18.3) [Computer software]. 2024.
  63. 63. de Winter J, Dodou D. Five-point likert items: t test versus Mann–Whitney–Wilcoxon. Pract Assess Res Eval. 2010;15.
  64. 64. Fagerland MW, Sandvik L, Mowinckel P. Parametric methods outperformed non-parametric methods in comparisons of discrete numerical variables. BMC Med Res Methodol. 2011;11:44. pmid:21489231
  65. 65. Haynes W. Holm’s method. In: Dubitzky W, Wolkenhauer O, Cho K-H, Yokota H, editors. Encyclopedia of systems biology. New York, NY: Springer; 2013. p. 902–902. https://doi.org/10.1007/978-1-4419-9863-7_1214
  66. 66. van Doorn J, van den Bergh D, Böhm U, Dablander F, Derks K, Draws T, et al. The JASP guidelines for conducting and reporting a Bayesian analysis. Psychon Bull Rev. 2021;28(3):813–26. pmid:33037582
  67. 67. Faul F, Erdfelder E, Lang A-G, Buchner A. G*Power 3: A flexible statistical power analysis program for the social, behavioral, and biomedical sciences. Behav Res Methods. 2007;39(2):175–91. pmid:17695343
  68. 68. Brysbaert M, Stevens M. Power analysis and effect size in mixed effects models: A tutorial. J Cogn. 2018;1(1):9. pmid:31517183
  69. 69. Van der Biest M, Pedinoff R, Verbruggen F, Brass M, Kuhlen AK. Instructing somebody else to act: Motor co-representations in the instructor. R Soc Open Sci. 2024;11(1):230839. pmid:38204793
  70. 70. Strittmatter Y, Spitzer MWH, Kiesel A. A random-object-kinematogram plugin for web-based research: Implementing oriented objects enables varying coherence levels and stimulus congruency levels. Behav Res Methods. 2023;55(2):883–98. pmid:35503167
  71. 71. Baayen RH, Davidson DJ, Bates DM. Mixed-effects modeling with crossed random effects for subjects and items. J Mem Lang. 2008;59(4):390–412.
  72. 72. Bates D, Mächler M, Bolker B, Walker S. Fitting linear mixed-effects models Using lme4. J Stat Soft. 2015;67(1).
  73. 73. Matuschek H, Kliegl R, Vasishth S, Baayen H, Bates D. Balancing Type I error and power in linear mixed models. J Mem Lang. 2017;94:305–15.
  74. 74. Kim H-Y. Statistical notes for clinical researchers: Chi-squared test and fisher’s exact test. Restor Dent Endod. 2017;42(2):152–5. pmid:28503482
  75. 75. Kuznetsova A, Brockhoff PB, Christensen RHB. lmerTest package: Tests in linear mixed effects models. J Stat Soft. 2017;82(13).
  76. 76. Ben-Shachar M, Lüdecke D, Makowski D. Effectsize: Estimation of effect size indices and standardized parameters. JOSS. 2020;5(56):2815.
  77. 77. Ghasemi A, Zahediasl S. Normality tests for statistical analysis: A guide for non-statisticians. Int J Endocrinol Metab. 2012;10(2):486–9. pmid:23843808
  78. 78. Lakens D. Equivalence tests: A practical primer for t tests, correlations, and meta-analyses. Soc Psychol Personal Sci. 2017;8(4):355–62. pmid:28736600
  79. 79. Lakens D, Scheel AM, Isager PM. Equivalence testing for psychological research: A tutorial. Adv Methods Pract Psychol Sci. 2018;1(2):259–69.
  80. 80. Makowski D, Ben-Shachar M, Lüdecke D. bayestestR: Describing effects and their uncertainty, existence and significance within the bayesian framework. JOSS. 2019;4(40):1541.
  81. 81. Blume JD, Greevy RA, Welty VF, Smith JR, Dupont WD. An introduction to second-generation p -values. Am Stat. 2019;73(sup1):157–67.
  82. 82. Todorov A, Oosterhof N. Modeling social perception of faces [social sciences]. IEEE Signal Process Mag. 2011;28(2):117–22.
  83. 83. Chang LJ, Doll BB, van ’t Wout M, Frank MJ, Sanfey AG. Seeing is believing: Trustworthiness as a dynamic belief. Cogn Psychol. 2010;61(2):87–105. pmid:20553763
  84. 84. Ert E, Fleischer A, Magen N. Trust and reputation in the sharing economy: The role of personal photos in Airbnb. Tour Manag. 2016;55:62–73.
  85. 85. Duarte J, Siegel S, Young L. Trust and credit: The role of appearance in peer-to-peer lending. Rev Financ Stud. 2012;25(8):2455–84.
  86. 86. Olivola CY, Todorov A. Elected in 100 milliseconds: Appearance-based trait inferences and voting. J Nonverbal Behav. 2010;34(2):83–110.
  87. 87. Sussman AB, Petkova K, Todorov A. Competence ratings in US predict presidential election outcomes in Bulgaria. J Exp Soc Psychol. 2013;49(4):771–5.
  88. 88. Mueller U, Mazur A. Facial dominance of west point cadets as a predictor of later military rank. Soc Forces. 1996;74(3):823.
  89. 89. Hütter M, Ache F. Seeking advice: A sampling approach to advice taking. Judgm decis mak. 2016;11(4):401–15.
  90. 90. Sniezek JA, Buckley T. Cueing and cognitive conflict in judge-advisor decision making. Organ Behav Hum Decis Process. 1995;62(2):159–74.
  91. 91. Engemann DA, Bzdok D, Eickhoff SB, Vogeley K, Schilbach L. Games people play-toward an enactive view of cooperation in social neuroscience. Front Hum Neurosci. 2012;6:148. pmid:22675293
  92. 92. Tomasello M. Why we cooperate. Cambridge, MA, US: MIT Press; 2009. p. xviii, 206.