An active inference model reveals a failure to adapt interoceptive precision estimates across depression, anxiety, eating, and substance use disorders

Recent neurocomputational theories have hypothesized that abnormalities in prior expectations and/or the precision-weighting of afferent interoceptive signals (i.e., the degree to which afferent bodily signals contribute to interoceptive perceptual inference), may facilitate the transdiagnostic emergence of psychopathology. Specifically, it has been suggested that, in certain psychiatric disorders, interoceptive processing mechanisms either over-weight prior expectations or under-weight signals from the viscera (or both), leading to a failure to accurately update beliefs about the body (potentially resulting in visceral dysregulation among other maladaptive phenomena). However, this has not been directly tested empirically. To evaluate the potential roles of prior expectations and interoceptive precision in this context, we fitted behavior in a transdiagnostic patient population on an interoceptive awareness (heartbeat tapping) task to a Bayesian computational model based on active inference (i.e., in which interoceptive inference can be cast as prediction-error minimization). Modeling revealed that, during an interoceptive perturbation condition (inspiratory breath-holding during heartbeat tapping), healthy individuals (N=52) assigned greater precision to ascending cardiac signals than individuals with symptoms of anxiety (N=15), depression (N=69), co-morbid depression/anxiety (N=153), substance use disorders (N=131), and eating disorders (N=14) - who failed to increase their precision estimates from resting levels. In contrast, no differences were found in prior expectations. These results provide the first empirical computational modeling evidence of a selective dysfunction in adaptive interoceptive processing in psychiatric conditions, and lay the groundwork for future studies examining how reduced interoceptive precision influences body regulation and interoceptively-guided decision-making.

Introduction mental disorders may be an inability of the brain to update its model of the body in the face of 108 interoceptive prediction errors (i.e., mismatches between expected and received afferent 109 interoceptive signals from the body). In predictive coding and active inference models, this kind 110 of aberrant belief updating is thought to come about through a dysfunctional "precision 111 weighting" mechanism, which governs the relative influence of prior expectations and afferent 112 bodily signals in determining perception (and informing visceral regulation). Simply put, it is 113 suggested that, across multiple mental health conditions, the brain may treat afferent bodily 114 signals (and associated prediction errors) as though they are not reliable indicators of bodily 115 states during interoceptive inferenceleading perception to be insufficiently constrained by true 116 visceral states and primarily determined by (in many cases maladaptive) prior expectations. 117 Misestimating the state of the body could in turn promote a number of transdiagnostic 118 symptoms. For example, interoceptive feelings are intimately tied to emotions (57-59); poor 119 body perception (e.g., high uncertainty about internal physiological conditions and a resulting 120 inability to efficiently regulate them) could thus maintain unpleasant emotional states. Chronic 121 underestimates of available metabolic resources may contribute to apathy and anhedonia (56), 122 and overestimates of the evidence uncomfortable bodily sensations provide for physical threat 123 (e.g., a heart attack) may contribute to anxiety and panic (60,61). 124 It's important to emphasize, however, that computational models often include several additional 125 parameters. Aside from the precision-weighting of sensory signals, individuals can also have 126 distinct differences in (for example) prior expectations about what they will perceive (assumed in 127 any Bayesian model of perception; e.g., predictive coding (62)), and differences in how quickly 128 they update those prior expectations over repeated observations (i.e., "learning rate"). Formal 129 computational models are often necessary to distinguish which parameters show differences 130 . CC-BY-NC-ND 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity.

(which was not certified by peer review)
The copyright holder for this preprint this version posted June 5, 2020. . https: //doi.org/10.1101//doi.org/10. /2020 between individuals and best explain differences in perception. Thus, there are multiple 131 computational mechanisms that could account for individual differences in interoception in 132 clinical populations. One major goal in computational psychiatry is to "computationally 133 phenotype" patients by identifying which sets of a parameter values best account for their neural 134 and behavioral responses (including self-reported perceptual experience) and use this 135 information to guide treatment (20,22,25). 136 In the present study, we apply a novel computational phenotyping approach, using a Bayesian 137 (active inference) model of perception, to identify the computational parameters that best explain 138 behavior on a cardiac perception (heartbeat tapping) task performed by a transdiagnostic clinical 139 sample of individuals with psychiatric disorders as well as a healthy comparison (HC) sample. 140 Given the notably poor cardiac perception of most human beings at rest (6), we chose to assess 141 individual differences in cardiac interoceptive precision in the context of normal physiological 142 baseline states as well as during a non-invasive interoceptive perturbation (a breath-hold) 143 condition that was expected to improve cardiac perception above floor values in a greater 144 number of individuals (i.e., we expected that cardiac perception would be generally poor during 145 resting conditions, and that the breath-hold condition would result in improved performance on 146 average). Our primary aims were to 1) demonstrate the sensitivity of our novel computational 147 approach in measuring the precision weighting of interoceptive signals and prior expectations 148 across a transdiagnostic sample of individuals with depression, anxiety, substance use disorders, 149 and/or eating disorders, 2) test the hypothesis (as previously proposed; e.g., (10,38,56)) that 150 these patient groups would show lower interoceptive precision weightings than HCs, more 151 precise prior expectations than HCs, or both, and 3) explore whether prior expectations and/or 152 . CC-BY-NC-ND 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity.

(which was not certified by peer review)
The copyright holder for this preprint this version posted June 5, 2020. . https: //doi.org/10.1101//doi.org/10. /2020 interoceptive precision is abnormal in general or selectively within resting or interoceptive  . CC-BY-NC-ND 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. (which was not certified by peer review) The copyright holder for this preprint this version posted June 5, 2020. . https://doi.org/10.1101/2020 Heartbeat perception task 186 As part of the T1000 project, participants completed a large number of assessments, self-report 187 measures, and behavioral tasks (detailed in (63)). Here we focus on data from a cardiac 188 perception task on which we have previously reported (i.e., on a subset of the participants 189 reported here, with analyses unrelated to computational modeling (70, 71)), wherein participants 190 were asked to behaviorally indicate the times at which they felt their heartbeat. The utilization of 191 the heartbeat tapping measure as an index of perception was based on a previously developed 192 heartbeat tapping task (40); for a more recent example, see (72)). The task was repeated under 193 multiple conditions designed to assess the influence of cognitive strategy and physiological 194 perturbation on performance. In the initial task condition, participants were simply instructed to 195 close their eyes and press down on a key when they felt their heartbeat, to try to mirror their 196 heartbeat as closely as possible, and even if they weren't sure they should take their best guess 197 (the "guessing" condition). Participants completed this (and each other) task condition over a 198 period of 60 seconds. In the second task condition, all instructions were identical except that they 199 were told to only press the key when they actually feel their heartbeat, and if they do not feel 200 their heartbeat then they should not press the key (the "no-guessing" condition). In other words, 201 unlike the first time they completed the task, they were specifically instructed not to guess if they 202 didn't feel anything. Finally, in the perturbation condition, participants were again instructed not 203 to guess but were also asked to first empty their lungs of all air and then inhale as deeply as 204 possible and hold it for as long as they could tolerate (up to the length of the one-minute trial) 205 while reporting their perceived heartbeat sensations. This third condition (the "breath-hold" 206 condition) was used in an attempt to putatively increase the strength of the afferent cardiac signal 207 by increasing physiological arousal. We expected that cardiac perception would be poor in the 208 . CC-BY-NC-ND 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. (which was not certified by peer review) The copyright holder for this preprint this version posted June 5, 2020. . https://doi.org/10.1101/2020 guessing condition, that tapping would be more conservative in the no-guessing condition, and 209 that the breath-hold condition would result in improved performance on average. As a control 210 condition, we also included an identical task where participants were instructed to tap every time 211 they heard a 1000Hz auditory tone presented for 100ms (78 tones, randomly jittered by +/-10% 212 and presented in a pattern following a sine curve with a frequency of 13 cycles/minute, 213 mimicking the range of respiratory sinus arrhythmia during a normal breathing range of 13 214 breaths per minute). This was completed between the first (guessing) and second (no-guessing) 215 heartbeat tapping conditions. 216 Directly after completing each task condition, individuals were asked the following using a 217 visual analogue scale: 218 "How accurate was your performance?" 219 "How difficult was the previous task?" 220 "How intensely did you feel your heartbeat?" 221 Each scale had anchors of "not at all" and "extremely" on the two ends. Numerical scores could 222 range from 0 to 100.

223
Computational model 224 To model behavior on the heartbeat tapping task, we first divided each task time series into 225 intervals corresponding to the periods of time directly before and after each heartbeat. Potentially 226 perceivable heartbeats were based on the timing of the peak of the electrocardiogram (EKG) R-227 wave (signaling electrical depolarization of the atrioventricular neurons of the heart) + 200 228 milliseconds (ms). This 200 ms interval was considered a reasonable estimate of participants' 229 . CC-BY-NC-ND 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. (which was not certified by peer review) The copyright holder for this preprint this version posted June 5, 2020. . https://doi.org/10.1101/2020 pulse transit time (PTT) according previous estimates for the ear PTT (73). We also measured 230 the average PTT of each participant, defined as the distance between the peak of the EKG R-231 wave and the onset of the peak of the PPG waveform (signaling mechanical transmission of the 232 systolic pressure wave to the earlobe). The length of each heartbeat interval (i.e., the "before-beat 233 interval" and "after-beat interval") depended on the heart rate. For example, if two heartbeats 234 were 1 second apart, the "after-beat interval" would include the first 500 ms after the initial beat 235 and the "before-beat interval" would correspond to the 2 nd 500 ms. The after-beat intervals were 236 considered the time periods in which the systole (heart muscle contraction) signal was present 237 and in which a tap should be chosen if it was felt. The before-beat intervals were treated as the 238 time periods where the diastole (heart muscle relaxation) signal was present and in which tapping 239 should not occur (i.e., assuming taps are chosen in response to detecting a systole; e.g., as 240 supported by (74)). This allowed us to formulate each interval as a "trial" in which either a tap or 241 no tap could be chosen and whether a systole or diastole signal was present (see Figure 1).

243
. CC-BY-NC-ND 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity.

(which was not certified by peer review)
The copyright holder for this preprint this version posted June 5, 2020. . https://doi.org/10.1101/2020.06.03.20121343 doi: medRxiv preprint on hidden states (s), where this relationship is specified by the A matrix, and those states depend on 248 previous states (as specified by the B matrix, or the initial states specified by the D vector). This model 249 represents a simplified version of a commonly used active inference formulation of partially observable 250 Markov decision processes, but which does not explicitly model action selection (for more details 251 regarding the structure and mathematics describing these models, see (75-77). In our model, the 252 observations were systole/diastole, and the hidden states included beliefs about the presence or 253 absence of a heartbeat. For simplicity, the probability of choosing to tap is here assumed to correspond 254 to the posterior distribution over states ( ̅ ) -that is, the relative confidence in the presence vs. absence 255 of a heartbeat: P(HB) and P(nHB), respectively. The model parameters we estimated corresponded to: 1) 256 interoceptive precision (IP) -the precision of the mapping from systole/diastole to beliefs about 257 heartbeat/no heartbeat in the A matrix, which can be associated with the weight assigned to sensory 258 prediction errors; and 2) prior expectations for the presence of a heartbeat (pHB). Because minimal 259 precision corresponds to an IP value of .5, and both higher and lower values indicate that taps will more 260 reliably track systoles (albeit in an anticipatory or reactive manner), our ultimate measure of precision 261 subtracted 0.5 from raw IP values and then took their absolute value. The raw IP values were then used 262 to assess for group differences in the tendency to tap before vs. after each systole. We also compared 263 this model to an analogous model that included learning (see main text). On each trial, beliefs about the 264 probability of a heartbeat (corresponding to the probability of choosing to tap) relied on Bayesian 265 inference as implemented in the "heartbeat perception" equations shown at the bottom of the figure. 266 Note that, by convention in active inference, the dot product (•) applied to matrices here indicates 267 transposed matrix multiplication. Observed diastoles and systoles were taken from ECG traces, after 268 dividing the time series into periods before and after each systole (exemplified in the right portion of the 269 figure; see text for details). 270 271 . CC-BY-NC-ND 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity.

(which was not certified by peer review)
The copyright holder for this preprint this version posted June 5, 2020. . https://doi.org/10. 1101/2020 To model behavior, we used a Bayesian model of perception (see Figure 1) derived from a 272 Markov decision process (MDP) formulation of active inference that has been used in previous 273 work; for more details about the structure and mathematics of this class of models, see (76,78,274 79). Unlike the full MDP model, however, which includes both a perception model and an action 275 model, here we only explicitly included a generative model of perception. This model specified 276 the random variables (see Table 2), and their dynamics over time, associated with each time 277 period ("trial") in the task in which a systole either did or did not occur. Observations (o) in the 278 model formally included systole, diastole, and a "start" observation. Perceptual states (s) 279 included either feeling one's heartbeat or not, as well as a "start" state. Here, a trial formally 280 included two timesteps: 1) a "start" time point, followed by 2) the possibility of either a systole 281 or diastole. or did not occur. The precision of this matrix was controlled by an "interoceptive precision" (IP) 288 parameter. A value of 0.5 would indicate minimal precision (i.e., leading the individual to simply 289 "guess" whether he/she experienced a heartbeat), whereas a value approaching 0 or 1 indicates 290 high precision (i.e., leading the individual to more consistently respond before, or consistently 291 respond after, a heartbeat has occurred). The B matrix encodes the probability that one state will 292 transition into another, ( +1 | )here corresponding to the probability of transitioning from 293 the "start" state to a "heartbeat state" vs. a "no heartbeat" state. This probability was controlled 294 . CC-BY-NC-ND 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity.

(which was not certified by peer review)
The copyright holder for this preprint this version posted June 5, 2020. . https://doi.org/10.1101/2020.06.03.20121343 doi: medRxiv preprint by a parameter pHB, where values above .5 indicate prior expectations favoring feeling a 295 heartbeat (e.g., expecting a faster heart rate), and values less than .5 indicate stronger 296 expectations not to feel a heartbeat (e.g., expecting slower heart rate). Both IP and pHB were 297 estimated for each participant based on their tapping behavior, as described below. A matrix encoding beliefs about the relationship between hidden states and observable outcomes (i.e., the probability that specific outcomes will be observed given specific hidden states).

298
Encodes beliefs about the relationship between felt heartbeats and diastole vs.
systole. The precision of the relationship between heartbeats and diastole/systole was controlled by a parameter IP, which specified how much evidence systole provides for a heartbeat and how much evidence diastole provides for the absence . CC-BY-NC-ND 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity.

B matrix ( +1 | )
A matrix encoding beliefs about how hidden states will evolve over time (transition probabilities).
Encodes the prior expectation that either a heartbeat or no heartbeat would occur on each trial, as controlled by a parameter pHB. ( 1 ) A matrix encoding beliefs about (a probability distribution over) initial hidden states.

D vector
Ensures the individual always begins in an initial starting state.

301
In this task, it is unclear whether or not learning over time (e.g., from paying attention to one's 302 heartbeat) contributes to task performance. To assess this using Bayesian model comparison, we 303 compared evidence for the "perception only" model described above with evidence for a model CC-BY-NC-ND 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity.

(which was not certified by peer review)
The copyright holder for this preprint this version posted June 5, 2020. . https://doi.org/10.1101/2020.06.03.20121343 doi: medRxiv preprint Here ⊗ indicates the cross-product, and 0 is a scalar on the prior value for concentration 315 parameters, where its value prior to learning encodes (inverse) sensitivity to information, such 316 that higher values will reduce the rate at which prior expectations are updated over time with 317 new observations. In the learning model, 0 was also estimated for each individual to capture the 318 possibility of different learning rates for updating prior expectations (this could also be thought 319 of as differences in a kind of interoceptive "belief rigidity").

320
Thus, the final parameters estimated for each participant included the IP, pHB, and 0 321 parameters. Our approach to parameter estimation used Bayesian inference at two levels (80).

322
First, each participant's responses were modeled using the Bayesian model of perception 323 described above. We then used a commonly used Bayesian optimization algorithm (called

324
Variational Bayes) to estimate each participant's parameter values that maximized the likelihood 325 of their responses (under the assumption that a higher/lower probability assigned to feeling a 326 heartbeat corresponded to a higher/lower probability of choosing to tap), as described in (25). 327 We optimized these parameters for each model using this likelihood and variational Laplace London, UK, http://www.fil.ion.ucl.ac.uk/spm). This estimation approach has the advantage of 331 . CC-BY-NC-ND 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity.

(which was not certified by peer review)
The copyright holder for this preprint this version posted June 5, 2020. . https://doi.org/10.1101/2020.06.03.20121343 doi: medRxiv preprint preventing overfitting, due to the greater cost it assigns to moving parameters farther from their 332 prior values. Estimating parameters required setting prior means and prior variances for each 333 parameter. The prior variance was set to a high precision value of 1/2 for each parameter (i.e., 334 deterring overfitting), and the prior means were set as follows: IP = .5, pHB = .5, and b0 = 1. Our 335 decision for selecting these priors was motivated in part by initial simulations confirming that prior of 1 is equivalent to this parameter having no effect on the model. After fitting parameters 340 for each model, we then performed Bayesian model comparison (based on (82, 83)) to determine 341 the best model. We then used classical inference to test for the effects of group differences in 342 parameter estimates for the best model, using a standard summary statistic approach.

343
Before using these parameters in further analyses, however, the "raw" IP parameter values were then instead used to assess individual differences in the tendency to tap in an anticipatory or 352 reactive (AvR) fashion: 353 . CC-BY-NC-ND 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity.

(which was not certified by peer review)
The copyright holder for this preprint this version posted June 5, 2020. Higher AvR values (> 0.5) thus indicated a stronger tendency to reactively tap in response to a 355 heartbeat as opposed to tapping in an anticipatory fashion (< 0.5).

357
Electrocardiography was used to assess the objective timing of participants' heartbeats 358 throughout the task. A BIOPAC MP150 was used to collect a three lead EKG signal and the 359 pulse oximeter signal, using a pulse plethysmography (PPG) device attached to the ear lobe.

360
Response times were collected using a task implemented in PsychoPy, with data collection 361 synchronized via a parallel port interface.

362
EKG and response data were scored using in-house developed MATLAB code. As described 363 above in relation to modelling, each participant's pulse transit time (PTT) was estimated as the 364 median delay between R wave and the corresponding inflection in the PPG signal.

369
Prior to performing our analyses, several participants were removed due to quality control 370 checks: 19 individuals were removed due to "cheating" (i.e., video revealed they were taking 371 their pulse while performing the task); 3 individuals didn't complete the task; and 13 individuals 372 had poor EKG across all trials that didn't allow reliable identification of heartbeat timing. An 373 additional 31 individuals were removed due to being outliers when performing the tone task 374 . CC-BY-NC-ND 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity.

(which was not certified by peer review)
The copyright holder for this preprint this version posted June 5, 2020. Correlational analyses were first conducted to examine the relationships between parameters 384 across conditions. For purposes of parameter validation, we ran further correlational analyses to 385 examine the relationships between each parameter and task-specific measures, including the self-386 report ratings of difficulty, confidence, and heartbeat intensity collected after each trial.

387
Analyses of variance (ANOVAs) and covariance (ANCOVAs) were conducted to identify 388 possible group differences (i.e., between HCs and the five patient groups) in each parameter and 389 in how they differed across trials (i.e., guessing, no-guessing, breath-hold), while accounting for 390 individual differences in age, sex, BMI, median PTT, number of heartbeats (and its interaction 391 with condition), IP for the tone condition (to rule out potential effects of reaction time), and 392 medication status (i.e., one analysis per parameter). Because the first half of the T1000 sample 393 was intentionally designed as an exploratory sample, we report relationships at p < .05 as a 394 potential basis for a priori hypotheses verification in the second (confirmatory) half of the T1000 395 . CC-BY-NC-ND 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity.

(which was not certified by peer review)
The copyright holder for this preprint this version posted June 5, 2020. . https://doi.org/10.1101/2020.06.03.20121343 doi: medRxiv preprint dataset in subsequent work. Although our analyses are exploratory, we note that a Bonferroni 396 corrected threshold for multiple comparisons with three parameters is p < .017 (α=0.05).

398
Participant characteristics 399 Complete information on sample size, demographics, and symptom screening measures is 400 provided in Table 1. Separate ANOVAs showed significant differences between groups in age 401 (F(5,428) = 2.98, p = .01) and body mass index (BMI; F(5,413) = 4.04, p = .001). A chi-squared 402 analysis also showed significant differences in the proportion of males to females between 403 groups (chi-squared = 19.43, df = 5, p = .002). Therefore, in our analyses of model parameters, 404 we also confirm our results after controlling for these other factors.

406
As expected, across all participants self-reported heartbeat intensity, confidence in task 407 performance, and task difficulty differed significantly between the three heartbeat tapping  Table 3. Summary statistics for all task-related variables 417 . CC-BY-NC-ND 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity.

(which was not certified by peer review)
The copyright holder for this preprint this version posted June 5, 2020. . CC-BY-NC-ND 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity.

(which was not certified by peer review)
The copyright holder for this preprint this version posted June 5, 2020. When comparing models (based on (82, 83)), the "perception only" model was better than the CC-BY-NC-ND 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity.

435
For consistency/comparability, we use the "perception only" model parameters to compare 436 conditions in our analyses below, as this model best explained heartbeat tapping behavior 437 overall. The accuracy of this modeldefined as the percentage of choices to tap/not tap that 438 matched the highest probability action in the model (e.g., a tap occurring when the highest 439 probability percept in the model was a heartbeat)was 74% across all conditions; by condition, 440 model accuracy was: guessing condition = 67% (SD = 12%); tone condition = 67% (SD = 12%); 441 no-guessing condition = 85% (SD = 14%); breath-hold condition = 84% (SD = 13%).

Relationship between parameters 443
As shown in Figure 2, correlations between IP across task conditions were generally low.

444
Correlations between pHB estimates across conditions were moderate, most notably between the 445 no-guessing and breath-hold conditions (which also included the no-guessing instruction). The 446 tendency to tap in an anticipatory vs. reactionary manner (AvR) showed no relationships across 447 conditions. Correlations between IP and pHB (or pTone) within each condition were also low (0 448 < r < .27), as were correlations between these parameters and AvR (-.28 < r < .01; see 449 supplementary materials).

450
. CC-BY-NC-ND 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity.

(which was not certified by peer review)
The copyright holder for this preprint this version posted June 5, 2020. the no-guessing and breath hold conditions. Additionally, pHB was lower in those self-reporting 459 greater difficulty in the no-guessing condition, and higher in those self-reporting higher 460 confidence and higher heartbeat intensity in both the no-guessing and breath hold conditions.

461
Across heartbeat tapping conditions, model parameters were also weakly (IP) to strongly (pHB) 462 related to the traditional counting accuracy measure (39).

465
. CC-BY-NC-ND 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity.

(which was not certified by peer review)
The copyright holder for this preprint this version posted June 5, 2020.

475
Parameter values for each group and condition are listed in Table 4. Across conditions, all 476 parameters were normally distributed (skew < 2). is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity.

(which was not certified by peer review)
The copyright holder for this preprint this version posted June 5, 2020. An ANOVA revealed that sensory precision (IP or auditory precision in the tone condition) was .008, p = .001, and p < .001, respectively). We subsequently confirmed that the group by 493 condition interaction remained significant (F(10,1114) = 2.7, p = .003) when also including age, 494 sex, BMI, median PTT, precision within the tone condition, number of heartbeats (and it's 495 . CC-BY-NC-ND 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity.

(which was not certified by peer review)
The copyright holder for this preprint this version posted June 5, 2020.

508
No difference was observed between conditions in the tendency to anticipate vs. react (AvR) to 509 the tone/heartbeat using an ANOVA. There was also no effect of group, or condition by group 510 interactions for AvR in a subsequent interoception-focused model excluding the tone condition. To examine the ability of traditional measures to capture similar group differences, we ran 515 analogous analyses using the traditional heartbeat counting task formula for interoceptive 516 accuracy (39). An ANOVA revealed a main effect of trial condition (F(2,1273) = 139.00, p < 517 . CC-BY-NC-ND 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity.

(which was not certified by peer review)
The copyright holder for this preprint this version posted June 5, 2020. . https: //doi.org/10.1101//doi.org/10. /2020 .001) and group (F(5,1273) = 3.57, p = .03) on counting accuracy, but no group by condition 518 interaction. Tukey post-hoc comparisons revealed that counting accuracy was greater in the 519 guessing condition than in the no-guessing and breath-hold conditions (both p < .001), and that 520 accuracy was higher in SUDs than in DEP and DEP/ANX (p = .04 and .004, respectively). These 521 effects remained significant after accounting for the other variables we controlled for above.

522
To assess potential group differences in the effect of task condition on self-reported experience 523 and physiology, we also carried analogous analyses assessing confidence, intensity, and condition across all participants. Prior expectations for heartbeats were lower in the no-guessing and 530 breath-hold conditions. Sensory precision (i.e., interoceptive precision for the heartbeat or auditory 531 precision for the tone condition) was much greater in the tone condition. This was expected given the 532 . CC-BY-NC-ND 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity.

(which was not certified by peer review)
The copyright holder for this preprint this version posted June 5, 2020. . https://doi.org/10.1101/2020.06.03.20121343 doi: medRxiv preprint unambiguous nature of this signal relative to the heartbeat signal. There were no significant differences 533 in sensory precision between the heartbeat conditions. The Anticipate vs. React values revealed no 534 mean differences between conditions. Top: For more complete data characterization, we also show 535 raincloud plots depicting the same results in terms of individual datapoints, boxplots (median and  536 upper/lower quartiles), and distributions. These illustrate that, for the Anticipate vs. React parameter, 537 nearly equally sized clusters of participants appeared to adopt more anticipatory (<.5) vs. reactive (>.5) 538 strategies in the tone condition, and that prior expectations remained unbiased (.5; with little variance) 539 in the tone condition relative to the heartbeat tapping conditions. 540 541 542 Figure 5. Bottom: Mean and standard error for interoceptive precision estimates by condition and 543 clinical group. Interoceptive precision (IP) was significantly greater in healthy comparisons than all other 544 groups in the breath-hold condition (*p < 0.05; with the exception of the eating disorders group, likely 545 due to the small sample size for this group), and healthy comparisons showed a significant increase in IP 546 from the guessing to breath-hold conditions that was absent in the other groups. Top: For more 547 complete data characterization, we also show raincloud plots depicting the same results in terms of 548 individual datapoints, boxplots (median and upper/lower quartiles), and distributions. 549 550

551
Given the heterogeneity in our clinical sample, we ran subsequent exploratory correlational 552 analyses with continuous scores on the clinical measures gathered, excluding HCs, to assess 553 whether model parameters might provide additional information about symptom severity. We 554 . CC-BY-NC-ND 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity.

(which was not certified by peer review)
The copyright holder for this preprint this version posted June 5, 2020. . https://doi.org/10.1101/2020.06.03.20121343 doi: medRxiv preprint note a few weak relationships at uncorrected levels in supplementary materials, but none 555 survive correction for multiple comparisons.

556
The T1000 dataset also includes self-report measures commonly used in interoception research, 557 including the Multidimensional Assessment of Interoceptive Awareness (MAIA; (84)), the 558 Toronto Alexithymia Scale (TAS-20; (85)), and the Anxiety Sensitivity Index (ASI; (86)). For 559 the interested reader, we also show exploratory correlation matrices between model parameters 560 and these common measures within supplementary materials. While a couple of relationships 561 were significant at uncorrected levels, the strength of these relationships was low.

563
This investigation aimed to examine whether a novel computational (active inference) model of 564 perception could provide a principled approach to empirically characterizing interoceptive 565 processing dysfunctions that have previously been proposed. Specifically, we used behavior 566 during a heartbeat perception task in conjunction with this model to estimate quantitative 567 differences in the prior expectations and sensory precision estimates that individuals implicitly 568 apply to afferent interoceptive (cardiac) signalsin both healthy individuals and a 569 transdiagnostic sample of individuals with depression, anxiety, substance use, and/or eating 570 disorder symptoms. We observed several relationships in the expected directions between model 571 parameters and other task-related variables that supported the construct validity of our model 572 parameters. Of greatest interest, we found evidence that an interoceptive (breath-hold) 573 perturbation increased the precision estimates assigned to cardiac signals in healthy individuals, 574 but that this perturbation had no effect on interoceptive precision in individuals with depression, 575 anxiety, substance use, and/or eating disorders. In contrast, no significant differences in prior 576 expectations were observed, suggesting sensory precision, and not prior expectations, best 577 . CC-BY-NC-ND 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity.

(which was not certified by peer review)
The copyright holder for this preprint this version posted June 5, 2020. . https://doi.org/10.1101/2020.06.03.20121343 doi: medRxiv preprint accounts for interoceptive differences in the clinical groups. Model comparison also suggested 578 that individuals did not update prior expectations over time during the task, which is perhaps 579 unsurprising given the low cardiac awareness commonly seen in previous studies (6). That is, 580 individuals may not have had sufficient signal to learn from (note that, in contrast, model 581 comparison supported the presence of learning in our auditory control condition with a clear 582 signal). We expand on these points below.

583
Parameter validity/sensitivity 584 IP showed positive relationships with self-reported heartbeat intensity ratings (no-guessing and 585 breath-hold conditions)as would be expected in the context of more precise cardiac signals. In 586 the no-guessing and breath-hold conditions, pHB was lower than in the guessing condition. In 587 other words, participants appear to have successfully adjusted their prior expectations to comply 588 with the no-guessing instructions. Further, consistent with the role of prior expectations in 589 perception, pHB was also lower in those reporting greater difficulty (no-guessing condition) and 590 higher in those reporting greater confidence and higher heartbeat intensity (no-guessing and 591 breath-hold conditions). Each of these results support the notion that our parameters tracked 592 theoretically meaningful individual differences in perception/behavior and that our approach can 593 disentangle the effects of sensory precision from those of prior expectations and anticipatory vs. 594 reactive strategies.

595
Further, while model parameters had some shared variance with traditional accuracy measures 596 (e.g., higher counting accuracy was weakly associated with higher interoceptive precision), they 597 mainly captured unique variance that was not tracked by standard measures. Further, no group 598 differences analogous to those seen with model parameters were found using the traditional 599 interoceptive accuracy measure (i.e., across all participants, counting accuracy showed an 600 . CC-BY-NC-ND 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity.

(which was not certified by peer review)
The copyright holder for this preprint this version posted June 5, 2020. . https://doi.org/10.1101/2020.06.03.20121343 doi: medRxiv preprint opposite pattern, being highest in the guessing condition). This highlights the unique ability of 601 this computational method to uncover differences in perceptual decision making across different 602 psychiatric subtypes.

603
It is also worth noting the strong positive correlation we observed between pHB and counting 604 accuracy in the no-guessing and breath-hold conditions, suggesting that heartbeat counting 605 accuracy primarily reflects prior expectations (as previously proposed in (38)). In the present 606 task this is explained by the fact that higher pHB values led to a greater number of taps and the 607 fact that the average number of taps in these conditions was low (i.e., because of the no-guessing 608 instruction). Thus, those who tapped more often approached the actual number of heartbeats and 609 therefore had higher counting accuracy scores. This highlights one specific way in which, in the 610 context of restrictions on guessing, counting accuracy may be most closely associated with prior 611 expectations.

613
Our primary results were that: 1) IP was higher in the healthy comparisons than in the clinical 614 groups (in general for those with depression/anxiety, and for all clinical groups but eating 615 disorders in the breath-hold condition; note that IP values in eating disorders were numerically 616 comparable to the other clinical groups, but did not reach significance due to small sample size); 617 and 2) there was a group by condition interaction, demonstrating that the interoceptive 618 perturbation (breath-hold) increased IP (relative to resting conditions) in the healthy participants, 619 whereas this perturbation had no effect in any of the clinical groups. These group differences 620 were not accounted for by any other demographic (e.g., age, sex) or physiological variables (e.g.,

621
pulse transit time, changes in heart rate). The hypothesized finding that IP was reduced across 622 psychiatric groups included in this study may be of clinical interest. First, multiple 623 . CC-BY-NC-ND 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity.

(which was not certified by peer review)
The copyright holder for this preprint this version posted June 5, 2020. . https://doi.org/10.1101/2020 theories of emotion, and 624 associated empirical findings (e.g., (42,(90)(91)(92)(93)(94)), suggest that interoceptive awareness may be an 625 important transdiagnostic factor in promoting emotional awareness. As low emotional awareness 626 has been linked to multiple psychiatric and systemic medical conditions (reviewed in (95, 96)), 627 reduced IP could contribute to low emotional awareness and its maladaptive consequences 628 irrespective of diagnostic category. Second, visceral regulation might be expected to be less 629 effective in the absence of precise feedback signals from the body, which could relate to visceral 630 dysregulation in psychiatric conditions (e.g., see (56)

637
These results build on previous bodies of work suggesting associations between psychiatric 638 disorders and interoceptive processing deficits (38,56,(97)(98)(99)(100). For example, previous cardiac 639 perception studies have shown that depressed patients exhibit reduced accuracy on a heartbeat 640 counting task (2-4), and that performance is negatively correlated with depressive symptoms (5) 641 as well as associated with both lower positivity and poorer decision-making (2); although, the 642 limitations of heartbeat counting tasks should be kept in mind when interpreting such findings 643 (47,49,101,102). While the literature on interoceptive dysfunction is mixed for anxiety 644 disorders broadly, it is well-stablished in panic disorder (reviewed in (6)). A couple recent 645 studies have also reported evidence of differences in interoceptive processing in both SUDs (e.g.,

646
. CC-BY-NC-ND 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity.

(which was not certified by peer review)
The copyright holder for this preprint this version posted June 5, 2020. . https://doi.org/10.1101/2020 blunted brain responses (7)) and eating disorders (e.g., stronger expectation effects on perception 647 during low arousal (8)). The computational framework within which our findings were observed 648 is also in line with several recent proposals about the role of interoceptive inference in guiding 649 (predictive) autonomic control and the potential breakdown of this mechanism within different 650 psychiatric conditions (10-15, 36, 38); however, in contrast to previous emphasis on altered prior 651 expectations in these proposals, our results more selectively support the existence of deficits in 652 adjusting precision estimates for afferent interoceptive signalsand do not suggest the presence 653 of altered priors.

654
That said, the neurobiological mechanisms promoting reduced IP in mental disorders during 655 interoceptive perturbation remains unclear. As IP increased during the interoceptive perturbation 656 in HCs, but not in the clinical groups, it could be that altered brain processes in individuals with 657 certain psychiatric disorders fail to update IP estimates during states of acute bodily arousal. The 658 neural process theory associated with active inference suggests an inability to adjust IP estimates 659 would correspond to reductions in synaptic plasticity in response to changes in patterns of 660 interoceptive prediction-errors (10,24,76,78,103), most plausibly within neural networks 661 supporting interoception and visceromotor control (e.g., insula and anterior cingulate (15, 104)).

662
This could be tested using our model/task in conjunction with neuroimaging. Alternatively, IP 663 estimates could be accurate, and afferent interoceptive signals may in fact be conveyed with less 664 fidelity (i.e., greater noise) to the brain in the context of psychiatric conditions. For example, 665 such differences could be due to altered signaling in interoceptive sensory axons, which could in 666 principle be affected by many factors (genetic/epigenetic influences, early adversity and related 667 socio-environmental factors, and/or effects of disease-related chronic stress, among others).

668
Future research will need to investigate these different possibilities.

669
. CC-BY-NC-ND 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity.

(which was not certified by peer review)
The copyright holder for this preprint this version posted June 5, 2020. This study has several major strengths. First is the novel application of a computational (active 671 inference) model to behavior on an interoceptive awareness task, which allowed for model 672 comparison (e.g., allowing us to rule out learning effects) and parameter estimates to disentangle 673 distinct computational mechanisms (e.g., the role of prior expectations vs. interoceptive 674 precision). A second strength is the application of this model to interoceptive processing in 675 individuals with psychiatric disorders, which to our knowledge, has never been reported. While 676 accounting for other influences, this approach allowed us to test a theoretical predictionthat cortices. Active inference models may be especially useful in this regard, as they allow for 685 simulation of predicted neuronal responses (78).  While this was possible, in simple perceptual tasks like ours that lack reward feedback, any 691 variability in the precision of action selection would be correlated with sensory precision; 692 . CC-BY-NC-ND 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity.

(which was not certified by peer review)
The copyright holder for this preprint this version posted June 5, 2020. . https://doi.org/10.1101/2020.06.03.20121343 doi: medRxiv preprint therefore, we chose a simpler model that treated behavior as a direct readout of confidence in 693 perception (and controlled for sensory precision in the tone condition to account for individual 694 differences in motor stochasticity). However, the novelty of our approach entails that it should be 695 replicated in future studies.

696
Another limitation is that, while we did compare models with vs. without learning, we did not 697 compare our perceptual model to other existing approaches. The heartbeat tapping task was not 698 well-suited for the several alternatives we considered. For example, it did not have a sufficient 699 number of trials to be suited for traditional signal detection approaches, which would be most 700 comparable to our model-based precision measure (105). Computational models based on 701 reinforcement learning were also inappropriate as the task did not include planning or learning 702 from reward, and instead dealt mainly with uncertainty in perception. Consequently, we 703 employed an active inference model that only includes perceptual inference. This involved 704 discretizing timepoints where responses were considered co-occurrent (or not) with diastole vs. 705 systole; but other such discretizations could have been chosen. That said, given the relationships 706 we observed between parameters and other task measures, our choices appear to have led to 707 estimates that track meaningful individual differences in task behavior. The task measurement 708 conditions also had certain limitations. For example, many individuals had low IP values -709 reflecting the low cardiac awareness commonly seen at rest in previous studies (6)which may 710 have limited the variability necessary to assess relationships with other variables. Finally, low 711 power, due to a relatively lower sample size in the eating disorders and anxiety disorders groups, 712 may have prevented us from detecting some effects in these groups.

713
In summary, this study 1) demonstrated the sensitivity of individual difference measures 714 (parameter estimates) derived from a novel active inference-based computational model, with a 715 . CC-BY-NC-ND 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity.

(which was not certified by peer review)
The copyright holder for this preprint this version posted June 5, 2020.   CC-BY-NC-ND 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. (which was not certified by peer review) The copyright holder for this preprint this version posted June 5, 2020. . CC-BY-NC-ND 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. (which was not certified by peer review) The copyright holder for this preprint this version posted June 5, 2020. . https://doi.org/10.1101/2020.06.03.20121343 doi: medRxiv preprint 977 . CC-BY-NC-ND 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. (which was not certified by peer review) The copyright holder for this preprint this version posted June 5, 2020. Group by condition interactions in other task-relevant variables 982 To assess potential group differences in the effect of task condition on self-reported experience 983 and physiology, we also carried analogous analyses assessing confidence, intensity, and 984 difficulty, as well as heart rate.

Difficulty.
A main effect of both condition (F(2,1273) = 18.62, p < .001) and group (F(5,1273) 997 = 5.94, p < .001) was found on self-reported difficulty, but no group by condition interaction; 998 post-hoc Tukey comparisons indicated greater difficulty in the no-guessing condition than in the 999 . CC-BY-NC-ND 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. (which was not certified by peer review) The copyright holder for this preprint this version posted June 5, 2020. . https://doi.org/10.1101/2020.06.03.20121343 doi: medRxiv preprint guessing or breath-hold conditions (ps < .001), and that SUDs reported less difficulty than DEP,  CC-BY-NC-ND 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. (which was not certified by peer review) The copyright holder for this preprint this version posted June 5, 2020. . https://doi.org/10.1101/2020.06.03.20121343 doi: medRxiv preprint CC-BY-NC-ND 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. (which was not certified by peer review) The copyright holder for this preprint this version posted June 5, 2020. . https://doi.org/10.1101/2020.06.03.20121343 doi: medRxiv preprint CC-BY-NC-ND 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. (which was not certified by peer review) The copyright holder for this preprint this version posted June 5, 2020. . https://doi.org/10.1101/2020.06.03.20121343 doi: medRxiv preprint