Predictability beyond accuracy: A correlation-based evaluation of survey forecasts of the Chilean exchange rate

Pablo Pincheira; Lorenzo Reus; Andrea Bentancor; Martin Flores

doi:10.1371/journal.pone.0344095

Abstract

Floating exchange rates are widely considered difficult—if not impossible—to predict. While traditional evaluations focus on out-of-sample accuracy measures such as Mean Squared Prediction Error (MSPE), recent literature argues that predictability is better understood as a form of dependence. Following this view, we assess the ability of Chile’s Survey of Professional Forecasters (SPF) to predict the Chilean peso (CLP) across multiple horizons. We find that SPF forecasts maintain stable and statistically significant predictive correlations with CLP returns, indicating meaningful predictability. However, forecast accuracy varies over time, mainly due to a persistent positive bias in the survey. We propose an adjustment aimed at removing this and other inefficiencies, which greatly improves accuracy, particularly at medium and long horizons. Finally, and contrary to common wisdom, we find that the most difficult benchmark to beat in the Chilean case is the random walk with drift—not the driftless random walk.

Citation: Pincheira P, Reus L, Bentancor A, Flores M (2026) Predictability beyond accuracy: A correlation-based evaluation of survey forecasts of the Chilean exchange rate. PLoS One 21(3): e0344095. https://doi.org/10.1371/journal.pone.0344095

Editor: Guanghui Liu, State University of New York at Oswego, UNITED STATES OF AMERICA

Received: June 26, 2025; Accepted: February 16, 2026; Published: March 27, 2026

Copyright: © 2026 Pincheira et al. This is an open access article distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited.

Data Availability: The data neccesary to replicate the findings are available in the following repository: https://doi.org/10.6084/m9.figshare.30887954.

Funding: Pablo Pincheira acknowledges financial support from ANID Fondecyt project #1251636. url:https://anid.cl/. Funder did not play any role in the study design, data collection and analysis, decision to publish, or preparation of the manuscript. Lorenzo Reus acknowledges financial support from ANID Fondecyt project #1251636. url:https://anid.cl/. Funder did not play any role in the study design, data collection and analysis, decision to publish, or preparation of the manuscript.

Competing interests: The authors have declared that no competing interests exist.

1. Introduction

In this paper, we evaluate the ability of the Survey of Professional Forecasters (SPF) in Chile to predict the Chilean exchange rate (CLP) at various horizons. As demonstrated by [1], the Driftless Random Walk (DRW) has proven to be a very difficult benchmark to outperform in out-of-sample evaluations within the exchange rate literature. Since then, a vast body of research has sought to explain why exchange rates exhibit near-random walk behavior [2] or why the DRW is so challenging [3]. Similarly, several studies have used new methods, tests and models, attempting to outperform the random walk. For example, [4] studies new predictors for several commodity-currencies, focusing on whether the prices of the commodities that dominate each country’s export basket can help explain and forecast their bilateral exchange rates. Using primarily daily data, the authors first examine whether these commodity prices deliver a strong out-of-sample fit for the corresponding exchange rates; an exercise in the spirit of [1]; though not a true forecasting evaluation because it relies on contemporaneous commodity prices. At the daily frequency, this out-of-sample fit is notably strong. Their analysis centers on the key export commodity associated with each exchange rate, namely oil for the Canadian dollar (CAD/USD), the Norwegian krone (NOK/USD), and the Australian dollar (AUD/USD); copper for the Chilean peso (CLP/USD); and gold for the South African rand (ZAR/USD). They then turn to traditional predictability tests, using lagged commodity prices in standard predictive regressions. In this setting, the evidence of forecasting power is considerably weaker: improvements relative to the DRW emerge only sporadically, in particular subsamples or at selected horizons. Forecast evaluation relies on Diebold–Mariano tests and [5] statistics, ensuring valid comparisons against nested benchmarks. The contribution of [4] is framed as an extension of the commodity-currency hypothesis developed in [6], where the authors documented significant predictive power from commodity-currency exchange rates toward commodity prices. The direction of analysis is effectively reversed by [4], asking whether commodity prices themselves can forecast exchange rates, and whether such predictive relationships survive rigorous out-of-sample testing. Overall, the study concludes that while contemporaneous commodity prices track high-frequency exchange-rate movements remarkably well, the lagged commodity prices offer only limited and fragile evidence of true out-of-sample predictability.

Another interesting article (See Ref. [7]) broadens the set of exchange rate models evaluated against the random walk, by introducing four new specifications: a real interest rate differential model incorporating shadow rates, a Taylor rule–based model, a sticky-price monetary model augmented with risk proxies, and an interest rate model that embeds yield-curve factors. Despite this expanded and more contemporary suite of fundamentals, the authors find that these newer models do not systematically outperform the older ones. Overall, while a handful of their results are noteworthy, they conclude that the long-standing question of exchange rate predictability remains largely unresolved.

More recently, [8] explores the usefulness of a new benchmark and a set of predictive variables for forecasting exchange rates, with particular attention to global risk measures that have gained prominence in recent years. Focusing on medium- and long-horizon forecasts, they assess whether these variables can systematically outperform the random walk model. Their analysis suggests that the apparent forecasting power of their proposed benchmark; the level of the exchange rate; and, by extension, of the global risk variables, is weak once sampling uncertainty is properly accounted for. In fact, their simulation-based evidence indicates that neither the new benchmark nor the global risk predictors provide strong evidence against the random walk model.

Although relatively recent studies, such as [9], have reported some improvements over the DRW, our reading of the literature suggests that this benchmark remains exceptionally difficult to beat in out-of-sample evaluations. Interestingly, [10] not only classifies the DRW as a strong benchmark but also points out that it is the most difficult to outperform. This observation is relevant to our work because, as we will see, in our case the Random Walk with drift (RW) tends to outperform the DRW at long horizons.

One strand of literature, not included in review [10], focuses on the evaluation of surveys of professional forecasters. See for instance, [11–13], and more recently, [14]. One interesting feature of this literature is that it covers floating exchange rates of both developed and emerging countries. In terms of predictive ability, results are mixed, with some surveys outperforming the DRW in the case of the Chilean Peso (e.g., [13]) and others being outperformed by the same benchmark (e.g., [14]).

The vast majority of the literature evaluating exchange rate predictability focuses on measures of forecast accuracy, like the popular Mean Squared Prediction Error (MSPE) and Mean Directional Accuracy (MDA). Yet, [15], makes a distinction between predictability and accuracy. They argue that predictability and forecast accuracy are two different yet related concepts. On the one hand, predictability is a notion of dependence between future and past events. A variable is predictable as long as its future is interconnected with some other variables in the present and in the past. This interpretation of predictability is also consistent with the views of [16,17]. On the other hand, forecast accuracy is just a measure of precision. The studies in [15,18] show that a zero forecast, which is totally independent of the target variable, may clearly display a lower MSPE than a forecast with a positive correlation of the target variable, yet displaying some degree of [19] inefficiency. This conflicting situation is labeled by the authors as the MSPE Paradox. Put differently, a forecast with a positive correlation with the target variable needs an additional requirement to be precise: it needs to be efficient as well, or mildly inefficient at most.

Building on these new concepts, this paper revisits the predictive performance of the SPF in the context of the Chilean Peso. The results in [13] previously demonstrated that the SPF outperformed the DRW across several horizons. However, their analysis did not address the presence of the MSPE Paradox, a key contribution of our study. Beyond this, it is essential to reassess the SPF’s predictive capabilities, given the significant global and domestic events of the past six years. These include the COVID-19 pandemic, the Russia-Ukraine war, and Chile’s 2019 social unrest, which triggered two pivotal referendums and led to a marked depreciation of the Chilean Peso.

Fig 1 illustrates the Peso’s trajectory against the US Dollar throughout the SPF’s existence. While an upward trend emerged around 2012–2013, the Peso crossed the 800 Pesos per Dollar threshold shortly after the social unrest in October 2019. From January 2001 to September 2019, the Chilean currency averaged 588 Pesos per Dollar, compared to a much higher 824 Pesos per Dollar between October 2019 and April 2024. This substantial depreciation might have influenced the SPF’s predictive behavior, which we examine in detail in sections 3.2 and 3.3.

Download:

Fig 1. Monthly Chilean exchange rate relative to US dollar. Jan 2001–April 2024.

https://doi.org/10.1371/journal.pone.0344095.g001

The potential discovery of strong predictors for the Chilean Peso is a topic of interest for both scholars and practitioners interested in exchange rate dynamics. Yet, as the Chilean Peso is also a commodity-currency, its potential predictability could also offer valuable insights regarding the ability to predict commodity prices, making it a topic of interest for a global audience. This relevance is emphasized by a growing body of research demonstrating the Peso’s ability to Granger-cause certain commodity prices, as shown in the seminal work of [6] and further supported by [20–22]. While our focus on the Chilean exchange rate may initially seem regionally specific, these findings suggest broader implications that resonate well beyond Chile.

As mentioned previously, [13] showed that the SPF consistently outperformed the DRW at multiple horizons when forecasting the Chilean Peso. However, closer scrutiny reveals that the traditional DRW, widely used as a benchmark, may not provide a fair comparison in this case due to differences in information. The DRW relies on end-of-month data to predict the exchange rate h periods ahead, whereas the SPF is typically released around the 10th day of each month. This means that the SPF benefits from approximately 10 additional days of market information, creating an inherent advantage over the DRW. As such, the superior performance of the SPF relative to the DRW, as reported by [13], may be partially attributable to this informational discrepancy rather than to the predictive strength of the SPF itself.

To address this issue, we propose a new benchmark: the “Driftless Random Walk Plus” (DRW+). Unlike the traditional DRW, the DRW+ uses exchange rate data from the day immediately preceding the SPF’s release, ensuring a more level playing field. This refinement, however, restricts the analysis to the period beginning in April 2012, when the Central Bank of Chile (CBCH) began disclosing the precise dates of SPF collection and release. Consequently, our study focuses on the April 2012–April 2024 period and further subdivides this interval into two segments: April 2012–May 2018 and June 2018–April 2024. While the first subsample aligns with the original analysis of [13], the second extends the evaluation into a period characterized by significant events, including Chile’s social unrest in 2019, the COVID-19 pandemic, and the Russia-Ukraine conflict, all of which may have had profound impacts on the Chilean Peso.

Our main findings indicate that: (1) Differing from [10], the Random Walk with drift (RW) consistently outperforms the Driftless Random Walk (DRW); (2) the enhanced “RW plus” (RW+) provides further improvements over the traditional RW; (3) the Survey of Professional Forecasters (SPF) exhibits mixed performance in terms of accuracy, achieving strong results in the first half of the sample period but being outperformed by our naïve benchmarks in the second half; and (4) the SPF maintains a positive and statistically significant correlation with Chilean Peso returns across short, medium, and long horizons throughout the entire sample. These findings suggest the presence of the MSPE Paradox and confirm that the SPF does predict the Chilean Peso, though with time-varying accuracy. This variability is primarily driven by a persistent and positive bias in the SPF forecasts. To address this issue, we propose an Adjusted forecast, which significantly outperforms the original SPF, particularly at medium and long horizons.

The structure of this paper is as follows: Section 2 outlines the dataset used for the analysis. In Section 3.1, we compare the predictive performance of four different benchmarks: DRW, DRW + , RW, and RW + . Sections 3.2–3.4 focus on the evaluation of the SPF’s predictive accuracy and on its efficiency. Finally, we conclude in Section 4.

2. Materials and methods

We utilize monthly data spanning from April 2012 to April 2024. This period is the only timeframe during which the Central Bank of Chile (CBCH) publicly reports the release dates of the Survey of Professional Forecasters (SPF).

Our primary data source is the monthly SPF published by the CBCH. The survey is typically conducted during the first two weeks of each month, with results released to the public the day after collection. Throughout the sample period, the release date fluctuates between the 9th and 13th of each month. The survey targets economists, consultants, and executives from the financial sector. The CBCH reports the median values and the 10th and 90th percentiles for each forecast. Exchange rate forecasts are provided for three different horizons: 2 months (SPF2), 11 months (SPF11), and 23 months (SPF23) ahead. For a more detailed methodology of the survey, refer to [23]. Data were accessed for research purposes on June 10 2024. The authors did not have access to information that could identify individual participants during or after data collection.

For exchange rate data, we extract the daily closing price of the Chilean Peso (CLP) from Bloomberg. These data are converted to monthly frequencies by sampling from the last day of each month. We also sample the closing price from the day before the survey is released, which is simply denoted as CLP + .

Fig 2 illustrates the Chilean Peso alongside the three SPF forecasts (SPF2, SPF11, and SPF23) for the entire sample period. This figure reveals two key observations: first, the Chilean Peso has followed a clear upward trend throughout the period; second, although all three SPF forecasts track the CLP closely, they tend to underestimate the exchange rate in the second half of the sample.

Download:

Fig 2. Chilean exchange rate relative to US dollar and SPF forecasts. April 2012 – April 2024.

https://doi.org/10.1371/journal.pone.0344095.g002

To assess predictive performance, we define three evaluation windows: (1) the full sample period from April 2012 to April 2024, (2) the first subsample from April 2012 to May 2018, and (3) the second subsample from June 2018 to April 2024.

3. Results

3.1. Which is the toughest benchmark?

In this subsection we explore a fairly simple question: which is the right model to use as a benchmark? According to the review in [10], the toughest benchmark to beat in the exchange rate literature is the DRW. Yet, as mentioned in the introduction, when compared to a survey that has an informational advantage of about 10 days, we rather expect this survey to outperform the DRW.

We are now in need of introducing some notation. Let denote the nominal exchange rate at time t, defined as the number of Chilean pesos required to purchase one U.S. dollar. Specifically, represents the closing price of the Chilean peso on the final trading day of month t. For example, for January 2024, corresponds to the closing price recorded on Wednesday, January 31.

We will use lower-case letters to denote the natural logarithm of a given variable, so: Let us define the h-period return of the Chilean peso as . We are interested in forecasts of this variable at various horizons denoted by h. We consider h = 1, 2, 3, 6, 9, 11, 12, 18, 24 months (We focus on the horizons h = 1, 2, 3, 6, 11, 12, 18 and 24 months for three reasons. First, this set enables a direct comparison with [13], who consider the same horizons in their analysis. Second, the Central Bank of Chile requests forecasts 2, 11, and 23 months ahead in its survey, which we interpret as reference points for short-, medium-, and long-term expectations. We therefore evaluate nearby horizons: 1–3 months, 9–12 months, and 18–24 months. Third, the Bank’s monetary policy horizon, approximately 18–24 months, makes these longer horizons particularly relevant for policy considerations.).

A RW model for is simply given by

(1)

where is a white noise process and is a constant representing the systematic drift in the evolution of the exchange rate (A white noise is a stationary time-series process satisfying the following three conditions: , , and finally = 0 for all t, s such that t ≠ s.). In a random walk with drift, captures the average expected change in that is not due to random shocks. This parameter governs the long-run direction of the series: a positive drift implies a persistent tendency for the exchange rate to depreciate over time, while a negative drift implies the opposite. Allowing the possibility of a non-zero drift matters because it enables the model to capture persistent trends; such as inflation differentials, risk premia or structural forces; that would otherwise be absorbed by the error term. When is exactly zero, expression (1) defines a DRW model for Using model (1), the optimal linear forecast for is given by . Again, when this forecast becomes for all forecasting horizon h. We will introduce the following notation to clearly identify these different forecasts.

Coming back to the previously mentioned informational advantage of the SPF relative to a RW, it seems natural to consider the following alternative target variable for the Chilean peso:

We use the subscript “” to explicitly remark that the respondents of the survey provide their forecasts approximately on the 10th day of month “t+1”. In other words, we have the following inequality: t < < t + 1. For this new target variable we propose two natural extensions of the RW and DRW benchmarks defined previously. These extensions are denoted by RW+ and DRW+ and provide the following forecasts for :

Here is just a constant depending on the reference point “” and the forecast horizon h. These new benchmarks use the closing price of the Chilean peso from the day before the survey is released as the initial exchange rate. The logic for changing the reference point is that an investor who reads the survey will be interested in the return of the Chilean peso from the reference point “” onward, as the information between t and “” is already known at time “”.

Notice that the four benchmark forecasts , can be used to generate predictions for . They basically are

In this sense they can be considered as forecasts for the same target variable. From that point of view it is worth to explore and compare their accuracy. To make this point clearer, notice that when forecasting either or , forecast errors are the same. For returns, forecast errors are defined as

For , forecast errors are defined as

When forecasting from “” we observe the same situation. For returns, forecast errors are defined as

Yet, for forecast errors are defined as

We have shown that when forecasting either returns or log-levels we obtain the same forecast errors. Notice, however, that when changing the reference point from t to “”, these errors naturally change. As “” is closer to t + 1, we expect forecast errors indexed by “” to display a lower MSPE. Furthermore, findings of significant differences between forecasts using different reference points, (t vs “”) suggest that the good performance of the SPF relative to the DRW reported in [13] might be importantly affected by the informational advantage of the survey.

Probably the most common metric used to gauge forecast accuracy is the Mean Squared Prediction Error (MSPE), defined as the expected value of the squared forecast errors. For instance, when forecasting h-steps with the DRW model, the corresponding MSPE is

Given a sample of P(h) available h-step-ahead forecast errors, the MSPE is typically estimated as

It is also common to report the square root of the MSPE as an alternative accuracy measure. This is denoted RMSPE. To simplify notation, we will use the term MSPE to refer indistinctly to both the population quantity and its sample counterpart.

Table 1 reports Root MSPE for the DRW and DRW+ in our full sample and in the two subsample periods of interest: April 2012-May 2018 and June 2018-April 2024. Table 1 also reports RMSPE ratios. A number lower than 1 favors the DRW+ benchmark. The table clearly shows that the DRW+ is more accurate than the traditional DRW at every single horizon, with only one exception that occurs when forecasting 24 months ahead in Panel 2. In this particular case both benchmarks display almost identical accuracy, with a tiny edge in favor of the DRW. Aside from this case, in all three Panels the DRW+ shows higher accuracy relative to the DRW, especially at short horizons of 1–6 months. At longer horizons, the informational advantage of these approximately 10 extra days of market information decreases considerable, but at shorter horizons the superior predictive accuracy of the DRW+ over the traditional DRW is substantial, with RMSPE Ratios as low as 0.82.

Download:

Table 1. RMSPE Comparison between DRW+ and DRW.

https://doi.org/10.1371/journal.pone.0344095.t001

Table 2 is akin to Table 1, but comparing the RW+ with the RW. These two benchmarks require estimation of the and parameters respectively. We estimate them with Ordinary Least Squares (OLS) in rolling windows of 74 observations, which basically corresponds to our first subsample of interest. Consequently, when using either the RW+ or RW we only show results for the second subsample, from June 2018 until April 2024.

Download:

Table 2. RMSPE Comparison: RW+ vs RW.

https://doi.org/10.1371/journal.pone.0344095.t002

Results in Table 2 confirm the findings from Table 1, highlighting that the informational advantage of the RW+ model translates into greater forecast accuracy, particularly at shorter horizons. According to the review of [10], the DRW is the toughest benchmark to beat in the exchange rate literature. Nevertheless, Fig 1 and 2 show a clear upward trend in the Chilean Peso during our sample period. This is indication that a forecast like or might be competitive as well.

Tables 3 and 4 next show RMSPE comparing both benchmarks in the traditional version (RW vs DRW in Table 3) and in the “plus” version (RW+ vs DRW+ in Table 4). Table 3 clearly shows that the RW is more accurate than the traditional DRW at every single horizon. RMSPE ratios have a U shape as a function of the forecast horizon, reaching a minimum of 0.88 when forecasting 18 months ahead. This is indication that the explicit inclusion of a linear trend is helpful when forecasting at medium horizons. Table 4 provides a similar picture. With the only exception of h = 1, the RW+ outperforms the DRW when h is greater or equal than 2, reaching a maximum difference when h = 18.

Download:

Table 3. RMSPE Comparison: RW vs DRW. June 2018 – April 2024.

https://doi.org/10.1371/journal.pone.0344095.t003

Download:

Table 4. RMSPE Comparison: RW+ vs DRW+June 2018 – April 2024.

https://doi.org/10.1371/journal.pone.0344095.t004

We finish this section with Table 5, which compares RMSPE of our less competitive benchmark (the DRW) with the most competitive benchmark (the RW+). Differences between these two competitors are sizable and statistically significant at short and medium horizons. This can be seen in the last row of Table 5, where we show traditional t-statistics of the Diebold-Mariano-West test [24–25] (Henceforth DMW test). This test evaluates the null of equal MSPE between these two benchmarks. As we can see, the null is rejected in favor of the RW+.

Download:

Table 5. RMSPE Comparison: RW+ vs DRW. June 2018 – April 2024.

https://doi.org/10.1371/journal.pone.0344095.t005

The fact that the RW performs better than the DRW in our setting may appear to contradict the conventional view—summarized in [10]—that the driftless random walk is the most difficult benchmark to beat. A key explanation is purely statistical: in many of the currency pairs analyzed in earlier studies, such as that by [8], the estimated drift is economically small or statistically indistinguishable from zero, making the DRW an appropriate benchmark. In contrast, in samples where the exchange rate exhibits a statistically meaningful drift—as in our case—the zero-drift restriction imposed by the DRW becomes inconsistent with the data. In such settings, the RW or RW+ naturally tends to perform better in forecast comparisons.

In the next section we compare the predictive performance of the SPF relative to our toughest benchmarks: the DRW+ when evaluating the whole sample and the RW+ when focusing on our more recent subsample of interest.

3.2. SPF forecast evaluation

In this subsection, we evaluate the ability of the SPF to predict CLP returns at several forecast horizons h, with h = 1, 2, 3, 6, 9, 11, 12, 18 and 24 months ahead. Our evaluation employs three types of analyses: traditional MSPE comparisons, MDA comparisons, and a novel yet straightforward approach proposed by [28], which aims to identify the forecast most strongly correlated with the target variable.

We evaluate differences in MSPE relative to the two most competitive benchmarks identified in Section 3.1: the DRW+ and the RW+ . When evaluating MDA we use a “pure luck” benchmark, which simply assumes that it is possible to correctly forecast the future direction of change with at least 50% of success. Finally, we use a correlation analysis to determine, among other aspects, whether the correlation of the SPF with the target variable is both positive and statistically significant, and to examine whether the MSPE Paradox arises in a meaningful way.

3.2.1. MSPE forecast evaluation.

In this subsection we focus on the accuracy of the SPF when predicting Chilean Peso returns defined as

We consider the following forecast coming from the SPF:

(2)

where and is the forecast of the nominal exchange rate made at time coming from the survey. The corresponding forecast errors are given by

Let us recall that we treat SPF2, SPF11, and SPF23 as distinct forecasts for . So, in what follows we will evaluate and .

Prediction accuracy is measured in terms of MSPE. So the main question of interest is how the MSPE coming from the SPF fares relative to the MSPE of our two preferred naïve benchmarks: DRW+ and the RW+ . To evaluate forecast accuracy under MSPE, we focus on the difference

Where represents the MSPE of a naïve benchmark (either DRW+ or RW+). We consider the following hypotheses

Rejection of the null hypothesis implies that SPF forecasts outperform the corresponding benchmark at a statistically significant level. For inference, we apply a one-sided DMW test using HAC standard errors according to [26–27]. Tables 6 report results relative to the DRW+ , whereas Table 9 shows the results relative to the RW+ . While Table 6 focuses on our entire sample period, Tables 7 and 8 focus on the first and second subsample of interest, respectively. Entries in the tables show Root Mean Squared Prediction Error (RMSPE) ratios between SPF and our benchmarks. Ratios below 1 favor survey-based forecasts. Tables also display the t-statistic and p-value of the DMW test.

Download:

Table 6. Forecast accuracy of survey-based forecasts relative to the DRW+ at several forecasting horizons h (measured in months). April 2012 – April 2024 window.

https://doi.org/10.1371/journal.pone.0344095.t006

Download:

Table 7. Forecast accuracy of survey-based forecasts relative to the DRW+ at several forecasting horizons h (measured in months). April 2012 – May 2018 window.

https://doi.org/10.1371/journal.pone.0344095.t007

Download:

Table 8. Forecast accuracy of survey-based forecasts relative to the DRW+ at several forecasting horizons h (measured in months). June 2018 – April 2024 window.

https://doi.org/10.1371/journal.pone.0344095.t008

Download:

Table 9. Forecast accuracy of survey-based forecasts relative to the RW+ at several forecasting horizons h (measured in months). June 2018 – April 2024 window.

https://doi.org/10.1371/journal.pone.0344095.t009

An analysis of the full sample results in Table 6 reveals that the SPF does not outperform the DRW+ in terms of accuracy. Although SPF2 and SPF11 exhibit reductions in RMSPE for medium-term horizons, these improvements are fairly small and not statistically significant according to the DMW test.

Tables 7 and 8 break down the results from Table 6 across the two subsample periods of interest. The analysis reveals that the relative predictive performance of the SPF fluctuates over time: Table 7 shows that the SPF outperforms the DRW+ at several horizons before May 2018, while Table 8 presents the opposite scenario, where the SPF is consistently outperformed by the DRW+ from June 2018 onward. The differences between Tables 7 and 8 are striking. For example, in Table 7, most of the RMSPE ratios are below 1 (23 out of 27), with 13 of them statistically significant at the 10% level. In stark contrast, all RMSPE ratios in Table 8 exceed 1, consistently favoring the DRW+ .

Finally, the results in Table 9 align with those in Table 8. Recall that Table 9 compares the SPF to the RW+ , which, for most forecasting horizons, serves as a more competitive benchmark than the DRW+ . As expected, most RMSPE ratios in Table 9 are higher than those in Table 8, with all exceeding 1, indicating that the SPF is consistently outperformed by the RW+ during the final subsample period. In summary, the results from Table 7, covering the period prior to May 2018, are largely consistent with those reported in [13]. However, the findings from June 2018 onward, presented in Tables 8 and 9, reveal a significant shift in the SPF’s predictive accuracy relative to the standard benchmarks. This change can primarily be attributed to the SPF’s consistent underestimation of the substantial depreciation of the Chilean Peso in the latter half of the sample period, a topic we will explore in more detail later.

3.2.2. Mean directional accuracy.

Mean Directional Accuracy (MDA) is an alternative metric for evaluating the accuracy of a series of forecasts. Essentially, it assesses how often the direction of change—whether the exchange rate increases or decreases—is correctly predicted for a given forecast horizon. MDA is particularly relevant in our context because many economic and financial decisions depend primarily on anticipating the sign of exchange-rate movements—whether the currency will appreciate or depreciate—rather than the exact magnitude of those movements. For example, hedging strategies, carry-trade positions, and policy decisions often hinge on correctly forecasting the direction of returns. Importantly, MDA provides information that is distinct from MSPE and correlation. The MSPE is driven by the magnitude of forecast errors and may be dominated by large but infrequent deviations, while correlation measures only linear comovement between forecasts and realizations. By contrast, the MDA isolates the ability of a model to correctly anticipate the direction of change. It is therefore possible for two forecasts to have similar MSPEs or correlations but exhibit very different directional performance.

A crystal-clear case illustrating the need for MDA is the zero forecast. It is widely used as a benchmark in exchange-rate forecast evaluations because it delivers a low MSPE. However, its correlation with the target variable is not defined, since the zero forecast has zero variance, and moreover, it is absolutely independent of the variable being forecast. Most importantly, the zero forecast has no ability whatsoever to anticipate the direction of future exchange-rate movements. This makes it an ideal example showing that magnitude-based metrics alone cannot capture all dimensions of forecast usefulness.

To evaluate MDA, we construct the following “Hit Rate” statistic.

(3)

Then our hypotheses are

In this case, we test whether the SPF outperforms a pure luck benchmark, implicitly defined as a forecast that predicts the direction of future movements in the Chilean Peso with a 50% probability of success. To evaluate this, we apply a one-sided Diebold-Mariano-West (DMW) test, using HAC standard errors as in [26–27].

Table 10 presents the results for the full sample period, while Tables 11 and 12 provide a breakdown for the two subsample periods of interest. Table 11 covers April 2012 to May 2018, and Table 12 spans June 2018 to April 2024. Each table reports the actual MDA for each forecast horizon, with statistically significant results highlighted by stars. An analysis of the full sample results in Table 10 reveals that the SPF outperforms the pure luck benchmark in only three cases, and even then, solely at the 10% significance level. The average hit rate across all horizons in Table 10 is a modest 52.4%. Notably, SPF11 achieves the highest MDA at nearly every horizon except h = 9. The peak performance in Table 10 is a MDA of 59.3%, achieved by SPF11 when forecasting 11 months ahead. Nevertheless, this outcome is not statistically distinguishable from the pure luck benchmark, as indicated by the DMW test. Overall, the findings in Table 10 are disappointing, mirroring the lackluster MSPE results reported for the full sample in Table 2.

Download:

Table 10. Mean Directional Accuracy of the SPF at several horizons. April 2012 – April 2024.

https://doi.org/10.1371/journal.pone.0344095.t010

Download:

Table 11. Mean directional accuracy of the SPF at several horizons. April 2012 – May 2018.

https://doi.org/10.1371/journal.pone.0344095.t011

Download:

Table 12. Mean directional accuracy of the SPF at several horizons. June 2018 – April 2024.

https://doi.org/10.1371/journal.pone.0344095.t012

Tables 11 and 12 provide a detailed breakdown of the results from Table 10 across the two subsample periods. They reveal that the MDA of the SPF exhibits significant variation over time. For instance, Table 11 shows a strong behavior of the survey during our first subsample period. In particular, the SPF outperforms the pure luck benchmark at multiple horizons. Specifically, all entries in Table 11 exceed 50%, with 14 of them showing that the superiority over the benchmark is statistically significant. The average hit rate across Table 11 is an impressive 61.1%. In sharp contrast, Table 12 displays much weaker behavior, with only two entries displaying numbers above 50% and none achieving statistical significance. The average hit rate in Table 12 is a disappointing 44.5%, showing the stark difference in SPF behavior between the two periods.

Results based on MDA are largely consistent with those obtained from MSPE. In both cases, the SPF’s predictive performance varies over time: it outperforms our preferred naïve benchmark in the first subsample, but performs notably worse in the second subsample. In the next subsection, we will explore whether this pattern holds when evaluating predictive ability through correlations.

3.2.3. Forecast evaluation based on correlations with the target variable.

Thus far we have evaluated the SPF’s ability to predict Chilean Peso returns by comparing its accuracy against conventional benchmarks. This approach, frequently employed in the exchange rate literature, has been used extensively in studies such as [1,5,10,13], among others. However, [15] argue that this intuitive and commonly applied methodology may not fully capture the true essence of predictability. As suggested by various authors, predictability fundamentally reflects a form of dependence between the target variable and past events or variables (As mentioned by [16] in p.657, “The extent of a series predictability depends on how much information the past conveys regarding future values of this series”.). This dependence can translate into greater predictive accuracy if forecasts are efficient in the [19] sense or exhibit only mild inefficiencies. As shown in [18], there are cases where a target variable exhibits a strong positive correlation with a forecast, even when the forecast is outperformed, based on accuracy metrics, by naïve and independent benchmarks. This distinction is at the core of their argument that predictability and forecast accuracy are separate concepts. To evaluate the former, they propose analyzing the correlation between the forecast and the target variable. Evidence of predictability arises when this correlation is positive or exceeds the correlation of a benchmark forecast with the target variable.

Tables 13-15 present the correlations between the three survey-based forecasts and Chilean Peso returns, defined as usual by the expression:

Download:

Table 13. Correlation between CLP+ and SPF at several horizons. April 2012 – April 2024 window.

https://doi.org/10.1371/journal.pone.0344095.t013

Download:

Table 15. Correlation between CLP+ and SPF at several horizons. June 2018 – April 2024 window.

https://doi.org/10.1371/journal.pone.0344095.t015

where represents the log-exchange rate observed immediately before the survey’s release and is the log-exchange rate h periods ahead. Table 13 summarizes results for the full sample period, whereas Table 14 focuses on the first subsample of interest and Table 15 examines the last subsample. For inference, we consider the following simple regression:

Download:

Table 14. Correlation between CLP+ and SPF at several horizons. April 2012 – May 2018 window.

https://doi.org/10.1371/journal.pone.0344095.t014

(4)

In this framework,

We test the following hypotheses:

posits that the predictor and target variables are uncorrelated, implying linear independence. To test this, we use a usual t-statistic constructed with HAC standard errors according to [26,27]. Tables 13–15 present not only the correlation coefficients between the three survey-based forecasts and Chilean Peso returns, but also the corresponding t-statistics and p-values for testing the null hypothesis .

The results across the three samples are striking. Correlations are mostly positive, relatively high, and statistically significant at several horizons in all three tables. At a 90% confidence level, Table 13 shows that the null hypothesis is rejected in 16 out of 27 cases, while both Tables 14 and 15 show 19 rejections out of 27 cases. The average correlation across all entries in Table 13 is 0.198, with the comparable figures for Tables 14 and 15 being 0.216 and 0.284, respectively. Unlike the findings from MSPE and MDA analyses, correlation results show a relatively stable behavior across both subsamples of interest. Additionally, there is evidence that the SPF performs best during the last subsample period, even though it was clearly outperformed by naïve benchmarks in terms of MSPE and MDA during the same period.

Table 16 presents results from a slightly different exercise, where we report the following correlation differential:

(5)

In words, we are comparing the correlation of the SPF with the target variable with the correlation of the RW+ forecast with the same target variable. This comparison helps us to assess whether survey-based forecasts exhibit a stronger correlation with the target variable than the RW+ forecast. For inference, we use the asymptotically normal correlation-based test proposed by [28] to test the null hypothesis of a zero-correlation differential. This test evaluates the correlation of two competing forecasts Z and X with a target variable Y, all of them which are supposed to be stationary variables. This test requires the construction of the following vectors

Consider the following assumptions i), ii), iii) and iv):

i) The vector is strictly stationary with mixing coefficients such that, for some , E and
ii) A strictly positive variance for Y, X, and Z.
iii) X and Z are considered as primitives (i.e., with no parameter uncertainty).
iv) Corr(Y,X) and Corr (Y,Z) are both strictly lower than 1.

Under these assumptions, [28] show that the following t-statistic

is asymptotically normal under the null hypothesis. Here , and represent the sample correlations of Z and X with Y, respectively. Besides is the sample variance of the target variable, T is the number of forecasts, , where and denote the sample standard deviations and covariances, respectively, and

where and represent the sample mean of Z, X, and Y respectively. Finally,

We present results of this test for our second subsample of interest only. This is because the drift of the RW+ is estimated as a rolling average of the most recent observations available in the sample. Specifically, we use the first subsample to obtain the initial estimate of the drift, and then update this estimate in rolling windows of the same size (74 observations). Table 16 reports not only the correlation differential in (5) but also the corresponding t-statistics and p-values coming from the correlation-based test.

Download:

Table 16. Correlation Differences: RW+ vs SPF at several horizons. June 2018 – April 2024 window.

https://doi.org/10.1371/journal.pone.0344095.t016

The overall pattern in Table 16 closely mirrors the results in Table 15. It shows that all correlation differentials are positive, indicating a clear advantage for the SPF over the RW+. Additionally, 18 entries in Table 16 are statistically significant in favor of the SPF. These findings further support and reinforce the conclusions drawn from Table 15.

In summary, we find that the relative forecast accuracy of the survey has declined in the second subsample. However, its correlation has improved at several horizons. Moreover, a close inspection of Table 8 vs Table 15 and of Table 9 vs Table 16 reveals that, in many cases, our naïve benchmarks outperform the SPF in terms of MSPE, while the SPF outperforms these naïve benchmarks in terms of correlations. This scenario is referred to as the MSPE Paradox by [18]. The paradox occurs when the forecast with the highest correlation with the target variable also exhibits the worst MSPE. As shown by [18], a key condition for the MSPE Paradox to arise is the inefficiency of some of the forecasts involved in the comparison. Consequently, our paradoxical results suggest that our survey-based forecasts might be inefficient, a topic we explore in the following section.

3.3. Efficiency analysis

The MSPE Paradox by [18] emerges if at least one of the forecasts involved in the comparison is inefficient, as defined by [19]. This implies that at least one forecast is either biased or exhibits a nonzero correlation with its own forecast error. The evidence of the MSPE Paradox presented in the previous subsection, coupled with the clear bias observed in the survey, as shown in Fig 2, leads us to further investigate the potential inefficiencies of the survey. We start by analyzing bias and then the correlation of survey-based forecasts with their own forecast errors.

3.3.1. Bias in survey-based forecasts.

A forecast is deemed biased if the expected value of its forecast error deviates from zero. To test for this, we consider the simple regression specified in expression (6), regressing the forecast error on a constant. As is standard practice, we apply HAC standard errors following [26,27].

(6)

We test the following hypotheses

Rejection of the null implies that the forecast is biased. Results are shown in Table 17 for the full sample, and in Tables 18 and 19 for the two subsamples of interest.

Download:

Table 17. Bias in survey-based forecasts. April 2012 – April 2024 window.

https://doi.org/10.1371/journal.pone.0344095.t017

Download:

Table 18. Bias in survey-based forecasts. April 2012 – May 2018 window.

https://doi.org/10.1371/journal.pone.0344095.t018

Download:

Table 19. Bias in survey-based forecasts. June 2018 – April 2024 window.

https://doi.org/10.1371/journal.pone.0344095.t019

Table 17 shows a positive and statistically significant bias in the SPF across all forecasting horizons. The first entry in Table 17 indicates a bias of 0.5493 for SPF2 when h = 1, meaning that the SPF2 underestimates the Chilean Peso by approximately 0.55% when forecasting one-month-ahead. The corresponding entry for SPF23, when forecasting 24 months ahead, is 11.7079. This means that SPF23 underestimates the Chilean Peso by approximately 12% when forecasting two-years-ahead. In contrast, when analyzing the sample from April 2012 to May 2018 in Table 18, the results differ considerably. A positive and statistically significant bias is observed only when forecasting 18 and 24 months ahead. At shorter horizons, however, the bias is either small or negative, and more importantly, not statistically significant. Table 19 completes the whole picture by presenting results for the subsample from June 2018 to April 2024. Similar to Table 17, Table 19 shows a positive and statistically significant bias across all entries. This bias is relatively small for SPF2 when forecasting one-month-ahead (approximately 1.2%) but becomes substantial at longer horizons, well above 10% when h = 24.

3.3.2. Correlation of survey-based forecasts with forecast errors.

In this section, we examine another type of forecast inefficiency: a non-zero correlation between survey-based forecasts and their own forecast errors. We test this using the following regression:

(7)

As usual, we use HAC standard errors according to [26,27]. Our hypotheses are as follows:

The null hypothesis posits that our survey-based forecasts are linearly independent of their own forecast errors. Although we have built three tables covering our results for the full sample and both subsamples of interest, in Table 20 we will only show results for the second subsample, from June 2018 to April 2024, for the sake of brevity. The rest of our results are available upon request. Table 20 shows statistically significant and negative results for at short horizons, specifically for h = 1, 2 and 3. There are also some positive and statistically significant results for at h = 9. For the rest of the forecasting horizons, we cannot reject the null of zero correlation between survey–based forecasts and their own forecast errors. Results for the full sample and the first subsample of interest are similar, with statistically significant results occurring only for a handful of forecasting horizons.

Download:

Table 20. Correlation of survey-based forecasts with forecast errors. June 2018 – April 2024 window.

https://doi.org/10.1371/journal.pone.0344095.t020

In summary, we have found that survey-based forecasts are inefficient, particularly in consistently underpredicting the Chilean Peso in our final subsample of interest. In the next subsection, we will explore if these inefficiencies can be used to generate new and more accurate forecasts.

3.4. Forecast optimization

In this subsection, we present an adjustment to the SPF forecasts aimed at improving their accuracy. We begin by estimating the inefficiencies in the original SPF forecasts and then subtract these from the original values to derive the final Adjusted SPF forecasts.

We start by estimating regression (7) for our first subsample of interest. Using the initial estimates of we compute the following adjusted forecast :

(8)

If either or is statistically and economically significant, the adjusted forecast is expected to show greater accuracy relative to the original forecast. Using the initial estimates of , we generate a new forecast for the first relevant observation in our second subsample of interest. For example, when forecasting one-month-ahead, the first observation corresponds to June 2018. To generate an adjusted forecast for the second relevant observation, we re-estimate expression (7) using the most recent 74 observations, updating the estimates of . With these new estimates, we construct the adjusted forecast for the second observation, corresponding to July 2018 when forecasting one-month-ahead. This process is repeated iteratively using rolling windows of 74 observations to generate adjusted forecasts for all subsequent observations. By employing this strategy, we simulate a real-time out-of-sample experiment. If the rolling estimates from expression (7) effectively capture stable inefficiencies, the adjusted forecasts should demonstrate higher accuracy.

Table 21 confirms this result, presenting RMSPE ratios between the Adjusted SPF and the raw SPF, where figures lower than 1 favor the former. Almost all the ratios in Table 21 are well below one, with only a couple of exceptions when forecasting 24 months ahead. The average ratio across all horizons is 0.874, favoring the Adjusted SPF. Improvements in forecast accuracy are all statistically significant at the short horizons of 1,2 and 3 months. At longer horizons there is only one entry showing statistically significant results for SPF23 when h = 18. The lowest ratio in Table 21, an impressive 0.53 for SPF23 at h = 1, highlights the substantial effectiveness of the adjustment. Despite this significant improvement, Table 22 brings us back to reality by presenting RMSPE ratios between the Adjusted SPF and our toughest benchmark, the RW+. All figures in Table 22 are above 1, favoring the benchmark. While the Adjusted SPF shows notable improvements, it still falls short of outperforming the RW+ in terms of MSPE.

Download:

Table 21. Forecast accuracy of the Adjusted SPF relative to the SPF without adjustment. June 2018 – April 2024 window.

https://doi.org/10.1371/journal.pone.0344095.t021

Download:

Table 22. Forecast accuracy of the Adjusted SPF relative to the RW+ . June 2018 – April 2024 window.

https://doi.org/10.1371/journal.pone.0344095.t022

Table 23 presents more encouraging results, showing the MDA of the Adjusted SPF across all horizons. In contrast to the less favorable results for the raw SPF in Table 12, we observe consistent improvements for horizons longer than 2 months.

Download:

Table 23. Mean directional accuracy of the adjusted SPF. June 2018– April 2024.

https://doi.org/10.1371/journal.pone.0344095.t023

Furthermore, values greater than 50% are achieved when forecasting 6 months ahead or longer. For horizons ranging from 9 to 18 months, the results are statistically and economically superior to the pure luck benchmark. The adjustment proves effective in forecasting the direction of change of the Chilean Peso, particularly at medium to long-term horizons. The highest average hit rate is attained with SPF2 when forecasting 18 months ahead, with a notable figure of 78.9%.

We also explored an alternative strategy to address potential inefficiencies in the survey, by assessing whether forecast errors at a given horizon can be partially explained by information from forecasts at other horizons as well. In particular, we examined whether the forecast errors of SPF2, SPF11, and SPF23 can be systematically related to the information contained in the remaining forecasts. To this end, we employed the following regression framework:

(9)

where represents a generic h-step-ahead forecast error that can be generated either by SPF2, SPF11 or SPF23. This expression allows for a more general adjustment of the raw forecasts coming from the survey. Equation (9) is estimated using the LASSO method, which enables variable selection and reduces estimation noise in settings with multicollinearity and limited sample sizes, relative to standard OLS.

To our general disappointment, this strategy does not yield significant improvements over the simpler adjustment approach based on equations (7) and (8). Table 24 illustrates this by reporting RMSPE ratios between the LASSO-adjusted SPF and our most challenging benchmark, the RW+. As in Table 22, all ratios in Table 24 are above 1, indicating superior performance by the benchmark. In fact, the RMSPE ratios in Table 24 are slightly higher than those in Table 22, suggesting a marginal advantage for the simpler OLS-based adjustment.

Download:

Table 24. Forecast accuracy of the LASSO-Adjusted-SPF relative to the RW+. June 2018 – April 2024 window.

https://doi.org/10.1371/journal.pone.0344095.t024

We also evaluated directional accuracy (MDA) using the LASSO-based adjustment obtaining slightly less favorable results than those presented in Table 19. For example, the average hit rate in Table 19 is 61.5%, whereas the corresponding figure in the table based on the LASSO adjustment is only 56.6%. For brevity, this additional table is not included here but is available upon request. As with the RMSPE results, the more complex LASSO-based adjustment does not appear to offer a compelling improvement over the simpler OLS-based alternative.

In conclusion, either the OLS or the LASSO Adjusted SPF exhibits considerable improvements in forecast accuracy compared to the original, unadjusted SPF. While these improvements are not enough to outperform our preferred benchmark in terms of MSPE, they are more than enough to outperform the pure luck benchmark in terms of MDA at medium horizons.

We also applied the Elastic Net to equation (9). This method combines the LASSO penalty, which performs variable selection with the Ridge penalty, which stabilizes coefficient estimates under multicollinearity. We assign an equal weight of 0.5 to both penalties and use a cross-validation procedure (customized to time-series) to choose the optimal regularization parameter. The results are qualitatively similar as those obtained with the LASSO method. Although the adjusted forecast clearly outperforms the SPF forecast (Table 25) it fails reducing the forecast errors of the RW+ (Table 26).

Download:

Table 25. Forecast accuracy of the Elastic Net-Adjusted-SPF relative to the SPF without adjustment. June 2018 – April 2024 window.

https://doi.org/10.1371/journal.pone.0344095.t025

Download:

Table 26. Forecast accuracy of the Elastic Net-Adjusted-SPF relative to the RW+ . June 2018 – April 2024 window.

https://doi.org/10.1371/journal.pone.0344095.t026

4. Discussion

Our results prompt several interesting discussions, one of which involves the selection of benchmarks for evaluating exchange rate predictability. While [10] identifies the driftless random walk (DRW) as the toughest benchmark to beat, our findings demonstrate that the random walk with drift (RW+) outperforms both the DRW and the DRW+ when forecasting the Chilean peso over long horizons. This outcome can be attributed to the significant depreciation of the CLP during the sample period. Thus, similar results may arise for other currencies that exhibit sustained trending behavior, independently of their specific institutional or macroeconomic characteristics. This observation follows from a statistical mechanism implied by equation (1): when an exchange rate displays a persistent trend—whether an extended depreciation or appreciation—the estimated drift is non-zero. In such cases, imposing a zero-drift restriction introduces systematic forecast errors, whereas allowing for a non-zero drift (as in the RW or RW+) captures the deterministic component of the trend and therefore tends to improve forecast performance.

The second discussion relates to the accuracy of the SPF in predicting the CLP. Similar to the findings reported by [13], the SPF’s MSPE—now measured from the day of respondent’s response—is statistically lower than that of the benchmark prior to May 2018 at several forecasting horizons. However, the survey’s performance deteriorates significantly after this date, both in terms of MSPE and Mean Directional Accuracy. This decline may be attributed to the SPF’s persistent downward bias observed after May 2018, suggesting that experts struggled to anticipate the magnitude of the depreciation that followed in the years thereafter.

Third, the relatively stable, positive and significant correlation between the CLP and the SPF across the entire sample—spanning multiple horizons—clearly suggests predictability. However, when coupled with the SPF’s underperformance in terms of accuracy, this serves as a clear illustration of the MSPE Paradox discussed by [18]. This paradox suggests that a strong dependence between the target variable and its predictor does not necessarily lead to a reduction in MSPE when compared to an independent benchmark, such as the DRW. Our findings show that the MSPE Paradox is not merely a theoretical artifact but a tangible phenomenon affecting Chilean peso forecasts produced by the SPF. This opens an intriguing avenue for future research: investigating whether this paradox also manifests in surveys for other exchange rates or in broader forecasting contexts.

Finally, we have applied a simple yet effective method for adjusting the inefficiencies of the SPF, which improves its accuracy in terms of both MSPE and MDA. However, it does not consistently outperform the RW+ in terms of MSPE. Thus, a key challenge for future research is to explore ways of transforming forecast inefficiencies into higher accuracy, with the goal of outperforming traditional benchmarks.

As mentioned in the introduction, accurate forecasts of the Chilean peso are not only valuable for practitioners and households interested in the future value of the CLP per se. It also has broader implications. As demonstrated by [6,20,29], commodity-currencies like the CLP have predictive power for metal and fuel prices. Therefore, our Adjusted SPF can also be of use to experts in those markets.

An important avenue for future research would be to assess the robustness of our inference results in settings where comparisons were conducted without explicitly adjusting for the pre-selection of the best-performing methods. In particular, this issue is most relevant for the evaluation of the optimal benchmark identified in the first part of the paper, where the benchmark-selection step precedes the formal tests of predictive ability. Incorporating procedures that account for such pre-selection would allow us to determine whether the benchmark’s apparent superiority remains statistically valid once this additional source of uncertainty is taken into consideration.

References

1. Meese RA, Rogoff K. Empirical exchange rate models of the seventies. J International Econ. 1983;14(1–2):3–24.
- View Article
- Google Scholar
2. Engel C, West KD. Exchange rates and fundamentals. J Political Econ. 2005;113(3):485–517.
- View Article
- Google Scholar
3. Kilian L, Taylor MP. Why is it so difficult to beat the random walk forecast of exchange rates?. J International Economics. 2003;60(1):85–107.
- View Article
- Google Scholar
4. Ferraro D, Rogoff K, Rossi B. Can oil prices forecast exchange rates? an empirical analysis of the relationship between commodity prices and exchange rates. journal of international money and finance. 2015;54(June 2015):116–41.
5. Clark TE, West KD. Using out-of-sample mean squared prediction errors to test the martingale difference hypothesis. J Econometrics. 2006;135(1–2):155–86.
- View Article
- Google Scholar
6. Chen YC, Rogoff KS, Rossi B. Can exchange rates forecast commodity prices?. Quarterly J Economics. 2010;125(3):1145–94.
- View Article
- Google Scholar
7. Cheung Y-W, Chinn MD, Pascual AG, Zhang Y. Exchange rate prediction redux: New models, new data, new currencies. J Int Money Finance. 2019;95:332–62.
- View Article
- Google Scholar
8. Engel C, Wu SPY. Forecasting the U.S. Dollar in the 21st Century. J Int Econ. 2023;141:103715.
- View Article
- Google Scholar
9. Ren Y, Wang Q, Zhang X. Short-term exchange rate predictability. Finance Res Lett. 2019;28:148–52.
- View Article
- Google Scholar
10. Rossi B. Exchange rate predictability. J Economic Literature. 2013;51(4):1063–119.
- View Article
- Google Scholar
11. Capistrán C, López-Moctezuma G. Las expectativas macroeconómicas de los especialistas: una evaluación de pronósticos de corto plazo en México. El Trimestre Económico. 2010;77(2):275–312.
- View Article
- Google Scholar
12. Ince O, Molodtsova T. Rationality and forecasting accuracy of exchange rate expectations: Evidence from survey-based forecasts. J Int Financial Markets, Institutions and Money. 2017;47:131–51.
- View Article
- Google Scholar
13. Pincheira-Brown P, Neumann F. Can we beat the Random Walk? The case of survey-based exchange rate forecasts in Chile. Finance Res Letters. 2020;37:101380.
- View Article
- Google Scholar
14. Kiss T, Kladívko K, Silfverberg O, Österholm P. Market participants or the random walk – who forecasts better? Evidence from micro-level survey data. Finance Res Letters. 2023;54:103752.
- View Article
- Google Scholar
15. Pincheira P, Hardy N. More Predictable than ever, with the worst MSPE ever. J Economic Forecasting. 2024;0(4):5–30.
- View Article
- Google Scholar
16. Diebold FX, Kilian L. Measuring predictability: theory and macroeconomic applications. J of Applied Econometrics. 2001;16(6):657–69.
- View Article
- Google Scholar
17. Clements M, Hendry D. Forecasting economic time series. Cambridge University Press. 1998.
18. Brown PP, Hardy N. The mean squared prediction error paradox. Journal of Forecasting. 2024;43(6):2298–321.
- View Article
- Google Scholar
19. Mincer JA, Zarnowitz V. The evaluation of economic forecasts. Economic Forecasts and Expectations: Analysis of Forecasting Behavior and Performance. NBER. 1969. p. 3–46.
20. Pincheira Brown P, Hardy N. Forecasting base metal prices with the Chilean exchange rate. Resources Policy. 2019;62:256–81.
- View Article
- Google Scholar
21. Pincheira P, Hardy N. Forecasting aluminum prices with commodity currencies. Resources Policy. 2021;73:102066.
- View Article
- Google Scholar
22. Brown PP, Hardy N. Forecasting base metal prices with exchange rate expectations. J Forecasting. 2023;42(8):2341–62.
- View Article
- Google Scholar
23. Pedersen M. Una nota introductoria a la encuesta de expectativas económicas. Estudios Económicos Estadísticos. 2010;82.
- View Article
- Google Scholar
24. Diebold FX, Mariano RS. Comparing predictive accuracy. J Busin Economic Statistics. 1995;13(3):253–63.
- View Article
- Google Scholar
25. West KD. Asymptotic Inference about Predictive Ability. Econometrica. 1996;64(5):1067.
- View Article
- Google Scholar
26. Newey WK, West KD. A simple, positive semi-definite, heteroskedasticity and autocorrelation consistent covariance matrix. Econometrica. 1987;55(3):703.
- View Article
- Google Scholar
27. Newey WK, West KD. Automatic lag selection in covariance matrix estimation. The Rev Economic Stud. 1994;61(4):631–53.
- View Article
- Google Scholar
28. Brown PP, Hardy N. Correlation‐based tests of predictability. J Forecasting. 2024;43(6):1835–58.
- View Article
- Google Scholar
29. Pincheira-Brown P, Bentancor A, Hardy N, Jarsun N. Forecasting fuel prices with the Chilean exchange rate: Going beyond the commodity currency hypothesis. Energy Econ. 2022;106:105802.
- View Article
- Google Scholar

[ref1] 1. Meese RA, Rogoff K. Empirical exchange rate models of the seventies. J International Econ. 1983;14(1–2):3–24.
View Article
Google Scholar

[2] View Article

[3] Google Scholar

[ref2] 2. Engel C, West KD. Exchange rates and fundamentals. J Political Econ. 2005;113(3):485–517.
View Article
Google Scholar

[5] View Article

[6] Google Scholar

[ref3] 3. Kilian L, Taylor MP. Why is it so difficult to beat the random walk forecast of exchange rates?. J International Economics. 2003;60(1):85–107.
View Article
Google Scholar

[8] View Article

[9] Google Scholar

[ref4] 4. Ferraro D, Rogoff K, Rossi B. Can oil prices forecast exchange rates? an empirical analysis of the relationship between commodity prices and exchange rates. journal of international money and finance. 2015;54(June 2015):116–41.

[ref5] 5. Clark TE, West KD. Using out-of-sample mean squared prediction errors to test the martingale difference hypothesis. J Econometrics. 2006;135(1–2):155–86.
View Article
Google Scholar

[12] View Article

[13] Google Scholar

[ref6] 6. Chen YC, Rogoff KS, Rossi B. Can exchange rates forecast commodity prices?. Quarterly J Economics. 2010;125(3):1145–94.
View Article
Google Scholar

[15] View Article

[16] Google Scholar

[ref7] 7. Cheung Y-W, Chinn MD, Pascual AG, Zhang Y. Exchange rate prediction redux: New models, new data, new currencies. J Int Money Finance. 2019;95:332–62.
View Article
Google Scholar

[18] View Article

[19] Google Scholar

[ref8] 8. Engel C, Wu SPY. Forecasting the U.S. Dollar in the 21st Century. J Int Econ. 2023;141:103715.
View Article
Google Scholar

[21] View Article

[22] Google Scholar

[ref9] 9. Ren Y, Wang Q, Zhang X. Short-term exchange rate predictability. Finance Res Lett. 2019;28:148–52.
View Article
Google Scholar

[24] View Article

[25] Google Scholar

[ref10] 10. Rossi B. Exchange rate predictability. J Economic Literature. 2013;51(4):1063–119.
View Article
Google Scholar

[27] View Article

[28] Google Scholar

[ref11] 11. Capistrán C, López-Moctezuma G. Las expectativas macroeconómicas de los especialistas: una evaluación de pronósticos de corto plazo en México. El Trimestre Económico. 2010;77(2):275–312.
View Article
Google Scholar

[30] View Article

[31] Google Scholar

[ref12] 12. Ince O, Molodtsova T. Rationality and forecasting accuracy of exchange rate expectations: Evidence from survey-based forecasts. J Int Financial Markets, Institutions and Money. 2017;47:131–51.
View Article
Google Scholar

[33] View Article

[34] Google Scholar

[ref13] 13. Pincheira-Brown P, Neumann F. Can we beat the Random Walk? The case of survey-based exchange rate forecasts in Chile. Finance Res Letters. 2020;37:101380.
View Article
Google Scholar

[36] View Article

[37] Google Scholar

[ref14] 14. Kiss T, Kladívko K, Silfverberg O, Österholm P. Market participants or the random walk – who forecasts better? Evidence from micro-level survey data. Finance Res Letters. 2023;54:103752.
View Article
Google Scholar

[39] View Article

[40] Google Scholar

[ref15] 15. Pincheira P, Hardy N. More Predictable than ever, with the worst MSPE ever. J Economic Forecasting. 2024;0(4):5–30.
View Article
Google Scholar

[42] View Article

[43] Google Scholar

[ref16] 16. Diebold FX, Kilian L. Measuring predictability: theory and macroeconomic applications. J of Applied Econometrics. 2001;16(6):657–69.
View Article
Google Scholar

[45] View Article

[46] Google Scholar

[ref17] 17. Clements M, Hendry D. Forecasting economic time series. Cambridge University Press. 1998.

[ref18] 18. Brown PP, Hardy N. The mean squared prediction error paradox. Journal of Forecasting. 2024;43(6):2298–321.
View Article
Google Scholar

[49] View Article

[50] Google Scholar

[ref19] 19. Mincer JA, Zarnowitz V. The evaluation of economic forecasts. Economic Forecasts and Expectations: Analysis of Forecasting Behavior and Performance. NBER. 1969. p. 3–46.

[ref20] 20. Pincheira Brown P, Hardy N. Forecasting base metal prices with the Chilean exchange rate. Resources Policy. 2019;62:256–81.
View Article
Google Scholar

[53] View Article

[54] Google Scholar

[ref21] 21. Pincheira P, Hardy N. Forecasting aluminum prices with commodity currencies. Resources Policy. 2021;73:102066.
View Article
Google Scholar

[56] View Article

[57] Google Scholar

[ref22] 22. Brown PP, Hardy N. Forecasting base metal prices with exchange rate expectations. J Forecasting. 2023;42(8):2341–62.
View Article
Google Scholar

[59] View Article

[60] Google Scholar

[ref23] 23. Pedersen M. Una nota introductoria a la encuesta de expectativas económicas. Estudios Económicos Estadísticos. 2010;82.
View Article
Google Scholar

[62] View Article

[63] Google Scholar

[ref24] 24. Diebold FX, Mariano RS. Comparing predictive accuracy. J Busin Economic Statistics. 1995;13(3):253–63.
View Article
Google Scholar

[65] View Article

[66] Google Scholar

[ref25] 25. West KD. Asymptotic Inference about Predictive Ability. Econometrica. 1996;64(5):1067.
View Article
Google Scholar

[68] View Article

[69] Google Scholar

[ref26] 26. Newey WK, West KD. A simple, positive semi-definite, heteroskedasticity and autocorrelation consistent covariance matrix. Econometrica. 1987;55(3):703.
View Article
Google Scholar

[71] View Article

[72] Google Scholar

[ref27] 27. Newey WK, West KD. Automatic lag selection in covariance matrix estimation. The Rev Economic Stud. 1994;61(4):631–53.
View Article
Google Scholar

[74] View Article

[75] Google Scholar

[ref28] 28. Brown PP, Hardy N. Correlation‐based tests of predictability. J Forecasting. 2024;43(6):1835–58.
View Article
Google Scholar

[77] View Article

[78] Google Scholar

[ref29] 29. Pincheira-Brown P, Bentancor A, Hardy N, Jarsun N. Forecasting fuel prices with the Chilean exchange rate: Going beyond the commodity currency hypothesis. Energy Econ. 2022;106:105802.
View Article
Google Scholar

[80] View Article

[81] Google Scholar

Figures

Abstract

1. Introduction

2. Materials and methods

3. Results

3.1. Which is the toughest benchmark?

3.2. SPF forecast evaluation

3.2.1. MSPE forecast evaluation.

3.2.2. Mean directional accuracy.

3.2.3. Forecast evaluation based on correlations with the target variable.

3.3. Efficiency analysis

3.3.1. Bias in survey-based forecasts.

3.3.2. Correlation of survey-based forecasts with forecast errors.

3.4. Forecast optimization

4. Discussion

References