Markovian city-scale modelling and mitigation of micro-particles from tires

The recent uptake in popularity in vehicles with zero tailpipe emissions is a welcome development in the fight against traffic induced airborne pollutants. As vehicle fleets become electrified, and tailpipe emissions become less prevalent, non-tailpipe emissions (from tires and brake disks) will become the dominant source of traffic related emissions, and will in all likelihood become a major concern for human health. This trend is likely to be exacerbated by the heavier weight of electric vehicles, their increased power, and their increased torque capabilities, when compared with traditional vehicles. While the problem of emissions from tire wear is well-known, issues around the process of tire abrasion, its impact on the environment, and modelling and mitigation measures, remain relatively unexplored. Work on this topic has proceeded in several discrete directions including: on-vehicle collection methods; vehicle tire-wear abatement algorithms and controlling the ride characteristics of a vehicle, all with a view to abating tire emissions. Additional approaches include access control mechanisms to manage aggregate tire emissions in a geofenced area with other notable work focussing on understanding the particle size distribution of tire generated PM, the degree to which particles become airborne, and the health impacts of tire emissions. While such efforts are already underway, the problem of developing models to predict the aggregate picture of a network of vehicles at the scale of a city, has yet to be considered. Our objective in this paper is to present one such model, built using ideas from Markov chains. Applications of our modelling approach are given toward the end of this note, both to illustrate the utility of the proposed method, and to illustrate its application as part of a method to collect tire dust particles.


Introduction
The recent uptake in popularity in vehicles with zero tailpipe emissions is a welcome development in the fight against traffic induced airborne pollutants, see historical review. The deployment of such vehicles is consistent with the prevailing contemporary narrative which is heavily focussed on mechanisms to abate mobility related greenhouse gases and tailpipe pollutants; see [1] for a snapshot of some recent work across several disciplines on this topic. However, as vehicle fleets become electrified, non-tailpipe emissions (from tires and brake disks) are likely to become a major concern for human health and this is likely to be exacerbated by the transition to electric vehicles due to their heavier weight and increased torque capabilities [2,3]. The issue of emissions from tire wear is in itself a very old topic. Somewhat remarkably, issues around the process of tire abrasion, its impact on the environment and human health, and modelling and mitigation measures, remain relatively unexplored and poorly understood. In addition, the general public seems oblivious to the fact that these emissions are significant and almost certainly harmful to human health. The fact that the topic is relatively unexplored and unknown (by the general public) in the context of automotive engineering is very surprising given the rate at which tire mass abrades and contributes to particulate matter (PM) in moving vehicles. PM is a generic term used for a type of pollutants that consists of a complex and varied mix of small particles. There is a growing and rich literature documenting the link between PM and its effects on human health [4][5][6][7][8][9]. A recent review of the impact of tire and road wear particles can be found in [10]. Roughly speaking, smaller PM particles tend to be directly more harmful to humans compared to larger ones, as they can travel deeper into the respiratory system [6,[11][12][13] (though larger toxic particles can also cause harm if they enter our food systems). Some of the known health effects related to PM include oxidative stress, inflammation and early atherosclerosis. Other studies have shown that smaller particles may go into the bloodstream and thus translocate to the liver, the kidneys or the brain (see [14] and references within). According to the World Health Organisation, for PM 2.5 , the daily maximum deemed safe level on average is 25 μg/m 3 , whereas the annual maximum permitted level is on average 10 μg/m 3 . For PM 10 , the maximum permitted levels are on average 50 μg/m 3 and 20 μg/m 3 on a daily and annual basis, respectively. In general, based on these numbers, it is acknowledged that non-exhaust emissions (including brake and tire wear, road surface wear and resuspension of road dust) resulting from road traffic, account for a significant component of traffic related PM emissions [15]. To parse these numbers in the context of a specific city it was recently estimated that approximately 186 kg per day of tire mass is lost to abrasion in Dublin each day [2].
Recently, the issue of tire generated PM emissions has become a topic of interest for several groups worldwide [16][17][18]. Roughly speaking, work on this topic has proceeded in several directions with work on the topic focussing on on-vehicle collection methods, on vehicle tirewear abatement algorithms, or estimating properties of tire debris. For example, several of the authors of [2], the tire Collective, have constructed a prototype on-wheel system for collection of tire debris [19]. Other authors [20,21] have explored controlling the ride characteristics of a vehicle with a view to abating tire emissions. A further approach in [2] explores access control mechanisms to manage aggregate tire emissions in a geofenced area. Other notable work on the topic focusses on the particle size distribution of tire generated PM, or to which degree this becomes airborne [8] (while currently available emission factors for tire wear in literature gives estimates of vehicle emissions of between 0.005 − 100g/km, no reliable method to calculate tire related PM or tire wear, depending, for example, on the vehicle operation, appears to be available [15,22]). The issue of which particulates become airborne is in fact the subject of some debate in the community. We note strongly that we are not concerned with such classifications. While research on emissions has focussed on airborne pollutants, the reality is that both outcomes are problematic for humans. Particles that become airborne have the potential to contribute to poor air quality in cities with all the ensuing health consequences; those that fall to the ground have the potential to enter water systems and contribute to the general problem of environmental microplastic pollution. Thus both manifestations of the tire pollution problem need to be addressed.
Our objective in this paper is to develop city-scale models of tire pollution, for both airborne and non-airborne PM, that can be used to inform policy makers in the fight to mitigate the effect of tire abrasion. We have already mentioned that the issue of tire wear is an old and relatively unexplored topic,in automotive engineering, and is subject to sources of large uncertainty. For example, tire induced PM, depends not only on the chemical composition of tires, but also on traffic densities, speeds, driving styles, and road surfaces. Indeed, the ultimate impact on humans depends on the effect of large aggregations of vehicles, each driven by drivers with differing styles, and with different tires. Given this uncertainty, there is clearly a need for robust and efficient methods that indicate the likely locations where large accumulations of tire mass are likely to be found. We would also like that these models somehow capture the complex relationship between speed limits, traffic signalling, and densities, so that these parameters can be explored from the perspective of tire emissions. To do this we shall build on our previous work on Markovian [23] models of traffic networks. An important point to note in this context is that even though tire emission factors are not well known (perhaps even unknowable), the qualitative aspects of the tire abrasion process is understood (the qualitative effects of speed, acceleration, weight, road surface). This is important from the context of Markovian network emission models which, even though uncertain, do tell us where build ups are likely to occur and the importance of road segments from the context of road debris. We shall show how such models enable a number of important applications in the fight against tire dust; in particular how such models can be used to inform tire dust collection strategies and to inform vulnerable road users such as pedestrians and cyclists.

Markovian models of traffic systems
The use of Markov chains for traffic congestion analysis was first proposed in [23]. Since then the idea has been developed and applied to other traffic related issues in a series of papers [24,25] and by other authors [26,27]. For convenience, we now briefly recall some of the background discussion on such models, while a more thorough explanation on such models can be found in the previous references. Traffic flows can be described through a Markov chain, which is a stochastic process characterized by the equation where p(E|F) denotes the conditional probability that event E occurs given that event F occurs. Eq (1) states that the probability that the random variable x is in state S ik+1 at time step k + 1 only depends on the state of x at time step k and not on preceding values. Usually the Markov chain with n states is described by the n × n transition probability matrix P, whose entry P ij denotes the probability of passing from state S i to state S j in exactly one step. Clearly the matrix P is a matrix whose rows sum to one (row-stochastic non-negative matrix). Markov chains are particularly useful for traffic systems due to their close association with graphs (in the context of traffic road networks). Recall that a graph is represented by a set of nodes that are connected through edges. Therefore, the graph associated with the matrix P is a directed graph, whose nodes are represented by the states S i , i = {1, . . ., n} and there is a directed edge leading from S i to S j if and only if P ij 6 ¼ 0. The strong connection between graphs and Markov chains manifests itself in many ways. For example, the notions of chain irreducibility and strongly connected graphs are enunciations of the same concept. More precisely, a graph is strongly connected if and only if for each pair of nodes there is a sequence of directed edges leading from the first node to the second one. Thus, P is irreducible if and only if its directed graph is strongly connected. The usefulness of Markov chains for road networks extends well beyond their close relation to graphs. In particular, many easily computable properties of the chain (from the transition matrix) also have strong physical interpretations. For example, for irreducible transition matrices, it is known that the spectral radius of P is 1. This fact is used in applications to detect communities in chains associated with transportation networks. Moreover, the left-hand Perron eigenvector π of the P matrix, that is π T P = π T such that π i > 0, ||π|| 1 = 1, yields a closed form expression for the stationary distribution of a random walker over the graph associated with the Markov chain. As such it has a strong connection to likely congestion locations in transportation networks. We shall exploit the Perron eigenvector in the present paper for the purpose of determining likely locations of high tire emissions. Finally, two other quantities that are useful for studying graphs and which can be easily computed are the Kemeny constant and the Mean First Passage Time. The mean first passage time (MFPT) m ij from the state S i to the state S j denotes the expected number of steps to arrive at destination S j when the origin is S i , and the expectation is averaged over all possible paths following a random walk from S i to S j . If we assume that m ii = 0, then the Kemeny constant is defined as Remarkably, the right-hand side is independent of the choice of the origin state S i [28]. An interpretation of this result is that the expected time to get from an initial state S i to a destination state S j (selected randomly according to the stationary distribution π) does not depend on the starting point S i [29]. Therefore, the Kemeny constant is an intrinsic measure of a Markov chain. Eq (2) emphasizes the fact that K is only related to the particular matrix P and it becomes very large if one or more of the other eigenvalues of P, different from λ 1 , are close to 1.
The use of Markov chains to model road network dynamics has been described in detail in [23] and in many subsequent papers by other authors [26,27]. The resulting networks are fully characterized by a transition matrix P, which has the following form: . . . P S n !S 1 P S n !S 2 . . . P S n !S n : ð3Þ The matrix P is a square matrix whose size is given by the number of road segments. The off-diagonal elements P S i !S j are related to the probability that one passes directly from the road segment S i to the road segment S j . Importantly, the transition matrix can be very easily computed after gathering the average travel times and junction turning probabilities. In our models the diagonal terms are proportional to travel times. If travel times are computed for all roads, and they are normalized so that the smallest travel time is 1, then the probability value associated to each self-loop is where tt i is the average travel time (estimated from collected data) for the i-th road. The offdiagonal elements of the transition matrix P can be obtained as where tp ij is the turning probability (estimated from collected data) of going from road i to road j [23]. In the next section we shall explain how this basic transition matrix (3) can be modified to convert the model the evolution of tire emissions in an urban landscape. Comment: The interested reader may ask the advantage of a Markovian model of traffic, as compared with using a traffic simulator, such as SUMO, which we shall extensively use in the remainder of this paper for validation purposes. The principal advantages of a theoretical approach are threefold. First, in terms of utility, once identified, the Markovian model gives access to predictions in a very efficient manner, especially when compared with Monte-Carlo based approached based on vehicle simulators. Second, following from the previous point, the parameters of mathematical models can be efficiently adjusted to explore traffic management strategies, without the need for ensembles of complex simulations. Finally, by developing a Markovian (transition matrix) approach to traffic modelling, one may avail of a well developed suite of analytics that have been developed to analyse Markovian systems over the past century. This can then be used both to study and analyse the properties of transportation networks, as well as providing a basis for the design of network-level traffic policies. Indeed, this has been explored in a series of papers on traffic modelling since the publication of [23]; see [25] for examples of work in this direction.

Extension of Markovian model to tire emissions
Our starting point in developing a tire emissions model is the assumption that the Perron eigenvector of a traffic congestion matrix also provides some relevant information about tires' emissions. This is a reasonable assumption because the entries of the Perron eigenvector report the average long-run fraction of time that a vehicle spends on each road. However, there is not a precise relationship between emissions and density information as tires emissions do not only depend on the amount of time that is spent along one road, but also on other quantities, such as average driving style and average speeds. To capture such effects, as a first approximation to develop city-scale models of tire pollution, we shall now describe how the number of tire particles can be estimated depending on the vehicle's speed, and how this information can be embedded in the Markov chain transition matrix.
As we have mentioned, tire emission factors in the literature are characterised by huge uncertainty varying between 0.005 to 100g/km [14]. In any case such a simple characterisation of tire based PM is not suitable to build a Markovian model of tire emissions; to build such a model a tire based PM estimation model depending on a vehicle's operation mode (for example, speed, acceleration, driving style) is required. To this end, we shall use measurements and results from [30] that show a dependency of the number of ultra-fine tire particles PN produced by a vehicle and the vehicle's operation. As particles irrespective of size, tend to be harmful to human health [31] we shall in the sequel focus on the number of particles to evaluate the impact in the city network, and consequently adopt and develop the approach from [30] to estimate the number of particles.
Comment: To further justify our approach it is worth noting the approach adopted here has also been recognised by the latest EU legislative regulations. These place a higher emphasis on the number of particles rather than particle mass or size distribution [32].
The measurements from [30] show a linear dependency of PN and vehicle's speed v, as well as an approximately quadratic dependency of PN and the forces on the wheels F. The combination of both curves leads to the following estimate of PN It is not a trivial matter to gather information about values of F and it is therefore more convenient to express F as a function of v (as aggregate estimates of v are simpler to obtain). To do so, we make the simplifying assumption that all roads in the city network are flat (without incline or elevation) and, in addition, for the sake of simplicity, accelerations are neglected.
Comment: The previous assumption introduces some approximations in our estimates (for example, accelerations would cause a higher number of tire particles (6)). However, we make two observations.
(i) First, it is important to note that if city-wide accelerations can be measured and aggregated, then this approach can be corrected to yield a more realistic and sophisticated model for tire particle estimation.
(ii) In many of our applications, we are interested in locations of elevated tire dust. While the simplified modelling approach will certainly affect the estimate of absolute amount of PM gathered in a specific location, the relative ranking of locations (to guide, for example, collection) will be less affected by the modelling assumption.
Thus, in our case, the force can be approximated as a function of velocity as where the first term describes the rolling effect of the vehicle, while the second term takes into account the air drag resistance. In Eq (7), m is the mass of the vehicle, g is the the gravitational constant, c r is the rolling resistance, c d is the drag resistance coefficient, ρ is the density of air, and A is the approximated front area of the vehicle. Numerical values for an average vehicle are given in Table 1. To estimate the number of tire particles per driven km, (6) and (7) are combined as follows: Note that the true process of generating tire particles is also affected by other factors that we are not considering here, such as road surface, type of tire, vehicle's weight [33] etc.
Comment: As a final comment, we further remark that Eq (8) gives the number of particles under rather approximated conditions and may underestimate the actual number of PN, as acceleration and braking events are neglected. However, as the Markovian models reveal densities, we expect these approximations to be reasonably accurate up to a scaling factor.
The estimate of the number of tire particles (8) is now used to convert the unit of time of the original transition matrix into a tire particle, and a step in the Markov chain now corresponds to a unit of tire emission. For this purpose, we change the diagonal entries of the transition matrix P as follows: where v i is the average velocity on the road segment i and l i is the length of the corresponding road segment. Then, the off-diagonal elements are re-normalized as stated in Eq (5) to keep the transition matrix row-stochastic. Comment: The effect of the diagonal scaling of Eq (8) is that large values in the diagonal of the original matrix P of Eq (8) corresponded to long times required to travel along a given road segment, while in the new transition matrix for the tire emissions model, large values in the diagonal entries of the new transition matrix now correspond to road segments with high tire emissions. More properties of the diagonal scaling can be found in [24]. Table 2 summarizes the interpretation of typical quantities of interest in Markov chains for the original transition matrix of travel times, and the new transition matrix related to tire emissions.

Applications of tire emissions model
While the utility of Markovian traffic models has been documented in several publication, to the best of our knowledge their utility in the context of tire emissions has not yet been

PLOS ONE
investigated. The objective of this section is to present some basic applications of the tire emissions model to illustrate its utility. We begin with some basic applications.

Application I-Design of low emissions zones
To illustrate potential applications of our approach, we now consider the design of a low emissions zone for a city. To provide some background context and link this to our previous work we now first consider the same urban network that had been investigated in references [23,24]. This simple network is depicted in Fig 2 and  As it can be seen from Figs 3 and 4, measured vehicle obtained from the mobility simulator SUMO [34] (blue dashed line) are compared with the basic Markovian traffic model (black solid line), for every considered speed limit. A very close correspondence between the simulator output and the Markov chain can be observed. In the same figure, we also report the distribution of different pollutants as estimated using the Markov chain of emissions [24].
Comment: While the stationary distributions of different pollutants and travel times can be easily estimated with the Markovian approach in a few milliseconds, it is more complicated to retrieve the same information by using the simulator. Indeed, in the latter case an ensemble of  Fig 3 that for low speed limits the density of pollutants is proportional to the density of vehicles (where there are more vehicles, there is more pollution), however, when higher speed limits are considered, the proportionality is lost, and different pollutants exhibit different properties with different speed limits. This last comment is further illustrated in Fig 5 that shows the optimal value of the Kemeny

PLOS ONE
constant as a function of speed limits. Recall that in the context of the tire emissions model, the Kemeny constant K is a measure of the average number of emissions associated with trips in the networks, and thus. it is a single quantity of the Markov chain which can be interpreted as an indication for the pollution in the entire network.
In particular, a lower value the Kemeny constant corresponds to a lower value of average emissions and a better overall network. As we have already observed, this may however be a tricky problem, since the optimal speed limits for tire particles may actually increase emissions from other pollutants. In order to calculate the optimal speed limit for the network, simulations for different maximum speeds, which are 20, 35, 50, 65, 80, 95 km/h, have been conducted using SUMO. After simulating the network for these six different speed limits, and building the resulting Markov chain, six different Kemeny constants can be calculated for each type of pollutant as well as for the travel time. It can be clearly observed that the optimal speed

PLOS ONE
limit varies again for different types of pollutants. While low speed limits seem to be good for reducing CO and tire particles, high speed limits would be better to reduce NOx and of course travel time. Fig 5 depicts the non-obvious result that the "environmentally optimal" speed limit actually depends on the specific pollutant that one is interested in minimizing. In particular, 40km/h appears to be the best speed limit if one aims at minimizing CO emissions, 60km/h is the best choice for minimizing CO 2 and tire emissions (which is the specific objective of this manuscript), 100km/h is the best choice for minimizing NOx and Benzene, while, obviously, the maximum considered speed limit (i.e., 120km/h) is the best option to minimize travel times. Thus, the selections of the "best" speed limits is not trivial, and policy makers should be informed about the optimal speed limits for different pollutants in order to make a decision.

Optimal speed limits in more realistic road networks.
To conclude this section we now confirm the above findings on a more realistic road network. To this end, rather than utilising the simple network previously illustrated, we now consider the artificial, but nevertheless realistic, network shown in Fig 6, where we assume that vehicles are allowed to travel in both directions on each road. We simulate the traffic flows ustilising the previously mentioned simulator SUMO (Simulation of Urban MObility). SUMO has been developed at the Institute of Transportation Systems at the German Aerospace Center and is an open source traffic simulation package that has been frequently used for large traffic networks. Once pre-defined start and destination roads are chosen, SUMO can automatically assign shortest routes to vehicles (e.g., minimum time routes) to the vehicles. After the simulation, statistics such as average travel times, average speeds, junction turning probabilities are available from SUMO for the whole network, and can be used to form the transition matrix P of travel times. Then, the average speed model can be used to form the transition matrix of tire emissions as explained in Section 3.

PLOS ONE
As before, in order to calculate the optimal speed limit for the network, simulations for different maximum speeds, which are 20, 35, 50, 65, 80, 95 km/h, have been conducted using SUMO. After simulating the network for these six different speed limits, and building the resulting Markov chain, six different Kemeny constants can be calculated for each type of pollutant as well as for the travel time. Fig 7 shows the Kemeny constants, normalized to fit the same graph. It can be clearly observed that the optimal speed limit varies again for different types of pollutants, confirming the results that had been provided for the simpler network. While low speed limits seem to be good for reducing CO and tire particles, high speed limits would be better to reduce NOx and of course travel time.

Application II-Advisory systems for protection of cyclists
One potential application of our Markov chain is related to the tire emissions footprint associated with specific routes. Active travel (cycling, walking) is experiencing a resurgence across the developed world as citizens abandon public transportation in response to health related concerns associated with Covid-19 [35,36]. Pedestrians and cyclists are extremely vulnerable road users and their exposure to traffic emissions regularly far exceeds that of car occupants. Given this context our goal now is to use the Markovian model to find the minimum tire emission route for cyclists in the network in order to reduce the emission exposure and consequently the harmful effect of emissions for their health. Here, we are using the classic Dijkstra algorithm [37] to determine the best route, but different from traditional applications, we do not wish to minimize distance or time, but the exposure to tires emissions. Thus, we associate each road segment with its corresponding entry in the Perron eigenvector (which we remind represents the normalized long run fraction of tire emissions release along each road segment), and we use Dijkstra algorithm to find the best path. In addition to computing the minimum tire emission route, one may also ask whether these best paths are sensitive to changing speed limits; namely, in other words, to know whether changing the speed limits in the network also

Application III-Tire-dust collection
We now present a somewhat unconventional application of the Markovian approach; namely, using the Markovian approach to inform the collection of tire dust by road sweepers [38]. Street sweeping is an effective practice to reduce the amount of road dust, and there is a recent interest in the literature to evaluate the effectiveness of the process [39] and to improve the efficiency of sweeping machines [40]. Here, we take a different view, and we are rather interested in the path followed by road sweepers. Indeed, we have already mentioned that tire particles are harmful to humans irrespective of whether they become airborne, or become part of ground debris. Ground debris is very harmful to humans due to the various pathways for tire particles to enter the human food system; in particular, through city drainage systems. This latter aspect is an important consideration to motivate the collection of tire particles prior to heavy rainfall events, or other severe weather events. In such circumstances, it is important to collect as much tire debris as quickly as possible, and this is in severe contrast to how road sweeping currently takes place.
Typically, road sweepers follow pre-defined paths, trying to cover most of the city, but without taking into account parts that would maximize the collection of tire particles. In this particular context, our Markovian model has much to offer. Our basic intuition is as follows. Since the Perron eigenvector provides the long-run fraction of pollutants along each road segment, important information can be extracted from the Markov chain to inform collection of tire particles in an optimum manner. However, as we have mentioned-our model is approximate and subject to much uncertainty. Thus, we propose to use our estimated chain to seed a learning based algorithm, using Reinforcement Learning (RL), that will learn the routes that are most likely to have large quantities of tire particles, and we now indicate how this can be achieved making use of the Markovian modelling approach.
Recall that reinforcement learning [41] is a machine learning strategy where agents (such as road sweepers) can explore an unknown environment, and learn optimal policies (such as the most likely route to find large quantities of tire particles). Reinforcement learning in conjuction with our Markovian models is appealing for this problem for two reasons.
(i) Our Markovian model could, in principle, be used as a basis for routing algorithms. However, as mentioned, the model is very uncertain, as it neglects several factors that affect tire particle generation. Thus, using a learning strategy to tune the elements of the transition matrix to provide a basis for routing makes a great deal of sense for such applications.
(ii) In addition, as we have already mentioned, the Perron eigenvector of the tire emission chain can be used to find the minimum tire emission routes for cyclists and other vulnerable road users. However, the road sweeping problem is much more challenging if one wishes to find the maximum polluted routes. This problem is an example of longest path problems and these are known to be NP-hard. While it is true that longest path problems can sometimes be converted into shortest path problems by negating the edge weights in a graph [42], many shortest path algorithms are able to solve the problem only if the underlying graph does not any cycles. This is not common for road network graphs, thus further motivating our interest in reinforcement learning algorithms.
To orchestrate a setting for reinforcement learning that is well-posed, we must first ensure that negative cycles in the graph associated with the network transition matrix are avoided. Here, we remind that a negative cycle in a graph is a cycle for which the overall sum of the weights is negative. To avoid such negative cycles in a graph, we simply add travel time/distance constraints to the longest path problem. In addition to making our solution well-posed, such constraints in fact are very sensible for road sweeping applications due to limited battery capacity of sweeping vehicles in the case of electric road sweepers. To this end, we combine the tire emission graph with a distance graph into one directed weighted graph G. Recall that an entry of the Perron eigenvector represents the normalized long run fraction of tire emissions released along the road segment assigned to that entry. The tire emission graph is then obtained by assigning negated entries of the Perron eigenvector to the corresponding edges in the graph. The distance graph is derived from a road network where the edge weight represents the length of road segments included in the corresponding state [43]. This then turns our problem into a type of multi-objective optimisation problem. We use a convex linear combination of the two objectives (travel distance and tire emissions) characterised by a quantity α which is the weight of the distance component of the cost. The corresponding weight function is described in Function 1 which returns the weight of a given edge in graph G. This weight represents the cost of traversing state s from any other preceding state.
Notation for the Weight Function: In Function 1 we have: α is the distance weight, a real number that satisfies 0 � α � 1; s is a state in the state space S; l s is the length of road links included in state s (here, the road merging mechanism introduced in [43] is utilised); c s is the amount of tire particles emitted along the road segments included in state s within some given time interval τ; L tot is the sum of the road lengths for a given road network; C tot is the total amount of tire particles emitted through the entire road network within time interval τ. C tot can be estimated using some historical data, and we simply assume that C tot is known.
To further elaborate on our proposed algorithm, we denote by α min the minimum value of α such that the graph G does not have any negative cycles for weights computed as Wða; s; l s ; � c s Þ, where � c s is the estimated amount of tire emissions along the road segments clustered in state s within the time interval τ. The value of � c s is the result of multiplication of C tot by the corresponding to state s entry of Perron eigenvector. The value of α min is determined empirically to two decimal digits of precision. Note that even though the weight w s defined in Function 1 is measured in the units of l s (i.e., distance), the values of w s can be negative. This particular design of w s results in a lower value of α min compared to what one would obtain in the case of normalized unitless weights.
Once a combined graph (without any negative cycles) has been constructed, we then use a shortest path algorithm to compute default solutions. To deal with values of α such that α < α min , approximation techniques are required to find the maximum tire emissions routes subject to the distance constraints. To solve this problem, our preferred approximation tool is reinforcement learning. The default solutions from the Markov chain are used as the initial estimate for the reinforcement learning algorithm. Note that even though the number of tire particles is generally larger along longer routes, assigning very long routes to the road sweepers would dramatically increase their travel time and may not even be feasible due to battery constraints.
We employ reinforcement learning to amend the initial estimate whenever the constraint of α < α min is satisfied (note that the case α = 1 corresponds to the shortest path routing). The Q-learning algorithm proposed in [44] is utilised here in our work. The initial parameters [41] for the underlying Q-learning algorithm are obtained from the default solutions of the Markovian model. Actions of the agent (i.e. road sweeper) represent road directions, for instance, turn left/right. The goal of the agent is to find a route which maximises the total expected reward [41]. The reward function for this application is outlined in Function 2: it returns a reward at time step t. A realistic road network, based on an existing area in Barcelona, Spain, used in all our experiments is depicted in Fig 9. To illustrate our algorithm we describe several illustrative experiments, designed using the SUMO traffic simulator and randomly generated traffic conditions. In all our experiments, a single Q-learning agent, i.e. road sweeper, starting from the same origin O (see Fig 9) each episode (day) is used. The agent has a fixed destination D to which it is asked to find the optimal route which (i) should be most polluted, and (ii) satisfies the distance constraint given by the parameter α. A road sweeper is released every time a new episode starts, i.e., once per day. Regarding the design parameters in the reward function, the values of β 1 and β 2 were tuned to be 3 and 8, respectively. The values of α min were empirically determined for each specific experiment. 4.3.1 Experiment 1: Tire dust collection under high traffic densities. In this experiment, we firstly consider a scenario in which traffic density is high. To achieve such a condition, we release a new vehicle every simulation step. High traffic density conditions naturally result in a larger amount of tire dust on the streets. In these settings, the estimated value of α min is 0.78. Fig 9 shows the shortest path in red, the initial solution in dashed green, and the optimal solution in solid blue, obtained using the proposed Q-learning algorithm for high traffic densities. Fig 10 compares the properties of such three routes with α = 0.5 used for the Q-learning solution. The brown dot-dashed line corresponds to the shortest path route, which was calculated using Dijkstra shortest path algorithm on the graph with weights computed using Function 1 for α = 1. The blue dashed line corresponds to the route obtained from the initial estimate, i.e., the default solution. Such a route was also calculated using Dijkstra shortest path algorithm on a graph with weights computed using the same weight function for α = α min = 0.78. As it can be seen in Fig 10 (black solid line), the road sweeper uses the initial solution at the beginning of the learning process. The agent, however, can explore the environment by taking random actions and eventually improves the chosen sweeping path. As it can be observed, the agent is able to find much longer routes than the shortest path and default routes in a rapid fashion, and such routes are also more polluted with tire particles. Even though routes with higher tire emissions have been explored by the agent, it will not prefer them due to the distance constraint, and this Q-learning routing system eventually converges to the solution which is optimal for the selected value of α.

Experiment 2: Tire dust collection under low traffic densities.
To simulate traffic conditions with a lower density, a new vehicle is now released every second simulation step. In this case, the obtained value of α min is 0.79. Fig 11 depicts the shortest path, default solution, and performance of Q-learning for α = 0.4. For low densities, it is reasonable to reduce the value of α in order to give more priority to pollution over distance. Otherwise, the Q-learning routing system would be useless as it may converge to the default solution. Note that the amount of collected pollution by the road sweeper in this case is indeed lower than that in the case of high density traffic (see the middle subplots in Figs 10 and 11).

PLOS ONE
From Experiment 1 and Experiment 2, we can draw the conclusion that the system is indeed able to find the optimal solution (the most polluted route with tire particles, without breaking travel distance/time constraints) in both high and low density traffic conditions. Thus, the RL strategy is an attractive alternative to those tools that can be fragile, especially when dealing with longest path problems and large-scale scenarios.

Conclusion
The problem of tire dust collection is likely to become one of the most pressing issues in automotive research and in wider society. While the problem of micro-plastic pollution is already becoming as issue of concern, the problem of tire induced pollution has, remarkably, yet to manifest itself in the consciousness of the public-at-large, possibly due to the sheer weight of the zero-tailpipe narrative that prevails currently in public discourse. Our objective in this

PLOS ONE
paper is thus twofold. First, we wish to make researchers across a wide spectrum of disciplines, aware of this problem, in all its guises. Second, we wish to suggest mitigation measures that can be used to combat this problem. While previous studies have focussed on on-vehicle mitigation measures, and network level access control mechanisms, our approach here is somewhat different. Our approach is to develop modelling strategies that can be deployed a-posteri. Specifically, we wish to predict, using a combination of measurements, and analytics, the likely areas where tire-dust will aggregate, with a view to using this information to inform collection strategies. In this paper we have introduced one such model of tire dust distribution in cities. A number of application use-cases are suggested that use the main features of this model. Future work will explore refinements of this initial model and its experimental validation.