Interplay between Telecommunications and Face-to-Face Interactions: A Study Using Mobile Phone Data

In this study we analyze one year of anonymized telecommunications data for over one million customers from a large European cellphone operator, and we investigate the relationship between people's calls and their physical location. We discover that more than 90% of users who have called each other have also shared the same space (cell tower), even if they live far apart. Moreover, we find that close to 70% of users who call each other frequently (at least once per month on average) have shared the same space at the same time - an instance that we call co-location. Co-locations appear indicative of coordination calls, which occur just before face-to-face meetings. Their number is highly predictable based on the amount of calls between two users and the distance between their home locations - suggesting a new way to quantify the interplay between telecommunications and face-to-face interactions.


Introduction
The interplay between telecommunications, travel and face-toface meetings is an unresolved puzzle. In some cases it has been suggested that telecommunications may be a substitute for physical interaction [1] -an idea that gained traction during the nineties and the rapid expansion of the Internet [2,3]. In other cases conflicting hypotheses have been made, including those of a complementary [4,5], neutral [6] or reinforcing [7] effect. Recently, social networks have been identified as possible predictors of travel behavior, as well as the possible decision to telecommute [8,9]. Social interaction has thus been integrated in activity-travel models [10], in addition to the existing categories of travel such as commuting, leisure and business. Furthermore, researchers such as Urry and others [11][12][13] have argued that flows and meetings of people produce small worlds, which require connections and meeting places -a phenomenon which is also known as the new mobilities paradigm.
This study aims to provide a new perspective into the relationship between telecommunicating people and their physical locations through an assesment of anonymized Call Detail Records (CDRs). CDRs show great promise for academic research: they have recently been used to explore human communications [14,15], the geography of social networks [16,17], urban dynamics [18], and human mobility patterns [19][20][21][22]. In this paper we use them for the first time to study the relationship between the telecommunications patterns of any two people and their physical locations.

Results
We use a large anonymized dataset of billing records for over one million mobile phone users, which was gathered in Portugal over a twelve month period between 2006 and 2007 (see Methods). We look at all communications between pairs of users, together with their locations at call time. As we are interested in comparing people locations, we discard users for which we do not have enough samples. We use two subsets: D1, which contains all reciprocal communications between the top 100,000 callers; and D2, which contains 10,000 pairs from D1, sampled at different home distances to ensure the same home distances distribution found in D1 (see Text S2). In the sequel, we use D2 in cases where computational complexity limits the use of a larger set.
We discover that at least 93% of users in D1 who reciprocally call each other, have at least once shared the same cell tower area in one year. The percentage decreases slightly as the distances between their homes decreases, but the value is still above 90% for users living 100 km apart (see Figure 1). It appears that almost all remote communications are associated with being physically sharing space. It may also be noted that we are underestimating the percentage as our data is only based on locations at call time, so users might have also shared space without this being recorded in our data. Results are consistent with what was recently found analyzing spatio-temporal coincidences in a geo-tagged pictures database to infer social ties [23].
If we also consider the temporal component, we can look at how often and where users are sharing the same space at the same time.
We restrict our attention to the case when two users call each other using the same cell tower. This scenario is based on the hypothesis that they are calling each other to coordinate to meet in a nearby area, also called ''coordination knot'' [24]. Of course, two people living or working close by could also call each other very often without physically meeting. So, we excluded users living or working in the same cell tower area, estimated as described in Text S2. We define a co-location event between two users (who live and work in distinct locations) as a call between the users while they are connected to the same cellphone tower. Each co-location is characterized by a specific time and place. Based on this definition, we characterize the spatio-temporal features of co-location events, to see whether they represent a reasonable subset of actual face-to-face meetings between users.
Starting with the larger subset (D1), we analyze the relationship between calling activity and user's locations. Among the pairs of communicating users, 400,000 cases have two users calling each other while in the same cell tower area, 350,000 of which have distinct home and work locations. Interestingly, 38.33% of the communicating users co-locate at least once during the period examined. When stronger relationships are considered (users who call on average at least once per month) the percentage increases to 69.41%.
Call duration appears to increase with the homes distance between users (see black line in Figure 2). Calls that occur between co-located people (red line) have a shorter average call duration, suggesting that people who co-locate call each other briefly to coordinate the exact meeting place and time.
We also find that the number of calls between two users increases just before and after their co-location ( Figure 3). The probability is rather constant in the interval, with two peaks around 0 and 1 (consecutive co-location events). The presence of these peaks suggests that the considered events (co-locations) represent a reasonable proxy for face-to-face meetings. In particular, a peak of calls just before the co-location event, suggests that the two people are talking on the phone to arrange a meeting, in line with what is hypothesized in [16,24]. The peak right after the co-location event might be explained by a follow up call after the meeting.
We analyze the features of co-location places and compared it with geographical and communication differences between users. We define d 1 (l) and d 2 (l) as the distances traveled by two users at every co-location event l~1, . . . ,m, and compute three measures of comparison: 1. The median ratio between the shortest and longest distance at co-location time: r d~m edian l minfd 1 (l),d 2 (l)g maxfd 1 (l),d 2 (l)g : 2. The fraction of times user 1 travels less than its peer: where: 3. The fraction of times one of the users travels less than the peer: r t~m infr t1 ,r t2 g: The first measure r d allows a comparison to be made between the lengths of the two users' trips. On the D2 subset, we find on average r d~0 :3, i.e. one user travels about 3 times less than the other one. Due to the asymmetric behavior in the length of trips, we question whether the shorter trips are always taken by the same user, or if the two users share the short trips. The third measure r l allows an evaluation of the asymmetry at the pair level, showing an average of 0:06. This suggests that in 94% of the selected pairs, there is one user who constantly travels less than its peers. The second measure r l1 is a directed measure and is computed to see whether geographical and communication differences allow the user that travels less to be predicted. Text S3 reports how these measures vary with homes distance, population density, normalized tie strength and call direction. In particular we find that as users' homes distance increases, co-locations occur in a place that is closer to one of the users. Moreover, the more the normalized tie  strength differs between users, the more the co-locations occur in places close to one of them.
Our definition of distance d is based on the Euclidean distance between home and co-location places. Two limitations arises from this choice: 1) the Euclidean distance does not take into account the real path taken by a person; 2) the person might not travel directly from home but the origin of the trip to the co-location place could be different. However, as we are interested in the relative distances traveled by the two peers, we can assume that both limitations affect the two measures in a similar manner, thus limiting the potential bias.
We evaluate the relationship between the home locations' distance and the number of co-locations between users. Figure 4(a) shows the average number of co-locations, which decreases with distance. The result is consistent with what was found in [12,[25][26][27] using data from surveys. If we compare this decrease with the one of phone calls, and total call times (see Figure 4(a)) we find different decays with distance. Total call time is the least affected by distance (slope 20.04), followed by the number of calls (slope 20.07). In contrast with this, the number of co-locations is strongly affected by distance (slope 20.14). Even if we consider a broader definition of co-location, in which two users are considered co-  located in the same cell tower if they happen to make a phone call (not necessarily to each other) from the same cell tower area within one hour, we still find a similar decreasing trend, as shown in Figure 4(b) computed for the D2 subset. The results are consistent with those from the analysis of fixed phone data combined with interviews showing the effect of distance on call duration and frequency of meetings [28,29].
The number of calls has a strong influence on the number of colocations, suggesting that the more people call each other, the more they co-locate (see Figure 5). As there appears to be a clear relationship between call patterns, distance and co-locations, we tried to built a predictor of the number of co-locations, starting from a measure of interaction (number of calls) and the geographical distance between users' home, obtaining r 2~0 :61 with the model (Figure 6): #colocations~0:92 #calls 0:60 distance 0:08 : This result suggests that geography and telecommunication interactions account for 61% of variations in the number of co-locations (see also Text S4). This is consistent during the one year time frame under analysis, as reported in Text S5. The exponent 0:60 for the #calls reveals the correlation between an increase in the number of calls and an increase in the number of co-locations. This result suggests that telecommunications might play a complementary role in facilitating face-to-face interactions, supporting the observations found in other studies [4,5].

Discussion
In this study we analyze one year of telecommunications data from a large European cellphone operator to investigate the relationship between people's calls and their physical location.
We discover that more than 90% of users who called each other have also shared the same space (cell tower), even if they live far apart. Moreover, we find that 69% of users who call each other frequently (at least once per month on average) have shared the same space at the same time -an instance that we call co-location. Co-locations appear highly indicative of coordination calls occurring just before face-to-face meetings. We are able to predict 61% of variations in the number of co-locations from the number of calls, and users' homes distance. In particular, as the distance between homes increases, the expected number of co-locations decreases.
We also characterize the co-location places in terms of distance from the home locations. As the users' homes distance increases, co-locations occur in a place that is closer to one of the users. In more than 90% of the cases, co-locations take place in an area that is closer to the same user of the pair (there is low reciprocity in the travel distance covered). Telecommunication strength helps predict which person of the pair travels less.
We believe that the above results suggest new ways to use CDRs to investigate the old conundrum of the interplay between telecommunications, travel and face-to-face meetings -with applications in the social sciences, urban planning and transportation studies.

Dataset
We use a large anonymized dataset of billing records for over one million mobile phone users, which was gathered in Portugal over a twelve month period between 2006 and 2007. To safeguard personal privacy, individual phone numbers were anonymized by the operator before leaving storage facilities, and they were identified with a security ID (hash code). Each entry in the dataset has a CDR, which consists of the following information: timestamp, callers ID, callees ID, call duration, callers cell tower ID, and callee's cell tower ID. This metadata on each call allows us to study both the mobile social interaction as well as the physical location of the users within the dataset. Notably, the dataset does not contain information regarding text messages (SMS) or data usage (internet). More details about the dataset can be found in Text S1.

Supporting Information
Text S1 Dataset. (PDF) Text S2 Home and work location determination.

(PDF)
Text S3 Co-location places, geography and communication strength.

(PDF)
Text S4 Statistical analysis. (PDF) Text S5 Relationship between co-locations and calls over time. (PDF)