The Strength of Friendship Ties in Proximity Sensor Data

Understanding how people interact and socialize is important in many contexts from disease control to urban planning. Datasets that capture this specific aspect of human life have increased in size and availability over the last few years. We have yet to understand, however, to what extent such electronic datasets may serve as a valid proxy for real life social interactions. For an observational dataset, gathered using mobile phones, we analyze the problem of identifying transient and non-important links, as well as how to highlight important social interactions. Applying the Bluetooth signal strength parameter to distinguish between observations, we demonstrate that weak links, compared to strong links, have a lower probability of being observed at later times, while such links—on average—also have lower link-weights and probability of sharing an online friendship. Further, the role of link-strength is investigated in relation to social network properties.


Introduction
Measuring social ties in real world data is not a simple procedure.In classical social science the standard approach is to use self-reported data.This method, however, is only practical for relatively small groups and suffers from cognitive biases, errors of perception, and ambiguities [1].And it has further been shown that the ability to capture behavioral patterns via self-report data is quite limited [2].A different approach for uncovering social behavior is to use digital records from emails and cell phone communication [3][4][5][6][7][8][9][10].Although such analyses have improved our understanding of social ties, they have left many important questions unanswered-are electronic traces a valid proxy for real social connections?Eagle et al. [11] began to answer this question by adding a spatial component to their data, using the short range (∼ 10 m) Bluetooth sensor embedded in study participants' smart phones to measure physical proximity.Their results show that proximity data closely reflects social interactions in many cases.But since it is easy to think of examples where reciprocal Bluetooth detection does not correspond to social interaction (e.g.transient co-location in dining hall) the question remains, which observations correspond to actual social interactions and which are just noise?Multiple alternatives have been proposed to Bluetooth for sensor-driven measurement of social interactions, each with particular strengths and weaknesses [12][13][14][15][16][17][18][19][20][21].For example, Radio Frequency Identification (RFID) badges have short interaction ranges (1 − 4 m) and measure only face-to-face interactions, thus solving many of the resolution problems posed by Bluetooth [20,21].However, this approach requires participants to wear custom radio tags on their chests at all times, unlike Bluetooth which is ubiquitous across many types of modern technology.
Our investigation digs into the role of Bluetooth signal strength, using a dataset obtained from apps running on the cell phones of 135 students at a large academic institution.Each phone records and sends data to researchers about call and text logs, Bluetooth devices in nearby proximity, Wifi hotspots in proximity, cell towers, GPS location, and battery usage [22].In addition, we combine the data collected via the phones with online data, such as social graphs from Facebook for a majority of the participants.The study continuously gathers data, but in this paper we focus on Bluetooth proximity data gathered for 119 days during the academic year of 2012-2013.Specifically, we use the received signal strength parameter to identify real face-to-face interactions and remove non-important links.Applying the method on our data, we compare the findings to a null model and demonstrate how removing links with low signal strength influences network structure.Moreover, we use estimated link-weights and an online dataset to validate the friendship-quality of removed links.

Dataset
We distributed phones among students from four study lines (majors), where each major was chosen based on the fraction of students interested in participating in the project.This selection method yielded a coverage of > 93% of students per study line, enabling us to capture a dense sample of the social interactions between subjects.Such high coverage of a social group has not been achieved in earlier studies [11].
The data collector app installed on each phone follows a predefined scanning time table, which specifies the activation and duration of each probe.Proximity data is obtained by using the Bluetooh probe.Every 300 seconds each phone performs a Bluetooth scan that lasts 30 seconds.During the scan it registers all discoverable devices within its vicinity (5 − 10m) along with the associated received signal strength indicator (RSSI) [23].Recorded proximity data is of the form (i, j, t, s), denoting that person i has observed j at time t with signal strength s.Only links between experiment participants are considered.Data collection, anonymization, and storage was approved by the Danish Data Protection Agency, and complies with both local and EU regulations.Written informed consent was obtained via electronic means, where all invited participants digitally signed the form with their university credentials.Along with the mobile phone study we also collected Facebook graphs for participants.Not all users donated their data since this was voluntary, however we obtained a user participation of ∼ 88% (119 users and 1018 Facebook friendships).

Identifying links
Independent of starting conditions, the scanning framework on one phone will drift out of sync with the framework on other phones after a certain amount of time, i.e. the phones will inevitably scan in a desynchronized manner.This desynchronization can mainly be attributed to: internal drift in the time-protocol of each phone, depletion of the battery, and users manually turning phones off.To account for irregular scans, we divide time into windows (bins) of fixed width and aggregate the Bluetooth observations within each time-window into a weighted adjacency matrix.The complete adjacency matrix is then given by: W = W (∆t1) , W (∆t2) , . . ., W (∆tn) , where each link is weighted by its signal strength and where ∆t i indicates window number i.These matrices generally assume a non-symmetric form, i.e. person A might observe B with signal strength s while person B observes A with strength s , or not at all.The scanning frequency of the app sets a natural lower limit of the network resolution to 5 minutes.If we are interested in the social dynamics at a different temporal resolution we can aggregate the adjacency matrices and retain entries according to some heuristic (e.g. with the strongest signal).Depending on the level of description (monthly, weekly, daily, hourly, or every 5 minutes) the researcher must think carefully about the definition of a network connection.Frameworks for finding the best temporal resolution, so called natural timescales have for specific problems been investigated by Clauset and Eagle [24], and Sulo et at.[25].In this paper, however, we are interested in the identification and removal of non-important proximity links, so aggregating multiple time-windows is not a concern here.Henceforth we solely work with 5 minutes time-bins.
The Bluetooth probe logs all discoverable devices within a sphere with a radius of 5-10 meterswalls and floor divisions reduce the radius, but the reduction in signal depends on the construction materials [26].Blindly taking proximity observations as a ground truth for face-to-face interactions will introduce both false negative and false positive links in the social network.False negative links are typically induced by hardware errors beyond our control, thus we focus on identifying false positive links.We therefore propose to identify non-important or noisy links via the signal strength parameter.The parameter can be thought of as a proxy for the relative distance between devices, since most people carry their phones on them, it will in principle also suggests the separation distance between individuals.Previous work has applied Bluetooth signals to estimate the position of individuals [27][28][29][30] but studies by Hay [31], and Hossein et al. [32] have revealed signal strength as an unsuitable candidate for accurately estimating location.However, the complexity of the problem can greatly be reduced by focusing on the relative distance between individuals rather than position.In theory, the transmitted power between two antennae is inversely proportional to the distance squared between them [33].Reality is more complicated, due to noise and reflection caused by obstacles.We use the ideal result as a reference while we perform empirical measurements to determine how signal strength depends on distance.Two devices are placed on the ground in a simulated classroom setting, where we are able to control the relative distance between them.The resulting measurements are plotted in Fig. 1A, as is evident there is a large variance in the measured signal strength values for each fixed distance.However, as both phones exhibit the same variance we can exclude faulty hardware; further, environmental noise such as interference from other devices, or solar radiation can also be dismissed since there appear no daily patterns in the data.But we observe multiple bands or so-called modes onto which measurements collapse, Ladd et al. [34] noted a similar behavior for the received signal strength of WiFi connections, both are phenomena caused by non-Gaussian distributed noise.The empirical measurements form a foundation for understanding signal variance as a function of distance, but they were performed in a controlled environment.In reality, there are a multitude of ways to carry a smartphone: some carry it around in a pocket, others in a bag.Liu and Striegel [35] have investigated how these various scenarios influence the received signal strengththeir results indicate only minor variations, hence we conclude that the general behavior is similar to the measurements shown in the figure.Further, social interactions are not only limited to office environments, so we have re-produced the experiment in outdoors and in basement-like settings; the results are similar.
Bi-directional observations yield at most two observations per dyad per 5-minute time-bin, we can average over the measurements (Fig 1B ), or take the maximal value (Fig 1C).Fig. 2 shows the distributions of signal strength for each respective distance.For raw data, Fig. 2A, we observe a localized zero-distance distribution while the 1, 2, and 3-m distributions overlap considerably.Averaging over values per timebin smoothes out and compresses the distributions, but the bulk of the distributions still overlap (Fig. 2B).Taking only the maximal signal value into account separates the distributions more effectively (Fig. 2C).The reasoning is that phones are physically at different locations and we expect the distance to be maximally reflected in the distributions.Thus, by thresholding observations on signal strength, we can filter out proximity links that are likely to be further away than a certain distance.By doing so we are able to emphasize face-to-face links, while minimizing noise and filtering away non-important links.From the behavioral data we count the number of appearances per dyad and assign the values as weights for each link.Link weights follow a heavy-tailed distribution, with a majority of pairs only observed a few times (low weights), a social behavior that has previously been observed by Onnela et al. [6].Based on their weight we divide links into two categories: weak and strong.A link is defined as 'weak' if it has been observed (on average) less than once per day during the data collection period, remaining links are characterized as 'strong' An effective threshold should maximize the number of removed weak links, while minimizing the loss of strong links.Fig. 3 depicts the number of weak and strong links as a function of threshold value.We observe that, as we increase the threshold, the number of weak links decreases linearly, while the number of strong links remains roughly constant and then drops off suddenly.Taking into account both the maximum-value distance distributions (Fig. 2C) and link weights (Fig. 3), we choose the value (−80 dBm) that optimizes the ratio between strong and weak links.In a large majority of cases, this corresponds to interactions that occur within a radius of 0 − 2 meters-a distance which Hall [36] notes as a typical social distance for interactions among close acquaintances.

Removing links
This section outlines various strategies for removing links from the network.Fig. 4A shows an illustration of the raw proximity data for a single time-bin, a link is drawn if either i → j or j → i. Thickness of a link represents the strength of the received signal.For the thresholded network (Fig. 4B) we remove links according to the strength of the signal (where we assume the weaker the signal the greater the relative distance between two persons).To estimate the effect of the threshold we compare it to a null model, where we remove the same number of links, but where the links are chosen at random, illustrated Fig. 4C.To minimize any noise the random removal might cause, we repeat the procedure n = 100 times, each time choosing a new set of random links, with statistics averaged over the 100 repetitions.To check whether thresholding actually emphasizes face-to-face links, we additionally compare it to a control network, where we remove links with signal strengths above or equal to the threshold, Fig. 4D; this procedure is also repeated multiple times.In a situation where there are more links below the threshold than above, we will remove fewer links for the latter compared to the other networks.

Network properties
Now that we have determined a threshold for filtering out noisy links, let us study the effects on the network properties.Thresholding weak links does not significantly influence the number of nodes present (N ) in the network (Fig. 5A), while the number of links (M ) is substantially reduced (Fig. 5B).On average we remove 2.38 nodes and 32.18 links per time-bin.Social networks differ topologically from other kinds of networks by having a larger than expected number of triangles [37], thus clustering is a key component in determining the effects of thresholding.Fig. 6 reveals a strong hint that we are, in fact, keeping real social interactions: random removal disentangles the network and dramatically decreases the clustering coefficient, while thresholding conserves most of the average clustering.Average clustering ratio ( c T / c N ) reveals that the clustering in the thresholded network compared to the null model network on average is 2.38 times larger.These findings suggest that a selection process based on signal strength greatly differs from that of a random one.

Link evaluation
Sorting links by signal strength and disregarding weak ones greatly reduces the number of links, but do we remove the correct links, i.e. do we get rid of noisy, non-important links?The fact that clustering remains high in spite of removing a large fraction of links is a good sign, but we want to investigate this question more directly.To do so, we divide the problem into two timescales; a short where we consider the probability that a removed link might reappear a few time-steps later, and a long where we evaluate the quality of a removed link according to certain network properties.The motivation for both time-scales is simple.Let's first consider the short time-scale.We assume that human interactions take place on a time-scale that is mostly longer than the 5-minute time-bins we analyze here.Thus, if a noisy link is removed, the probability that it will re-appear in one of the immediately following time-steps should be low, since no interaction is assumed to take place.We do expect the probability of reappearance in subsequent timesteps to be significantly greater than zero, since even weak links imply physical proximity.Similarly, if we (accidentally) remove a real link, the probability that it will appear again should be high, since the social activity is expected to continue to take place.
Let us formalize this notion.Consider a link e that is removed at time t, the probability that the link will appear in the next time-step is p(t + 1|e, t).Generalizing this we can write the probability that any removed link will appear in all the following n time-steps as: no. links removed at t present at t + 1 ∩ . . .∩ t + n no.links removed at t Fig. 7A illustrates that thresholded links in subsequent time-steps are observed less frequently then both null and control links.Note that to make the strongest possible case, we compare data from each thresholded time-bin with the raw data from the next bin (where the raw data contains many weak links).
In spite of this, we observe a clear advantage of distinguishing between links with weak and strong signal strengths.If we look at values for t + 1, the first subsequent time-step, the probability of re-occurrence in the thresholded network is about 12% lower than for the null model, and as we look to later time-steps, the gap widens.the average, the rest are disregarded.Average clustering is calculated according to the definition in [38].Since social activity in groups larger than two individuals results in network triangles, the fact that clustering is not significantly reduced by thresholding (compared to the null model) provides evidence that we are preserving social structure in spite of link removal.
A different set of social dynamics unfolds on longer timescales, here we determine impact of removing links in two ways.First, we use total link weights and second, we use online friendship status.Friends typically meet regularly.We capture this behavior using the total number of observations of a certain dyad to measure the total weight of the frindship (again, counted in the raw network).Thus, for first aspect of the long time-scale evaluation we use the total weight of removed links to evaluate the links we remove.Since multiple links are removed per time-bin we calculate an average weight, w t /w t,background = Avg.weight of removed links at t Avg.weight of all links at t Where w j is the total weight of pair j.If no links are removed at bin t then w t is disregarded.The second method to evaluate the link-selection processes is comparing the set of removed links to the structure of an online social network, i.e. if an removed proximity link has an equivalent online version.The weight of removed Facebook friendships is estimated as: no. links removed at t that are FB links no.links removed at t Bins with no observed Facebook friendships are disregarded.Fig. 7B indicates differences in the selection processes, both in terms of proximity and online networks.Distinguishing between strong and weak proximity links has a clear positive effect, thresholded links have on average lower edge weights and remove fewer Facebook friendships compared to both the null-model links and the background weights.In contrast, strong proximity links have higher weight, indicating a higher observation frequency and contain a substantially larger fraction of the online friendships.

Discussion
The availability of electronic datasets is increasing, so the question of how well can we use these electronic clicks to infer actual social interactions is important for effectively understanding processes such as relational dynamics, and contagion.Sorting links based on their signal strength allows us to distinguish between strong and weak ties, and we have argued there that thresholding the network boosts the social signal while eliminating some noise.The proposed framework is not perfect, in certain settings we remove real social connections while noisy links are retained.The results indicate that the framework is better at identifying strong links than removing them.A trend which the link-reappearance probability, link-weights, and online friendship analysis support.Compared to the baseline we achieve better results than just assuming all proximity observations as real social interactions.But determining whether a close proximity link is an actual friendship is much more difficult.Multiple scenarios exist where people are in close contact but are not friends, one obvious example is queuing.Each human interaction has a specific social context, so an understanding of the underlying social fabric is required to fully discern when a close proximity link is an actual face-to-face meeting.This brings us back to the question of how to determine a real friendship from digital observations (cf.[1]).Face-to-face meetings may not be the best indicator of friendship; call logs, text logs, and geographical positions are all factors which coupled with Bluetooth could give us a better insight into social dynamics and interactions.

Figure 2 .
Figure 2. Distributions of signal strength for the respective distances.A: Raw data.Measurements from both phones are statistically indistinguishable and are collapsed into single distributions, i.e. there is no difference between whether A → B or B → A. B: Average of signal strength per time-bin.C: Maximal value of signal strength per.time-bin.

Figure 3 .
Figure 3. Number of links per type as a function of threshold value.Links are classified as weak if they are observed less than 120 times in the data, i.e. links that on average are observed less than once per day-otherwise they are classified as strong.Grouping students into study lines, reveals that links within each study line have an almost uniform distribution of weights while links study lines are distributed according to a heavy-tailed distribution.A threshold of −80 dBm (gray area) removes 1159 weak and 387 strong links and classifies 97.6% of inter-study line links as weak and 86.7% of intra-study line links as strong.

Figure 4 .
Figure 4. Networks.A: Raw network; shows all observed links for a specific time-bin.Thickness of a link symbolizes the maximum of the received signal strengths.B: Thresholded network, we remove links with received signal strengths below a certain threshold, where dotted lines indicate the removed links.C: Null model; with respect to the previous network we remove the same amount of links, but where the links are chosen at random.D: Control network, links with signal strength above or equal to the threshold are removed.

Figure 5 .
Figure 5. Network statistics.A: Number of nodes as a function of time.Only active nodes are counted, i.e. persons that have observed another person or been observed themselves.Dynamics are shown for a two-week period of the 2013 spring semester.Data markers are omitted to avoid visual clutter.Due to the nature of the study the network statistics exhibit both daily and weekly patterns.On average thresholding removes 3.06 nodes during weekends and holidays, and 2.38 during regular weekdays.B: Number of links as a function of time.10.60 links are on average removed during weekends/holidays, and 32.21 are removed during weekdays.

Figure 6 .
Figure 6.Average clustering.Only active nodes, i.e. nodes that are part of at least one dyad contribute to

Figure 7 .
Figure 7. Link evaluation.A: Probability of link reappearance.For each selection process we remove a specific set of links.Thresholded, removes links with weak signal strength, Null, removes randomly chosen links, while the Control removes strong links.The probability for links to reappear within all the next n time-steps is calculated using Equation1and averaging over all time-bins.Boundary conditions are not applied and the reappearance probability for the last n = 5 bins is not taken into account.B: Average weights.For each time-bin we calculate wt/w t,background , where the background weight includes links present in bin t.Brackets indicate a temporal average across all time-bins, and red line denotes the average background weight.