The Strength of Friendship Ties in Proximity Sensor Data

doi:10.1371/journal.pone.0100915

Table 1.

Data overview.

More »

Expand

Figure 1.

Bluetooth signal strength (RSSI) as a function of distance.

A: Scans between two phones. Measurements are per distance performed every five minutes over the course of 7 days. Mean value and standard deviation per distance are respectively and . B: Average of the values in respective time-bins. Summary statistics are: and . C: Maximal value per time-bin. The mean value and standard deviation per distance are: , , and The measurements cover hypothetical situations where individuals are far from each other and on either side of a wall.

More »

Expand

Figure 2.

Distributions of signal strength for the respective distances.

A: Raw data. Measurements from both phones are statistically indistinguishable and are collapsed into single distributions, i.e. there is no difference between whether observes or vise versa. B: Average of signal strength per time-bin. C: Maximal value of signal strength per. time-bin.

More »

Expand

Figure 3.

Number of links per type as a function of threshold value.

Links are classified as weak if they are observed less than times in the data, i.e. links that on average are observed less than once per day—otherwise they are classified as strong. Grouping students into study lines, reveals that links within each study line have an almost uniform distribution of weights while links across study lines are distributed according to a heavy-tailed distribution. A threshold of (gray area) removes 1159 weak and 387 strong links and classifies of inter-study line links as weak and of intra-study line links as strong.

More »

Expand

Figure 4.

Networks.

A: Raw network; shows all observed links for a specific time-bin. Thickness of a link symbolizes the maximum of the received signal strengths. B: Thresholded network, we remove links with received signal strengths below a certain threshold, where dotted lines indicate the removed links. C: Null model; with respect to the previous network we remove the same amount of links, but where the links are chosen at random. D: Control network, a similar amount of links with signal strength above or equal to the threshold are removed.

More »

Expand

Figure 5.

Network statistics.

Properties are highly dynamic but on average we observe nodes and links per time-bin. A: Number of nodes as a function of time. Only active nodes are counted, i.e. people that have observed another person or been observed themselves. Dynamics are shown for two weeks during the 2013 spring semester, clearly depicting both daily and weekly patterns. Data markers are omitted to avoid visual clutter. On average thresholding removes nodes during weekends and holidays, and during regular weekdays. B: Number of links as a function of time. links are on average removed during weekends/holidays, and are removed during weekdays.

More »

Expand

Figure 6.

Average clustering.

Only active nodes, i.e. nodes that are part of at least one dyad contribute to the average, the rest are disregarded. Average clustering is calculated according to the definition in [48]. Since social activity in groups larger than two individuals results in network triangles, the fact that clustering is not significantly reduced by thresholding (compared to the null model) provides evidence that we are preserving social structure in spite of link removal.

More »

Expand

Figure 7.

Link evaluation.

A: Probability of link reappearance. For each selection process we remove a specific set of links. In the thresholded network, we remove links with weak signal strength. For the null network, we remove links at random. Lastly, in the control network case we remove strong links. The probability for links to reappear within all the next time-steps is calculated using Eq. 1 and averaging over all time-bins. Boundary conditions are not applied and the reappearance probability for the last bins is not taken into account. B: Quality measure for proximity data. C: Quality measure for the online data. For each time-bin we calculate as defined in Eq. 2 and 3. Brackets indicate a temporal average across all time-bins and value are shown for all three network types.

More »

Expand