Fig 1.
Flowcharts of the CANDIDATE procedure: (a) Add, (b) Encode, (c) Lookup, and (d) Hash.
Fig 2.
Examples of how different hash functions code the name “Christian”.
Table 1.
Collision probability with djb2, CRC-32 and double hashes (half djb2/half CRC-32) for 100 randomly selected names with different coding space sizes.
Based on simulation with 10,000 iterations.
Fig 3.
Example of encoding and collision handling: (a) Encoding the first participant with hash function 0. (b) Encoding the second participant with hash function 0. (c) Further seven participants encoded without any collisions. (d) The eighth participant gives a collision. (e) A free slot is found by instead using hash function 1 (CRC-32). The hash function used (hash function 1) is attached to the collision entry together with a check code obtained using hash function 11 (salted). (f) A collision also occurs for participant nine. (g) A free slot is obtained using hash function 1 and attaching the hash function used and the validation code to the colliding item. (h) the tenth participant is encoded without collision.
Table 2.
Distribution of hash functions used by CANDIDATE for encoding 100 participants with coding spaces of 1000, 10,000 and 100,000.
Fig 4.
A screenshot of the CANDIDATE tool implementation.
Fig 5.
Encoding success rates for small samples (N ≤ 100).
Fig 6.
Encoding success rates for larger samples (100 ≤ N ≤ 1000).
Note that the y-axis starts at 99.6% to show the small variations.
Fig 7.
Collision rates for small samples (N ≤ 100).
Fig 8.
Collision rates for larger samples (100 ≤ N ≤ 1000).
Fig 9.
Log-log plot of mean anonymity with coding spaces of 100 (two-digit IDs), 1,000 (three-digit IDs), 10,000 slots (four-digit IDs), and 100,000 (five-digit IDs) with a phonebook of 103,472 names.
Error bars indicate the minimum and maximum anonymity.
Fig 10.
Percentage of unused ID slots with small sample sizes.
This indicates the portion of phonebook entries that can be discarded as non-participants during an attack.
Fig 11.
Percentage of unused ID slots with large sample sizes.
This indicates the portion of phonebook entries that can be discarded as non-participants during an attack.