Behavioral structure of users in cryptocurrency market

doi:10.1371/journal.pone.0242600

Table 1.

An example of bitcoin’s dataset obtained using BitIodine software [21].

More »

Expand

Table 2.

An example of ethereum’s dataset obtained from [23].

More »

Expand

Table 3.

Periods when bitcoin and ethereum systems show significant and non-significant price changes.

More »

Expand

Table 4.

Network properties calculated for bitcoin (BTC) and ethereum (ETH) networks for each period under analysis.

More »

Expand

Fig 1.

Proportion of “noisy” users in transaction networks.

More than 50% of users in each period of ETH have degree less than 2, small transaction value and appear only once. In bitcoin vast majority are “noisy” users but surprisingly, in the period of Crypto Winter and after Crypto Bubble the proportion of “noisy”users is only 36-38%. “Noisy” users are removed for the further analysis.

More »

Expand

Fig 2.

Correlation between all features in ETH are shown—It can be seen that total degree and total value are redundant features as they are always highly correlated with others.

In later periods, value in and value out also show very high correlation.

More »

Expand

Fig 3.

Correlation between all features in BTC are shown—It can be seen that in, out and total degrees are always highly correlated, same as in, out and total value.

More »

Expand

Fig 4.

The algorithmic steps to classify users into behavioral groups.

Due to the complexity of the feature sets, we found that solely applying unsupervised method is insufficient. Our approach is to first group the set of features A_c that k-means clustering has good performance. From which, we obtain the set of labels V_c. Subsequently, we use the clustered output as training data for the more advanced supervised algorithm SVM. The final step is to employ the trained SVM model as classifier for the rest of the feature sets A₁, A₂, …A_m to obtain their corresponding labels V₁, V₂, …V_m.

More »

Expand

Fig 5.

Number of clusters in ETH data found using the elbow method—for all periods, 4 appeared to be the optimal.

More »

Expand

Fig 6.

Number of clusters in BTC data using the elbow method—for local events, optimal number is 4.

However, for global events there is no “sharp” elbow.

More »

Expand

Table 5.

Distinct properties of the majority of nodes in each cluster.

More »

Expand

Fig 7.

Silhouette coefficient calculated for filtered ethereum data set.

For local periods all data points have positive scores, while small number of data points for global events are not well matched with their cluster (have negative score).

More »

Expand

Fig 8.

Silhouette coefficient calculated for filtered bitcoin data set.

In all periods we see amount of data points with negative score and average silhouette coefficient for all is 0.4.

More »

Expand

Table 6.

The percentage of misclassified points shown in comparison with various feature sets used.

More »

Expand

Fig 9.

The percentage of users in different groups in the in bitcoin and ethereum during different periods of systems’ evolution.

More »

Expand