Figure 1.
Number of words in YouTube comments.
The histogram (bin size = 9 words) shows the distribution of number of words for all YouTube comments analyzed in this paper. Words were defined as strings of characters flanked by white spaces after the preprocessing described in results. The bin size for the histogram is 9 words.
Figure 2.
Representing sentences as vectors.
The left panel shows the representation of two sentences, “Dogs run and dogs run.” and “Cats run.” as vectors. The “Dogs run” vector is twice the length of “Cats run” because each term in the former is repeated. Equation 1 illustrates how the cosine of the angle between the two vectors, , quantifies the similarity between the two sentences.
Figure 3.
Spreading out from the root concept of drugs are progressive refinements or hyponyms. Counting each path as 1 and starting from the root node, Drugs, one may calculate the path similarity of any two concepts (see text).
Figure 4.
Most frequent words in YouTube comments.
Probability density function of words from all YouTube comments analyzed in this paper. The frequency of occurrence was calculated after removing stopwords.
Figure 5.
Range of dxm dosages discussed on YouTube.
The histogram (bin size = 200mg) shows the distribution of dosages mentioned in the YouTube comments. All doses were converted to milligrams.
Table 1.
Signs and symptoms associated with dextromethorphan ingestions.
Figure 6.
Semantic similarity of YouTube comments to established signs and symptoms of dextromethorphan use.
The black bars refer to “Robo-tripping” videos from YouTube, the white bars to the most popular videos (see sec:Methods). All values for path similarity are calculated relative to the signs and symptoms mentioned in [25].
Table 2.
Words specific to each plateau.
Figure 7.
Extraction of YouTube Comments for Analysis.
Table 3.
Categories for stratifying words based on semantic relation to drug use.
Figure 8.
Most frequent words in YouTube comments stratified by plateau.
Figure 9.
Distribution of tf-idf scores for YouTube comments.
Each panel shows the distribution of tf-idf scores for YouTube comments stratified by the plateaus defined in [25]. The dotted line shows the threshold beyond which the tf-idf score indicates that the word indicates a specific plateau.