Browse Subject Areas

Click through the PLOS taxonomy to find articles in your field.

For more information about PLOS Subject Areas, click here.

< Back to Article

Caveat emptor, computational social science: Large-scale missing data in a widely-published Reddit corpus

Table 2

Regression exploring the relationship between amount of missing content per subreddit and total amount of known content per subreddit, and month in which the subreddit was created.

We expect that these two variables would have meaningful explanatory power for where missing content is—we find that this appears to be the case for missing comments but not for missing submissions, as evidenced by the relative R2 values.

Table 2