I came across this article after a colleague forwarded it to me. While I find the subject matter quite interesting, I noticed several anomalies and had one general comment about the design:
1. For Studies 1, 2, and 3, the Ns reported in the abstract do not match those reported in the body of the paper.
2. For Study 2 the authors report the following: "In Study 2, those primed with science responded more severely to the moral transgression (i.e., condemned the act as more wrong; M = 95.95, SD = 4.37) relative to those in the control condition (M = 81.57, SD = 5.09), F(1, 31) = 4.58, p = .040." Two things need to be stated about this:
(2a) The 2 experimentally assigned groups are approximately 3 standard deviations apart (Cohen's d = 3.03, roughly corresponding to a Pearson's r of .83). That strikes me as highly improbable. For comparison, Cohen suggested that a d=0.80 should be interpreted as "large."
(2b) The F statistic (and its associated p-value) for this study does not correspond to the means, standard deviations, and sample size (neither the N stated in the abstract nor the N stated in the method section).
3. In Study 4 the authors write, "As predicted, those in the science condition allocated less money to themselves (M = 2.71, SD = 1.43) than those in the control condition (M = 2.84, SD = 1.11), t(41) = 2.06, p = .046." The t statistic and p-value reported do not match up to the means, standard deviations, and sample size. Assuming n=21 in one condition and n=22 in the other, I calculated t(41)=0.33, p=.74 from the reported means and SDs. In other words, one cannot reject the null hypothesis.
4. More generally, it strikes me that these sample sizes (even the larger of the ones reported) would afford relatively low power to detect a population effect of plausible magnitude. The probability of rejecting the null 4 times out of 4 studies would be even lower. For example, if the population effect is d=0.5 (a "medium" effect according to Cohen), a single study comparing 2 groups of n=25 each would have approximately 40% power to reject the null (alpha = .05, two-tailed). The probability of rejecting the null 4 times in a row, with 40% power in each study, would be 0.40^4, or about 2.6%.