For everyone who has commented on the paper and offered feedback, thank you very much. Two attempts at replication have been made: one by John Turri (n=196 after comp checks), the other by Joshua Knobe (n= 136 after comp checks). Here is an overview of their findings. The replication data will be made available upon email request.
First, John’s study replicates the primary finding of our paper: participants attribute knowledge in the fake-barn vignette. Josh’s study does this as well.
Second, John’s study does not replicate the correlation between age and knowledge attribution. John’s population is demographically different from the population tested in our study, but with 19 individuals 49 years of age or older, there is some representation of older individuals. Josh’s study does not replicate the correlation either. Josh’s population had 20 individuals who were over the age of 40.
Third, John’s findings suggest a trend of age and gender interaction on knowledge attribution. For men, there is a trend that higher age predicts lower knowledge scores, while there is no trend for women. Neither trend is significant. However, Josh’s findings do not suggest a trend of this sort.
The original study and the replication studies were distributed using different means, and also different population demographics. One or both of these differences may explain why the age finding failed to replicate. The age range of the original study may be different from the online studies, which may affect the knowledge score correlation. More plausibly, the older population surveyed in the original study (in person on paper) may be unlike the population surveyed in a replication (online via the m-turk service). If they are different, more research must be done to assess which population, if either, is more representative of older people more generally.
I take these results to suggest that, if there is a relationship between age and knowledge attribution in fake-barn cases, it may be more complex than the original study was able to measure; knowledge attribution may be affected by a variety of interacting demographic factors, age included. This is in line with the cautious conclusion of the original paper. I would like to see more studies on the topic of intuitions and age, which perhaps will further elucidate their relation, if any.
Once again, I would like to thank John and Josh for running the replications, and everyone who has commented on the blog posts.
As an outsider to this debate, I am just so impressed by everyone's collegiality and impartiality in approaching this replication project. We should all thank David, Josh, and John as setting a model. (In addition to, of course, the earlier model set by Jonathan Weinberg et al and the replicators in the cross-cultural intuition debate.)
Now, David notes that "The original study and the replication studies were distributed using different means, and also different population demographics. One or both of these differences may explain why the age finding failed to replicate." And these seem like plausible factors.
Psychologist Dan Simons suggests that it is helpful for experimental papers to specify the generalizability target ( http://blog.dansimons.com/2013/06/direct-replication-and-conceptual.html ), e.g.
"Limits in scope and generalizability
The results from this study should generalize to experienced bridge players at other duplicate bridge clubs, as well as to other domains in which players regularly compete against the same group of players in games of skill, provided that the outcome of any individual match or session is not determined entirely by skill."
I'm just wondering that in this case, would the original authors have a suggestion for the conditions under which the age effect might be replicated? For example, is it for all contested knowledge cases, all fake-barn cases, or this particular fake barn case? I think understanding the intended generalizability target a little better can really help clarify the factors that replication attempts need to pay particular attention to.
(In the paper, sometimes the claim stated quite cautiously, but other times it is stated quite generally, e.g. "Third, perhaps the most interesting result of this study is that there is a negative relationship between knowledge attribution and age.")
Posted by: Shen-yi Liao | 06/24/2014 at 05:40 AM
For those who are interested in ascription practices in these situations more broadly, several conceptual replications of the main result of David’s paper regarding knowledge ascription in the fake barn case are offered in “Knowledge and Luck” forthcoming in Psychonomic Bulletin and Review. The penultimate draft of that paper can be found here: http://philpapers.org/rec/TURKAL
I called them conceptual replications because the materials that were used involved an array of different cover stories that nonetheless still retained the same structural features as fake barn cases. For instance, they featured different agents who perceptually detect truth-makers in the presence of salient but ultimately failed threats to the agent’s ability to detect them. The result was that patterns of ascription again closely resembled responses to paradigmatic knowledge.
At this point, it seems beyond doubt that there is very high rate of knowledge attribution in fake barn and structurally similar cases, and that the conventional philosophical wisdom about this is wrong. I’d also note that age effects were not detected in these studies, although they were also conducted using mturk, and David’s points in the OP again seem to apply.
Posted by: Wesley Buckwalter | 06/24/2014 at 12:55 PM
First of all, hats off to David and his colleagues. As lots of people have already noted, this post really does a lot to advance our study of these questions and, just generally, to establish the right tone for thinking about replication studies.
Shen-yi makes a very good point in his comment above. Basically, he suggests that researchers should do more to be clear about the conditions under which they expect their effects to arise. I think this point is a very helpful one, but I wanted to propose what I hope will be a friendly amendment.
Specifically, I'm not sure that researchers need to say anything directly about the conditions under which they expect their effects to arise. Instead, it seems like researchers can just make hypotheses about the underlying *psychological processes* that give rise to their effects. These hypotheses will then generate predictions about when the effect should show up -- though only in conjunction with a whole lot of auxilliary hypotheses.
To give just one example, in their excellent paper on intuitions about reference, Machery, Mallon, Nichols & Stich suggest that the reason why Western participants give different responses from East Asian participants is that causation is more salient for Western participants. This claim does not directly tell us when the effect is supposed to show up, but it does give us a more indirect way of figuring it out. (Suppose someone asks, 'Will people from this other culture give the same answer that Western participants do?' The hypothesis yields a prediction, which is something like this: 'They should give the same response if they have they same way of thinking about causation.')
Needless to say, I am not trying to suggest that we have some a priori way of knowing that hypotheses about underlying psychological processes are the way to go. Rather, my reason for thinking all of this comes from facts about how research has been proceeding thus far. Looking at what has happened over the past ten years or so, it seems like there has been a lot of productive research that started with hypotheses about psychological processes and then triggered numerous follow-up studies that moved things forward in helpful new directions.
Posted by: Joshua Knobe | 06/24/2014 at 06:53 PM
"More plausibly, the older population surveyed in the original study (in person on paper) may be unlike the population surveyed in a replication (online via the m-turk service). If they are different, more research must be done to assess which population, if either, is more representative of older people more generally."
I seriously doubt this claim. One clear possibility is that the original study was underpowered for making the correlation claim you make in the paper so the presence of outliers spuriously drove your effect.
You seem to think you can ignore the possibility of a type 1 error...but you should not ignore this possibility, especially considering the study is insufficiently powered.
Posted by: Zach | 06/30/2014 at 06:40 PM
Thanks, again, everyone, for the comments on the replication so far. Here are my thoughts on some of the comments.
Shen-yi, perhaps the findings would be replicated if the study were performed in person in public places, as was the case in the original study. This would greatly reduce the worry that the population surveyed in the original is different from the population surveyed using M-Turk. I don’t take our findings to generalize to all instances of disputed knowledge, but rather to fake-barn vignettes similar to the version we used. The only way to find out which features are relevant is to run more studies, I think.
Josh, in the paper, we suggest a number of reasons why there might be a relationship between age and knowledge attribution, in which age is a relevant factor (age and its relation to conservative thinking, age and its relation to the experience of merely apparent instances of knowledge, etc.). We also gesture at reasons in which age is a spurious factor (people within certain age ranges had different life experiences, age being correlated with some other factor affecting knowledge attribution). I think one could draw on these speculations when forming future studies if one thinks that each or any of them is responsible for the effect we found.
Zach, I agree with you that a type I error very well might explain our original findings, given a failure to replicate within a different population from the original. If you are picking up on the ‘more plausibly’ language in the sentences you copied, I meant that language to note that it is comparatively more plausible that the populations are unlike (and one might be more representative than the other) than that the original and replication studies have different age ranges. You are right to point out that it may be that neither of these is correct; I did not intend my post to imply that the only explanation for a failure to replicate is a difference between populations.
Posted by: David Colaco | 07/01/2014 at 08:58 PM
Thanks, David. I definitely agree that more studies can clarify which features are ultimately relevant.
However, I was puzzled by your responses to Josh and me. If the speculation of the underlying psychological mechanisms has to do with the relationship between age and knowledge attribution generally, why would you expect the effect to only be replicated in fake-barn cases, and not other cases of contested knowledge attribution? (The contested part is there to rule out the lack of an age effect due to a ceiling or floor effect.)
Posted by: Shen-yi Liao | 07/02/2014 at 04:48 AM
Shen-yi, I thought it prudent to not make claims that our findings will extrapolate to other instances of contested knowledge attribution without additional data to support them. The speculations we make in the paper might not be relevant to all cases of contested knowledge. Given that ‘cases of contested knowledge’ is a broad category, I am not currently in a position to state what additional features might be present in other cases, and whether or not these features might confound any process that might result in a correlation between age and knowledge attribution.
Posted by: David Colaco | 07/15/2014 at 11:51 PM