Lott v. Levitt III

A commenter writes: “In the context of refereed economics journals, ‘replicate’ has one meaning only: The use of an author’s data and model to ensure that falsification of findings is not an issue.” Is this so? Here are some more data points, emphasis added in each case: Edward Kane, “Why Journal Editors Should Encourage the […]

A commenter writes: “In the context of refereed economics journals, ‘replicate’ has one meaning only: The use of an author’s data and model to ensure that falsification of findings is not an issue.” Is this so? Here are some more data points, emphasis added in each case:


Edward Kane, “Why Journal Editors Should Encourage the Replication of Applied Econometric Research,” 23:1 Q.J. Econ.:

“Replication includes but is not limited to slavish duplication of a predecessor study’s specific procedures.”

The Foote and Goetz paper criticizing Levitt/Donohue’s abortion-crime thesis:

“The first column replicates the odd-numbered columns of Table VII (DL 2001), using an updated data set from Donohue’s internet site.”

In other words, same analysis, different data.

Justin McCrary, in a paper trying to replicate a 1997 AER paper by Levitt:

The weighting procedure used in producing [Columns (2) and (3)] is incorrect, and gave crime categories with higher variance more weight. Column (3) instruments police growth rates with election year indicators and the covariates described above. Weights for column (3) are based on the 2SLS standard deviations in column (1). Columns (4) and (5) replicate Levitt’s estimates using correct weights.

In other words, same data, different analysis (albeit the analysis apparently originally intended but not performed).

And, perhaps most importantly, Steven Levitt himself. If Levitt uses “replicate” consistently in the ¶ 12 sense, perhaps he can’t hide behind the ambiguity argument. But in Donohue & Levitt’s “The Impact of Race on Policing, Arrest Patterns, and Crime,” we see:

“We perform this calculation using parallel estimates to Table 5, but based only on the set of 45 states for which we have arrest data. When we replicate Table 5, but with only these 45 states, the coefficients are 10-15 percent larger.

Same analysis, different data. (Or should we interpret the “but” to mean that this isn’t a real replication?) In Donohue and Levitt’s Reply to Foote and Goetz, we see the following discussion:

The data set we provide to researchers who want to replicate our findings reflects the improvements we made to our approach after the original paper was published, e.g. it includes abortion measures both by state of occurrence and state of residence, and also extends the years covered beyond the original sample. We find it puzzling that Foote and Goetz chose to use the longer data series (which slightly reduces the point estimates) when “replicating” our original Table 7, but did not elect to use or even discuss the better abortion measure (which substantially increases the estimates), in spite of citing Donohue and Levitt (2004) which argues strongly for the improved measure.

Again, discussing the use of the same analysis and different data to replicate—unless the scare quotes indicate otherwise. Is there a clearer example? I haven’t found it yet. The other two uses of “replicate” in the response to Foote & Goetz are within the ¶ 12 definition; in Donohue & Levitt’s Reply to Joyce, the four instances of “replicate” are consistent with Lott’s definition. (The use of “replicate” in this Levitt paper is not about replicating results.) I haven’t spent money to review Levitt’s other papers.

Something I haven’t seen noted on other blogs is the fact that there’s an observer effect. While Levitt’s attorneys are no doubt reading the blogospheric interpretations of the Lott complaint with interest (as Bill Henderson suggested), Lott’s attorneys are surely also aware of the critique of ¶ 12. It wouldn’t surprise me if the complaint is amended. Under our plaintiff-friendly notice system in American courts, if a complaint merely states a claim for relief, it withstands a motion to dismiss. Many publisher-author contracts require the author to indemnify the publisher against libel claims, so this lawsuit could get awfully expensive for Levitt if he has a standard contract and the publisher stands on ceremony. And the unfortunate message is clear: an academic willing to sue can impose substantial costs that deter others from criticizing him, and the legal system does little to protect the integrity of academic debate.

8am update: Whether you think Lott or Levitt is the good guy in their spat, if Lott wins, this is a tactic that can be used by leftist academics to silence criticism from the right just as easily as Lott is using it to go after Levitt. I spent much of last night writing a piece criticizing the misleading arguments and methodology of a trial-lawyer advocacy group, and, unlike a right-winger, they are not going to face any sort of peer pressure against resolving an academic debate in the courtroom.

5 Comments

  • Lott v. Levitt IV

    David Glenn, in the Chronicle of Higher Education, has the definitive MSM reporting on the affair. (Permanent link here after Apr. 24.) He finds a mixture of scholars who agree and disagree with Lott on…

  • The precise definition of “replicate” in question is only understood as such by professional economists, yes? Laymen readers — who are responsible for Freakonomics‘s astronomical sales — understand or perhaps misunderstand “replicate” to mean “reaching the same conclusion”. So will damages awarded be based only upon how many professional economists read Levitt’s claim, or how many total readers?

  • roy:

    I’m not a professional economist, but I am a rocket scientist (i.e. I have a PhD in Aerospace Engineering). I was pretty taken aback by the replication claim as I read it in the more technical and narrow sense. Do I count as a layman? I certainly know little about economics and econometrics, but I do know a thing or two about publishing in academic journals (in engineering and physics).

    I seems to me that many scientists and engineers may share my (and Lott’s) reading. Might not other types of non-economists as well?

  • As you’re a rocket scientist, Bill, how do you understand “replicate” in the physical sciences? Do you have to follow identical experimental protocols, same sample size, same test statistics and so on. Or do you just do the same experiment with what is, in effect, a new data set? I suggest the latter.

    I read lots of articles about people who had failed to replicate the “cold fusion” results and in every case Pons and Fleischman (sp) blamed deviations from the original protocols. But no-one suggested that they could sue on this basis.

  • I’m not an experimentalist. I do computational work (computational fluid dynamics). As such, I take “replicate” to mean that the author should provide enough data about the methods, algorithms, numerical tolerances, etc and the problem setup (boundary and initial conditions, etc.). that another researcher could reproduce the author’s work by doing a separate implementation of the aforementioned methods, algorithms, etc. and then running the same numerical experiments. I also expect the original author to provide some evidence (convergence analysis, comparison to physical experiment, or both) that the method described in a published article is likely to have been a good simulation of nature.

    The engineering community has not yet embraced the idea that computer codes and input datasets must be placed on file with the publishing journal or a standardized third-party repository. I understand that this is de rigueur in the social sience these days. I doubt that it will ever become necessary in the engineering community because the claims that are made from the results of engineering numerical or physical experiments are not about society in general, and are therefore not nearly as controversial. When the results are controversial, more scrutiny is required (with assistance and openness from the original authors).

    As to Pons and Fleischmann, it’s my understanding that either they didn’t provide sufficient information to make replication easy, or that in the face of conflicting data they failed to do a publicly scrutinized demonstration of thier own original work. (Using a pressre release rather than a peer-reviewed journal article to announce their work, didn’t help either). Being able to repeat your own work is an absolute requirement for controversial conclusions in the physical sciences.

    It seems to me that when it comes to the sort of statistics and statisical modeling that is so prevalent in the social sciences, that it is incumbent upon researchers to make their exact computer codes, data, and assumptions available for their colleagues to verify that the original authors did not make programming or (data) coding errors that signficantly change the results upon which the societal conclusions are drawn. This, in my mind, constitutes replication. Furthermore, other researchers should attempt de novo examination of the problem in question with thier own data, models, and computer codes to determine the the work done is definitive, or is simply one way of looking at the data (and by looking I mean in the social scientific sense).

    In the physical sciences it is often possible to simply look at some result and conclude that it must be in error, since it is completely outside the realm of ordinary scientific possibility. Whenever statistics and statistical reasoning are involved (in the physical and social sciences), you can throw your intuition out the window (see the early history of quantum mechanics). At that point, only careful rexamination of the reported data, repitition of the undertaken experiments, and completely new experiments may verify that a result is certain (or even believable).

    Finally, the inanimate portions of nature don’t lie (can’t lie), so only error in the design of apparati, conduct of experiment, or interpretation by the experimenter may lead to error in research into natural phenomena. Suvery respondents lie all the time, and so do law enforcement departments who collect crime data (i.e. there is frequent reclassification of certain crimes to skew the statistics to make a department look good). This means that in the social sciences even if a researcher’s process, assumptions, and models are perfect, the result can be wrong. This is considerably less likely in the physical sciences.

    OK, that’s enough babble from me, I’m going to stop now.