1.26.2009

Fixing peer review, part 2

I saw this over at DrugMonkey. Cell, a biology GlamourMag, claims that it is redefining its publication standards and policies in response to the digital publishing era:
One issue in particular that we at Cell will be focusing on in 2009 is redefining what constitutes a "publishable unit" in the age of electronic journals and how we can best present the information content of a scientific article online. [...] The scientific article of the future will no longer be tied to the constraints of a printing press and will take advantage of all the opportunities afforded by the web to introduce a hierarchical rather than linear structure, increased graphical representations, and embedded multimedia. Inherent in our thinking about the scientific article of the future is the need to address the current unchecked growth in the amount of supplemental and supporting material and to identify constructive, well-defined guidelines for what is reasonably and appropriately included in a unit of scientific advance.

I agree with everything DrugMonkey said -- what should be in the paper should be in the physical paper. However, I am heartened by one aspect of expanding supplementary online material: I would like to return to an earlier practice of reporting genetic analyses and require that sequence alignments used in publications be publicly available as supplementary material. This would not take up a large amount of space (the nexus file format, which I would propose as the standard, is both compact and informative -- it will preserve annotations of genomic regions if the authors wish to add them).

How would this improve science, and why is this being filed under "fixing peer review?" First and foremost, it allows future groups to build upon previously published work, and know concretely how their analysis differs from the published work. If you start with a published (as supplementary material online) alignment, and you have a serious problem with aspects of the older alignment (e.g., they neglected to take conserved secondary structures into account), this gives you an important point for discussion to explain why you results do not match up with previous results. We would no longer have to speculate -- we could test such an assertion. Secondly, it will speed up research, as fewer groups would have to reinvent the wheel and re-do the same work that has been done by other groups. And, as I argue in the first point, this does not mean that the published work would be used as an unevaluated black box -- it can, and hopefully will, be critiqued and refined. The reason it is on my mind at the moment, however, is the way this will assist peer review. I just spent far too many hours yesterday trying to replicate an alignment in a manuscript I was reviewing to test a theory I had about a phylogenetic tree in it. If my theory were correct, it would undermine all the paper's conclusions, but I could not see if my theory was even plausible without seeing at least a partial alignment. Had the authors been required to upload their alignment with their manuscript, this would have taken me less than five minutes to check. Under our current system... let's not discuss how long it took. I know some will say that I should have just suggested the reanalysis without attempting it, but I have learned the hard way that if you don't provide evidence (a hint of reanalysis, citations) for your comments in peer review, the authors sweet talk the editor and convince them you didn't know what you were talking about.

When sequence analysis was a new field, journals published the entire alignments in the physical paper journals. This quickly became impractical, as computing power and algorithms advanced, and it no longer was required or encouraged to publish your alignments. However, you are always supposed to supply your alignment to any and all people who contact you after your paper is published, and some attempts have been made at alignment repositories (EMBL's closed in 10/2008). After all, bioinformatic analyses based on nucleotide or amino acid sequences are only as accurate as the alignments underlying them. I think requiring alignments as supplementary online material would be a fantastic requirement for the increasingly digital age. Some journals already require this, but none of those I regularly read or review for. Maybe this is not what Cell meant at all... but I'm running with this.

Plus, I am in favor of anything that improves my time-to-completed-review. (I have peer reviews #5 and #6 of 2009 on my computer today. I need to speed things up or I'll never get tenure)

No comments: