Archive for December, 2007

Mining the biological literature about a model organism

Excerpts from Wikipedia about the model organism Caenorhabditis elegans:

Caenorhabditis elegans … is a free-living nematode (roundworm), about 1 mm in length, which lives in temperate soil environments. Research into the molecular and developmental biology of C. elegans was begun in 1974 by Sydney Brenner [1] and it has since been used extensively as a model organism.

As for most model organisms, there is a dedicated online database for the species that is actively curated by scientists working in this field. The WormBase database attempts to collate all published information on C. elegans and other related nematodes.

There’s a “Find” search box at the top of the WormBase home page. After putting the phrase “open access” into the search box, and after selection of the “Literature Search” option in the drop-down menu, a search yielded matches in 214 of the documents in the Literature database. The “Database Description” at the bottom of each page of results stated that the “Current database contains 9475 full text papers and 24612 abstracts“.

The first 25 papers identified in the results of this search were all openly accessible via an “online text” link to the website of the journal in which the document was published. However, it appears that only a small minority of the 9475 full text papers were identified as “open access” papers. (Probably fewer than 214/9475=2%, because some of the 214 “open access” documents appear to have been identified more than once).

Another “Literature Search”, using “stem cell” as the keyword phrase, yielded matches in 867 documents. An “online text” link was present for 19 of the first 25 of these documents. This link provided free access to the full text for 9 of these 19 papers. (Unfortunately, whether or not the “online text” is freely accessible can only be determined by attempting to obtain access).

Although this latter finding suggests that a much higher proportion of the documents in this database may be freely accessible than are found by searching for the key words “open access”, a much more extensive study would need to be carried out in order to obtain a reliable estimate of the proportion of freely-accessible documents in this particular database.

The software that powers the WormBase Literature Search is Textpresso. On the About Textpresso webpage, it’s stated that: “Textpresso is an information extracting and processing package for biological literature“.

I wasn’t aware of this software until I saw an item, Presentations from Harvard publishing conference, posted by Peter Suber to Open Access News (December 29, 2007). This item led me to the PDF version of a presentation by Robert Kiley, Head of Systems Strategy, Wellcome Library. He participated in a panel session at the Harvard conference on Publishing in the New Millenium (Cambridge, November 9, 2007). His presentation was entitled: “Wellcome Trust and open access“. The heading on Slide 7/12 was: “New resources from mining the literature: Textpresso“.

A useful demonstration model of the ways that are being developed to “mine the literature”? (But, consider how much more useful this model would be if all of the documents in the Literature database were openly accessible).


Comments (2)

Repression of mouse mammary progenitor cells

In the December 15, 2007 issue of Genes & Development, one of 12 “Research Papers” and both of two “Research Communications” are labelled “Open Access Article”. One of the latter articles is: A role for microRNAs in maintenance of mouse mammary epithelial progenitor cells, by Ingrid Ibarra, Yaniv Erlich, Senthil K. Muthuswamy, Ravi Sachidanandam, and Gregory J. Hannon, Genes Dev 2007(Dec 15); 21(24): 3238-3243. The Abstract:

microRNA (miRNA) expression profiles are often characteristic of specific cell types. The mouse mammary epithelial cell line, Comma-Dβ, contains a population of self-renewing progenitor cells that can reconstitute the mammary gland. We purified this population and determined its miRNA signature. Several microRNAs, including miR-205 and miR-22, are highly expressed in mammary progenitor cells, while others, including let-7 and miR-93, are depleted. Let-7 sensors can be used to prospectively enrich self-renewing populations, and enforced let-7 expression induces loss of self-renewing cells from mixed cultures.

The final paragraph of the Results and Discussion section:

Overall, our results support the notion that miRNA expression patterns form both a characteristic signature of a given cell type and help to reinforce cell fate specification. Even within a single cell line, distinct compartments containing progenitor cells and more differentiated cells have unique miRNA patterns, suggesting that such signatures can be used not only to define and track rare cell populations in vitro and in vivo, but that manipulation of these signatures might be used to expand or deplete stem cell and tumor-initiating cell populations for therapeutic benefit.

For a commentary about this article, see: Scientists identify and repress breast cancer stem cells in mouse tissue, Medical Science News, December 19, 2007. Excerpt:

By manipulating highly specific gene-regulating molecules called microRNAs, scientists at Cold Spring Harbor Laboratory (CSHL) report that they have succeeded in singling out and repressing stem-like cells in mouse breast tissue – cells that are widely thought to give rise to cancer.

Added December 26, 2007:

Genes & Development is one of the journals published by Cold Spring Harbor Laboratory Press. It’s a high-quality journal in its field, “ranked number 1 in terms of cost-effectiveness in the field of Developmental Biology“, according to a statement attributed to

The journal has an Open Access Option:

All papers are freely available online six months after publication. In addition, Genes & Development is now offering an Open Access option in which authors may pay a surcharge of $2000 to make their paper freely available online immediately upon publication. Authors may choose this option when page proofs are returned to Journals Production; choosing this option will have no effect on acceptance and publication of submitted papers.

As of the end of 2007, all papers published after June 15 are freely available online. The 12 issues between July 1 and December 15 can be accessed via the Archive of 2007 Online Issues. These 12 issues contained a total of 155 contributions (other than “Errata”). Of these contributions, only 13 (8%) were freely available online immediately upon publication. The Open Access Option was utilized more often for “Perspectives”, “Reviews” and “Research Communications” (7/50=14%) than for “Research Papers” (6/105=6%).

Ten of the 13 freely available papers can also be accessed via the page of Articles With Immediate Access in the appropriate section of PubMed Central. Among these ten is the very interesting article highlighted above, entitled: A role for microRNAs in maintenance of mouse mammary epithelial progenitor cells.

This particular case study has revealed that the Open Access Option has been adopted by only a small minority of authors during the last half of 2007, so a six-month delay before free availability has not been a deterrent for most of the authors whose papers were accepted for publication in these issues of this journal. So, this case study highlights the crucial importance of the duration of the delay period, prior to free availability, for journals such as this one.

This six-month delay period is already shorter than the one-year delay permitted by the recently-imposed OA mandate at NIH. See: OA mandate at NIH now law, posted by Peter Suber to Open Access News on December 26, 2007.

I was unable to find a policy about Green OA (via self-archiving) for the journal Genes & Development.

Comments (1)

Free contents in NEJM

The New England Journal of Medicine (NEJM) is a top-ranked medical journal with an impact factor of 44 (2005 data, see this FAQ).

Registered users have free access to research articles that are six months old or older. However, the tables of contents of each issue (such as the issues for 2007) of the NEJM indicate that the free full text is available immediately, without registration, for a minority of the contents of each issue.

I’ve looked at the 15 most recent issues (the issues from September 6, 2007 to December 13, 2007) and have tabulated the contents for which the free full text is already available. The section that lists the “Article Summaries” was omitted, as were the “Book Reviews”. A total of 315 individual items were identified in the contents of these 15 issues. The free full text was accessible upon publication for 103 of these items (33%). The largest number of items (92) in an individual section was in the “Correspondence” sections of these issues. The free full text was immediately accessible for 26 of these letters (28%).

The “Perspectives” section of the NEJM “Provides a quick assessment of a single, important topic“. There were 36 items in the “Perspective” sections of these 15 issues, of which 25 (69%) were immediately freely accessible. Prompt free access was also provided to 20 of 29 (69%) of items in the “Images in Clinical Medicine” sections.

Of a total of 60 “Original Articles” in these 15 issues, only 8 (13%) were freely accessible upon publication, and of a total of 35 “Editorials”, 7 were already freely accessible (20%).

Why are some items in the table of contents freely accessible immediately upon publication, while others are not? I’ve been unable to find an answer to this question in the FAQs that are available via the NEJM site. I’ve contacted the journal via email in an attempt to obtain an answer.

Added December 26, 2007:

A message, NEJM – Change in Access Policy for Research Articles, was posted to the Liblicense-L mailing list by Tom Richardson, Director, Institution Sales & Service for NEJM, on December 22, 2007. Excerpts:

As of December 19, 2007, the Journal now provides free access to original research articles (Original Articles and Special Articles), six-months after publication, with no registration required.

The Journal will continue to make articles of immediate public health importance available free to all visitors upon publication.

A copy of this message was posted by Peter Suber to the SPARC Open Access Forum on December 23, 2007: NEJM – Change in Access Policy for Research Articles.

Comments (1)

Insight Journal is interesting

A comment that’s available via the BOAI Forum Archive, posted by Luis Ibanez on December 11, 2007, describes the Insight Journal, a very interesting “Post-Publication-Peer-Reviewed-Open-Access-E-Journal” for the medical image analysis community. An excerpt from the comment:

[A] main focus of this journal is to empower readers to perform verification of reproducibility by asking authors to post along with their papers *all* the material that is required for reproducing the work that is described in the paper. In this way, serious peer-review can actually be performed, instead of being limited to the decadent practice of most journals where only “opinions” from the reviewers are considered to be an acceptable peer-review.

Publication is automatic in the Insight Journal, and then papers are made available for public (anonymous and non-anonymous) peer-review online.

Added December 16, 2007:

For an example of an article that’s been published in the Insight Journal, see: Principles and Practices of Scientific Originology by Luis Ibanez, presented at the ISC/NA-MIC Workshop on Open Science at MICCAI 2007, Brisbane, Australia, November 2, 2007.

For an example of papers currently open for public review in the Insight Journal, see those under the heading: ISC/NA-MIC Workshop on Open Science at MICCAI 2007.

For an example of public reviews, see: Data, data everywhere, nor an image to read – Finding open image databases by David R Holmes III and Richard A Robb, Paper ID 167, ISC/NA-MIC Workshop on Open Science at MICCAI 2007. It’s about the concept of open-data centralization.

This journal seems to be able to attract post-publication reviewers. Other experiments along these lines have been less successful. See the examples mentioned by Mark Ware in: Why don’t researchers like to comment on journal articles? (putting down a marker blog, 24 August 2007).

Leave a Comment

Excitement about induced pluripotent stem cells

There’s been a lot of interest in this article: Generation of induced pluripotent stem cells without Myc from mouse and human fibroblasts, by Masato Nakagawa and 9 co-authors (including Shinya Yamanaka), Nature Biotechnology, Advanced online publication, 30 November 2007. As of this morning, the full text is still freely accessible.

For the current news about this topic, see, for example, a Google News search for the key words: Yamanaka stem cell

Two earlier papers about induced pluripotent stem cells, both published online on November 20, 2007, also generated interest. They are (see also my comment posted to this blog on November 20):

1) Induction of Pluripotent Stem Cells from Adult Human Fibroblasts by Defined Factors, by Kazutoshi Takahashi and co-authors (including Shinya Yamanaka), published in Cell. [Cell DOI: 10.1016/j.cell.2007.11.019]. PDFs of the article and supplemental data are currently freely accessible via the Cell website.

2) Induced Pluripotent Stem Cell Lines Derived from Human Somatic Cells, by Junying Yu and co-authors (including James A. Thomson), published in Science. [Science DOI: 10.1126/science.1151526]. Only the abstract is currently freely accessible.

Cell has highlighted the free access to the first of these two articles published on November 20. See: Yamanaka paper in Cell captivates the world (undated). Excerpts:

Media from around the world were captivated by the Yamanaka paper published in Cell earlier this week. Over 300 stories about the research appeared online within 2 hours after being posted online, with nearly 800 stories by 3pm that day.

A full list of coverage can be found at:

However, as of today, most of the coverage was about the subsequent article that was published online in Nature Biotechnology on November 30.

Comments (6)