More baseline data from PubMed

Heather Morrison has pointed out that one can readily obtain data on the percentages of literature indexed by PubMed for which a link to the free full text is available (see her post: Cancer Literature: 13% Free, March 29, 2008).

The new NIH policy about open access will begin to be implemented on April 7, 2008. So, April 6 is a good time to collect baseline data about the portion of the literature that’s a result of NIH-funded research. An indicator of the amount of such research can be obtained by adding these key words (without the quotation marks) to the PubMed search box: “Research Support, N.I.H., Extramural [pt] OR Research Support, N.I.H., Intramural [pt]“. The key words Research Support, N.I.H., Extramural yield a search for all articles noted in PubMed as resulting from extramural research at any NIH institute. Similarly, the key words Research Support, N.I.H., Intramural yield a search for all articles noted as resulting from intramural research at any NIH institute.

When these key words were used on April 6, with the search limited to articles published within the last 3 years, a total of 215424 articles was identified. When the same search was repeated, but limited to articles published within the last 3 years for which links to the free full text also were available, the result was 73631, or 34% of the total. Over the next couple of years, as the new NIH access mandate continues to be implemented, this percentage can be expected to increase substantially. Indeed, this indicator should provide a simple way of assessing the impact of the new mandate.

Some other data about the percentage of freely-accessible articles, obtained on the same day using the same key words:

Published within the last 2 years: 44366/144166=31%
Published within the last 1 year: 12241/67486=18%
Published within the last 180 days: 3894/30602=13%
Published within the last 90 days: 1012/11947=8%
Published within the last 60 days: 454/6475=7%
Published within the last 30 days: 143/2267=6%

These data suggest that, prior to the implementation of the new NIH policy, less that 10% of NIH-supported articles were freely accessible via PubMed within 90 days after publication. In contrast (after embargo periods of between 6 months and a year have been exceeded) the proportion of NIH-supported articles that were freely accessible via PubMed within 2 years after publication increased to about 30%.

These proportions vary somewhat across topics. When the same search for NIH-supported articles was limited to specified “Topics”, the following results were obtained:

Topic: AIDS
Published within the last 2 years: 2149/6807=32%
Published within the last 90 days: 37/461=8%

Topic: Bioethics
Published within the last 2 years: 167/886=19%
Published within the last 90 days: 2/28=7%

Topic: Cancer
Published within the last 2 years: 16344/48587=34%
Published within the last 90 days: 254/3364=8%

Topic: Complementary Medicine
Published within the last 2 years: 1420/5337=27%
Published within the last 90 days: 36/332=11%

Topic: History of Medicine
Published within the last 2 years: 83/340=24%
Published within the last 90 days: 3/17=about 20%

Topic: Space Life Sciences
Published within the last 2 years: 672/2479=27%
Published within the last 90 days: 10/125=8%

Topic: Systematic Reviews
Published within the last 2 years: 360/1519=24%
Published within the last 90 days: 6/113=5%

Topic: Toxicology
Published within the last 2 years: 6528/21120=31%
Published within the last 90 days: 109/1256=9%

It’ll also be interesting to look for variation in the impact of the implementation of the new NIH policy across topics.

Finally, it should be noted that some bloggers have proposed that the week beginning on April 7, 2008 should be OA week, in recognition of the beginning of implementation of the new NIH policy. Those bloggers who take part are asked to mention at some point during the week that the NIH is, at present, collecting public comments on the policy.

Added April 7, 2008:

A more elaborate search strategy for identification of NIH-supported publications is described at: www.nlm.nih.gov/bsd/funding_support.html

The search strategy involves “all the NIH 2-letter grant codes and institute acronyms as well as the two publication types, Research Support, N.I.H., Extramural and Research Support, N.I.H., Intramural“.

When this strategy was used, the results obtained for the percentages of publications identified in PubMed as freely accessible were:

Published within the last 2 years: 45361/145354=31%
Published within the last 90 days: 1039/11905=9%

These percentages are very similar to those obtained via the search strategy that involves only the two publication types, Research Support, N.I.H., Extramural and Research Support, N.I.H., Intramural.

Results for each of these two publication types were:

Research Support, N.I.H., Extramural:
Published within the last 2 years: 42622/139273=31%
Published within the last 90 days: 981/11425=9%

Research Support, N.I.H., Intramural:
Published within the last 2 years: 2678/7087=38%
Published within the last 90 days: 76/672=11%

The Intramural Support percentages are somewhat larger than the Extramural Support percentages. However, there were substantially fewer Intramural Support publications, so the combined percentages are dominated by the Extramural Support contributions.

Added July 2, 2008: Cite as: Till J. More baseline data from PubMed. Be Openly Accessible or Be Obscure blog. Self-Archived at WebCite® 2008-Jul-2 [http://www.webcitation.org/5Z0oCNM85]

1 Comment »

  1. tillje said

    An “Open Access Quotient” (OAQ) has been defined as: (PubMed results with open access fulltext links for last 60 days)/(PubMed results with fulltext links for last 60 days). See: How open access is your research area? Matthew Cockerill, BioMed Central Blog, July 22, 2007.

    The OAQ does not have a focus on NIH-supported research. Instead, it’s an indicator of the fraction of PubMed-indexed publications in particular research fields that’s available with open access shortly after publication.

    One or more keywords can be entered into a search box, to calculate the OAQ for a given topic. Results for a variety of keywords have been posted as comments, at: How open access is your research area?

    On July 22, 2007, the results for the research field “cancer” were:

    * Open Access: 623
    * Total: 8622
    * Open Access Quotient: 7.2%

    On August 18, 2008, the results for “cancer”, obtained via the search box, were:

    * Open Access: 971
    * Total: 11542
    * Open Access Quotient: 8.4%

    Not much change in the OAQ for “cancer” during the past year.

RSS feed for comments on this post · TrackBack URI

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s

Follow

Get every new post delivered to your Inbox.

%d bloggers like this: