Andra Waagmeester * Chris T Evelo
Department of Bioinformatics – BiGCaT Maastricht University, Maastricht, the Netherlands
* Corresponding author: firstname.lastname@example.org
CitedIn is a webservice that can be found at www.citedin.org. It tracks online citations in databases, blogs, wikis, and community sites to the literature covered in Pubmed. For biomedical scientific literature Pubmed is by far the most interesting resource. Pubmed is the interface to the Medline repository which covers over 1 century of publications and contains almost all relevant publications in the field. Pubmed uses very simple unique numeric identifiers. This makes it easy to cite publications using these Pubmed Identifiers (PMIDs) or to link to Pubmed itself based on those. This ease of use results in PMIDs often being used for online citations, this is often the standard in biomedical databases and for instance in Wikipedia. CitedIn uses a federate search approach towards references to Pubmed to find which publications contained in Pubmed are cited where in a large number of web resources. At this moment (April 2011) CitedIn searches wikis (including Wikipedia), search engines for scientific blogs (Nature blog and Google blog search), databases (including some major bioinformatics databases), Google books, some special publications sets, and social network sites (such as Connotea and CiteULike). In the near future searching through Twitter tweets will be implemented as well. A CitedIn search can be done for any set of Pubmed references either offered as a list of PMIDs or retrieved from Pubmed through a set of keywords. This for instance allows searches for all papers produced by a single author and thus allows you to ask the question “where am I cited in on the web, besides in scientific publications?”. CitedIn will show you the publications it reviewed and for each of those it will indicate where it was cited. It is possible to receive an overview of the whole set, in which the contribution of each resource to the set is given, and it is also possible to review an individual publication where you can find the actual citation. CitedIn also offers an interface for programmatic access (API) through which it can be used for automated analysis. While we initially thought about CitedIn as “just” a resource to find online citations it also provides information that is useful to estimate the online impact of a paper or a set of papers. This offers an opportunity to assess the online impact of an author, a group of authors or a research topic. We propose to use this to calculate a metric for online scientific impact: the CI-‐number (the CitedIn Number for online impact). Traditionally impact of scientific publications, journals and researchers is determined based on how often publications are cited in other publications. That leads to number of publications per article, impact factors per journal (average number of citations for all articles in a journal) and for instances h-‐indices for researchers (number of articles cited at least as often as that number)[1-3]. Debate is ongoing about how justifiable these indices are but the importance of scientific literature for the advance of science and technology indicates the need to somehow measure contributions, and the current system often determine academic careers, the fate of journals or even decisions to close or fund whole research institutes or research programs. Since the current methods only consider structured citations in reference lists of journals articles (and sometimes books), they miss important citations. These are roughly from four domains: 1) publications on the Internet (e.g. blogs, Wikipedia), 2) online databases (containing structured knowledge derived from papers and often referring to them), 3) social network cites (these are in part designed to share important publications, like Mendeley, CiteULike and Connotea) and 4) supplementary data (especially for reviews long lists of references are sometimes only published online).
We have defined the CI-‐number (CitedIn number for online impact) as a metric to assess the impact in online resources for a set of scientific publications contained in Pubmed. This metric is calculated dynamically while the numbers of citations in each resource are counted. In the calculation of the CI-‐number we normalize on the total number of citations covered in the resource under scrutiny, we also introduce a weight value for each resource. This weight indicates the “impact” of the resource. A citation in Wikipedia should be considered to have a higher impact, than when being cited in a blog. The total weight is first divided over the main groups of resources and then between the resources of that group. Individual weights and relative total weights for groups will be adjusted on a yearly basis (effectively leading to a yearly CI-‐number). We will start with the following arbitrary selected group weights: Wikipedia (25%), blogs and social media (15%), small wikis (15%), databases (35%), and a rest category (10%). The equation below shows the formula that calculates the CI-‐number for a set of PMIDs:
Formula to calculate the CI-‐number of a given set of Pubmed Identifiers. In the near future we plan to extend CitedIn to cover Document Object Identifiers (DOIs) next to PMIDs, which will help to retrieve more citations and to cover more scientific domains . We would like to present and discuss the proposed CitedIn metric with the alt-‐metrics community.
 Garfield. Citation indexes for science; a new dimension in documentation through association of ideas. Science (1955) vol. 122, 108-‐11 (PMID: 14385826)
 Hirsch. An index to quantify an individual’s scientific research output. Proc Natl Acad Sci USA (2005) vol. 102, 16569-‐72 (PMID: 16275915)
 Neylon and Wu. Article-‐level metrics and the evolution of scientific impact. PLoS Biol (2009) vol. 7, e1000242 (PMID: 19918558)