Who are we talking about?: the validity of online metrics for commenting on science [v0]

This is version 0 of an abstract to be presented at altmetrics11

Julie Birkholz
Shenghui Wang
Vrije Universiteit Amsterdam, the Netherlands

The work of research undertaken by scientists culminates in the communication of results. This act of dissemination- the researchers’ gift to his peers (Hagstrom 1965), occurs through the publications of research through peer-reviewed journals, books, and or collections of conference proceedings. The field of Scientometrics works to analyze theses practices of communication (e.g. namely investigating the end-line products of knowledge systems- scholarly publications), to assess the impact, influence and processes of science. With the general rise of Internet use an increasing portion of researchers’ work takes place online via e-mail exchange, accessing online bibliographic databases for the latest research, and transversing the web and blogosphere on topics of science. The rise in these online activities suggests that an increasing portion of knowledge production has moved from ‘paper publication’ to online platforms. This social phenomenon of science moving to the web presents a hypothetical turning point for science studies. These traceable behaviors on the web provide an additional, arguably complimentary, manner to study science; but they spark a throng of additional questions about how to assess the online aspects of knowledge production in science.

A number of studies have begun to explore the online behaviors of science communities (see Priem & Hemminger 2010). These research projects suggest that similar communication conventions, such as citation of academic articles, also occurs on both blogs (Groth & Gurney 2010) and Twitter (Priem & Costello 2010). A short list of tools exist that aid in further conceptualizing and understanding these online social behaviors in science which include methods to track online readership (Taraborelli 2010), and impact metrics (Neylon & Wu 2009). These research studies have set a path of inquiry about the effects of knowledge dissemination via the web; although the validity of these arising metrics remains an issue for theorizing about science.

We pose the question: in science, who is represented online?; working to answer the level of external validity that alternative metrics can infer for science as a system. External validity refers to if results can infer causal mechanisms for an entire population (Lucas 2003); giving an indication of the generalizability of the research to a specific population. This external validity is dependent on the population which we aim to generalize about, in this case the aim of developing alternative metrics is to complement or support traditional scienteometrics than we must be clear about the level of external validity within these online samples. Answering this question allow us to comment on not only the the representativeness of these measures but also provides a descriptive depiction of who is on the web in science.

In order to determine the external validity of researching online behavior of scientists we propose a method to test this among a specific set of actors on a number of social platforms on the web. This method is dependent on starting data which include a list of names and institution of a specific population (e.g. discipline/field bound communities, set of an organization). In this study we target online behavior of one’s own enterprise. This implies that we are not searching in online bibliographic databases for evidence of publications but that we are isolating the existence of online activity on the social web including: blogs; micro-blogging (Twitter); activity on social platforms – LinkedIn, and Mendeley; and sharing of presentations through Slideshare. A web crawler is being developed to automatically search these sites to confirm the existence of online activity.

The web crawler first goes over each person’s homepage and searches for additional evidence of online presence such as links to their blogs, follow me links for LinkedIn, or Twitter, aswell as entries in Mendeley and Slideshare. If these activities are not mentioned on personal homepages, the crawler will search LinkedIn, Twitter, Mendeley and Slideshare individually to check whether these authors have an account in these sites. An additional search will be carried out in Google Blogs to see whether these authors maintain their own public blogs. In this way, we have sufficient information of online activities of these authors. The crawling results will be stored in RDF or Excel sheets. Standard statistical measures can then be run to analyze how the online activity relates to the any number of factors within standard metrics (e.g. academic performance). The combination of the starting data of a list of names and search limits of platforms provide the limits for the crawler to arrive at results for the analysis of external validity of the online population to the population aimed to generalize about.

In this first explorative research the automated procedures are done by hand, as following. The population we aim to generalize about is current tenured computer scienctists working in 10 Dutch universities. The list of names is queried from the Digital Bibliography & Library Project (DBLP)- an online bibliographic database for the field of computer science with publication streams from a number of top rated journals, conference proceedings and books within the field; for all published authors from January 2007 (the year after Twitter’s inception) to March 2011. This query returned in total 4984 individual scientists published during the period representing a list of all active scientists connected to Dutch computer scientists via co-authorship. This list of scientists was then sent to ArnetMiner.org, a search and mining services of computer science researchers which includes semantic data on names, contact information, tenure/position, homepage, and additional traditional scienteometric statistics. The H-index, total citation numbers, position and affiliation were queried from ArnetMiner.org for all actors.

From this list we sorted the population first by the median citation score and took 50 actors on each side totaling 100 scientists in the middle range of performance levels in their field. A selection was also made for the 100 top and bottom-most actors giving us a selection of actors that represented all spectrums of productivity. We manually searched their online activities (mimicking the actions of the developed crawler above) and apply statistical measures on the results to analyze how this selection of 300 actors, of varying performance levels, is represented online. Our results present a depiction of life on the web for the field of computer science. Meanwhile, the implementation of the web-crawling tool which takes the names as input and automatically searches the presence of these people on the web will greatly increase the amount of people who can be analyzed; thus, providing a more reliable extension to our initial manual test of external validity of specific communities.

Therefore, from an analytics perspective, we have worked to develop a method that provides a tool for accurately reflecting on population as to reliably provide a level of external validity (generalizabilty) to a greater community of actors. In summary, we propose that when measuring the scientific impact and contribution of one’s work on the web it is necessary to be clear about the level of external validity of these web activities in order to better infer trends in science and this tool can be a component to achieving this goal.


Groth, P. & Gurney, T. (2010). Studying Scientific Discourse on the Web using Bibliometrics: A Chemistry Blogging Case Study. In:  Proceedings of the WebSci10: Extending the Frontiers of Society On-Line, April 26-27th, 2010, Raleigh, NC: US.

Hagstrom, W.O. (1965). The Scientific Community. New York: Basic Books.

Lucas, J. W. (2003). Theory-Testing, Generalization, and the Problem of External Validity. Sociological Theory, 21: 236–253.

Neylon C. & S. Wu (2009). Article-Level Metrics and the Evolution of Scientific Impact. PLoS Biol 7(11): e1000242.

Priem, J. & B.M. Hemminger (2010). Scientometrics 2.0: Toward new metrics of scholarly impact on the social Web. First Monday, 15 (7).  Retrieved from: http://firstmonday.org/htbin/cgiwrap/bin/ojs/index.php/fm/article/viewArticle/2874/2570.

Priem, J., & Costello, K. (2010). How and why scholars cite on Twitter. Proceedings of the 73rd ASIS&T Annual Meeting. Pittsburgh, PA, USA.

Taraborelli, D. (2010). ReaderMeter: Crowdsourcing research impact. Academic Productivity. Retrieved April 5, 2011, from:  http://www.academicproductivity.com/2010/readermeter-crowdsourcing-research-impact/.


  1. Judit Bar-Ilan
    Posted May 9, 2011 at 12:56 pm | Permalink

    Dear Julie and Shenghui,

    Do you know what is the source of arnetminer’s citation data? It seems to be a slightly outdated version of Google Scholar. However the number of publications considered seems to be much smaller than what appears in Google Scholar, and accordingly the h-index seems to be much lower than the h-index derived from Google Scholar.

  2. j.m.birkholz
    Posted May 9, 2011 at 8:42 pm | Permalink

    According to email communication with Arnetminer.org- “The publication data is from DBLP and ACM, while the citation no. is from google scholar.”

    Thus listed publications are comprehensive for the field of Computer Science as they are using two main bibliographic servers of the field. But the Google Scholar score reflects a wider disciplinary outlook. Thus I would assume the difference in scores is because the Google Scholar score that Arnetminer takes is refined by the list of publications within Arnetminer.org itself.

  3. Victoria Uren
    Posted May 19, 2011 at 4:10 pm | Permalink

    This work is potentially very interesting, particularly to identify knowledge of which online activities are most associated with improved performance and to target effort. Do you have any preliminary view on differences between the online activity of the three groups you tested?

  4. José Luis Ortega
    Posted April 14, 2013 at 6:28 pm | Permalink

    I think that your are doing the same task two times because the publication source of Arnetminer is DBLP, so you do not need to extract names from DBLP, they already are in Arnetminer!
    Judit is right. Citations are taken from Google Scholar, which is calculated from a larger amount of documents than Arnetminer (DLBP) holds, so you have to be careful to compare citations and h-indexes with documents, because they come from different sources.

  5. Posted April 15, 2013 at 7:32 am | Permalink

    José Luis,

    Yes you are correct one could just query Arnetminer as the DBLP pub data is embedded in the Arnetminer database. For our purposes we already had a set of DBLP records of identified Dutch scientists and disambiguated names as part of a database used in another research project. This was used as our source list. Since, DBLP does not have/store citation scores, we needed a reliable database where we could identify these scientists from our source list to compare metadata. Arnetminer was selected, since the coverage was comparable.

    For the complete results of this study, check out our recently published a paper – Birkholz, J. M., Wang, S., Groth, P., & Magliacane, S. (2013). Who are we talking about?: identifying scientific populations online. In J. Li et al. (eds.), Semantic Web and Web Science, Springer Proceedings in Complexity,DOI 0.1007/978-1-4614-6880-6 21.

Post a Comment

Your email is never shared. Required fields are marked *