The 2015 Altmetrics Workshop
Amsterdam, 9 October 2015
Juan Pablo Alperin
There is a long-standing hope from the altmetrics community that
altmetrics will provide a means of capturing impact beyond the academy.
Statements speaking to altmetrics’ potential to capture public impact
can be seen as early as the *Altmetrics Manifesto* (Priem, Taraborelli,
Groth, & Neylon, 2010), and similar statements regarding various types
of impact are peppered throughout the literature (Alperin, 2013;
Lin2013;Taylor2013a; Bornmann, 2014a, among others). This is no
surprise, of course, as the very term altmetrics implies the measure of
impact beyond citations.
Yet, getting hung up on the *metrics* may be hindering our efforts to
capture the very things we intend altmetrics measure. There is an
understanding, at least among some, that where this “broader impact of
research is concerned, it is much more important to learn who has used
an actual research product and why, than to simply know ‘how many’
people have in total.” (Bornmann, 2014a, p. 901).
While the task of collecting counts (i.e., the ‘how many’) can easily be
automated, there has been little progress in finding reliable automated
ways of collecting further information about the individuals responsible
for generating the counts, and even less progress in uncovering their
motivations or the contexts in which they encountered and engaged with
the research product.
As a result, after five years of altmetrics research, we are certainly
better equipped to interpret and use social media metrics, but we still
only have a limited understanding of who and how the metrics are
To this end, we must devise new research instruments that move beyond
the data sources themselves, and that instead offer information
regarding the individuals who access or mention the research online.
This paper describes one such method, specifically designed to learn
more about individuals who have Tweeted research articles, along with
the experience of a pilot study that shows the method’s strengths and
potential, as well as its limitations.
This method builds on recent efforts by others, who have used sentiment
analysis (Friedrich, Bowman, Stock, & Haustein, 2015) and using expert
and crowd-sourced (Mechanical Turk) classification of Tweets(T. D.
Bowman, 2015), and complements the efforts of identifying social impact
by using comparisons to expert-evaluation of articles (using F1000
recommendations) (Bornmann, 2014b, 2015).
The first step is to identify Twitter users who have shared a research
article in the recent past. Users can be identified using data from
Twitter itself either through the Twitter API or any third party data
re-seller, including altmetric providers. In the pilot study, the
screen-names of 6,397 Twitter users who either Tweeted or Retweeted a
link to an article published on the [SciELO Brazil](http://scielo.br)
platform over the 2013 calendar year were extracted from data provided
Second, a Twitter bot, a program to automatically post and interact with
Twitter, must be set up to conduct the actual survey. A short research
question (within Twitter’s 140 character limit) can then be asked, and
directed at the identified users by ‘mentioning’ the user’s screen name
along with the question. Mentions are “any Twitter update [Tweet] that
contains ‘@username’ anywhere in the body of the Tweet” (Twitter, 2015,
n.p). These mentions cause the Tweet to appear in the “mentions” and
“notifications” tabs of the person mentioned, alerting them to the Tweet
(Twitter, 2015). In the pilot study, an account with the screen name
@AcademicOrNot was set up, identifying itself as a research account
owned by me (Twitter screen name @juancommander). It tweeted at the
6,379 screen names over a period of 11 days, starting on March 21, 2015.
Because the identified accounts had Tweeted a link to an article from
SciELO Brazil, which publishes primarily Portuguese articles, the survey
was conducted in Portuguese, regardless of the language of the
downloaded article. Every Tweet sent contained a text that translates to
“This is a survey by @juancommander, please help by responding: Are you
affiliated with a university? Thank you @_username_!”.
Third, responses to the questions must be collected and interpreted.
Response in the pilot study were collected until April 30, 2015,
although most responses arrived within a week of the question being
sent. In the pilot study, 286 were received, and were coded manually. Of
these, 102 (36%) reported not being affiliated with a university at all,
with the remaining 184 (64%) having some affiliation.
Although in the pilot responses were coded manually, questions could be
formulated so that coding responses could be easily automated. Simple
Yes or No responses, for example, could be solicited. Alternatively,
machine learning or crowd-sourced interpretation of the responses could
Finally, surveys need not be restricted to single questions. Follow-up
questions, can be asked, including built-in logic to react to the
received response. In the pilot, users that responded to the survey
question in the affirmative, were asked a single follow-up question:
“Are you a student, or do you work for the university?”. Those that
responded in the negative were asked “What line of work are you in?”
Responses were saved and coded as “Not affiliated”, “Affiliated –
Student” (23%), “Affiliated – Faculty/Staff” (24%), or “Affiliated –
Unspecified/Organization” (16%). The latter category was chosen when an
initial response the affirmative was received, but no response was
received for the follow-up question or in the few instances where the
account belonged to an academic organization.
The obvious strength of this approach is that it uses the altmetrics
data to identify the target population, and automatically reaches out to
as many of those individuals as is desired. By having the source data
(i.e., the original Tweet) it is possible to link responses to the
critical incident (the sharing of the article). This allows responses to
be associated with the Tweeted article and all of its metadata (i.e.,
title, subject, etc.) as well as the time, date, and possibly
geolocation of the event.
The main limitation of this approach was the low response rate. Of the
6,397 accounts that had shared an article from SciELO Brazil,
approximately 5% were no longer active, leaving 6093 successful messages
being sent out. Of these, only 286 responded, corresponding to a 5%
response rate. This low response rate is perhaps not surprising, given
that messages were unsolicited, a long time after the critical incident,
over a medium that where this type of message is uncommon, and in some
cases, in a language that the recipient was not familiar with. A
follow-up study, correcting some of the conditions of the pilot is
needed to see if response rates improve.
The description and results of the pilot study have been previously
published in the author’s dissertation (Alperin, 2015).
Alperin, J. P. (2013). Ask not what altmetrics can do for you, but what
altmetrics can do for developing countries. *Bulletin of the American
Society for Information Science and Technology*, *39*(4), 18–21.
Alperin, J. P. (2015). *The public impact of Latin America’s approach to
Bornmann, L. (2014a). Do altmetrics point to the broader impact of
research? An overview of benefits and disadvantages of altmetrics.
*Journal of Informetrics*, *8*(4), 895–903.
Bornmann, L. (2014b). Validity of altmetrics data for measuring societal
impact: A study using data from Altmetric and F1000Prime. *Journal of
Informetrics*, *8*(4), 935–950.
Bornmann, L. (2015). Usefulness of altmetrics for measuring the broader
impact of research. *Aslib Journal of Information Management*, *67*(3),
Bowman, T. D. (2015). Differences in personal and professional tweets of
scholars. *Aslib Journal of Information Management*, *67*(3), 356–371.
Friedrich, N., Bowman, T. D., Stock, W. G., & Haustein, S. (2015).
Adapting sentiment analysis for tweets linking to scientific papers.
Retrieved from <http://arxiv.org/abs/1507.01967>
Priem, J., Taraborelli, D., Groth, P., & Neylon, C. (2010). Altmetrics:
A manifesto. Retrieved from