What impact ? Whose value ? Citation metrics in a workflow perspective

In fact, the JIF does not cease each day to answer for its crimes, and we cannot read far in the literature without encountering a new criticism, a new proposal, a new alternative metric, a new funeral for the JIF. So, my talk today could be a eulogy, though I think we are finding that reports of the death of the journal impact factor are greatly exaggerated. In fact, one of the first things we need to address is why, given all this talk, the JIF remains so important. Why, year after year, each summer, about solstice time, does it rise from the ashes, take wing, and flourish again? But our task today involves not only a single metric, but the much broader question of what the role of citation metrics, and other metrics, is in the world of electronic publishing. To answer this, we will need to investigate the nature of the 21stcentury research community, about which we may have misconceptions that can blind us to the real issues we face. With a clearer understanding, we can better appreciate what I would call the ‘new search for value’ that is calling forth meetings like this one all over the world. And lastly, we can begin to consider what it would mean to construct true metrics for decision, which at Thomson Scientific is a question we place within a work-flow perspective. It is strange that those who approach the JIF in the role of executioners often sing its praises in curious ways. There is much ambiguity in our attitudes towards this curious number. Indeed, it is not uncommon to see this ambivalence on display in a journal that adopts a critical stance, yet proudly shows the plumage of each year’s newly risen JIF on its website.1 What impact? Whose value? James Pringle Serials – 20(3), November 2007


Introduction
I will begin, in the Shakespearean spirit introduced by the title of this colloquium, by amending Mark Antony: Friends, publishers, information professionals, lend me your ears; It appears that we come to bury the journal impact factor, not to praise it.

Many scholarly articles and blogs have told you that the JIF is flawed.
If so, it is a grievous fault,

And grievously has the JIF answered for it …
In fact, the JIF does not cease each day to answer for its crimes, and we cannot read far in the literature without encountering a new criticism, a new proposal, a new alternative metric, a new funeral for the JIF.So, my talk today could be a eulogy, though I think we are finding that reports of the death of the journal impact factor are greatly exaggerated.In fact, one of the first things we need to address is why, given all this talk, the JIF remains so important.Why, year after year, each summer, about solstice time, does it rise from the ashes, take wing, and flourish again?
But our task today involves not only a single metric, but the much broader question of what the role of citation metrics, and other metrics, is in the world of electronic publishing.To answer this, we will need to investigate the nature of the 21stcentury research community, about which we may have misconceptions that can blind us to the real issues we face.With a clearer understanding, we can better appreciate what I would call the 'new search for value' that is calling forth meetings like this one all over the world.And lastly, we can begin to consider what it would mean to construct true metrics for decision, which at Thomson Scientific is a question we place within a work-flow perspective.
It is strange that those who approach the JIF in the role of executioners often sing its praises in curious ways.There is much ambiguity in our attitudes towards this curious number.Indeed, it is not uncommon to see this ambivalence on display in a journal that adopts a critical stance, yet proudly shows the plumage of each year's newly risen JIF on its website. 1 Articles and editorials sometimes reveal the same ambivalence.One recent editorial cried out against 'deliberate editorial practices and manipulation' supposed to account for rising JIFs among medical journals. 2Yet, the article to which the editorial referred, a scholarly study of seven medical journals over eleven years, concluded that these 'deliberate editorial practices' included 'active recruitment of high-impact articles' by 'courting researchers' and 'hiring editorial staff', 'improving services to authors', 'boosting the journal's media profile', and 'careful article selection based on the quality of papers'. 3That is to say, by focusing on building a high-quality journal.This is hardly a nefarious practice.
I recently travelled to many of our customer sites in continental Europe -customers who use citation data for many purposes, including both scientific discovery and evaluation of researchers at their institutions.As always, I discussed issues concerning citation metrics with them, and posed a question I often do: Should we change the impact factor?
Their comments surprised me, because I am used to a mixed response.But they were adamant -don't change it.They made two strong arguments: first, because Thomson Scientific has produced this metric with a consistent set of data for so long, it has a unique value for analysis over time that no other measure can offer; and, second, in countries that are trying to upgrade their scientific publishing practices, the JIF gives researchers a publishing goal that encourages them to aim high, and thus improve the quality of their efforts.
As we toss around discussion of 'flaws' that may affect some minor percentage of comparative results, these fundamentals should also be kept in mind.But we have a broader task today -to look at both old and new measures of value and how they should affect electronic journal publishing.

Value in the 21st-century research community
What is going on when a young researcher looks at a JIF before choosing a journal for submission of a new article, or an administrator uses a JIF as shorthand for research evaluation, or a journal posts a high JIF while publishing articles of criticism about it?
The power of citation metrics is easy to understand.Citations, and thus analyses derived from them, are at the heart of scholarly research, the lynch-pin of understanding the scholarly research community.Citations are the most fundamental example of community-created content .They are built by a community with well-defined characteristics: a global community of scholarly authors who are engaged in a formal communication process.They have a complex sociology, it is true, but in comparison to something like usage, or social tagging, or peer review, the basics of this sociology are by now well understood.
Yet why is it under so much discussion, and why do alternative metrics like 'usage factors' evoke such a visceral feeling of hope and promise?It is because we are in the midst of a crisis in our understanding of value.And this crisis goes well beyond any particular measure we apply.
To grapple with 'value' in the scholarly research community, and how it is created, we must go well beyond any discussion of the adequacy or inadequacy of any particular set of measures.Instead, we should focus on what is occurring within the research community that we all serve.We will not get this from a single, simple survey, but rather from an appreciation of long-term and short-term trends.
Many of us misperceive the nature of the 21stcentury research community.We secretly retain a vision of the scientific author inherited from the 18th and 19th centuries, which we Philadelphians can see finely illustrated in Thomas Eakins' painting The Gross Clinic: a Romantic view of the scientist drawing forth a new truth of nature heretofore unknown in a single heroic encounter with nature.An article is a record of this successful outcome of this single encounter.
This image is remarkably pervasive.We see vestiges of it in metrics like the 'h-index', which appears to enable us to compare one researcher's work directly with that of another. 4Those who apply this new metric implicitly assume that the units of comparison -the articles in a publication list -are units each produced by a single researcher, heroically writing up new truths and presenting these to the waiting scholarly audience.
Our journals seem to be struggling to escape a vision as inheritors of that heroic authorship model.The collection of research articles that a journal holds incarnate the formal communications of these authors, and journal metrics are bound up with this concept.
Today's academic community looks very different.An author's publication list is an immensely complex document, a sociological record of a career built in ways that our 19th-century predecessors could never have imagined.Today's academic is: ■ communicative at ever higher velocities ■ collaborative, as a fundamental research practice, and ■ competitive, at many levels.

Communication
Research today, first of all, moves at a faster pace, using more communications channels than ever before, and proceeds simultaneously on global scale.William R Brody, President of The Johns Hopkins University, recently called this a result of 'IT/IT'cheap international travel and information technology. 5As he notes, 'Today, knowledge is disseminated in seconds …' and he further notes that 'expertise is now measured on a global rather than local scale'.In practice, this means that a researcher who presented a conference paper last week in Madrid finds that paper is now being discussed in Taipei thanks to an e-mail, listserv, or conference posting, and it will next week be referenced at a meeting in Beijing.This is not particularly a 'Web 2.0' phenomenon.It is simply a result of the technologies of information-sharing honed by scientists and scholars since the initial days of the Internet.
Because of communications velocity, scholarly reputations grow faster, and grow simultaneously around the world.Reputation is not based only on dissemination of articles, but on a whole set of communications activities.As reputations grow, a new culture of expertise is created, in which a researcher sits at the center of multiple scholarly networks of communications, is the center of these multiple communications channels -a sort of 'Martha Stewart' of a small multimedia empire consisting of the textual, data and visual representations of his or her work that are moving around the world at the speed of light.
The research article, as published in a scholarly journal, is less and less at the forefront of this communication chain, and increasingly at the end point -the researcher's reputation grows faster than his publication list, and is based on a multiplicity of new types of documents and databases in addition to published articles.The types of documents stored in institutional repositories provide one indicator of the importance of new types of materials -though still largely comprised of scholarly articles, now databases, video clips, and other types are growing among the mix. 6rom the university's point of view, Brody notes a 'Michael Jordan' effect, in which universities bid higher for the top researchers who can make the institution grow.And increasingly, universities, as centers of the knowledge economy that is fueling economic growth, are willing to bid.

Collaboration
These trends might seem to bode well for the heroic scientist, seemingly reinvented for the 21st century as a scholarly entrepreneur.But today's researcher is not acting as an individual.Increasingly, it is his or her place in the social network that creates importance.An American researcher has built a genomic database, an Australian researcher has developed a new analytical technique, word spreads at an Italian conference, and a collaboration is born, enhancing the reputation of all participants.International collaboration is increasingly the foundation of scholarship.
This trend leaves a mark where we can all see it -in the research and corresponding addresses of papers.Scientists and other scholars have always been collaborators.What is surprising is the nature of the collaborations taking place today.Seen over a 25-year period, the growth is steady and inexorable, both in the average number of authors per paper and in the average number of countries represented.(See Figures 1 and 2.) Within the new world of 21st-century scholarship, new sets of questions about value emerge: 1. Who of the many researchers listed on this paper is the next 'Michael Jordan'? 2. Is not the person who built the database underlying the research as deserving of credit as the person who wrote up the research?3. Is not this database a valuable research product in its own right, and perhaps more valuable in the real network of scholarly communications today than the article that incarnates the conclusions of a point-in-time discovery made using it?4. What portion of the scholarly network of value has actually been captured by the journal, and what is thus subject to any 'journal-level' metric?

Competition
Within the research community, such issues are important because they frequently involve large sums of money.Certainly, availability of funding varies around the world, but in many fields it appears that increasingly large sums within a fixed pool are going to fewer successful candidates who must jump through more hoops to get it.Recent reports from the National Institutes of Health clearly illustrate this trend.

The new search for value
Having looked at the changing nature of the research community, we can focus more clearly on the search for value, and ways to assess it, that are underway among the various stakeholders.We have already looked at one of these stakeholders: the researcher building a career.Let us now look at three others: the funding community, the library community and the publishing community.

Decision support for funding
Funders -whether grant funding agencies or university administrators charged with managing scarce university resources -are today intensely outcomes driven.They must understand whether they are getting value for money.To do so, they are faced with the need to determine whether research has resulted in tangible outcomes -articles, patents, impacts on the health and welfare of society, scientific advances -and they seek measures to assist them.Not all these measures are neatly quantitative.The Wellcome Trust, to take one example, has developed sets of illustrative stories and histories that showcase how research money has been used to good effect. 8But quantitative measures are at the heart of their search.
Here, abuses of the JIF abound, but so do good uses of the JIF and other citation metrics.For example, I have seen one agency that divides JIFs by discipline into quartiles using a statistical technique call 'box-plots', and then uses the percentage of papers published by agency researchers within each quartile as a measure that can be compared across research centers and across years.Such an approach minimizes the effects of skewness in JIF distribution, adheres rigorously to the recognition that JIFs vary across discipline, and ignores minor JIF variations that have little meaning.It also avoids the assumption that because a paper is in a high-JIF journal, it is highly cited.Instead, this approach simply signals that papers from an agency have achieved publication in wellrecognized journals of global significance. 9verall, this is an arena where there are few good guidelines and many experiments.It is the frontier of the metrics world, and at the heart of the changing nature of value in the community.Instead of condemning abuses occurring in this world, we need to engage it directly, because this is where keen observers of the changing research community are grappling most desperately to understand the changing nature of scholarly value.

Decision support in the library community
The library community looks at value from two points of view.The first is value for money where resource decisions are concerned.Since the era of the big deal, the difficulty of using individual journal data to make such decisions has become complex, but to the extent that this data can be applied, the kinds of questions that the library community poses, in my experience, tend to be local in nature.Knowing general journal metrics forms a part of the picture, but more basic are the kinds of data that are unique to an institution.The questions tend to be of this type: What journals are being used in my institution?Where are my researchers publishing?What are they citing?
For the library community, usage is local, supporting local purchasing decisions, though these may be executed through consortial arrangements.For this reason, 'small-world analysis' is especially interesting as a tool.Herbert Van de Sompel conducted a study last year using log files at Cal State, in which they constructed a 'usage impact factor' that mimicked the calculation of the JIF.They found that differences between the two factors for a test set of journals were closely linked to the characteristics of the user community, which included a large group of undergraduates and graduate students as well as faculty.Variations were in most cases closely correlated to the ratios of undergraduates to graduate students and faculty at the disciplinary level. 10This is actionable information -the knowledge that the characteristics of my institution should determine what resources are needed.Though it uses similar approaches to the construction of a JIF, it uses them to considerably different effect.
The second search for value in the library community involves supporting the new tasks undertaken by the research administrator.The library is increasingly called upon to assist the academic administration as researchers prepare publication lists for evaluations and administrators attempt to answer new questions about research and its outcomes as these affect the institution's reputation and development.I have met a number of library administrators who have transformed themselves into bibliometricians in a search to assist a provost or research administrator.

Publishers: measuring the value of a journal
As more scholarly communication passes to other forums, and the weight of the importance of an article passes increasingly from communications to validation, publishers face some very new challenges.Everywhere around us they are testing solutions, through such means as testing new types of interactive environments and supporting related databases and the media attached to the article.
We should expect that these new and varied experiments will yield new measures of value.Many of these measures are derivatives of the JIF.Approaches like the 'Bangkok impact factor', for example, weight the JIF with the cited half-life to correct for the velocity of citations in different scholarly disciplines. 11Approaches like the 'Eigenfactor' apply sophisticated mathematical techniques to the journal citing/cited tables from the Journal Citation Reports to attach importance to the fact that a journal which cites another journal is, in turn, highly cited. 12

What should we do?
The right approach to future value metrics will involve accepting the realities of changing scholarly communications and focusing on the needs of the decision-makers who will use these metrics.
Given what we have seen about the changing research community, it is clear that publishers can help by aiming to improve and standardize the ways in which the new objects of evaluation (e.g.databases, images, videos, etc.) are cited and linked within their environments, by working towards clear and precise attribution of credit in research teams, and by helping to standardize the presentation of funding data within articles.
Publishers should also focus on the context and work flows within which metrics support decisions.As we have seen, some of the most interesting work on metrics is based on understanding local contexts and the nature of decisions that many administrators need to make.One initiative to support the local and specific nature of decisions that Thomson Scientific has undertaken was a product launched last year called the Journal Use Reports™ (JUR).The JUR contains tools to aggregate COUNTER-compliant usage statistics at the institutional level, to view publication and citation patterns of researchers at individual institutions, and to view journal metrics for the journal set to which an institution currently has access.It focuses on the specific decisions that need to be made at an institution using local data appropriate to the decisions.The JUR is an example of a 'work-flow'-oriented product.This type of tool is based on the ways in which the right information and metrics can be put into the hands of decision-makers.The JUR includes the JIF, but also other metrics where needed for decision-making.It is for this reason that Thomson Scientific has adopted a 'work flow' approach to understanding the information needs of information solutions for the research community.This approach can provide a firm support for our efforts as we determine the best type of metrics needed for decisions about the value of journals, articles, and other outcomes of the scholarly research process in the future.
What, then, of a 'usage factor'?Given all these real needs, it would be an extremely odd and rather unexpected outcome if the best way to use usage statistics for decision turned out to be a metric that sought to replicate the JIF with usage data, rather than a metric that contributed to the kinds of decisions about value that will be needed in the research community.It is to be feared that, instead, all that such an initiative would accomplish would be a new series of debates about 'flaws' and a new source of bragging rights.
Indeed, this debate has already begun.As one commentator recently put it, 'while usage statistics are only slightly useful, their misuse can be enormously damaging 13 .
A simple effort to envisage a 'usage factor' as a sort of parallel and equivalent 'counterweight' to the journal impact factor seems to lead us deeper in this direction, into a new round of discussion about flaws, abuses, etc., while the nature of value in a changing research community keeps escaping around us.It would be far better to engage the real evaluative issues that confront us.

Figure 1 .Figure 2 .
Figure 1.Growth in average number of authors per research article, 1981-2006 Source: Web of Science®