The publication of the San Francisco Declaration on Research Assessment provides a welcome point of focus from which to debate the value of metrics that attempt to measure the volume and quality contributions made to scientific progress by a country, funding agency, institution or individual researcher.

A major stimulus for this multi-publisher/agency review was the growing animosity of academics and their funders for Thomson Reuters’ Journal Impact Factor(IF), a statistic based on averaged citations originally designed to help librarians identify journals for purchase, but increasingly used as a quantitative indicator of a journal’s quality, and, by implication, the papers published therein and their authors. In reality, the citation contributions of individual articles range over several orders of magnitude (see graphic), so whilst they are de facto a good predictor of the IF, the reverse is not true.

Nature cites3

The pressure to cut the influence of the IF back down to size comes not because metrics are a generally a bad thing, but because, with a growth in scientific production (papers, datasets, etc) funders, institutions and researchers need some form of objective framework to assess and manage performance. The mistake, perhaps, has been to assume that estimators of quantity can also be used to access quality, in this case contribution to scientific progress.

Open access publishing models and digitization of STM content generally have stimulated a growing number of alternative metrics whose variety and usefulness is championed by organisations such as ImpactStory and Altmetric. These new metrics include downloads, data from social media sites such as Twitter and Facebook, and information from online reference managers such as Mendeley. So there is a lot to choose from. But although the Declaration makes a number of recommendations for improving the way in which the quality of research output should not be evaluated, it fails to put its finger on what it is that should be being measured and how.

For example, its advice to Funding Agencies is to “consider the value and impact of all research outputs (including datasets and software) in addition to research publications, and consider a broad range of impact measures including qualitative indicators of research impact, such as influence on policy and practice.”, but not use journal-based metrics, such as Journal Impact Factors, as a surrogate measure of the quality of individual research articles. But what are these new impact measures to be?

Let’s start with citation statistics. Surely the number of times an article appears in a reference list provides some indication of the value of the cited article to the scientific community?

Well, the answer may be not to use statistical metrics at all. Citation may be a powerful form of social communication but it is not an impartial method of scholarly assessment. The distortions in article citation statistics include bias, amplification, and invention. Thus, according to Steven Greenberg, a neurologist who studies the meaning of citation networks, citation is often ends up being used to support unfounded claims which can mislead researchers for decades. More to the point, it is possible to identify the evolution of our knowledge about a problem by characterising individual citations according to whether they present original authorative ideas, are supportive, critical or unfounded.

Qualitative citation typing ontologies, such as the one just mentioned, could be created as part of the funding review process and/or as part of a formal program assessment process to review the contribution of individual projects. Once captured the data could be added to and graphed using open access bibliometric databases such as PubMed.

By making it clearer what the purpose of the measurement is, we stand a better chance of coming up with new metrics that work.