There is an increasing amount of data about STM publishing available freely on the web. Publisher/imprint/journal/ISSN relationships as developed for Elsevier’s Scopus database can be found here. Publicly accessible resources derived from Scopus and Thomson Reuter’s Web of Science can be viewed at Scimago and Eigenfactor. Scimago is especially useful as it contains journal statistics such as the number of documents published which can be downloaded as a CSV file. Elsewhere you can find information about journal pricing.

More detailed document level data can be downloaded from PubMed. This obviously is far more restricted in its field coverage, but it does contain information about article accessibility, useful for tracking the penetration of different open access business models.


And if you want to know more about the value of article-level metrics, then head for PLoS.

Most of these different data resources can be zipped together using the journal title, ISSN and document identifiers (DOI) to link specific records.

The two panels alongside demonstrate the sorts of things that can be achieved. I have grouped individual journal publication volumes across publishers and subject areas, calculated growth rates and citation ratios. The database takes a few hours to set up and debug, but then it is easy to formulate complex queries in a few minutes – for example, how does the citation impact and publication volume of a large society publisher depend on its output of Proceedings journals?

This example here is taken from the social sciences and maps publishers in terms of overall size, growth and citation impact and provides a quick reminder of who the major players in this journal ecosystem are.

The size of the circles in the upper panel correlated with the Scimago Journal Ranking scores (an algorithm similar to Google’s PageRank) which are presented in more detail (median, 95 percentiles, outliers) in the box plots below. It would be a straightforward matter to look below at the journal level and to chart changes over time (a topic for a future post).

This picture highlights the wealth of acquisition targets in the discipline (there are many more slightly smaller companies not shown in this view) all of which may shortly be challenged with implementing new open access options for their authors. Alternatively there is a great opportunity for a PLoS ONE clone here, though the experience of SAGE Open seems to suggest that the field still feels uncomfortable with open access ethics and may well choose ultimately to take the green route as the lesser of two evils. This will change…