There are a growing number of data sources relating to individual articles available for analysis.  By using a common data element such as a DOI, these data resources can be joined together into a single data resource which can by analysed using packages such as STATA and Excel. These data sources can include internal production information and external sources of usage data as supplied by a growing number of publishers.

For instance…

This is an example of the output from a recent analytical project. The main object of the study was to take the metadata generated during the client’s production process and zip it to citation data. The integrated dataset could then be investigated using statistical tools such as STATA.

Citation analysis reveals varying performance of special themed issues

The graph shown here represents the varying number of citations received by individual journal issues since 2007, i.e. the oldest issues are to the left of the graphic.

The red line represents the mean citations generated by each issue. These decline in a linear fashion. That is to say, the mean and the variance of issue citation rates increases with age. However, whilst the mean increases in a linear fashion, the variance increases dramatically after about 50 issues.

The special issues are labelled with blue spots. It is clear that the success of these initiatives has been quite variable. In fact, the highest peaks are generated by specially commissioned themed issues, whereas the smaller ones are associated with issues based around conference proceedings.


Editors can put a lot of effort into creating special journal issues. Analysis can provide evidence of their impact on usage and journal impact.

Current analysis projects focus on combining different types of metric and assessing their predictive value. We are also studying the growth of PLoS ONE and its clones.