Go to home page

NAF home

Organisation and funding

Symposia and reports

Projects

National Scholarly Communications Forum

2005 Review of the Learned Academies

NAF home > Symposia and reports > Measuring excellence in research and research training


MEASURING EXCELLENCE IN RESEARCH AND RESEARCH TRAINING
Canberra, 22 June 2004


Excellence in the humanities, creative arts and media
Professor Iain McCalman


What is excellent research in the humanities and arts?

Innovative or original work that moves a field of knowledge forward, breaks fresh ground, stimulates further developments. It may do this by finding new areas to explore or alternative ways of examining existing problems, for example by discovering new data, interpreting new or existing bodies of knowledge in a new way; by systematically challenging the existing canons or ideologies; by experimenting with materials and ideas in search of new formulations; by questioning methodologies and forming new hypotheses and theories, by testing and implementing models and paradigms; or by acts of creative practice or performance.

Work that makes a substantive contribution to an existing field of study or discipline. Important to stress that this can and does include scholarship which is sometimes and erroneously denigrated by comparison with what is deemed research. Scholarship, as defined in the latest British RAE as, ‘the creation, development and maintenance of the intellectual infrastructure of subjects and disciplines, in forms such as dictionaries, scholarly and critical editions, catalogues and contributions to major research data-bases.

Work that feeds the development of new technologies or the application of technologies that bring benefit to peoples.

Work that is wealth creating or that represents good value for the money invested.

Excellent research in the humanities and arts may have an immediate and recognisable use-value in terms of wealth generation or social benefit or it may be ostensibly non-utilitarian. It may be pure, or academically driven investigative research, or applied research which is policy driven. It may have an ultimate significance that lies not in its applicability to any given current set of circumstances, but in its enhancement of the cultural or spiritual dimensions of the collective life of peoples. Its significance and effects may be long-term and subtle, that is in contributing to shifts of thought and values or long-term and speculative whose results may not reach immediate fruition.

How do we achieve it?

We must identify and reward research of a good standard as well as the best possible research.

Excellent research must be tested and evaluated in the public domain, through publication or other comparable means of dissemination ranging from print, through electronic and digital forms, to performance or exhibition. It needs to be assessed both by national and international peers, according to clear, fair and agreed benchmarks.

If it includes a substantial component of practice or is embedded in creative practice, as in the visual, creative and performing arts (and similar to law, engineering and parts of medicine), it must bring enhancements of knowledge and understanding in the discipline or related disciplines and incorporate a scholarly apparatus and record of research activity that enables other researchers in the discipline or related disciplines to assess its value and significance, and its methods.

We need to foster a research culture or environment that is self-reflexive, accountable and innovative. It should encourage and allow individual scholars to evolve, formulate, publish and disseminate their research both for a national and international scholarly community of specialists and for general publics, as well as research which bridges the gap between scholarly communities and publics.

It should, where relevant, support and encourage team-based and collaborative research, interdisciplinary and multi-disciplinary as well as single discipline research, and cross-institutional as well as single institutional collaborations.

It must attract postgraduate research students and encourage their completions, and attracts and fosters the work of postdoctoral students and of early career teaching staff. Such an environment must have incentives to improve future research performance for existing staff and improve the supply and demand of researchers. Assessments of research excellence must be dynamic, rewarding past performance and prospective plans or projects. It must be flexible so as to enable new fields to develop.

How do we measure such excellence?

There are four broad methods which have been variously used around the world to measure research excellence in the humanities and creative and performing arts: peer review, such in the British RAE system: self-assessment; historical ratings and algorithmic or quantitative measures. Want to focus particularly on the last because it has been the key system in Australia and because both the British and United States higher education systems are moving more and more in that direction.

Algorithmic assessment.

I am not going to waste much time on the crudely quantitative system currently employed by DEST except to say that by failing to incorporate any real component of qualitative assessment, it is worse than useless in that it skews research practices in directions that are almost uniformly calculated to prevent a healthy and innovative research environment. It rewards mediocrity than quality. I will give a few examples from my own research oeuvre.

Quantitative indicators.

One quantitative indicator commonly employed is the size of research grant income, but this is not necessarily a measure of research quality in a humanities and arts context where work is less dependent than science, engineering and technology on factors such as expensive equipment. In the arts and humanities, the number and competitive bases of research grants may be a better indicator than their size.

Another quantitative measure which has gained fairly wide acceptance is the numbers of postgraduate students attracted, and their completion rates, and the numbers of postdoctoral students attracted.

A third quantitative measure which has not been widely developed but which is being tested by various of the Research Councils in Britain and by the RAE assessors is a series of esteem indicators, such as membership of learned societies; invited national and national keynote lectures; editorial boards of scholarly journals; membership of learned boards such as the ARC, government inquiries, etc. These can be adjusted according to the particular shape of disciplines, to include performances, exhibitions, consultancies, etc.

Citation indices.

This system was invented more than 30 years ago by Eugene Garfield who set up the Institute for Scientific Information based at Philadelphia in the USA, and which has published three citation indices: in science, in arts and humanities and social sciences. These are alphabetical lists of scholars whose writings are mentioned in a body of scholarly periodicals. By counting the number of times a particular researcher and their work has been mentioned in this literature you arrive at a number which is interpreted as proxy of research excellence. A series of studies, especially in librarian studies field, conducted by researchers at Loughborough University, have claimed a close correlation between such indices and other systems of assessment, including peer evaluation. These studies have argued that the correlation holds across a very wide field of subjects and disciplines, including those in the humanities, and even in fields where the specialist groupings are small, the outputs slight and the citations rare. Because the system is claimed to be relatively cheap and objective compared with peer or self assessments, it is recommended as preferential to any other.

Criticism of citation indices have come from two quarters; those who find serious fault in the crudity of existing citation methods for any field of research excellence ; and those who argue that they have a particular problem with measuring excellence in the humanities and arts, as well as some other sub-fields in other disciplines.


Questions/discussion

Marie Carroll – I certainly agree with your sentiments that we need a combination of methods of assessment, but I was particularly interested in your advocacy of self-review. I was wondering if you had any evidence at all that that is a reliable method of measuring excellence. Do you have any knowledge of whether self-review is actually reliable?

Iain McCalman – Well, it has been tried and it is being tested, both in Canada and also in Britain. It is only reliable, I think, if you can constrain it. That’s the problem. You have to verify it, and that does mean the same problem as peer review, that it uses up resources and time. It has to have very definite benchmarks, as it were, which one assesses oneself against or groups assess themselves against. It certainly wouldn’t work, and I don’t think would be credible, as a sole system.

What I am suggesting is that it might narrow down the very cumbersome processes so that we end up focusing more specifically; the peer review elements are subjected to more intensive analysis and in a less cumbersome manner. But I agree, I think more work needs to be done on it. Fortunately, they are testing it in various ways at the AHRB and in Canada, so I guess we just have to follow and see what they come up with.

Robin BatterhamI agree very much with what you were saying there, and I would invite you, in the discussion and the thinking, to note in the UK system how they are distinguishing between peer review and expert review. I think that when we say ‘peer review’, actually a lot of the time we are meaning what they call ‘expert review’, which is subtly different.

Iain McCalman – Thanks for that, Robin. Yes, I noticed that there is a strong emphasis now on that. In the latest RAE exercise, particularly in international reviewing, they are going to put much greater emphasis on that. International reviewers are going to spend quite a sustained amount of time in the UK; there are going to be international benchmarks, both with like countries and with adjacent countries. So yes, that distinction is important to make.

Again we have the point that Bob Graham made earlier, that this is expensive and it presents some difficulties, but I think international benchmarking is one of the ways we can offset the problem of a very small group of peers here.

Sue RichardsonI would be interested in your view – and I am particularly referring now to the self-assessment – whether there are gender biases in these different strategies for reviewing quality. Women are well known for underselling ourselves relative to men. I don’t know who gets it right, but relative to men we undersell ourselves. But there may also be gender biases in the peer review if women are doing things that are a bit different from what the men are doing – the men command the heights here. I wonder whether from your perspective you have any insights you can offer us on that.

Iain McCalman – No, except that you are absolutely right. Look around! So many times I have sat in senior academic meetings and seen this sea of grey suits and masculine heads, and felt intimidated, almost, because the humanities are a notoriously feminised area. I have no doubt that there are gender biases in that. There are gender biases in the career structure of women and men at the moment, still, many of which have not been caught by this system, so you can get a problem that catches early-career people as well.

I presume that is the kind of thing we really do need to do work on, and this is where we need the social sciences to look into that. All I can say is, I suppose, that a diversity of assessment procedures may cancel out some of the biases, perhaps even by setting one against the other. But the ultimate fact is that in the system at the moment its seniority right through is heavily weighted in favour of men, and it is going to perpetuate that until that weighting alters.

Mark FinnaneI think Robin Batterham, at the beginning when he put up a brief indication of what the $2.8 million from DEST was to be for, contributed to a focus so far on the assessment of research.

Nearly $600 million a year from DEST goes into research training. The $2.8 million, I understand, is going to be to investigate the quality of research and research training. I don’t know if Iain has any comments on the issue as well of assessment of the quality of research training as well – which, after all, in the Australian system is strongly dependent in terms of its outcomes on a system of expert review.

Iain McCalman – That is a very good point. Funnily enough, from just looking at the literature – and this may be a problem with the literature – research training doesn’t seem as controversial. There isn’t as much analysis of it. And I think perhaps the problem is that we just take it for granted, that the existing system is accepted. It is simply stimulated by some galvanic action to its nerve system to make it jump further and faster, and we don’t look to see if it is really working in quality terms. I suspect that we need expertise looking at that. Especially, one of the things that has struck me for a long time is the isolation of our PhD system in the humanities and social sciences, and how little we have in connection with peer groups. We have tried to do something about that in our particular centre, but something like that to be built in to the system I think would enhance quality. But we have a long way to go.

Leon MannBob Graham, I think, focused very much on individual excellence and Iain, I think, spoke most recently about ‘peer’ and about ‘team’. That complexity of individual research and team research, where we are being encouraged to collaborate and where a lot of the work that is being done across disciplinary areas, obviously is going to require teams for creativity. It adds a level of complexity in working out what it is that is the recognition of excellence. I am wondering whether the speakers this morning have got some thoughts about how we factor in the team variable in making that kind of assessment, in particular recognising that it is somewhat harder for teams and larger aggregates to work together, although work together they must.

Iain McCalman – I don’t really have any answer to that. Teamwork, collaborative work of a reasonably large-scale kind, is newish in the humanities and social sciences. It is a model that has been employed in the natural sciences for much longer, and they have particular mechanisms for assessing qualities of teams. It is a new problem for us. The only direct experience I have had is when I produced this book – An Oxford Companion to the Romantic Age – which in DEST points earns me virtually nothing but which took five years, has 360 contributors, 150 international contributors, and was the first Oxford Companion in the British area to be delegated outside of Britain. It was a massive team effort, on a very small outlay. We didn’t get ARC grants for it – it is scholarship, not innovative research, it seems, even though the reviewers seem to think it has changed the paradigm.

This is part of the problem, that it is not only how you assess teams but how you assess their product, and how the people involved in this get the benefits other than me. It’s only my name that appears on this book, but I have a key editorial team with three or four people who deserve the accolades much more than I do.

So there are difficulties, particularly in a culture like ours which has been used to individual books and individual researchers.


GPO Box 119 | Canberra ACT 2601 | AUSTRALIA | Ph: 02 6249 1788 | Fax: 02 6247 4335