Go to home page

NAF home

Organisation and funding

Symposia and reports

Projects

National Scholarly Communications Forum

2005 Review of the Learned Academies

NAF home > Symposia and reports > Measuring excellence in research and research training


MEASURING EXCELLENCE IN RESEARCH AND RESEARCH TRAINING
Canberra, 22 June 2004


Open forum discussion


Phil McFadden (Chair) – If I could do just a bit of a wrap-up.

One thing I heard throughout each group's report was the cry for diversity, the need to maintain diversity, the need not to have a one size fits all response – the need to move away from simple metric measures as a one size fits all, the need to take into account the different needs of each of the groups. One thing nobody has mentioned is that what we are actually seeking there is moving away from a very simplistic method that is currently being used, which measures quantity, to a much more complex system in order to be able to measure quality.

That has resource implications somewhere along the line, but I guess the thing is that we want this done properly, and if you are going to do it properly you are going to have to accept the consequences of doing it properly.

In the training I heard the need for courage at one’s centre to be able to look at their students and see whether in fact they are giving you quality. I remember I once made the mistake of trying to bring a PhD student to a halt at two years, and I still bear the scars, but I think you were dead right that it would be much better for the student had they gone on to doing something that suited them better. We need the courage.

We need to avoid the pressure of an artificial shortening in time for a piece of work that in fact deserves five years to train the student and develop the student into a really self-thinking creature. If it is going to take five years, then sometimes that is worth it and we shouldn’t have these artificial impositions that bring it to a halt in some shorter period of time.

We need to encourage and engender mobility between different universities. We need to engender different ways of measuring for the different communities.

So there was a lot of meat in that. In essence, every part of it spoke about the need for being responsive to the requirements of that particular area. One particularly strong statement that I heard from John Beaton, and that I think has got a lot of validity to it, is that need for faith in the peer review. If you don’t have faith in the peer review, where are you going to go to get advice about quality? I think one of the things that we need to do is to make certain that we can engender, in the funding bodies, faith in the peer system, faith in the expert advice. That is going to be central to our ability to convince people.

One thing that we were a bit short on, in terms of what was discussed – and it is of course because it is the very difficult thing – is that although we spoke a lot about the generalities of ‘We should avoid biasing against early-career researchers,’ but the question is how. What method are we going to be able to use to actually judge that this person is excellent and help them, and not bias it against those people? So to some extent we need to think a little bit more deeply, I think, about the ‘how’. A lot of what was expressed was, of course, very valuable. But it is not valuable unless we can start articulating some of the measures that we can use to actually start achieving that.

Questions/discussion

Miriam Goodwin – One thing I also note that we haven’t really thought about here and yet is one of the great research challenges in Australia is how this exercise can contribute to increasing the level of industrial R&D being done in this country. How can we increase the level of business expenditure on R&D through giving business a greater confidence in the research quality, perhaps giving business a greater guidance on where the best research is being done that is relevant to their needs, being able to then also allow business people to perhaps validate within their own organisations, ‘We are collaborating with Institution X or Group Y because they are shown to be the best in Australia in this field’?

There is a real potential here, and it is broader than reporting to government and it is broader than reporting to the community. It is about being able to really address one of the critical issues that we face.

Bob Watts – I think you will find that the industry people know where the top research is, and I think you will also find that the level of research that they support in the country is very closely determined by, if I dare say it, the sort of dollar value considerations that I gave this morning. So a reason why there is not more industrial research done in Australia, I think, is that the management doesn’t see the value in it.

Jim Angus – I just wonder whether or not this is an opportunity for getting the researchers from the universities either to join advisory boards or actually to be on the boards of some of these companies.

Bob Graham – I don’t have a lot to add, except that it worries me, it irks me that we don’t have much pre-clinical research, especially from pharmaceutical industries, here in Australia. In part that reflects, I think, a xenophobia on the part of the companies that are located over in the United States and other countries. They don’t think we can do it, and they think they can do it better. Again I think it would help if we have these expert panels with people coming from overseas: they can get to see what we have here, and that may filter back. And we may also try and get people from industry to come in on those expert panels, and that may help as well. That’s the only suggestion I would have.

Phil McFadden – I guess to some extent that is targeting some of the question of ‘how’. And I guess there is a question in my mind: do we in this country do enough evaluation, you might say, of programs and institutions, separated from the funding itself? People might want to consider that.

Mark Finnane – I just want to pick up something that was commented on by at least two groups. It is not going to be easy, whatever quality assessment framework we use, to arrive at estimations of quality in research training, but there seemed to be a bit of consensus in a couple of groups that quality was not necessarily related to time of completion.

Let’s hypothesise that University A, over a number of years, consistently completes its students in 3½ years and another university doesn’t worry about completions, and completes them in 4½ years. If we look at all the other factors in terms of assessing the quality of the outputs in terms of the students’ careers and they are all the same, wouldn’t we agree, on the balance of things and certainly with fairness to the students, getting them out of the poverty situation that they are in while they are students, that completing in 3½ years is better than in 4½ years?

I am not aware that there is any evidence that taking a very long time to complete improves the quality of research training, and a lot of discussion is going on in Australia about trying to improve the way in which we manage research training in a way that is compatible with shortening the period and getting people on into their research in university and industry tracks, and not getting hung up on the idea that you need a German-style PhD program that takes 10 or 20 years.

Jim Angus – If I could just by analogy talk about medical training: you can have a three-year medical degree in Canada, or in Australia now there are four-year graduate programs right through to six years from an undergraduate program, including the one at Melbourne, where as an undergraduate we have one year of research for every medical student. I hope the government doesn’t say, ‘Well look, you can do it five years or four years at this institution but it is six at Melbourne, therefore we are only going to pay for four at Melbourne, or five at Melbourne.’ I think that is a terrible way of looking at this.

I think we have to look at what is the quality of the product and take the individual case in terms of a PhD student to see what it is, why they have taken that extra length of time to get through their PhD. And I think then you will see why there are these differences.

I have got medical students who may take six years because they have moved from full-time to part-time as they have moved through their clinical training. I think we have to allow that flexibility.

Phil McFadden – I think you made quite a strong point in your comment there about ‘if all other things are equal’. If all other things are equal, and you have given the intrinsic capability, then of course my response is yes, it is better for the student and probably better for the country. As an employer I will quite often look around and have in mind quite specific things that I am seeking, and there are times when I would rather pick off somebody who has spent more time in training and is better quality as a consequence – if it actually does lead to better quality. I take your point: if all other things are equal, then sure. But I think very often other things are not all equal.

Gina Newton – I would just like to make a comment related to ‘all things are not equal’. I think this pressure to reduce completion time may seriously impact on some people who have various circumstances while undertaking their research training. In particular I am thinking of women who may find themselves in the position where they are having children, for example, or having to look after sick relatives or whatever, and particularly people who might go into the workforce and then come back into research training later on in their life when they may have, again, family circumstances that mean that they can’t do their training in a brief period of time. I think you have got to take those sorts of things into consideration as well, and not disadvantage those people.

John Byron – As one of the two former CAPA [Council of Australian Postgraduate Associations] Presidents in the room I think I should respond to that point as well, Mark [Finnane]. As someone who is nine weeks off from submitting a thesis after too long, I couldn’t agree with you more and I don’t think there is anybody, other than the longsuffering postgrad, that thinks that completing early is a good thing.

I think one of the frustrations, though, is that the three-year cut-off for APAs and the four-year cut-off for RTS are so arbitrary. It is a one size fits all model and clearly is not designed out of any commonsense approach that might fit particular projects, let alone variations between disciplines and institutions and personal circumstances. I think it is the wanton stupidity of a single algorithm to fit every postgraduate/HDR situation that is actually the problem, not so much the absolute numbers. I think that probably most people do agree that finishing earlier is better.

Better doesn’t necessarily mean it is higher quality, and I think that one of the problems that the groups might have been trying to strive with is that they could see that there could be some trade?offs in the quality of research undertaken while students are doing their research degrees, in order to get a rapid completion, and perhaps compromises or less beneficial quality outcomes for the entire research education experience, which might include things in addition to their actual research project, such as training in entrepreneurship, teaching experience, lab supervision and so on.

Phil McFadden – Could I just put in one comment, quickly. I hear you loud and clear. I once took over a student who had been 18 years so far on his PhD, and let me tell you, it didn’t benefit anybody at all. It is this one size fits all that is the problem.

Mark Finnane – I think it is just worth stressing that the RTS framework defines time of completion, not in calendar years but in full-time equivalent years of study. So it does allow, and universities do allow, address to equity issues. For the APA, the scholarship frameworks are much more circumscribed, I think, and John is right about that. But even the APA guidelines allow part-time scholarship funding for equity reasons as well.

Bob Graham – There is actually very good data, in at least the biomedical sciences, to show that the longer your training, the better chance you have of getting a good job later on. I would have to dig that out, but I am definite that there is.

Jim Peacock – One thing we didn’t discuss much today at all, if at all, was that I think we all would like to think that the framework or system that we put into place will not prostitute the qualities that we all think are very important in measuring the quality of research. And yet they have to be able to be appreciated by people who aren’t ‘like us’, who aren’t scientists or science students. That I think is a very important point.

Related to that: so we have this system, we haven’t talked at all about then who will make the decisions about how it will influence investment in R&D. I think over the next 18 months that is something that we should have some thoughts about, and try and have some inputs.

Phil McFadden – One of the things that I think we do need to bear in mind is that we have in place a funding mechanism that tends to drive things towards quantity rather than quality at the moment. That is a concern that we have heard expressed all the time, and I think one of the things we are going to have to look at in the future, for us coming up with those measures, for us coming up with the ‘how’, is to help people put in place a funding model that will actually drive it towards the quality that we value and the quality that we think will be of help. We have got to figure out how to do that.

Pascale Allotey – I just want to make a comment on the discussions that we have had about the peer review process and how important that is. I think one of the problems – and it has already been raised – is the small pool of that we have of peers to do reviews in particular areas. I, for instance, with grants that I have submitted, have had people send them back saying they are not happy to review them because they have a conflict of interest because it is in the same area. So what happens is that the grants then get sent to junior researchers, and there is research that suggests that junior researchers tend to be a lot harsher and less constructive in the way that grants are reviewed.

I have, for instance, been able to compare reviews from an NIH grant where the NIH reviewers actually are paid as part of their work to review the grants and you get really constructive, really useful feedback, whereas reviews here, by comparison, are very much about ‘How can we not fund this grant?’ rather than ‘How can we bring it up?’

So I think one of the things that we should be thinking seriously about is some sort of training or some guidelines. We have given [the reviewing of] NHMRC grants to researchers in the department who are straight out of PhDs and have no idea what it is they are doing. There are criteria but there is no real training about what it is we are trying to achieve in the peer review process.

Phil McFadden – Yes, I think there is a lot to that. I would point out that I have had a couple of reviews from some young people who could have done with some training in how to review, no question about it. I think we have all experienced that.

The small pool for our peer review was raised several times during the day, and yet I am sure that each of us here ends up doing a lot of NIH and NSF grants, German grants and that sort of stuff, so they are quite happy to use us. I see no impediment at all to our using people around the world, and in fact ARC has expanded its pool to use a lot of Americans, a lot of English, a lot of Germans et cetera. So there is a much broader pool than just the Australian one, but I think you are quite right that this aspect actually does need some training, as to how you deal with that.

Mike Manton – Today we have focused a lot on assessing excellence in the work of individuals, but I think that as we develop metrics for looking at research and research training we need to take it into account that you will need different measures for the results of individuals, of teams and of institutions. They are not the same. Similarly, when you come to the reason for the research being done, there has been a lot of emphasis on academic research or research that has intrinsic value; we have also touched on research that has commercial value. But there is a third amount of research done which relates to the generation of public goods, and that needs different metrics as well. So those two things are somewhat linked, and as the metrics are developed we must take those variations into account.

Jim Peacock – I asked people in my breakout group to have a show of hands as to whether, if there is an evaluation system and there is a barcode or something that comes out as to the scores, you would like to wear on your lapel a barcode that was based on the evaluation of your institution, or an evaluation of a group, discipline or cross-discipline that operates mostly within your institution but could be other things, or one that operates at the level of you or your own group.

So I first asked for a show of hands, and I ask it here, for the ones that would like the barcode on the basis of the institution. Now we have the one about its being based really on a group of interacting workers, mostly in the discipline but cross-discipline as well. And then who would rather see it, if they had to choose one, on the basis of their own work and their own group?

The [respondents to the] last two seem to be splitting the money, but in our group the biggest show of hands was a no-show of hands.

Phil McFadden – I think that helps to articulate an issue. One of the things that is going to be quite important in trying to inform a funding model is the very question of whether you should be applying these kinds of criteria at the institutional level, at the group level, at the individual level. And yet you can see here, within ourselves, we haven’t resolved that question in our own minds. To ask that is still to ask a complex and unresolved question, and yet this is something that we are going to have to give some guidance on, in order to be able to affect the funding model.

Jim Peacock – Maybe the question was wrong. A single barcode may not be appropriate at all.

Mike Manton – The answer is that it is all three. A scientist or a researcher wants to be judged as an individual. It is like an individual in a community. You want to be judged as an individual in your family, in your community and in your nation. You don’t say that you are just one or the other. And that is what I was saying earlier: it is important that these metrics, as they are developed, take into account that people are associated with different groups. It is not impossible, it is just to recognise that we have to take different indicators for individual components and not try to get one indicator or set of indicators that does everything. It is recognising the differences. And it is important to recognise that those different institutions have to be recognised.

Phil McFadden – Do you have any suggestions, Mike, as to how that might be achieved?

Mike Manton – Yes. If you sit down and say as an individual what you want to achieve, that is when citations and so on can be grouped together, but with teams and institutions you get much more onto the outcomes: user satisfaction, generation of products, commercial outcomes and so on. And as you move to more institutional measures, that is when you tend not to get citations and so on as being as important as for the individual.

Phil McFadden – In several of the groups that I was in, I heard comments, ‘Whatever the framework is, it should be non-judgmental.’ I was wondering if people in the open forum had any comment on that, about whether a framework trying to measure excellence should be judgmental or non-judgmental.

Robert O’Connor – I would just like to pick up on that briefly and tie it to something you alluded to before. It would be nice if we had a non-judgmental quality and accessibility framework, but you and a lot of people in the room seem to be assuming that there going to be money tied to this. If there is money tied to it, there is a judgment being made. So where do we end up sitting?

Phil McFadden – My response is that I personally would like to see instances where we were able to go out and make, I guess you might say, determinations about excellence, not judging but looking at excellence and that sort of stuff, but inevitably the moment you do that there is going to be funding attached to it somewhere along the line. In the long run, to some extent what we are looking at is wanting to inform funding policy, so money is going to be attached, in terms of the way things happen. I think that is inevitable. Bob Watts would certainly tell you that is the way of life.

Frank Howarth – Touching on that issue: one of the things that was tossed around in our group quite a lot and hasn’t really come up in this plenary is whether you can have excellence without relevance. Can you have excellence in its own right, or is it even worthwhile thinking about that at this point, when people are going to make judgments? Can we have a debate that simply talks about excellence, or do we need to roll relevance in at some measure or other?

Phil McFadden – Can I bring us back to a comment that Bob Graham made in his talk but that hasn’t resurfaced. There is no doubt about it that in the ordinary day-to-day business of science we have to do the kinds of things that Bob Watts is talking about, in terms of being able to show the value. But in the long run the fundamental value and purpose of science – this is what Bob Graham was talking about – is how to make the unimaginable imaginable. Somewhere along the line we have to have measures and have to be able to tell the stories and convince people that in fact that the blue-sky kind of research is worth funding, does in the long run actually have intrinsic value – and extrinsic value as well. To some extent that is the question of being able to accept excellence without immediate relevance.

I would make a point that Pasteur, who had I don’t know how many patents – we all know who Pasteur is and he did all sorts of fantastic things, most of it applied science – made the comment that there is no such thing as applied science: there is science, and there are applications of science. He was putting his finger on the fact that no matter the purpose of what you are doing, it is the excellence that matters. If you look at what he did, you see that several times he turned the unimaginable into the imaginable. He separated out the excellence from the relevance, and what he did was to build a core of phenomenal excellence, most of which then turned out to be relevant.

Frank Howarth – Could I make a supplementary comment on that, then. I don’t disagree one bit with that, and no matter what, the bottom line is excellence. It has got to be there. But in terms of the real world, we are in Australia in 2004 talking about these issues. We can’t shy away from the issues that were raised by Michael Barber or indeed Bob Watts about the interest in mission-orientated or endgame research. It is growing, and the signs are everywhere. So I think we need to spend more time talking about how we add some measures of relevance on to complement that excellence.

Phil McFadden – We do have to make certain of the balance. Agencies such as the one I come from are entirely mission-oriented, CSIRO is mission-oriented, and we have to make certain we get the right balance between that kind of research and the blue-sky.

Michael Barber – I will make a comment and perhaps do a disservice to Robin Batterham. I don’t think I will betray any confidence if I perhaps reiterate a couple of things Robin said this morning in a bit more detail, and it really does address this question.

Robin has basically put on the table a very interesting proposition: if we have excellent research – in the context he was talking to me about, it was science, but I can take it more widely into excellent research – we will have excellent end-user engagement. That was in a sense part of, I think, the two themes of today: the measurement of excellence but also the focus on the delivery of that to outcomes.

So I actually think the question is a little simpler. I think the question is first: can we, across the agencies and across universities, come to some measure of the funding, which is not driven so much by the competitive funding formulas but is, in fact, in the block grants as Robin said, which does actually say that the science that we do – or the research that we do, taking it more widely – is valuable or is of quality as judged by relatively straightforward metrics?

The only complication in the balance of that framework as it relates to part-time versus full-time, which was picked up, full-time researchers and research-only agencies such as in your own, Phil [McFadden], or in mine, versus other people in universities, is a danger that we see quantity versus quality. So again interestingly Robin says very much that in the UK it is not about everything, it is a few publications, those publications being seen as being a significant benchmark of that activity.

In that context, then, picking up Frank Howarth’s comment, the challenge and the proposition are that if we have, if you like, that assessment of the fundamental contributions of our science background, we will actually see those linkages flowing, whether it is into the biological sciences, into medical research as Bob Graham was talking about or into the social sciences, with the impact upon social policy.

It is a very interesting proposition he has laid before us. I am not quite sure whether it is a valid proposition, but the first part does seem to me a relatively simple exercise, for us to have agreed in some broad sense that our output is able to be assessed in some metrics which are relevant for our own discipline.

I say this because the other point that he made this morning was that in the UK the metrics, the value you apply on that end of the spectrum to what is regarded by the social sciences as high-quality social science research and what is similarly regarded by the biomedical sciences need not be the same thing. We ought to relatively, between those frameworks, be able to differentiate. Then I think the relevance will be an interesting question to actually see, because that I think is more about the corpus, if you like, the whole discipline. What is the impact of the social sciences as a community on social science policy? What is the impact of the earth sciences on exploration et cetera? And we can tease out those dimensions.

Phil McFadden – Something I would just stress – I think it was something you actually said in your talk Mike [Barber] – is that within the mission-oriented science you are given a goal, you know that you should be focusing on that. But curious people, people who are excellent, still find the opportunity to do absolutely stunning science within that, and add an enormous value as a consequence. I think that is critical.

Don Price – One of the questions that has confused me a little through this whole discussion is about time scales and timetables for measuring excellence, and whether we are focusing on measuring the excellence of science that has been done or on measuring the excellence of proposals for science that we intend to do. In many cases there is a relationship between somebody’s track record and the science that they are going to do, but in many cases it may be different. They may be quite different questions and the answers may be different according to when you measure them. If you try and measure the excellence of a piece of science immediately it has been finished, then the metrics and the recognition of it are going to be quite different – and indeed the apparent relevance of it might be quite different – from if you measure it two years later or something. If you are trying to make a judgment before a piece of science has been done, indeed, at the proposal stage, then the way of your assessment of it might be again a completely different question.

Phil McFadden – I think you are quite right. When Geoff Vaughan was talking, he touched on that question of pre-judging because you are looking at a CRC that you are about to give money to, and making a pre-judgment as to whether there is going to be excellence, as compared with the issue of post-judging some science. You came up with quite different criteria there, Geoff. So I think you [Don Price] are dead right; there is quite a significant issue there.

Robert Cribb– I think when we talk about relevance as a criterion for judging research excellence we have to bear it in mind that relevance is not to be judged simply in terms of economic or technological advantage. Research, for instance, which informs public awareness of Middle Eastern politics or of Indonesian politics or of the PNG economy is vitally important. Its value is not quantifiable but it…[inaudible].

Phil McFadden – [Bob Watts] touched on that with regard to Ok Tedi. It is the kind of thing that, although you don’t apply a value to it up front, you find has enormous negative value to you if you don’t get it right, and so it is something that it really very important to deal with.

Jim Peacock – I can’t see why we would want to bring this framework in if it isn’t judgmental. What do we want to do? Do we want to pin something up on a wall? Why do we want peer review? It’s judgmental. Perhaps what people are saying is that we don’t want the department people to make the decision about someone being fired from a university. Maybe that’s it. But what we want to bring this in for is to optimise the investment of government moneys in R&D and research training in Australia. How the judgments are made, and where they are made, is of course very important. But we do want it to be judgmental. I can’t see that it could be anything else. Furthermore, we want it not only to optimise but to be able to be used as an argument for increased support of R&D, to show why it is so important for Australia.

Bob Watts – Let me comment, then, on the political aspects et cetera raised at the back there. In fact, the company [BHP Billiton] does put a value on political and country risk. And it is expressed in dollar values. If we want to invest in, let’s say, Indonesia at the moment, the company requires a much higher effective rate of interest on the money that it is going to put in than it does if we are going into a nice safe country. So there is a value associated with the sort of information that we get from the political scientists and so on. It is absolutely built in to all their financial modelling. And there is a large group of people doing it.

A comment on quality as well: one of the rapporteurs said that excellence is an absolute measure. I disagree strongly with that. A simple example would show you why. How do you assess excellence? Let’s go back to our music, mentioned by Bob [Graham], who catching me in the corridor said something about Mozart. I don’t quite know what my teenage granddaughter would do, but it wouldn’t be repeatable, if we tried to say that Mozart was better than some pop guy who, in her opinion, has excellence in music. And our tastes and our assessments change over time. So we do have measurements of excellence that change, are not absolute, are relative and, frankly, are assessable.

Tom Clark – I want just to elaborate what that point was. As I understood it – I didn’t raise the point – it was that quality is a sliding scale from 0 through a range of fractions up to 1 as being perfect quality, whereas excellence is a value of 1 or 0, with no intermediate scale. You are excellent or you are not, whereas you have a varying amount of quality. That was the context in which that distinction was being drawn, and I am sure you can draw it in any number of other ways.

Phil McFadden (Chair) – I think we have finally quantified the qualitative! We have got some fascinating discussion going on, but we have reached 5 o’clock and I think we are going to have to draw it to a close because there are people who have to catch flights and that sort of thing.

I would like to emphasise that this does not bring this process to a halt. This is part of a journey. Today we have explored – and I think we have had a wonderful exploration – the complexities, all the different facets of the different things that we are going to have to think about. We are going to have to be involved now in drawing this together, and then in thinking about how we achieve some of the things that have been articulated here.

There have been some very strong statements made about what it is we would like to achieve, and certainly some very strong statements about what we would like to avoid, and I think the next step is for us to start thinking about how we actually go about achieving that.


GPO Box 119 | Canberra ACT 2601 | AUSTRALIA | Ph: 02 6249 1788 | Fax: 02 6247 4335