Today I happened upon a post in the blog of the Energy Research Centre (ERC) on 5 June 2017 recording with great sadness that Bill Cowan had passed away.


Bill Cowan (right) with colleagues Patrick van Sleight and Gamieda Gierdien in 2003.

I remembered that name and found that more than 30 years ago in 1985 Bill Cowan published an article in the SA Labour Bulletin 4 pp 93-106 entitled Is Job Evaluation Scientific?

Given how topical this subject is I am providing some extracts from that article and need to point out that when he wrote the article there were only 6 “occupational levels” whereas there are now 7 such levels.  But I argue that the new design is just as flawed as the previous ‘system’ given that it also fails to meet the statutory requirements of proportional pay differentials.

This table shows the adjustment by creating an extra occupational level in the middle and not on top as the new EEA9 has purported to do:

The employees currently locked into the 4th occupational level in EEA9 are deprived of intentional, continuous and systematic development and are unlikely to receive the correct development attention.

Is Job Evaluation Scientific?

Extracts from (1985) 10 SA Labour Bulletin 4 pp 93-106 by the late Bill Cowan

“There are a number of reasons why job evaluation is not scientific, despite the appearance – the measurements, graphs, calculations.  The simplest reason is that the aim of job evaluation systems is to say what rewards people should receive.  In fact, job evaluation systems do not spec­ify the actual amounts which people should be paid, but in­stead specify the order of value in which jobs should be placed, and how much should be paid for each job relative to the others.  But this doesn’t make any difference.  The issue which job evaluation is addressing is still one of “distributive justice” – of who should get what, how the cake should be divided.  And science does not provide answers to such questions.  Science depends on empirical proof to test its theories.  What empirical proof could be found for a theory which states that one person should be paid twice as much as another?

. . . . .

The methods of job evaluation

When negotiating about a job evaluation system, it is imp­ortant to know at what phases in the job evaluation proce­dures these value judgments came in – for these are the “weak points” in the systems, the areas where (as far as a scientist is concerned) negotiating parties are quite en­titled to disagree.

. . . . .

Nevertheless, even if a union decides that it does not want to work with­in a job evaluation system, it is useful to identify where the system is scientifically faulty, so that desired modi­fications can be argued for, without being stopped by the mistaken reply: “You can’t touch it – it’s scientific!”- can start by looking at three phases in job evaluation:

    1. The first phase is selecting the criteria for job analysis and deciding how these will be applied. (For instance, in the Paterson system, the “level of decis­ion-making” is chosen as the criterion for distinguish­ing between jobs.)
    2. The second phase is the actual measurement process, where different jobs are assessed in terms of the crit­eria selected in phase 1.
    3. The third phase is the grading of jobs on the basis of the measurements taken in phase 2.

We will stop there.  There is, of course, a final phase which is most vital both to employers and employees, and that is making the link between the grading of jobs and the grades of wages.  This last stage raises problems of a different kind.  Having decided (by phase 3) that one job is of higher “value” than another, the decision must then be made: how much more should that job receive?     This would require a separate full discussion and we rather spell out the criteria against which to judge the first three phases.

. . . . .

Reliability and validity

Now the three phases shown above are very familiar to soc­ial scientists who are used to making social measurements in their research.  The whole procedure is only reckoned to be methodologically acceptable if they are satisfied that the procedure is (a) “reliable” and (b) “valid”.

Reliability has to do with the way measurements are made.  In order for a measurement process to be reliable, we want to know that different ways of measuring the same thing will yield the same results, and that if different people measure the same thing – using the same or different meth­ods – they will come up with the same results.

Reliability of measurement is basic to any scientific in­vestigation, but it is not, by itself, enough to ensure “validity”.  Validity requires, amongst other things, that what people are actually measuring is what they say they are measuring.  For example, if an intelligence test was constructed which consisted only of mathematical problems, it might produce consistent, reliable results, but it would not be valid as a measure of intelligence, because intelli­gence is not just mathematical ability.

. . . . .

Phase 1: selection criteria

In this phase, criteria are selected and applied, to indi­cate the content of a job, relative to other jobs.  Some job evaluation systems employ only one primary criterion.  The Paterson system is the commonest example of this in South Africa, and in this system the criterion chosen is the “level of decision-making”.

. . . . .

An advantage of such a job evaluation system, which looks at only one primary factor in assigning jobs to “bands”, is that it is relatively simple and fast to implement.  This is particularly an advantage to management, but it could also be an advantage to unions if they expect their members to benefit from the successful implementation of the system.

. . . . .

Single factor systems, such as Paterson’s, fail the test of scientifically valid procedure at the first hurdle, in phase 1.  Measuring the “level of decision-making” cannot provide a valid measurement of “job content”.  What about more complex systems?

The Peromnes System, for inst­ance, which is quite popular in South Africa, pays attention to eight separate factors (problem solving, consequence of error of judgment, pressure of work, knowledge, the in­fluence of one job on other jobs, the level of comprehension required by the job, educational qualifications requir­ed, and the degree of further training needed to do the job competently).  Some systems are even more complex than this, taking account of 26 or more separate factors in assessing job content.

. . . . .

Phase 2: the measurement process

Once the criteria have been selected, the next phase is to discriminate between different jobs on the basis of these criteria.  This is the measurement phase, and here (from the point of view of scientific method) the main thing we want to know is: are the methods of measurement reliable?     Will different people come to the same conclusions?

In general, this kind of measurement is likely to be more reliable if

(a)       a fairly rigid measuring procedure is ad­hered to;

(b)       if subjective judgments are kept to a mini­mum; and

(c)       if the measurements are not too elaborate.

. . . . .

In the next stage in the Paterson system of grading, super­visory grades and further sub-grades are made within each of the bands, and here the judgments seem to become more subjective, with more scope for disagreement – i.e. more risk of unreliability.  The reason for this is that at this stage the judgments become more complex, taking account of various different factors such as work pressure, var­iety of tasks, etc.

. . . . .

It is worth noting that a fairly rigid measuring procedure, such as that in the Paterson system, makes it easier to manage consultations with workers and unions at this stage of job evaluation.  It leaves some scope for disagreements and negotiation, but not so much scope to threaten the overall design of the pay structure.  Both in the job descr­iption stage and in the grading stage workers may be con­sulted.  However there is typically little, if any, consult­ation about how the terms have been set – why “decision-making” has been adopted as the primary criterion, and how the measuring schedule has been drawn up.

. . . . .

There are a number of possible measurement problems in a multi-factor system.  First of all, because there are more aspects of a job to be measured than in a single-factor system, and because some of these factors require subject­ive assessment (for example, in the Peromnes system, the “pressure of work”) there is more chance of going wrong than in a simpler system.  On the other hand, the errors made in measuring one factor may be averaged out by errors in the opposite direction in measuring other factors.  So one can’t be sure – which is itself, of course, a problem.

. . . . .

In summary, a scientific enquirer would treat the relative scores awarded to jobs on the different factors in a multi­factor system with a great deal of scepticism.  There is no good reason to suppose that the figures are accurate.  Even so, this is probably not the major problem with multi­-factor job evaluation systems.  The great unsolvable problem canes when you try to combine the separate scores awarded for different aspects of a job, in order to come up with a single overall score for the job.

Phase 3: grading

Having selected the criteria for distinguishing between jobs in phase l, and having made measurements according to these criteria in phase 2, the task in phase 3 is to convert these measurements into a means for ranking jobs.  In the case of a single-factor system such as Paterson there is no immediate problem at this stage.  Once jobs have been catergorised according to their “level of decision-making” (phases 1 and 2), this automatically places them in one of the six bands.

. . . . .

What are the “points of detail” in the Paterson system which can lead to difficulties in phases 2 and 3?   First there is the problem of borderline cases, where a grading committee disagrees about which decision-making band a particular job should occupy.  This is a problem of measure­ment (phase 2).  But the more important “details”, as far as employees and their representatives are concerned, are prob­ably to do with the sub-grades which are allocated to jobs within a “band”.

. . . . .

As suggested earlier, sub-grading can provide an area for negotiation, precisely because the rules for sub-grading are not clear.  But this should not obscure two facts.

  • Firstly, the area for negotiation has already been decided, because the additional factors now being considered cannot move a job out of its “decision-making” band.
  • Secondly, when the additional factors are being considered, there is no scientifically valid way of deciding whether one factor, such as work pressure, should carry more weight than anoth­er factor, such as the “variety of tasks”.

This difficulty in knowing how to play off different fact­ors against one another only hits the Paterson system at the level of sub-grading.  But it is a problem which is right at the heart of multi-factor systems.

Multi-factor systems, which look at more aspects of job content than simply “decision-making” have a better chance of preserving phase 1 validity.  They may be complex and generally less reliable in the measuring phase, but it is in the third phase that they really knock down the hurdle and fall to the ground in a crunch of non-scientific calc­ulations.  The problem is this: that having taken measure­ments of different aspects of a job, you now have to comb­ine these figures to come up with a single score.

. . . . .

So job evaluation is caught in a trap.  Either it starts off with invalid assumptions, or it ends up with an invalid procedure for combining scores on different factors.

This does not really come as a surprise, because we saw at the outset that the aim of job evaluation is to make value judgments.  These value judgments may appear to be hidden, by the trappings of scientificity and a confusion of numbers, but like a disappearing scorpion, they will turn up again if you lift up all the stones.  Either the non-scientific value-judgments are made at the beginning, in choosing the criteria, or they come back at the end, in making unscientific decisions about how to weight different aspects of a job.”