Unscientific job evaluation: Stones disappearing scorpions

Today I happened upon a post in the blog of the Energy Research Centre (ERC) on 5 June 2017 recording with great sadness that Bill Cowan had passed away.

Bill Cowan (right) with colleagues Patrick van Sleight and Gamieda Gierdien in 2003.

I remembered that name and found that more than 30 years ago in 1985 Bill Cowan published an article in the SA Labour Bulletin 4 pp 93-106 entitled Is Job Evaluation Scientific?

Given how topical this subject is I am providing some extracts from that article and need to point out that when he wrote the article there were only 6 “occupational levels” whereas there are now 7 such levels. But I argue that the new design is just as flawed as the previous ‘system’ given that it also fails to meet the statutory requirements of proportional pay differentials.

This table shows the adjustment by creating an extra occupational level in the middle and not on top as the new EEA9 has purported to do:

The employees currently locked into the 4th occupational level in EEA9 are deprived of intentional, continuous and systematic development and are unlikely to receive the correct development attention.

Is Job Evaluation Scientific?

Extracts from (1985) 10 SA Labour Bulletin 4 pp 93-106 by the late Bill Cowan

“There are a number of reasons why job evaluation is not scientific, despite the appearance – the measurements, graphs, calculations. The simplest reason is that the aim of job evaluation systems is to say what rewards people should receive. In fact, job evaluation systems do not specify the actual amounts which people should be paid, but instead specify the order of value in which jobs should be placed, and how much should be paid for each job relative to the others. But this doesn’t make any difference. The issue which job evaluation is addressing is still one of “distributive justice” – of who should get what, how the cake should be divided. And science does not provide answers to such questions. Science depends on empirical proof to test its theories. What empirical proof could be found for a theory which states that one person should be paid twice as much as another?

. . . . .

The methods of job evaluation

When negotiating about a job evaluation system, it is important to know at what phases in the job evaluation procedures these value judgments came in – for these are the “weak points” in the systems, the areas where (as far as a scientist is concerned) negotiating parties are quite entitled to disagree.

. . . . .

Nevertheless, even if a union decides that it does not want to work within a job evaluation system, it is useful to identify where the system is scientifically faulty, so that desired modifications can be argued for, without being stopped by the mistaken reply: “You can’t touch it – it’s scientific!”- can start by looking at three phases in job evaluation:

1. The first phase is selecting the criteria for job analysis and deciding how these will be applied. (For instance, in the Paterson system, the “level of decision-making” is chosen as the criterion for distinguishing between jobs.)
2. The second phase is the actual measurement process, where different jobs are assessed in terms of the criteria selected in phase 1.
3. The third phase is the grading of jobs on the basis of the measurements taken in phase 2.

We will stop there. There is, of course, a final phase which is most vital both to employers and employees, and that is making the link between the grading of jobs and the grades of wages. This last stage raises problems of a different kind. Having decided (by phase 3) that one job is of higher “value” than another, the decision must then be made: how much more should that job receive? This would require a separate full discussion and we rather spell out the criteria against which to judge the first three phases.

. . . . .

Reliability and validity

Now the three phases shown above are very familiar to social scientists who are used to making social measurements in their research. The whole procedure is only reckoned to be methodologically acceptable if they are satisfied that the procedure is (a) “reliable” and (b) “valid”.

Reliability has to do with the way measurements are made. In order for a measurement process to be reliable, we want to know that different ways of measuring the same thing will yield the same results, and that if different people measure the same thing – using the same or different methods – they will come up with the same results.

Reliability of measurement is basic to any scientific investigation, but it is not, by itself, enough to ensure “validity”. Validity requires, amongst other things, that what people are actually measuring is what they say they are measuring. For example, if an intelligence test was constructed which consisted only of mathematical problems, it might produce consistent, reliable results, but it would not be valid as a measure of intelligence, because intelligence is not just mathematical ability.

. . . . .

Phase 1: selection criteria

In this phase, criteria are selected and applied, to indicate the content of a job, relative to other jobs. Some job evaluation systems employ only one primary criterion. The Paterson system is the commonest example of this in South Africa, and in this system the criterion chosen is the “level of decision-making”.

. . . . .

An advantage of such a job evaluation system, which looks at only one primary factor in assigning jobs to “bands”, is that it is relatively simple and fast to implement. This is particularly an advantage to management, but it could also be an advantage to unions if they expect their members to benefit from the successful implementation of the system.

. . . . .

Single factor systems, such as Paterson’s, fail the test of scientifically valid procedure at the first hurdle, in phase 1. Measuring the “level of decision-making” cannot provide a valid measurement of “job content”. What about more complex systems?

The Peromnes System, for instance, which is quite popular in South Africa, pays attention to eight separate factors (problem solving, consequence of error of judgment, pressure of work, knowledge, the influence of one job on other jobs, the level of comprehension required by the job, educational qualifications required, and the degree of further training needed to do the job competently). Some systems are even more complex than this, taking account of 26 or more separate factors in assessing job content.

. . . . .

Phase 2: the measurement process

Once the criteria have been selected, the next phase is to discriminate between different jobs on the basis of these criteria. This is the measurement phase, and here (from the point of view of scientific method) the main thing we want to know is: are the methods of measurement reliable? Will different people come to the same conclusions?

In general, this kind of measurement is likely to be more reliable if

(a) a fairly rigid measuring procedure is adhered to;

(b) if subjective judgments are kept to a minimum; and

. . . . .

In the next stage in the Paterson system of grading, supervisory grades and further sub-grades are made within each of the bands, and here the judgments seem to become more subjective, with more scope for disagreement – i.e. more risk of unreliability. The reason for this is that at this stage the judgments become more complex, taking account of various different factors such as work pressure, variety of tasks, etc.

. . . . .

It is worth noting that a fairly rigid measuring procedure, such as that in the Paterson system, makes it easier to manage consultations with workers and unions at this stage of job evaluation. It leaves some scope for disagreements and negotiation, but not so much scope to threaten the overall design of the pay structure. Both in the job description stage and in the grading stage workers may be consulted. However there is typically little, if any, consultation about how the terms have been set – why “decision-making” has been adopted as the primary criterion, and how the measuring schedule has been drawn up.

. . . . .

There are a number of possible measurement problems in a multi-factor system. First of all, because there are more aspects of a job to be measured than in a single-factor system, and because some of these factors require subjective assessment (for example, in the Peromnes system, the “pressure of work”) there is more chance of going wrong than in a simpler system. On the other hand, the errors made in measuring one factor may be averaged out by errors in the opposite direction in measuring other factors. So one can’t be sure – which is itself, of course, a problem.

. . . . .

In summary, a scientific enquirer would treat the relative scores awarded to jobs on the different factors in a multifactor system with a great deal of scepticism. There is no good reason to suppose that the figures are accurate. Even so, this is probably not the major problem with multi-factor job evaluation systems. The great unsolvable problem canes when you try to combine the separate scores awarded for different aspects of a job, in order to come up with a single overall score for the job.

Phase 3: grading

Having selected the criteria for distinguishing between jobs in phase l, and having made measurements according to these criteria in phase 2, the task in phase 3 is to convert these measurements into a means for ranking jobs. In the case of a single-factor system such as Paterson there is no immediate problem at this stage. Once jobs have been catergorised according to their “level of decision-making” (phases 1 and 2), this automatically places them in one of the six bands.

. . . . .

What are the “points of detail” in the Paterson system which can lead to difficulties in phases 2 and 3? First there is the problem of borderline cases, where a grading committee disagrees about which decision-making band a particular job should occupy. This is a problem of measurement (phase 2). But the more important “details”, as far as employees and their representatives are concerned, are probably to do with the sub-grades which are allocated to jobs within a “band”.

. . . . .

As suggested earlier, sub-grading can provide an area for negotiation, precisely because the rules for sub-grading are not clear. But this should not obscure two facts.

Firstly, the area for negotiation has already been decided, because the additional factors now being considered cannot move a job out of its “decision-making” band.
Secondly, when the additional factors are being considered, there is no scientifically valid way of deciding whether one factor, such as work pressure, should carry more weight than another factor, such as the “variety of tasks”.

This difficulty in knowing how to play off different factors against one another only hits the Paterson system at the level of sub-grading. But it is a problem which is right at the heart of multi-factor systems.

Multi-factor systems, which look at more aspects of job content than simply “decision-making” have a better chance of preserving phase 1 validity. They may be complex and generally less reliable in the measuring phase, but it is in the third phase that they really knock down the hurdle and fall to the ground in a crunch of non-scientific calculations. The problem is this: that having taken measurements of different aspects of a job, you now have to combine these figures to come up with a single score.

. . . . .

So job evaluation is caught in a trap. Either it starts off with invalid assumptions, or it ends up with an invalid procedure for combining scores on different factors.

This does not really come as a surprise, because we saw at the outset that the aim of job evaluation is to make value judgments. These value judgments may appear to be hidden, by the trappings of scientificity and a confusion of numbers, but like a disappearing scorpion, they will turn up again if you lift up all the stones. Either the non-scientific value-judgments are made at the beginning, in choosing the criteria, or they come back at the end, in making unscientific decisions about how to weight different aspects of a job.”

3 Comments

mimahlangu on May 13, 2019 at 11:50 am

Good day

May I find out on table 1 above (Audit step 3) what do those acronyms/abbreviations stand for – there is AL , AU/BL, BU/CL etc?

Yours Sincerely

Isaac Mahlangu
GilesFiles on May 13, 2019 at 2:26 pm

Many thanks for your question and the explanation is that two frameworks are shown in one table and there are 3 grades within each level. The 7 levels are identified from A to F and the grades are lower (L), mid and upper (U).
So when one adopts 7 as opposed to 6 levels it will be seen that some ‘jobs’ move sideways to ensure that there is a straight-line curve with proportional differentials across the 7 levels.
Trust this all makes sense.

Trackbacks/Pingbacks

Seniority justifying differentials: Valid rational reason? - GilesFiles - […] justifying differentials in pay does not accord with the principles of proportional pay differentials because it completely distorts the…

You must be logged in to post a comment.

Unscientific job evaluation: Stones disappearing scorpions

Is Job Evaluation Scientific?

The methods of job evaluation

Reliability and validity

Phase 1: selection criteria

Phase 2: the measurement process

Phase 3: grading

3 Comments

Trackbacks/Pingbacks

Leave a reply