A QUALITATIVE EXPLORATION OF THE RELATIONSHIP BETWEEN PERFORMANCE APPRAISAL AND THE DEVELOPMENT OF INSPIRED BUSINESS LEADERS

GEORGE P. SILLUP

St. Joseph’s University

HaubSchool of Business

Assistant Professor, Department of Pharmaceutical Marketing

Fellow, PedroArrupeCenter for Business Ethics

5600 City Avenue

Philadelphia, PA19131-1395

(610) 660 – 3443

Fax: (610) 660 – 1229

E-Mail:

May 31, 2006

The author wishes to acknowledge the financial support from the Saint Joseph’s UniversityPedroArrupeCenter for Business Ethics and the intellectual guidance from John McCall and Stephen J. Porth

A Qualitative Exploration of the Relationship between Performance Appraisal and the Development of Inspired Business Leaders

Abstract

A review of the literature and summary of structured interviews with employees of four Fortune 100 companies (Aetna, IBM, Johnson & Johnson and Wyeth) were conducted to understand whether, and if so how, performance appraisal (PA) influences the development of inspired business leaders (IBLs) in the US. IBLs were defined as employees who contributed to the fiscal success and establishment and maintenance of corporate social responsibility of their respective companies by exceeding their individual annual objectives. The structured interviews explored the way PA, a systematic process that requires descriptions of job-relevant strengths and weaknesses and uses observation, judgment and solicited feedback to assess performance, helped and/or hindered development of an IBL. Qualitative findings infer that PA often attributes success to outcomes of behavior (e.g., units sold) rather than behavior leading to job success (e.g., motivating others to sell units). Furthermore, PA may impair development of IBLs by shortcomings related to the PA system’s implementation, such as not training those conducting PA or not allowing performance assessors enough time to conduct PA properly. Results suggest that PA and its implementation can influence development of IBLs and merits further investigation using quantitative analyses.

A QualitativeExploration of the Relationship between Performance Appraisal and the Development of Inspired Business Leaders

Table of Contents

Connecting Performance Appraisal and InspiredBusiness Leaders page 4

Review of the Literature about Performance Appraisal Systemspage 5

Research Methodologypage 25

Resultspage 28

Discussionpage 31

Concluding Remarks / Implications for Further Research page 33

References page 35

Tables and Figures

Table 1: Behavioral Checklist Examplepage 11

Table 2: Graphic Rating Scale Examplepage 12

Table 3: Successful Uses of Performance Appraisal Datapage 18

Table 4: Profile of Respondentspage 28

Table 5: Frequency of Interview Responsespage 29

Figure 1: Balanced Scorecard Examplepage 34

N.B. Contributors to the content of this paper include employees of Aetna, IBM, Johnson Johnson and Wyeth Pharmaceuticals as well as Susan Givens-Skeaton for her input about managing human resources and introduction to Wayne Casio’sresearch and Christopher Field for his critical review.

A Qualitative Exploration of the Relationship between Performance Appraisal and the Development of Inspired Business Leaders

Connecting Performance Appraisal and Inspired Business Leaders

Performance Appraisal (PA) emerged as a concept with the emergence of “big” business in the early twentieth century in the US when employees were considered a means of production. Karl Marx’s work can be said to pioneer the moral critique of capitalism and subject capitalistic business enterprise to moral scrutiny as well as encourage big business to place some value on an individual employee’s contributions rather than only on the production of its products.1 But progress to raise this concept to an issue for consideration was slow; big business was busy building products for consumption around defining events of the first half of the last century (World War I, the Great Depression, the New Deal, World War II).

Following World War II, big business began to use PA, usually informally and not as an assessment for all employees. However, the importance of PA was enhanced over the last three decades of the twentieth century. US big business has encountered competition from new domestic and global competitors and began to connect individual employee contributions with improving their overall performance. Additionally and importantly, “ethical” treatment of employees was enhanced by the civil rights and environmental movements along with questions about American business practice from direct and indirect involvement with the Nixon campaign. These events caused a revolution of rising moral expectations for American business and created a new way of doing business that resulted in numerous legislative acts benefiting the employee, e.g., Civil Rights Act of 1964 at the national level, Massachusetts’ Right-to-Know Law at the state level and Collective Bargaining at the union contract level.

Despite this enhanced treatment of employees, US business employees still work “at will” without contractual assurance of her/his job in most instances (e.g., non-union employees) based on the right to property.2An employee is expected to contribute to the business’s success where success is defined by well-known financial metrics (e.g., ability to generate revenue).3However, emphasis on the well-known financial metrics often fails to place value on other attributes of performance, such as integrity, or nurture business practices that help to develop inspired business leaders (IBLs).

Recognizing this potential shortcoming, PA systems have been embellished with questions to assess performance attributes and/or company values (e.g., dependability) over the past 20 years by US companies. These embellishments add complexity and may require more time for implementation. Consequently, businesses may not be allowing sufficient time for implementation of the PA system they are using. When you consider that the basis of PA is people making judgments about other people in an organized setting, insufficient time for PA can be unfair to an employee in the short term and jeopardizes his/her development as an IBL in the long term.4A review of the literature about PA systems and the potential impediments to using and implementingthem is in order to answer the following research questions:

  • Does implementation of an effective PA system help to develop IBLs?
  • What are the key impediments to using a PA system? This includes impediments before and/or during implementation.
  • How can implementation be improved?
  • What are the implications of this research for the development of IBLs?

Review of the Literature about PA Systems

PA is the systematic description of the job-relevant strengths and weaknesses of an employee or an employee team. PA combines two processes, observation and judgment, to assess job performance on the basis of objective and subjective indices. Objective indices, such as sales data, are appealing because they measure a quantifiable endpoint. However, they often measure factors beyond an employee’s control, or outcomes of behavior (e.g., number of units sold), rather than the job performance behavior itself (e.g., what did it take to sell the units). Subjective indices measure the job performance behavior.

PA is an integral part of performance management (PM), which requires a willingness and commitment to improvethe every day performance of the employee. PM systems provide instantaneous, real-time information that describe the difference between employees’ current and desired courses by giving timely feedback about performance while constantly focusing attention on implementing strategies that will help the business grow.

To work successfully, PM requires that managers do three things well:

  • Define performance (through goals, measure and assessments),
  • Facilitate performance (by identifying obstacles to good performance and providing resources to accomplish objectives),
  • Encourage performance by providing a sufficient number of rewards in a timely and fair manner that people value.5

As an integral part of PM, PA helps employees improve their work performance byrealizing their full potential by providinginformation to employees and managers for use in making work-related decisions. PA is also a feedback process, an organizational intervention, a measurement process as well as an intensely emotional process. Above all, it is an inexact, human process, which, not surprisingly, can be subject to bias during implementation. However, PA can fail for a number of reasons that can infiltrate and/or compromise PA systems before implementation:

  • Setting fair performance standards,
  • Meeting requirements of effective PA systems,
  • Assessing the types of PA systems/determining which one to use,
  • Identifying who should rate performance,
  • Overcoming judgmental biases in rating,
  • Establishing training programs for raters,
  • Determining how often PA should be done,
  • Understanding how PA fits with Total Quality Management.

Setting Fair Performance Standards

To understand how these failures can occur, it is necessary to consider of how a business sets its performance standards (PS). As mentioned earlier, PA involves observation and judgment. Observation processes include the detection, perception and recall or recognition of specific behavioral events. Judgment processes include the categorization, integration and evaluation of information.6 Observation and judgment help PA make distinctions among people, especially among people in the same job. PS provide the critical link in the process between job analysis, which identifies the components of a particular job or job description, and PA.

Ultimately, it is management’s responsibility to establish PS at levels of performance deemed acceptable or unacceptable for each of the job-relevant, critical areas of performance identified through job analysis. For some jobs (e.g., production or sales), PS can be set on the basis of quantitative performance measures, such as the number of units sold. For others, such as new product development, setting PS is considerably more subjective and is frequently a matter of manager and employee agreement.7

PS are essential in all types of goods-producing and service organizations because they help ensure consistency in supervisory judgments across individuals in the same job. Unfortunately, it is often the case that charges of unequal treatment and unfair discrimination arise because no clear PS exist.8 Once PS are in place, PAcan be done by gathering job performance information using observation and by evaluating the adequacy of individual performance using judgment.

Meeting Requirements of Effective PA Systems

The requirements of effective PA Systems are relevance, sensitivity, reliability, acceptability and practicality. Relevance implies there are clear links between up-to-date PS for a particular job and an organization’s objectives and between the critical job elements identified through job analysis and the dimensions to be rated. In short, relevance is determined by answering the question “What really makes the difference between success and failure in a job according to the customer?” The customermay be internal (e.g., your manager) or external (the recipient of business’s product or service). In all cases, it is important to pay attention to what the customer believes is important (e.g., on-time delivery).9

Sensitivity implies that a PA system is capable of distinguishing effective from ineffective performance. If it cannot, and the best employees are rated no differently from the worst employees, then the PA is not useful. PA that lacks sensitivity will not help employees develop and will undermine the motivation of both managers (who will be likely to view PA as pointless paperwork) and employees. Such measures include acknowledgement of different endpoints of factors in one’s work environment.

Reliability is the third requirement of sound PA and refers to consistency of judgment. For any given employee, PA by raters working independently of one another should agree closely. Raters with different perspectives (e.g., managers, peers and employees) often view the same employee’s job performance very differently.10 To provide reliable data, each rater must have an adequate opportunity to observe what the employee has done and the conditions under which he or she has done it; otherwise unreliability may be confused with unfamiliarity. Note that there has been no mention of the validity or accuracy of appraisal judgmentsbecause there is no thoroughly objective “truth” in PA. By making PA systems relevant, sensitive and reliable, the resulting judgments can be considered valid as well.

In practice, acceptability is often considered the most important requirement of all. Any PA system must havethe support of those who use it, or else employees will undermine it. Unfortunately, many businesses have not putenough effort into garnering the front-end support and participation of those who will use the PA system. This is also true for practicality which requires that PA systems are easy for managers and employees to understand. Consequently, PA systems often do not work because they were designed with limited input from managers and even less input from the employees.11

In a broader context, relevance, sensitivity and reliability are simply technical components of a PA system designed to make decisions about employees. Using a PA system that ensures consistent evaluation of employees’ performance will help the PA system’s acceptability. However, because some degree of error is possible in all employment decisions, determining the optimal PA system to use will result in the greatest benefit for the business and its employees.

Assessing the Types of PA Systems /Determining Which One to Use

There are two primary types of PA systems, Behavior-Oriented Rating Methodsand Results-Oriented Rating Methods. The type selected should depend on what the system needs to accomplish; this requires a strategy for the management of performance. In a study about work motivation, afairly well established principle, “things that get rewarded get done,” was reinforced. At least one author has termed this “the greatest management principle in the world.”12 So, a fundamental issue for managers is “What kind of behavior do Iwant to encourage in my employees?” If employees are rewarded for generating short-term results (e.g., sales during business quarter), then they will work towards short-term results.If they are rewarded for completing long-term results, (e.g., generating repeat business), then employees will aspire for those things.To be most effective, however, the manager must think strategically so when all her/his employees’ objectives are completed, the business gains competitive advantage in their market (e.g., faster delivery to customers, higher quality at lower cost).13 With the strategic intent of PA in mind, a review of key Behavior-Oriented and Results-Oriented Rating Methods follows.

Behavior-Oriented Rating Methods

There are several types of Behavior-Oriented Rating Methods. Narrative Essay is the simplest type of PA rating system.14In this method, a rater describes in writing an employee’s strengths, weaknesses and potential, together with suggestions for improvement. This approach assumes that a candid statement from a rater who is knowledgeable about an employee’s performance is just as valid as more formal and complicated rating methods. When done well, Narrative Essays can provide detailed feedback to employees regarding their performance. However, when they are totally unstructured and vary widely in length and content, comparisons across other employees or departments is almost impossible. Also, comparisons across other employees or departments are not possible because different essays touch on different aspects of each employee’s performance and do not objectively compare employees relative to one another.

Ranking is the next Behavior-Oriented Rating Method. Simple Ranking requires only that a rater order all employees from highest to lowest or from “best” employee to “worst” employee. Alternation rankingrequires that a rater initially list all employees. From this list he or she first chooses the best employee (No. 1), then the worst employee (No. n), then the second best (No. 2), then the second worst (No. n— 1), and so forth, alternating from the top to the bottom of the list until all employees have been ranked. Both simple and alternation ranking implicitly require a rater to compare each ratee with every other ratee but a systematic ratee-to-ratee comparison is not a feature of simple or alternation ranking.

To get a ratee-to-ratee comparison, Paired Comparisons need to be used. Each employee is compared with every other employee, usually in terms of an overall category such as “current value to the organization.” The number of pairs of ratees to be compared may be calculated from the formula [[n(n-1)]÷2]. For example, if 10 individuals were being compared, [10(9)] ÷ 2 or 45 comparisons would be required. The rater’s task is simply to choose the “better” of each pair and each employee’s rank is determined by counting the number of times she or he was rated superior. However, these comparisons are made on an overall basis and their results may be questioned because they lack behavioral specificity from making comparisons in terms of a single overall suitability category (that is, “Who is better?”).15.On the other hand, paired comparisons are easy to explain and are helpful in making personnel decisions. They also provide useful data in validation studies, for they effectively control leniency, severity and central tendency bias.

ForcedDistribution is another method of comparing employees with oneanother. As the name implies, the overall distribution of ratings is forced into a normal, or bell-shaped, curve under the assumption that a relatively small portion of employees is truly outstanding, a relatively small portion is unsatisfactory and everybody else falls in between (five categories based on a normal distribution curve).Its primary advantage is that it controls leniency, severity and central tendency biases rather effectively. However, it assumes that ratees conform to a normal distribution. This has the potential to introduce a great deal of error if a significant group of ratees is either outstanding or unsatisfactory.

Another Behavior-Oriented Rating Method is the Behavioral Checklist in which raters are provided a series of statements that describe job-related behavior. The rater’s task is simply to “check” which of the statements describes the employee’s job-related behavior. In this approach, raters are more reliable as reporters of job behavior than as evaluators of job behavior.16Table 1 below is an example of the Likert method of a summated rating for one declarative statement, “She or he follows up on customer complaints.” It is followed by categories, such as “always,” “very often,” “fairly often,” “infrequently” and “never.” The rater checks the response category that he or she thinks best describes the employee’s performance for that category. Each category is weighted, for example, from 5 (always) to 1 (“never”) and an overall numerical rating (or score) for each employee is then derived by summingresponses that were checked for each item.

Table 1: Behavioral Checklist Example

Job-Related
Behavior / Always
(5) / Very Often (4) / Fairly Often (3) / Infrequently
(2) / Never
(1)
Good follow-upon customer complaints

There can be several problems with summated rating scales, e.g., selection of response categories is often arbitrary. Additionally, behavioral checklists have no control for leniency but do not have a significant problem with halo effect because they only yield an overall summary rating.It is also difficult for a rater to give diagnostic feedback based on checklist ratings because they are not cast in terms of specific behaviors. On balance, practicality probably accounts for their widespread popularity.