To align employees’ interests with the firm’s goals, employers often use performance-based pay, but designing such a compensation plan is challenging because performance is typically multifaceted. For example, a sales employee should be incentivized to sell the company’s product, but a focus on current sales without rewarding the salespeople according to the quality of the product and/or customer service may result in fewer future sales. To solve this problem, firms often increase the number of metrics by which they evaluate their employees, but complex compensation plans may be difficult for employees to understand.
Performance-based pay can increase productivity.
In response to additional performance metrics employees adjust their behavior because they are no longer incentivized to focus on only a few measurable and rewarded tasks.
Nonlinear incentive plans attract highly motivated employees.
With nonlinear incentive plans, less productive employees are motivated to match the productivity levels of their more productive peers.
Experienced staff will be more likely to understand how to respond to multiple performance metrics.
In the absence of a broad range of metrics, employees will only focus on those tasks that are rewarded at the expense of other tasks that may also be important to the firm.
Employees can attempt to “game” the system in pursuit of rewards, which can be costly to the firm.
Adding more performance metrics may make the reward system too complicated for employees to understand.
When subjective performance evaluations are used to reward hard-to-measure tasks, the firm is likely to incur significant costs managing this process.
When multiple performance metrics are used, firms incur costs in the process of determining the appropriate weight for each metric.
Author's main message
Pay-for-performance plans should reward all dimensions of an employee’s performance. If the firm only measures and rewards a subset of dimensions, employees will devote their efforts to those that are rewarded and ignore the others. This “multitasking” problem is more likely to arise in “knowledge jobs,” which place a heavier emphasis on “non-routine” problem-solving tasks (e.g. managers, software engineers, lawyers, physicians, and academics). However, as firms design more comprehensive compensation plans, training programs will likely be necessary to help employees understand the metrics and how their behaviors can influence them.
Pay for performance is increasingly used to align employees’ interests with those of the firm. Classic examples include commissions for salespeople and piece rates for manufacturing workers. However, very few jobs can be measured along a single dimension. In the case of salespeople, rewarding them for sales rather than profits may result in negative impacts on profits if salespeople are permitted to give discounts. Paying the manufacturing worker by the piece may induce him to stint on quality. Economists refer to these examples as the multitasking problem or the “you get what you pay for” problem, in that the incentive pay scheme is based on a subset of the tasks the worker actually performs . Empirical evidence has shown that this problem exists in a wide variety of settings (manufacturing, health care, education, professional service firms). Multitasking problems are more likely to arise in “knowledge jobs.” These are jobs in which the required tasks are ambiguous ex ante and employees must exercise discretion and be creative about what they do and how they do it. Examples of such jobs are managers, software engineers, scientists, lawyers, physicians, and academics.
Theoretically, the multitasking problem can be solved by increasing the number of metrics by which an employee’s performance is measured and rewarded. However, for this to work effectively, the firm needs a methodology to measure all of the worker’s tasks, and then must decide on the weights to assign the various metrics. In some cases, it can be very difficult to measure performance across all relevant tasks, and even if the tasks can be measured, adding more metrics may make the evaluation system too complicated for employees to understand.
Discussion of pros and cons
Employees and firms respond to metrics
Many studies have found that a pay-for-performance system can increase performance. For example, in the 1990s, the Safelite Auto Glass Company found that its windshield installers were installing 2.5 windshields in an eight-hour day, even though it only took about 1–2 hours to install a windshield. The company instituted a performance pay plan whereby the installer was paid a piece rate for each installed windshield; one year after the plan was instituted, productivity rose by over 40% . However, the company found that while productivity rose, the quality of the installations fell.
Similar experiences have occurred at other companies; for example, one study introduced experimental treatments at five Chinese firms that manufactured electronics . Before the intervention, some factories paid their workers a flat per hour wage, while the others paid piece rates. The experimental treatment involved offering workers monetary incentives based on productivity in addition to their flat or piece rate base salaries; the control group did not receive monetary incentives. This study found evidence of multitasking; productivity increased as a result of the monetary incentive, but so did the defect rate. Interestingly, this quality–quantity trade-off was only present for the workers whose base salary was a flat rate, which suggests that it is difficult to generalize results based on the impact of the introduction of incentives. The authors speculated that their finding could reflect the fact that workers who were paid a piece rate were already producing near their productivity frontiers, while the flat rate workers had much more room for productivity increases. Another example of multitasking is from Australian manufacturing workers who offered less help to co-workers when promotion decisions emphasized individual performance .
Multitasking is a particularly serious problem when a job combines tasks that are easy to measure with tasks that are hard to measure. An elementary school teacher offers a good example. Teachers often focus on two main tasks: making sure their students are at “grade-level” in reading and mathematics, and inspiring students to love learning for its own sake. Measuring the teacher’s performance on the first task is relatively easy; at the end of each school year, students should be given a reading and mathematics test. The school’s principal need only compare the students’ test scores at the end of the year to their scores at the end of the previous academic year; the difference would measure the teacher’s performance on the first task. By contrast, measuring the teacher’s performance on the second task, i.e. “love of learning,” is almost impossible. As such, when tracking and rewarding teacher performance, only the first task is measured. For teachers in this situation, the incentive is to “teach to the test,” potentially causing them to neglect or ignore entirely the second task. In general, when a job combines tasks that are easy to measure with tasks that are hard to measure, firms should consider whether it might be preferable not to use incentives at all because of the risk that the incentive plan will encourage the employee to ignore the hard-to-measure task.
There are many other examples of the multitasking problem. One example from the health care sector is based on the Nursing Home Quality Incentive (NHQI), which was launched by the Center for Medicare and Medicaid Services (CMS) in 2002 in the US. CMS chose to publicly report only a subset of the quality measures that were tracked as part of their random inspection program of CMS-certified nursing homes. The fact that NHQI only reported a subset of the quality measures suggests that multitasking could play a role in understanding the impact of the mandated information disclosure on nursing home quality. For example, one study found that quality measures along the NHQI-reported dimensions improved; but along the NHQI-unreported dimensions, there was a significant deterioration (Figure 1) . In other words, the nursing homes reallocated resources across dimensions of quality, based on whether NHQI did or did not disclose those specific quality measures. However, due to insufficient data the study was unable to conclude whether this reallocation was harmful to consumers. Another study of childcare providers found that regulations that instituted quality requirements resulted in a reduction in staff wages that could have negative effects on the quality of the services provided .
Employees can game the system
Employees often manipulate incentive systems to maximize their compensation. When employers use non-linear period-based compensation systems, employees have an incentive to engage in “timing gaming.” An example of a non-linear period-based compensation system is when salespeople are compensated according to an accelerated commission scale, whereby the commission for the same deal could be very different depending on the quarter in which the deal closes. The commission depends not only on the size of the deal but also the monetary value of the deals that have already been closed in the quarter.
A study of a leading enterprise software vendor found that the closure of deals within a given quarter was not smooth (Figure 2) . Rather, there was a large spike in completed deals on the last day of the quarter, as well as smaller spikes early and in the last few weeks of the quarter. These spikes were correlated with the average discount offered by salespeople to customers. At the start and end of the quarter, discounts were above 35%, whereas in the middle of the quarter they were below 30%. Salespeople were incentivized to offer larger discounts at the end of each quarter because their commission would be higher since they had already closed deals in that quarter. Similarly, if the salesperson was having a bad quarter, he/she had an incentive to “pull” deals into the subsequent quarter. Due to this “timing gaming,” the vendor in this study experienced a 6–8% loss of revenue due to lower than needed pricing.
Despite the cost of “timing gaming” to companies, non-linear compensation systems are commonly used for salespeople. Research in behavioral economics suggests that this may be due to behavioral biases; an experimental study found that non-linear systems attract and retain highly motivated subjects who are top performers . In addition, non-linear compensation systems have the advantage of incentivizing less productive employees to strive to achieve the levels achieved by top performers. Employers thus face a dilemma in weighing the costs and benefits of these compensation plans.
Adding metrics changes employees’ behavior
Gaming by employees can be addressed by adding metrics to the compensation scheme. The purpose of these additional metrics is to address unintended drawbacks associated with pay-for-performance schemes (e.g. quality reductions), or to better align firms’ and employees’ goals. For example, in the case of Safelite, quality problems were addressed by requiring the installers to repair defective installations before they could take on new jobs.
In another case, an international law firm faced a multitasking problem when it paid its partners solely based on their billable hours . The partners were not incentivized to share billable hours with members of their team or to spend time on non-billable activities that would benefit the firm, such as presenting at conferences, attending firm meetings, and training and mentoring associates. The firm revised its compensation plan to address this multitasking problem by reducing the commission for billable hours and introducing a bonus that included objective and subjective metrics that measured a variety of non-billable activities. In response to the new compensation plan, the partners significantly increased their non-billable hours and decreased their billable hours. As the partners spent more time on non-billable leadership activities, billable work was shifted to team members. The researchers found that profits generated by team members rose in the short term; however, they were unable to estimate the long-term impact of the new compensation plan on profitability. While law firms have historically relied on billable hours to compensate their partners, today, fewer than 10% use this formula approach . This dramatic change demonstrates how law firms are responding to the multitasking problem.
Another example is that of physicians in Quebec who were originally paid on a fee-for-service basis. When their compensation plan was changed to a mixed system (combining a fixed fee per day with a partial fee for services provided), they decreased their billable services but increased the average time spent per service, i.e. providing better quality care to their patients .
Subjective performance evaluations
Some firms use subjective performance evaluations to solve the multitasking problem. This is especially true when some components of performance are difficult to measure. For example, in evaluating the performance of a manager or team leader, senior management might want to measure the person’s skill as a mentor of junior colleagues. To do this, information about the quality of the mentorship could be gathered from the mentees themselves. Another component of performance that might require subjective evaluation is teamwork, in which case, members of the team could be asked to evaluate how well the employee in question functions within the team. While subjective performance evaluations may enable the firm to reward hard-to-measure tasks, they have significant drawbacks. First, they are very time-consuming; senior management needs to gather data from many individuals and process this data to come up with a summary recommendation. Second, the employees being evaluated will have an incentive to spend time developing good relationships with those evaluating them, rather than spending time on their jobs. Third, they can have a detrimental impact on employee morale, as employees may question the fairness and accuracy of the evaluation process. Subjective performance evaluations are thus more likely to be successful when the employees trust the managers who are evaluating them.
Many large organizations evaluate performance at the group level (e.g. business unit, branch, or team) and often use a multidimensional performance measurement system called the “balanced scorecard” . With the balanced scorecard, the group’s performance on a large number of metrics is monitored and rewarded. Proponents of the balanced scorecard argue that it helps managers to improve performance by monitoring and rewarding a range of activities that cover multiple perspectives (such as financial, customer, internal, and innovation/learning). However, the balanced scorecard may not be an effective incentive mechanism because it provides no information on how managers and workers should trade off different objectives.
One study examined a multinational distributor of heating and plumbing products that introduced a balanced scorecard in one of its UK divisions . The researchers compared the performance of this division to that of another division that had similar characteristics but did not employ the balanced scorecard. Prior to introduction of the balanced scorecard, employees in both divisions received a bonus that was based on branch-level profits. Managers were incentivized to adjust the timing of capital investments to affect bonus payments, and they had an incentive to compete for business against other local branches within their own division. After the first division adopted the balanced scorecard, the performance of each of its branches was evaluated using 16 metrics that were designed to encourage workers to put more effort into activities that were previously not rewarded but had a positive impact on firm-level long-term profits. The list of metrics included both financial and nonfinancial indicators of performance. Examples of nonfinancial indicators were customer satisfaction, customer retention, employee satisfaction, and employee retention. The design of the balanced scorecard was supervised by senior management who consulted regional and branch managers for their input. The fact that so many metrics were included in the final scorecard indicates that the design committee could not reach consensus on a small number of metrics. In addition, the design committee decided that each of the 16 metrics would have equal weights in calculating an overall measure of branch performance, which, in turn, would determine the monetary bonus amount offered to employees.
The researchers found that, compared to the division that did not have the balanced scorecard, sales increased in the division with the balanced scorecard, but costs also increased, so profits did not rise. However, further analysis showed that profits did increase in branches that were run by more experienced managers (Figure 3). This result is important because it demonstrates that multidimensional performance measurement systems can have a positive effect on a firm’s financial performance if managers and staff understand how to act on them. Interviews allowed the researchers to determine specific ways that the experienced managers overcame some of the difficulties with the balanced scorecard. This finding suggests that more experienced managers may be able to train less experienced managers in how to respond to multidimensional performance measurement systems.
Limitations and gaps
The studies discussed in this article draw on data from individual companies. While many of the researchers believe that their findings are generalizable, this cannot be stated with certainty. More research should be conducted on the existence and impact of multitasking across a broader range of firms and industries, with a particular focus on knowledge workers. To accomplish this, researchers need access to detailed compensation and performance data from participating firms; however, firms may be reluctant to share such data for fear of confidentiality breaches. Many of the studies discussed here addressed these concerns by employing rigorous data security procedures that enabled the researchers to guarantee confidentiality to participating firms. Academics and human resource professionals should work together to ensure that the interests of both parties can be met satisfactorily. Moreover, researchers must be able to collect the data needed to evaluate the actual impact on firm profits, which has been a persistent challenge in many studies.
Another concern with the existing research is whether the decision to introduce new compensation plans to deal with multitasking problems was exogenous, i.e. not related to any characteristics of the employees, the department, or the division. If there is a correlation between the introduction of the new compensation plan and any of these characteristics, it will be very difficult to attribute the observed change in performance to the introduction of the plan. For example, suppose the new compensation plan was introduced at the same time that other organizational changes were introduced, e.g. a new training program, a new approach to recruiting, or new reporting relationships. In cases like this, any observed changes in performance could also be due, at least partially, to the other changes taking place in the firm. Ideally, measuring the response to a new compensation plan should be studied in a quasi-experimental context where the researchers can identify a suitable control group and use a difference-in-difference methodology. With this methodology, trends prior to and after a change in the compensation plan for both the control and experimental groups can be studied. The balanced scorecard study did exactly this , as did the study on Chinese manufacturers . However, additional studies conducted in alternative contexts are needed.
Summary and policy advice
Firms frequently face a multitasking or “you get what you pay for” problem, which arises when a performance pay plan only rewards employees for a subset of their jobs’ tasks. In theory, firms could address this issue by using multiple metrics to measure and reward all the dimensions of an employee’s performance. However, in practice there are two main challenges. The first is identifying all of an employee’s performance dimensions; the second is that firms must be able to actually measure each dimension in order to compile a complete portfolio of metrics. Information technology will make it easier for firms to measure some tasks, but it will likely remain difficult to accurately measure all performance dimensions. This is most applicable when it comes to knowledge workers, for whom the tasks that the firm values may be ambiguous ex ante (e.g. managers, software engineers, scientists, lawyers, physicians, and academics). In this situation, pay-for-performance compensation schemes may not be appropriate and the firm may prefer that a larger share of the compensation package be in the form of a fixed salary.
Moreover, even if a firm can measure all relevant components of an employee’s performance, a compensation system with many metrics may be too difficult for employees (especially those with less experience on the job) to understand and respond to. Both employees and managers will require training in understanding how a metric is actually measured and how their behavior can influence the metrics. Hence, as compensation plans become more complex, firms will need to invest in training and education programs to ensure that the intended results of the new plans come to fruition.
The author thanks two anonymous referees and the IZA World of Labor editors for many helpful suggestions on earlier drafts. Previous work of the author contains a larger number of background references for the material presented here and has been used intensively in all major parts of this article .
The IZA World of Labor project is committed to the IZA Guiding Principles of Research Integrity. The author declares to have observed these principles.
© Ann P. Bartel