The Pros and Cons of Continually Assessing Performance

Summary:

As AI rapidly changes the division of labor between people and machines, organizations need ways of understanding employee capabilities that are more dynamic than just job titles, résumés, and periodic reviews. Continuous-assessment systems use signals generated during everyday work to track how skills are applied, how tasks are shifting to AI, and where new capabilities are emerging.

For much of aviation history, pilot training programs were geared toward two outcomes: the accumulation of a prescribed number of hours in the cockpit, and certification to fly a certain kind of aircraft. Airlines treated those outcomes as hard-earned evidence of competence.

As cockpits have become increasingly automated, however, modern airlines have had to develop new ways of understanding capability. Knowing that pilots have logged thousands of flight hours, for example, tells airlines little about how those pilots might perform during edge cases that involve conditions of uncertainty. What counts in those cases is how pilots perceive risk and make judgments. To gauge those capabilities, airlines now use advanced data-monitoring systems, which capture thousands of signals during every flight and use them to detect patterns in decision-making, reaction time, and procedural adherence under changing conditions. Airlines now assess pilots continuously, in other words, based on how they actually fly.

The changes we’re seeing in airline cockpits are a preview of what many organizations will soon need across roles and industries: systems of continuous assessment that monitor performance, detect patterns and problems, and enable learning in real time. In the years ahead, for example, we may see something similar happen in the operating room. AI systems already analyze endoscopic video to track instrument movement and flag deviations in technique while a procedure is underway. This points toward a future where hospitals assess how surgeons actually operate, case by case, rather than certifying competence once and assuming it holds.

In this article, we’ll discuss the risks and benefits of creating such systems, and we’ll offer some preliminary guidance on how to create them.

The Risks and Rewards Are Real

To many workers, a system that monitors the flow of everyday work can easily feel less like a learning architecture and more like a system of surveillance. That dynamic surfaced recently at Meta after the company began installing software on the computers of its U.S.-based employees to capture mouse movements, clicks, keystrokes, and occasional screenshots. Meta said the data would be used to train AI systems rather than evaluate employee performance, but the initiative triggered substantial internal backlash and demonstrated how quickly systems of continuous assessment in the workplace can lose legitimacy when employees experience them as mandatory, opaque, and extractive.

Such systems can also lead to a kind of organizational myopia, in which employees optimize for what is measured rather than what matters. Organizations instead need to build in room for employees to experiment, develop new skills, and explore approaches that have not yet been captured in the data. If the system is designed well, continuous assessment won’t constrain behavior but instead will create a clearer environment in which people can learn.

“The goal cannot be surveillance for surveillance’s sake,” we were told by Carrol Chang, the CEO of Andela, a global tech-talent marketplace, who prior to that role spent eight years at Uber as the global head of driver and courier operations, building deep experience with distributed workforces at scale. The next generation of assessment systems, she argued, will move beyond measuring output toward understanding how people work with AI in practice. “The healthiest systems,” she said, “use assessment to support growth, mentorship, and adaptation. If workers only experience measurement without support, organizations create fear. If assessment is paired with coaching, reskilling, and transparency, people are much more willing to engage with change.”

And change they must, given how rapidly AI is being adopted in the workplace. To contextualize what’s happening and what the implications are, let’s consider a little history.

Organizations used to treat jobs as fixed containers for work, even as the work itself grew more varied. The job-based model rested on the idea that you could assign people to predetermined bundles of tasks and update those bundles only occasionally. By the 1990s and 2000s, as more knowledge work became organized around projects and cross-functional initiatives, the traditional job-and-manager model began to show its limits. Work increasingly required specific combinations of skills, but companies lacked an internal market that could reveal who had those skills, who was available, and where they could be best deployed. The system could no longer connect capability to opportunity efficiently.

The idea of the skills-based organization emerged to correct this failure. Companies began cataloguing the underlying skills that powered their operations—an approach that provided a more granular representation of capability and allowed organizations to start matching work to employees with the relevant skills, no matter what their job title, making it easier to assemble cross-functional teams.

The skills-based model succeeded for a while, because capability requirements remained stable long enough to guide decisions. But with the advent of AI, that stability is now disappearing. As AI keeps improving, the division of labor keeps shifting. Machines are taking over a growing number of tasks that once required human expertise, and organizations now have to constantly reassess which human capabilities matter, and why. Not only that, once machines take over skills, those skills become easy to commoditize and make widely available—which means that companies need to find new sources of advantage. A skill that once held its value for years can now be devalued in a single product cycle if a competitor learns faster, or if the provider of an AI tool absorbs that skill and commoditizes it.

This creates a new kind of organizational problem: To succeed today, companies have to sense how capabilities are changing during the actual performance of work, and they need to use what they learn to match employees accurately to the tasks they’re best suited to do. Continuous assessment may offer a powerful solution to this problem.

Three Necessities

Some companies are already experimenting with continuous assessment in small ways. Individually, these initiatives may not look like continuous assessment. Many are positioned as productivity, training, staffing, or workflow-improvement efforts rather than assessment systems. But once these components are considered together, they point toward the architecture of a more mature system, in which assessment is based less on periodic review and more on continuously captured evidence from actual work.

We’ve identified three main actions that will be necessary to make such a system work:

1. Change what you treat as evidence of capability. What matters here are signals from actual work. To assess performance, organizations can no longer rely on periodic reviews or self-reports. Instead, they have to continuously monitor such real-time signals as code commits, customer calls, collaboration patterns, and tool usage.

That’s the job done by Microsoft’s Skills Agent, launched as part of the company’s People Skills system. The tool captures information across emails, documents, meetings, chats, and collaboration patterns and uses it to infer what people are actually working on, what expertise they are applying, who they collaborate with, and how their capabilities are evolving. Those signals are then mapped against Microsoft’s skills taxonomy, powered in part by LinkedIn’s Skills Graph, which captures relationships among more than 39,000 skills. This creates a dynamic employee-skill profile that constantly updates as new work is performed. The system doesn’t offer a perfect view of capability, of course, nor is it a substitute for human judgment. But it marks a fundamental shift: from periodically recording someone’s self-declared skills to continuously inferring what capabilities they demonstrate in the flow of work.

2. Analyze work at the level of individual tasks. Continuous sensing—that is, the ongoing capture of work signals at the individual and workflow level—is what makes continuous assessment possible by enabling organizations to better understand how work is being distributed and reconfigured between human employees and AI. This helps determine who is adapting well to AI and which skills are being absorbed by tools. If AI can perform a task that once carried a skill premium, the organization needs to know that quickly. Otherwise, it will continue hiring and organizing around capabilities that may already be losing their scarcity.

Stripe’s internal coding agents offer one of the clearest public examples of this shift. Known as “minions,” the agents independently write blocks of code and submit them for human review before anything goes live. In February 2026, Stripe reported that more than 1,300 such submissions were merged into production each week—fully AI-written, human-reviewed, and containing no human code at all. Each submission becomes a data point on the shifting division of labor. The organization can see not only that AI was used but where it was used: which parts of the task were delegated to minions, which parts were rewritten or corrected by the engineer, and which outputs passed review, testing, and deployment. Over time, these signals show which tasks AI is absorbing and which engineers are learning to deconstruct, supervise, and integrate AI-generated work most effectively. Stripe doesn’t clearly state whether it uses this data for formal capability assessment or workforce redeployment, but the task-level signal is clearly visible.

This is already starting to happen in software development. AI-assisted work now genereates signals on which suggestions were accepted, which were rejected, where developers used chat or agentic assistance, and whether the output survived review, testing, and deployment. For instance, GitHub’s Copilot usage metrics give enterprise administrators visibility into how Copilot is adopted and used across an organization. Enterprise studies are already using these signals. For example, a ZoomInfo deployment of GitHub Copilot across more than 400 developers measured suggestion and line-acceptance rates alongside developer feedback, reporting a 33% average suggestion-acceptance rate and 20% line-acceptance rate.

That makes the emerging assessment problem very different from traditional productivity measurement. The organization can begin to see which parts of software work are being absorbed by AI, which developers are using AI mainly for acceleration, and which are learning to supervise, evaluate, and integrate AI-generated output into reliable systems.

3. Close the loop from insight to action. Continuous sensing is useful only if it changes how the organization allocates work, develops people, redesigns roles, and plans for the future. r.Potential—a venture from Adecco in strategic partnership with Salesforce—suggests where we’re headed on this front. The company says the venture is designed to generate a “dynamic understanding of the work humans should continue to do and the work agents can do,” and to that end it draws on global labor market data, company-specific workforce data, and real-time data on how AI agents are performing to help leaders model what their workforce should look like. The hope is that all of information will allow r.Potential to recommend new ways of organizing work and combining human and AI roles.

Finally, insight must translate into action. This is where the organization moves from training people outside the workflow to adapting them inside the workflow. Cresta’s Agent Assist capitalizes on this opportunity, offering reminders to call-center personnel, along with knowledge relevant to specific customer interactions, before a call is over. Agents learn while customers are still on the line, in other words, rather than having to wait for the next training cycle. Some companies have begun to create similar AI-enabled coaching models for their engineering teams.

From the What to the Why

Once the organization can see what is happening, it must understand why, and learn which changes in performance reflect skill development, for example, and which signal fatigue.

Let’s continue with the aviation example. A delayed response in the cockpit can’t be interpreted in isolation as a capability issue. It may reflect fatigue or issues in crew coordination. That’s why aviation-monitoring systems combine operational data with context rather than treating every performance variation as an individual competence problem.

The same logic is now moving into knowledge work. AI coaching systems used in contact centers can help train call-center agents, but they can also help distinguish weak performance from issues relating to a broken workflow. Over time, a learning organization develops this interpretive capacity, turning continuous assessment from surveillance into organizational learning.

This AI-driven transformation of work forces a fundamental shift in strategic planning. In an environment reshaping itself in real time, organizations can no longer rely on fixed assumptions about roles, skills, or the division of labor. Instead, strategy must become a dynamic exercise anchored by a high-resolution, real-time view of where capability is forming, eroding, and shifting between people and machines.

Transitioning to this model of continuous assessment is ultimately a governance challenge, not a technological one. Leaders must clearly define how performance signals are used, ensuring the system is transparently positioned to support learning and adaptation rather than surveillance and control.

Historically, eras were organized around functions, processes, and projects. The coming era will organize around readiness: that is, an organization’s continually updated capacity to act at the moving boundary of human-machine collaboration. The firms that master this governance will transcend outdated labels, allocating work based on real-time capability and adapting faster than their competitors can plan.

Explore AAPL Membership benefits.