Designing for the Humans in the Loop – Part III

This is the third part of three on how AI regulation is a chance to fix deeper problems.
Part I introduced the EU’s AI Act as the likely global standard for responsible AI development.
Part II reframed what it is we are actually fighting for: human agency in tech design.
This final part is directed at businesses who want to take care of their people in a future of work made up of cognitive assembly lines.

I’m starting to have a love/hate relationship with the term “future of work”.

It’s happening, it’s here, but, as William Gibson puts it, it’s unevenly distributed.

Some are getting the best of it, some are getting the worst of it.

Many of those getting the worst of it are doing “ghost work”. A lot of human labor is critical to making AI-enabled systems function, yet it rarely has a visible role in the major narratives about digital transformation. As consumers benefit from near-instant delivery and social media feeds that are free of obscene images, behind the scenes are hundreds of thousands, if not millions, keeping the engines running and struggling to keep up with the pace of these souped-up machines.

The book Ghost Work, by Mary L. Gray and Siddharth Suri, has made a start on detailing the class of workers carrying forward the AI revolution. They are data annotators, testers, and delivery drivers, almost all gig workers outside of traditional modes and protections of employment. And I think this human labor component in AI automation will become far more central in most narratives very soon.

As enterprises deploy AI in their organizations, more traditional jobs inside of companies are seeing the same challenges troubling ghost workers. And for all of the effort and attention put into mitigating AI risks for end consumers (like privacy, unfair bias, and social media recommendations), AI governance and regulation have barely considered the future of work that is already here.

AI and causality: Why AI needs so many humans in the loop

Understanding cause-and-effect relationships is hard for AI. A lot of today’s AI is based on predicting probabilities of patterns in large amounts of data. That pattern detection has led to AI developing  “senses” of sight and hearing and predicting certain behavior like clicking a link. But it is often operating without a model of causal relationships. 

Viewing a slow-motion video of a baseball player hitting a ball, we know that the player is swinging the bat to hit the ball and repel it, and not the other way around. An AI model might be able to determine that the pitch was a curveball thrown from a lefty pitcher, moving at 93.6 mph, and had 23 inches of lateral movement, but it has no understanding of those causal relationships between the batter, the bat, and the ball.

In medicine, this gets especially tricky with determining the effects of a drug and whether other confounding variables have caused some disease. For instance, determining whether a drug given to someone with high blood pressure poses a risk of heart attack. This judgement call is very challenging for a machine because it may not see all the needed variables, yet it is a fairly routine task for a doctor.

The AI field is working on incorporating causal models into machine learning, but adding that domain expertise at the R&D level is hard. Up to now, the field has been able to move very quickly with the pattern recognition techniques through trial and error on large data sets. These experiments can be done with even just one researcher or engineer tuning the model until it works well enough on the test data. Adding in more collaborators from other domains with their own vocabularies and methods in order to incorporate causal models, not to mention the competition between approaches for how to even introduce causal models into ML, and it’s easy to see there will be a lot of friction to overcome.  

The fix is that causal models can be applied at the data annotation and application levels by human experts. On the data annotation level, experts can create train data wranglers to identify causal variables for training data. This allows the pattern recognition models to be scaled very efficiently with the traditional methods given all the data we have to feed them. Though, without causal models actually incorporated in the model itself, the AI will still be very narrow–it will have difficulty adapting to new situations without guidance. 

At the application level, domain expertise is needed again to guide the machine with the right objectives and environmental awareness to ensure it doesn’t make mistakes when it encounters scenarios outside of its training. As an AI system can contain many different models, this makes for multiple points at which a human is needed “in the loop”. For the foreseeable future, we will see most AI systems deployed with many humans in the loop and be very dependent on those people’s ability to apply their domain expertise of cause and effect.

Cognitive Assembly Lines

I believe that, just as automation of industrial manufacturing commoditized physical tasks that led to the assembly line, we will see the same thing happen with assembly lines of cognitive tasks. In the “future of work” that is already here, AI automates the pipeline, packaging, and delivery of work, but is still very dependent on human “cogs”. The efficient frontier of these systems is a neat categorization of work into tasks that can then be delegated to machines or humans depending on who does it better.

The conversion of work into fungible tasks is making its way to knowledge worker employees. We see this in emails and growing lists of tasks thanks to automated systems for organizing and distributing work. As AI advances, we will see more knowledge work commodified into tasks, especially:

  • Contact service agents
  • Project coordinators
  • Expert anomaly detection
  • Edge case investigators

This shift is not well seen in part because of more people working at home during the pandemic while their jobs are transformed into production line tasks. The shift has also been gradual: the work is done at the same computer and desk we’ve always done our jobs, and we’ve grown accustomed to lists upon lists of notifications or task assignments from various work and productivity tools landing in our inbox. The dynamics of the cognitive assembly line are not unfamiliar.

In insurance, automatically generated claims from a few photos of a car need human help with the edge cases. For all the automation coming to the insurance industry, claims checkers are still needed for their human judgement and expertise. The AI systems will make them much more productive, but the business sentiment seems to be to downsize or hire less-skilled people to do the work. Further, the repetitive nature will have its impact and prevent people performing it as fast as the machine delivers.

One very visible example of this are airport security checkpoints for carry on luggage. Each of these have a security officer sitting in front of a screen checking the x-ray scans of each bag. In most parts of the world, these are AI-assisted systems, with object detection that trigger warnings about dangerous items. Those warnings aren’t good enough on their own, which is why there is still a human sitting in that chair to make the judgement call on whether someone should go through or be picked for extra screening.

That screener’s job has been boiled down to just watching image after image on a screen, deciding whether to accept or reject the warnings. This is incredibly efficient in terms of getting bags through while seeing the contents in high fidelity, keeping the lines at security moving. But just as when Ford introduced the assembly line, and all the great benefits that came along with it, so too come new problems.

An article from an AI company promoting its software for airport screening detection aptly points out all the challenges of this assembly line work, including:

  • The 2.5 seconds a screener has to observe each bag
  • The chaotic environment distracting them from making a good judgement
  • A long list of items to be screening for, plus emergent threats they need to be aware of
  • That their accuracy starts to quickly fall after just 10 minutes of work
  • True threats appear very infrequently and so true positives are even less likely
  • Little-to-no reward for catching threats leads to low job satisfaction, worse performance, and high turnover

AI systems have many people in the loop to provide oversight and input into these systems, and there is a great deal of legal and organizational governance to guide that oversight. The problem is that the governance doesn’t cover how those humans in the loop are impacted. And retroactively fitting HR policies and labor laws won’t do the trick, these human factors need to be implemented in the design of the systems.

This image depicts all the different people who are part of an AI system and the governance oversight

AI systems have many people in the loop

Designing for the Humans-in-the-Loop

The shift to the assembly line really got going in 1914 when Henry Ford introduced it at his manufacturing plant with wild success. As the methods took hold across industry, with it came a rash of new types of injuries from highly repetitive work. It wasn’t until the 50s that ergonomics became well established practice outside of the military, and the 70s for that practice to commonly consider cognitive factors in addition to physical ones. 

There is growing recognition of the need to include human factors in AI systems engineering, but by all accounts there is still a long way to go.  This paper has mapped the research, noting how little there is, especially when it comes to including workers in the design of ergonomic systems.

Expecting AI to carry all the load of additional productivity is expecting humans to do it, and they won’t be able to meet the standards we set for machines. It will lead to unhealthy work forces and unhealthy businesses. 

This isn’t just a safety thing. In all the companies I’ve worked with, this kind of consideration of who is in the loop is THE difference that separates those who are successful at implementing AI systems at scale. Even if you’re an automation maximalist, this isn’t so bad: economists have been saying that the true new value creation from AI is in the creation of new jobs, and this forces organizations to think about how best to use their people’s expertise rather than just automate it away. 

With the talent shortage, businesses will also find they have an edge if they can add a layer of governance to their AI systems that takes into account these human factors: how workers perform, how they feel, how to retain them, and how to drive overall better productivity end-to-end

Designing for human-machine collaboration is what will allow for net-new value and productivity. Here’s a framework I’ve been working on for the factors to consider for the human in the loop.

Trust: Explainability, Feedback Mechanisms, Transparent Performance

Trust is something we talk about a lot in governance. It should be as true with your workers as it is with your customers. Just like in human-human collaboration, if they don’t trust the machine they won’t work with it faithfully. They will give it the wrong data or won’t accept the input from the machine. The main problem right now is that the interfaces we have were never designed for collaboration in the first place. 

First, make the machine understandable. Prioritize explainable outputs for why a machine is giving a recommendation so that the worker can incorporate it with the context they have. Then give your people clear and easy-to-use mechanisms for giving that context back to the machine to incorporate in the machine’s future predictions. Finally, give transparent and honest displays of accuracy and performance. If the machine is not performing well, it should say so and it will empower employees to work with the machine to get it up to speed.

Motivation: Training, Social Connection, Value Sharing

If you want a highly precise, accurate, and adaptable machine, you need these people for the foreseeable future because only they can provide the agility, robustness, accuracy, and precision needed to compete. The people who have been doing the work for many years are the ones who will be able to spot the opportunities not seen by the machines. Make sure they feel they are part of creating the new value for the company, rather than targets for automation.

Training is the number one factor. It will show them the unique value they bring and how to best use it in human-machine collaboration. There is no such thing as best practices when it comes to these new systems. Open up ways for workers to connect socially and share their techniques and discoveries in these new collaborations. It will reinforce the sense of being on a team even with new machine teammates.

The last bit of this may be hard for some companies: give employees a piece of their value creation. This goes beyond stock options and profit sharing. They are contributing their intellectual property, and for them to have career longevity with the domain expertise they have built up, they need not only the human-machine collaboration skills, but also to be able to port some of the tuned aspects of the machine to bring with them. This last part is highly theoretical, but inescapable. The more finely grained we can make portability, the better. Otherwise sloppy solutions will leave everyone unhappy.

Safety: Ergonomics, Critical Thinking, Privacy

This has been covered a lot already: mitigate for physical and cognitive injuries. Avoid the fantasy that you are dealing with an AI system, and be hyper-aware of where humans are in the process to avoid putting mechanized expectations on them. With the research so far behind, it will be extra important to listen to employees when they say something is challenging about working with the machine. Listening will have the double effect of reinforcing trust as it will be an opportunity to best shape the systems to the worker’s domain expertise.

Both humans and machines are going to make mistakes and be biased at times. The system should be designed to not allow for over-trust of the machine or of the human’s inputs. Create an experience that reinforces and rewards critical thinking, along with fail safes and operating boundaries to prevent major errors on either side.

In all of this, the worker will need to have their privacy. Some of the feedback mechanisms will require a degree of surveillance of employees far beyond our society’s general acceptance. Ensuring privacy, a system of recourse of possible violations, and protection of their value contribution will contribute to critical psychological safety.

The path to the future is uncertain.

It is going to take a lot of human effort to get AI to work until it can be fully autonomous, if it ever will.

There are many risks and we will likely see work transform many times over along this path.

In general I think we will continue to see the conversion of knowledge work and line work, where knowledge workers are doing more commoditized and repetitive work, and line workers are actually needed more for their human intelligence and unique contextual awareness.

These workers will be bearing the brunt of the transition to the future of work, and we can’t just hope that we will get to an equilibrium suddenly such that everyone will quickly forget the human cost along the way.

Much of the policies for worker protection we need already exist. The AI Act is largely an extension of existing human rights law to the new technological context and force these considerations into the design. However, the AI Act is really just a set of guidelines and a few hard lines in extremely risky cases.

Companies should ensure the health and happiness of those doing repetitive work and effectively valuing the unique knowledge a worker brings to their job so as to realize it for organization. 

Obviously, companies will have different calculus of where they want to be on the spectrum of governance. The law sets some hard lines for some applications, but a lot of it is up to the organization.

My bet is that the companies that reach for a higher level of excellence with their employees’ input into machines systems will create the most value.

The place to start is just recognizing the humans in the loop of these automated systems and putting at least as much attention on their well being as the functioning of the overall system. Ask: Who will be in the loop, what is the experience we should create for them?


Part I – “AI Regulation is a Chance to Fix Deeper Problems”

Part II – “‘F*** the Algorithm’, It’s Time to Get Our Agency Back”

    Leave a comment