LSE - Small Logo
LSE - Small Logo

Umar Taj

Owain Service

Edward Gardiner

June 13th, 2024

“Behavioural AI” can help improve human decision-making

1 comment | 19 shares

Estimated reading time: 5 minutes

Umar Taj

Owain Service

Edward Gardiner

June 13th, 2024

“Behavioural AI” can help improve human decision-making

1 comment | 19 shares

Estimated reading time: 5 minutes

Umar Taj, Owain Service and Edward Gardiner are building the “behavioural AI” programme, initially to help call centre handlers be more empathetic when fielding calls. Over time, they believe the programme could evolve into a “large behavioural model” (LBM), with wider applications.


If you work for a large organisation anywhere in the world, chances are that you will have been asked at some point: “what can we do with AI”? Hundreds of millions of people, ourselves included, would answer that they have prompted ChatGPT or its growing band of competitors (such as Meta AI, Mistral AI and Google’s Gemini) to complete tasks more accurately or more quickly than a human being could.

This technology is called generative AI because its foundational models have been developed to take raw data and “generate” outputs when prompted to do so. We ourselves are using “generative AI” in this way. But as we will explain below, we are also interested in how we can use this technology to better understand and improve human decision-making, by combining the latest developments in the field with insights from behavioural science.

The practical application of behavioural science involves designing new systems, processes and products that help to encourage or support people to make better decisions for themselves. We believe that training generative AI models with the aim of helping human beings to improve their decision-making is part of the answer to the doomsday scenarios put forward by many commentators, who talk of AI heralding the “end of the human era” or even the extinction of the human species.

While we should be cognisant of the nefarious potential of uncontrolled AI, we also believe it would be wrong to suggest that human beings and machines are currently competing in a singular Darwinian struggle for survival. As behavioural scientists, we believe that we need to focus on the potential symbiotic opportunities between human decision-making and artificial intelligence.

We are currently building a programme that we hope will help support this vision. We call it “behavioural AI”.

There are potentially limitless applications for behavioural AI. But it will be useful to explain what we mean by the term in relation to a concrete situation in which we all have some experience. Namely, what happens when you dial a call centre, looking for a problem to be resolved.

With the possible exception of large technology companies, almost every large organisation that has contact with large numbers of individuals has some form of call centre in place to handle support, advice or complaints from customers. And almost all these organisations want to know how they can help their call handlers to give their advice in a way that resolves issues quicker and leads to a better outcome for their customers.

One particular question that we have been examining is how to help call handlers be more empathetic when fielding calls. Empathy is the ability for human beings to understand at an emotional level the feelings of someone else. It is a quality that large organisations often believe their call handlers (and automated systems such as chatbots) fail to display.

In a recent project undertaken with The Foundation, we learned of a call from a distraught customer of a large high street bank. The customer couldn’t make her next mortgage payment. “Our system will only let me help you when you have already missed three payments, so call us back then”, came the reply from the call centre agent. There was no sympathy expressed for the plight of the customer, no focus on solving the problem. No putting themselves in the shoes of their customers.

The bank knew that they wanted their call handlers to “be” more empathetic. But they lacked the tools to understand what was happening across the organisation and what could be done to make things better.

This is where behavioural AI comes in.

We can think about its use in terms of three different levels, each of which adds an additional layer of complexity and sophistication in understanding human behaviour and developing new working practices.

Behavioural AI’s level 1 uses the productivity-enhancing power of AI together with the knowledge and training of a behavioural scientist. In this level, we need to determine what we mean by empathy, and develop a scale for measuring it in practice. To do this, we need to undertake a large-scale literature review, and to examine existing validated scales for measuring empathy that enables us to determine what “high” and “low” empathy looks like in practice.

We do this with trained behavioural scientists searching the existing literature. This will include an assessment of the strength of different academic papers, not all of which should be treated equally. But generative AI can be used to “read”, summarise, and search for patterns in a much broader range of papers and sources than would even the smartest human being.

So we use these foundational models to enhance what would have otherwise taken longer and been less all-encompassing. And by the end of this phase of work, we have a new construct for measuring empathy that can be used for all subsequent analysis.

Level 2 is where the latest developments in AI come to the fore, enabling us to complete tasks that would have previously been nigh on impossible. If we were doing a project of this kind in the past, we would have had to get human beings to find a way of gathering a representative sample of calls from a call centre and listen to each and every one of them. We would then have needed to use human beings to analyse these calls against the measures developed in level 1.

This task is either hugely time consuming or, more likely, impossible to do with any degree of rigour. And it would have resulted in a relatively high degree of subjectivity. But with generative AI, so long as we have a good way of accessing recordings or live conversations of call centre calls, we can now transcribe every single call that has been undertaken over, say, the past year. And we can then analyse each and every one of those calls, to understand why people are calling up and where challenges arise.

Having transcribed these calls, we can also prompt our generative AI models to measure and score how empathetic the call handlers have been in each and every call that they have undertaken. In the calls below, for example, the higher empathy call (scoring 90 in our empathy scale) is characterised by a balanced conversation between caller and agent, which can be contrasted with the lower empathy call (scoring 25 on our empathy scale) in which one side (the caller) has a complex complaint that is not being fully addressed by the operative.

We can also pull out specific examples from each and every call, that can be used to generate feedback to call handlers, which over time will encourage them to change the way that they interact with customers.

Below is an example of the kind of insight that can be generated directly from this kind of analysis. As can be seen, we are moving beyond “high” or “low” empathy, to specific examples of behaviours that should be kept up (or changed) and how.

Note that because we have now analysed every single call made by every single handler, these prompts are tailored to each individual, using examples from their calls. And because these call handlers will invariably be using online platforms that contain scripts that help them to respond to specific types of questions, we can integrate these prompts into in-the-moment feedback that improves performance over time.

 

This kind of direct, specific, in-the-moment feedback is at the heart of behavioural AI programmes: the focus on helping to encourage human beings to (positively) change their behaviour. How and when this feedback is given can itself draw on behavioural science research. For example, the literature on habit and routine formation shows that new practices only become embedded when repeated in a similar context over time. So this step will usually involve ensuring that call handlers are encouraged to focus on one bit of insight until it becomes embedded in their working practice (rather than everything all at once).

By analysing empathy scores over time, we can then determine whether individual call handlers are becoming more empathetic. And how the organisation as a whole performs relative to other organisations. We can also look to test and trial new ideas for improving empathy scores, using another method beloved of behavioural scientists: a randomised controlled trial in which different call handlers are “randomised” into different intervention groups, and we then measure how effective the new practices are relative to a control group.

Level 3 is our final level, which goes beyond our specific project and focuses on the foundational models we are using to power our behavioural AI analysis. While levels 1 and 2 have involved using existing foundational models (think GPT 4.0), which we then use to analyse external datasets, in level 3 we start to modify how the underpinning foundational models operate, by integrating the learnings from the behavioural science literature.

In level 3, we assume that we cannot create a more sophisticated large language model (LLM) than can Open AI, Google or Meta. But we can add insights that tweak and train these foundational models to make them more appropriate for behavioural analysis and solution development. In relation to our empathy work, for example, this will involve training foundational models on our empathy score cards, improving their performance in relation to these tasks and increasing their accuracy overall. Over time, we are looking to go from training a large language model with behavioural insights to a more sophisticated, all-encompassing model that we might call a “large behavioural model” (LBM).

This is a hugely exciting time to be involved in behavioural science and artificial intelligence. Our belief is that there will always be a need to focus on the “behavioural” side of AI because there will remain a need to think about the implications of insights generated by AI upon humans and human behaviour.

So, if you’re one of those individuals who is currently musing on the question of how to use AI but wants to bring the human back into the frame, drop us a line.

 


  • This blog post represents the views of the author(s), not the position of LSE Business Review or the London School of Economics and Political Science.
  • Featured image created by the authors with the assistance of DALL·E 2
  • When you leave a comment, you’re agreeing to our Comment Policy.

 

About the author

Umar Taj

Umar Taj is Associate Professor in Behavioural Science at the Warwick Business School. He also teaches executive courses on behavioural decision science at the London School of Economics and Political Science.

Owain Service

Owain Service is the CEO of CogCo.

Edward Gardiner

Edward Gardiner is Co-founder and Director of CogCo.

Posted In: LSE Authors | Technology

1 Comments