Turns out Gen AI isn’t just coming for the artists and nerds. CEOs and other high-level strategic decision makers are also in the crosshairs – especially executives who are less flexible and more short-term oriented in their thinking.
Earlier this year, Cambridge-based researchers created a simulation of the US auto industry for human and AI agents to navigate as chief executives. The simulation was designed to allow more than 500,000 possible decision combinations per round (each round representing a fiscal year) and to have no fixed winning formula. For players, the game’s objective was simple — maximize market cap as CEO and don’t get fired by the virtual board.
Humanity was represented in the experiment by a mixed cohort of undergrad and graduate students from Central and South Asian universities and senior executives from a South Asian bank. The Gen AI agent was Open AI’s GPT-4o.
As the researchers detailed in their recent HBR piece (which we linked in a previous Briefing), the AI outperformed human players pretty much across the board on the game’s metrics. The LLM “designed products with surgical precision, maximizing appeal while maintaining tight cost controls… [and] responded well to market signals, keeping its non-generative AI competitors on edge.”
Score one for the machines.
And yet… the AI agent tended to be fired by the virtual board in the simulation more quickly than the human players were – particularly the top-performing students. Those students took longer-term strategies and less aggressive approaches to maximizing growth in the near term. They tended to value adaptability and flexibility and were better able to respond to shifting market conditions and shocks.
By contrast,the AI’s tendency to lock into short-term optimization strategies ran into trouble with big market shifts and black swan-type events. In other words, it excelled at maximizing advantages in relatively stable and known conditions, but struggled to navigate an uncertain and volatile environment. Much like the top human executives involved in the study. GPT-4o and these human executives shared a key weakness: “Overconfidence in a system that rewards flexibility and long-term thinking as much as aggressive ambition.”
I’ve been fascinated by this result, and I have a few thoughts. First, the experiment nicely illustrates the dynamics and risks of the kind of future Bob Johansen has described as a decision environment that rewards clarity while punishing certainty. As such, it offers a great opportunity for self-reflection. Second, I’m supremely curious about the possibility that a centaur-style approach of university student + Gen AI (i.e., relative novice + AI) might outperform both the AI alone and the human executive players in the simulation – a possibility that IRL human executives everywhere should consider.
Third – and final, I’ve been thinking about this experiment as the debate over whether LLMs are (or will be) capable of genuine reasoning continues to heat up. If today’s executives tend to share the same strategic strengths and weaknesses as a pattern recognition tool that overvalues past performance, struggles outside of controlled environments, and may only appear to be reasoning… what does that say about their ability to lead in an uncertain future? And what does it say about the ways in which the business world has elevated a model of leadership that perhaps undervalues our greatest human strengths?
@Jeffrey