The following essay builds on remarks delivered by Ingvild Bode as part of the Expert Workshop “AI and Related Technologies in Military Decision-Making on the Use of Force”, organised by the International Committee of the Red Cross (ICRC) & Geneva Academy Joint Initiative on the Digitalization of Armed Conflict on 8 November 2022.
I want to make three sets of comments in relation to consequences of using AI-based decision-making support systems for affected populations. These are based on my expertise in investigating the normative dimension of developing and using weapon systems with autonomous and AI-based features, including in relation to different understandings of appropriateness on the use of force. Many of these weapon systems include decision-making support elements, so there is no perfect separation into neat categories.
(1) First, I will discuss examples where the use of AI-based decision-making support systems contributed to unintended harms, reflecting on system failures, errors, and human-machine interaction issues.
Here, I want to speak to the history of machine-assisted decision-making processes on the use of force and their associated failures. I consider AI-based systems as continuing a longer trajectory of machine-assisted decision-making and targeting— although of course, the complexity is increasing. This trajectory started with air defence systems and guided missiles and is currently visible, for example, in the use of loitering munitions.
What have been some of the consequences of “delegating” not only motor and sensory tasks but also cognitive tasks to decision-making support systems and what does this mean for affected populations?
First, this “delegation” has reduced the tasks conducted by humans and has made decision-making on the use of force not necessarily easier, as perhaps expected, but more complex. Such complexity relates in particular to human-machine interaction. It makes decision-making on the use of force arguably harder to manage for human operators. The delegation of motor and cognitive skills to decision-making support systems means that operators become passive supervisors rather than active controllers for most of the time of the systems’ operation. When they are called upon to act, they often have a very limited amount of time to regain situational understanding in the context of “machine speed”. This makes it more difficult to question system outputs and make reasoned deliberations about engaging specific targets. Understanding such complexities of human-machine interaction can shed light on prominent failures associated with air defence systems, such as the downing of Ukraine International Airlines Flight 752 by a Tor-M1 in January 2020.
Second, integrating AI-based features into decision-making support comes along with accepting a certain unpredictability. The inherent complexity of decision-making support systems does not only mean accepting potential failures as “normal”. As has been investigated by other experts, this extends to both technical and operational forms of unpredictability when operating AI-based systems. Technically, as systems come to include machine learning, there are clear limits to training and testing such systems in relation to all possible scenarios they may encounter in real life, therefore impacting the extent to which they can be predictably used. Operationally, not all actions performed/undertaken by an AI-based system may be anticipated by human operators. A certain level of unpredictability is therefore characteristic of AI-based systems.
(2) Second, I will explore how the use of AI-based decision-support may affect conflict-affected populations, in particular civilians or certain groups of civilians.
On this point, I would like to speak about the functionality of AI-based decision support systems. I want to draw on research conducted on civilian applications of AI technologies that addresses the so-called functionality fallacy of AI, especially when it comes to cases in the context of complex social problems— in fact, most social problems are complex.
In the discourse within the military context, we can still often hear AI technologies being described as a kind of “silver bullet”. Following this argument, AI technologies make military decision-making “better” by providing more reliable, thorough, and objective information, and therefore by making targeting more precise.
For those of us who are outside observers and not privy to specifics of military decision-making, it is practically impossible to assess the veracity of these claims. States such as Russia claiming to use ”high-precision” weapons integrating AI while striking civilian targets do not help. But we can compare these claims to what we know about civilian applications of AI technologies. Research in this sphere demonstrates two main insights:
First, in discussions about AI, there is more focus on whether these technologies can be ethical and value-aligned—that is trustworthy, democratic, fair, and interpretable— rather than on whether they are functional in the first place. The functionality of AI technologies is often simply assumed. This is not to say that the questions of ethical AI are not important, just that we should start with the more basic question of AI functionality.
Second, AI technologies often do not work. Even a cursory look at recent reports about them demonstrates this. Some examples include hospital bed assignment algorithms prioritising healthy over sick patients or AI-based content moderation tools regularly flagging safe content. Further, such failures have been found to disproportionately affect minority groups, e.g., the afore-mentioned hospital resource allocation algorithm’s misjudgements have mostly impacted Black and lower income patients. Since, 2019, the “AI, Algorithmic and Automation Incident and Controversy Repository (AAAIRC)” catalogues all of these cases based on open source data and can be used as an important source of information. Assessing AI technologies should therefore start with the question of whether they are indeed functional. Scholarship by Raji et al. presents, for example, a taxonomy of four types of AI technologies failure or dysfunctionality that have led to harm. The four types are impossible tasks, engineering failures, post-deployment failures, and communication failures.
As we hear much about the dual-use nature of AI and about the fact that innovation on AI tends to originate in the civilian rather than in the military sphere, we should not expect things to turn out differently in the case of military applications. Consequences of machine learning failures or dysfunctional AI technologies can even be worse in the military sphere as they may be irreversible for those affected by AI-based decision-making in the use of lethal force.
(3) Third, and final, I want to reflect on whether an “over-reliance”on AI-based military decision-support system might degrade human decision-making in the military sphere.
The “ideal” of integrating AI technologies in the military is to free operators’ thinking space. Rather than being bogged down by mundane or painstaking tasks, AI technologies supposedly allow, for example, intelligence analysts to think “better” about the data they receive and what it means, as well as passing on that information in a timely manner because of the AI-based increase in speed. But what is the quality of the thinking space that remains for operators of AI-based systems in the military?
Lessons learned from humans operating existing systems with automated and autonomous features already suggest that operators’ thinking space is limited. My research on human operators of air defence systems with automated and autonomous features, for example, has shown two things. First, operators often lack a functional understanding of how the system’s decision-making process works and the logic informing this process. Second, this in turn limits their capacity to scrutinise the system’s decision-making, leading to a potential over-trust in the system. The problem of automation bias—or the over-reliance on outputs provided by a machine rather than on one’s own critical facilities—is, indeed, well-explored in literature on human factor analysis.
Relying increasingly on AI-based decision-making systems in scenarios of human-machine interaction therefore means ceding critical deliberative mental space. This can have fundamental effects for what it means to be a human decision-maker in the military and in warfare. As human operators do not have the capacity to assess how algorithms reach conclusions and make decisions, trusting the systems becomes something of a fundamental requirement for operating them. There may be a risk of eliminating the possibility of human doubt in order for human-machine interaction to function. The point of using AI in decision-making seems to be to rule out human doubt. What place can human decision-making still have when teaming up with AI-based systems?