R . R .

#Issue 4: When AI drinks the water

Introduction

Consider this: “It is summertime. A heatwave hits. Water restrictions are in place. A major data centre has priority access to water under an existing contract. Fire risk is high. Nothing has failed yet, but the allocation has already been decided.”

Welcome to issue 4. If you have been following this newsletter, you will know that we spend a lot of time looking at what happens when AI systems and tools meet crisis environments and finding that the results are rarely as straightforward as your AI consultant suggested. This issue we are looking at something that I am not even sure has a name yet: the conflict between the resource demands of AI infrastructure and the resource needs of a disaster response.  For the purposes of this issue, let's call this a resource conflict, ie, a situation in which AI infrastructure and critical public services depend on the same finite resource under stress conditions, with no clear governance mechanism to determine priority.

In January 2025, wildfires tore through Greater Los Angeles. At least 30 people died and more than 16,000 structures were destroyed. The Eaton and Palisades fires were among the most destructive in California's history, and as firefighters struggled to contain the blazes, hydrants across LA County began to run dry.

The shortage of water pressure was attributed to multiple factors: an ongoing drought, ageing infrastructure, and the sheer scale of simultaneous demand. However, much less attention was paid to the fact that at the same time as the hydrants were failing, AI data centres across California were consuming enormous quantities of the same resource. There is no evidence that data centres caused the hydrant failures, but the fires exposed something harder to ignore: AI infrastructure and emergency response infrastructure are drawing on the same constrained resource, under the same stress conditions.

Let's remind ourselves of our four lenses from issue one:

  • Power & Agency

  • Data & Consent

  • Accountability & Governance

  • Operational Reality

Today's topic: When the infrastructure that powers AI competes with the infrastructure that fights fires, who decides who gets the water?

Primary lens: Power & Agency | Secondary lens: Accountability & Governance

Let's go….

How much water does AI actually use?

This is harder to answer than it should be, but the short answer is: a lot. The longer answer, but still broadly speaking, we are looking at the following:

Data centres, the physical infrastructure that runs AI systems, use water in two main ways. First, directly on-site, where water is used to cool servers that generate enormous amounts of heat. Second and indirectly, through the power plants that supply their electricity, many of which rely on water-intensive steam generation. The water used in cooling largely evaporates rather than being returned as treatable wastewater - basically, it disappears.

A peer-reviewed study published in Patterns estimated that AI systems' water footprint could reach between 312 and 764 billion litres in 2025 alone, which is a range comparable to the global annual consumption of bottled water.  The same study also highlights a more basic problem, in that AI-specific impacts are rarely reported separately, and most estimates rely on approximation rather than direct reporting.  In other words, we are not entirely sure how much water AI uses, partly because the companies operating these systems are not required to tell us. I don’t know about you, but this sets off alarm bells for me. However, let's continue….

A separate analysis found that data centres in Texas alone may use 49 billion gallons of water in 2025, rising to as much as 399 billion gallons by 2030 which is the equivalent to drawing down Lake Mead, the largest reservoir in the United States, by more than 16 feet in a single year.

California, where the Los Angeles fires burned, presents a particularly acute risk profile: chronic drought conditions, competing water demands, and climate-driven variability that places increasing pressure on already stressed water systems.

Where the data centres are

The sensible location of AI infrastructure is a critical consideration, and on current evidence, it is not being given nearly enough weight.  According to the Environmental Law Institute, more than 160 new AI data centres have been built across the United States in the past three years in places with scarce water resources.  A statistic so incredible that I actually said “whyyy” out loud when I first read it.  Anyway, it appears that siting decisions tend to prioritise proximity to data demand hubs and low latency, which does not always align with local environmental or water conditions.

California alone hosts 286 data centres, including 69 in Los Angeles County.  These are not abstract cloud resources. They are physical buildings, drawing on the same municipal water systems as the residents, businesses, and fire hydrants of the communities around them.

An analysis of 9,055 data centre facilities found that by the 2050s, nearly 45% may face high exposure to water stress.  A further MSCI analysis of 13,558 data centre assets found that about 30% of projects currently under construction are in regions where water scarcity is expected to intensify significantly by 2050.  This is not a future problem. It is already happening. The LA wildfires made it visible in the starkest possible way.

The governance gap

Three California lawmakers introduced bills specifically targeting AI data centre water consumption in the weeks following the fires. Assembly member Dian Papan stated plainly: “Water's a limited resource. I’m trying to make it so we are prepared and ahead of the curve as we pursue new technology.” 

But here is the structural problem: A peer-reviewed study in the Journal of Cleaner Production noted that utilities and regulators have already signalled the possibility of restricting data centre water access during droughts or peak demand, yet water sustainability is still not being treated as a core pillar of responsible AI infrastructure planning.

Major hyperscalers including Google, Microsoft, and Amazon have pledged to become “water positive” by 2030, meaning they aim to return more water to the environment than they consume.  However, these replenishment efforts do not necessarily address localised supply constraints faced by municipalities during droughts or heatwaves, which were the conditions that existed in Los Angeles in January 2025. 

In other words: the water offset might be happening somewhere. But if the hydrant outside your burning house has no pressure, that is not much comfort.  The seasoned disaster managers among you will have your head in your hands at this point, I’m sure.

Through the lenses

Power & Agency

The siting of data centres in drought-prone regions, the prioritisation of latency and cost over local water resilience, and the voluntary rather than mandatory nature of water reporting are all choices made by actors with the power to make them, in contexts where the communities most affected by those choices have the least influence over them.  When the hydrants run dry, the consequences are not distributed equally.

The deeper question is who gets to make these trade-offs before a crisis hits: utilities, private operators, regulators, emergency services, or some combination of them. At present, that answer is often fragmented, contractual, or opaque, which means the allocation may already be decided long before the emergency begins.

Accountability & Governance

You cannot govern what you cannot measure.

The Patterns study highlighted that the lack of distinction between AI and non-AI workloads in environmental reporting makes it extremely difficult to assess the true water footprint of AI systems, and called for mandatory disclosure standards.  The California legislative response is a start. But reactive legislation after a disaster is a poor substitute for the kind of proactive, transparent governance that might have prompted different siting and infrastructure decisions in the first place.

The thing nobody is saying

There is a conversation happening in the AI governance space about how to make AI systems more ethical, more explainable, and fairer. It is an important conversation, but it is almost entirely focused on what happens inside the model, ie, the data it was trained on, the decisions it makes, the outputs it produces.

What the Los Angeles wildfires make visible is something different: the physical infrastructure that makes AI possible has its own footprint, its own resource demands, and its own consequences, and those consequences do not respect the boundary between the AI system and the world it operates in.  We talk about AI as though it exists in the cloud, but it doesn't, it exists in buildings, drawing on water, consuming electricity, built in communities that may or may not have had any say in whether they wanted it there.

In a scenario where communities face a wildfire, their hydrants run dry, and the data centres next door keep humming along, the issue is not whether one directly caused the other. The issue is that no credible governance framework appears to exist for managing that resource tension when it matters most.

So what do we do about this?

This is not an argument against AI data centres, that ship has sailed and the trajectory we are on with AI means they are becoming part of our critical infrastructure. However efficiency and equity aren’t the same thing and a system that optimises for one at the expense of the other is not a system that is being governed well.

The minimum governance requirements seem clear enough:

  • Mandatory, standardised water disclosure at facility level

  • Regulatory frameworks that treat water as a critical constraint in siting decisions, especially in drought prone regions

  • Clear accountability mechanisms when AI infrastructure competes with public safety needs

The technology industry has spent years arguing that it should be trusted to self-regulate on environmental impact. Los Angeles in January 2025 offered one answer to that argument.

This week’s question

So, it’s over to you:  When a city is on fire and water is scarce, who decides whether it goes to cooling servers or saving homes, and what governance frameworks should manage this?

See you in a fortnight.RB

References

Brookings Institution (2025) AI, data centres, and water. Available at: https://www.brookings.edu/articles/ai-data-centers-and-water/

de Vries-Gao, A. (2026) ‘The carbon and water footprints of data centres and what this could mean for artificial intelligence’, Patterns, 7(1), 101430. https://doi.org/10.1016/j.patter.2025.101430

Environmental Law Institute (ELI) (2025) AI's cooling problem: how data centres are transforming water use. Available at: https://www.eli.org/vibrant-environment-blog/ais-cooling-problem-how-data-centers-are-transforming-water-use

Herrera, M., Xie, X., Menapace, A., Zanfei, A. and Brentan, B.M. (2025) ‘Sustainable AI infrastructure: A scenario-based forecast of water footprint under uncertainty’, Journal of Cleaner Production. Available at: https://www.sciencedirect.com/science/article/pii/S0959652625018785

InformationWeek (2025) LA wildfires raise burning questions about AI's data center water drain. Available at: https://www.informationweek.com/it-infrastructure/la-wildfires-raise-burning-questions-about-ai-s-data-center-water-drain

Lincoln Institute of Land Policy (2025) Data drain: the land and water impacts of the AI boom. Available at: https://www.lincolninst.edu/publications/land-lines-magazine/articles/land-water-impacts-data-centers/

MSCI (2025) When AI meets water scarcity: data centres in a thirsty world. Available at: https://www.msci.com/research-and-insights/blog-post/when-ai-meets-water-scarcity-data-centers-in-a-thirsty-world

Net Zero Insights (2025) How AI growth is intensifying data centre water consumption. Available at: https://netzeroinsights.com/resources/how-ai-intensifying-data-center-water-consumption/

U.S. Geological Survey (2023) Drought in California. Available at: https://www.usgs.gov/centers/california-water-science-center/science/drought-california

Read More
R . R .

Issue #3: The data she's not in

Introduction

Welcome to issue 3! This issue lands around International Women's Day (March 8th 2026), and while that's not the reason for the topic, it's not a coincidence either. The question of how women experience disasters differently to men is not new. What is relatively new is what happens when that uneven experience gets baked into the datasets that AI systems learn from.

In the last issue, we asked: when you know an AI system's accuracy is unevenly distributed, what does responsible deployment actually look like? One place to start answering that is to ask where the unevenness comes from. What if it doesn't start with the system, but with the data?

Let’s remind ourselves of our four lenses from issue one:

  1. Power and Agency

  2. Data and Consent

  3. Accountability and Governance

  4. Operational Reality

Each issue we will be looking at a topic through one or more of these lenses.

Today's topic: If the data that shapes crisis decisions doesn't see women clearly, what happens to the help that follows?

Primary lens: Data & Consent | Secondary lens: Power & Agency.

Let's go….

Who disasters hit hardest

Let's start with what we know. A landmark study published in the Annals of the Association of American Geographers analysed disaster impacts from 141 countries over two decades. It found that natural disasters on average kill more women than men, and that the stronger the disaster, the wider the gap becomes. But here's the critical finding: the higher women's socioeconomic status in a given country, the weaker this effect. In other words, the vulnerability isn't biological. It is structural. It is built into the everyday patterns of who has access to information, resources, and decision-making power - which historically is usually men.

This plays out in specific ways. During Bangladesh's 1991 cyclone, women were three to five times more likely to die than men. Research attributed this primarily to women's limited access to warning information and their lack of agency in deciding how to respond to the hazard. Evacuation decisions often depended on male household members, reflecting social norms that limited women’s independent mobility and access to warnings. 

Now, before we go further, this is not a simple story of "women always lose", nor is it diminishing the challenges unique to men that arise during a disaster response. As always, context matters enormously. Men account for the majority of flood-related deaths in Europe and the United States (often around two-thirds), largely linked to risk-taking behaviour and rescue activity. So you see, the gendered impact of disasters is not uniform, it depends on the social, cultural and economic context in which the disaster occurs. However, globally, the pattern is clear: existing inequalities are amplified, not equalised, by crisis.

Falling into the gap

So we know that men and women experience disasters differently and you could reasonably expect that the data we collect during and after disasters would reflect this.  However, yes you guessed it…. it often doesn't.

A World Bank report on gender and disaster risk found that disaster risk management lags behind other sectors in the collection and reporting of sex and age disaggregated data. Reviews of disaster risk management under the Hyogo Framework found that sex- and age-disaggregated data was rarely collected or analysed, and gender considerations were often absent from post-disaster needs assessments.  In addition, many countries still do not report disaggregated data even on the most basic indicators: deaths, injuries, and direct losses.

This matters for a simple reason. If you don't count it, you can't see it. And if you can't see it, you can't act on it.

There are also subtler ways the data gets skewed. The same report found that information on affected populations is often limited to aggregated numbers at the household level, rarely capturing individual-level data on damages and losses. When women lack access to bank accounts and hold a larger share of their assets in tangible form (eg, livestock, jewellery, stored crops etc), those assets are both more vulnerable to disaster and less visible in standard loss assessments. A destroyed house may get counted, but informal or unregistered assets will not.  The report explicitly called for collecting more information on damages and losses at the individual rather than household level. The result is that data collection practices can systematically make women's experiences less visible, not because anyone set out to exclude them, but because the default methods weren't designed to include them.

And it goes deeper still. A study of the Yemen humanitarian response, published in the International Journal of Information Management, conducted 25 interviews with humanitarian managers and analysts and reviewed 47 reports and datasets. The researchers found evidence of a cycle of bias reinforcement, in which biased data at the field level cascaded upward through headquarters and donor decision making levels of the response. Among the four types of bias they identified, sampling bias was particularly revealing: respondent groups were frequently gender-imbalanced, with males overrepresented. In some cases, questions on sexual and gender-based violence were removed from surveys in order to obtain approval from local authorities to conduct them. So the very harms that disproportionately affect women were being edited out of the data before it was even collected.  Although Yemen is a conflict rather than a natural disaster context, the dynamics of bias reinforcement in data collection apply across crisis types.

What this means for AI

If you read Issue 2, you'll recognise the shape of this problem. Last time, we looked at how the vantage point of a damage assessment system could create blind spots, with satellite imagery systematically under-reporting damage that was visible at closer range. The issue wasn't the AI model itself, it was the data the model learned from.

This issue extends that logic, in that if disaster data systematically underrepresents women's experiences, then any AI system trained on that data will inherit the gap. It will learn that the patterns in the data are the patterns that matter.

Think about what an AI system trained on this kind of data would absorb. It would learn that the household is the relevant unit for needs assessment. It would learn damage patterns from loss data that captures property and infrastructure but not the assets women are more likely to hold. It would learn to prioritise the types of harm that are most visible in existing datasets, which are the types of harm that existing collection methods were designed to capture, which are not the same as the types of harm that disproportionately affect women.

A scoping review in BMC Medical Ethics examining dozens of studies on AI in humanitarian crises found that biased data processing leading to inequitable assistance distribution was the most frequently cited ethical concern. The problem is not theoretical and as UNDP has noted, AI models trained on publicly available content tend to reflect the structural inequalities of the societies that produced the data, and these patterns are often reproduced or even amplified as models generate new outputs. 

The quiet compounding

Let's pause on something. None of this requires malice and none of it even requires negligence in any individual decision. A needs assessment team interviews the people who are available. Asset registrations follow existing legal and property norms. Survey instruments are adapted to what authorities will permit. Data gets aggregated at the household level because that's the standard unit. Each of these is a practical, defensible choice.  However, the combination produces a cumulative effect: a systematic underrepresentation of women's disaster experiences in the datasets that increasingly drive decisions, and when those datasets feed AI systems, the underrepresentation doesn't just persist - it scales.

This is what makes data bias different from data absence. Absence is visible: you can see a blank column, whereas bias looks like complete data. The dataset has entries, the model produces outputs, the dashboard displays results and everything appears to be working. Therefore, the gap is not in what the system shows, it is in what the system was never given to learn from.

Through the lenses

Through the Data & Consent lens, the issue is foundational. Data collection in crisis settings is already ethically fraught: consent is constrained, people have limited ability to refuse or control how their information is used, and the power imbalance between data collectors and affected populations is significant. When collection methods are also structurally gendered, failing to capture women's experiences or actively excluding certain types of harm from surveys, the data does not just reflect reality unevenly, it constructs a version of reality in which women's needs are systematically smaller than they actually are.

Through the Power & Agency lens, that constructed reality then shapes who gets what. If AI-driven resource allocation models, risk scores, or needs assessments are trained on gender-skewed data, they will produce outputs that appear objective while encoding existing inequalities. The communities and individuals whose experiences were underrepresented in the data will be underrepresented in the response. Not because someone decided they mattered less, but because the system's understanding of the situation was shaped by data that didn't fully see them.

So what do we do about this?

This is not an argument against data collection, or against AI in crisis response - both have value. However, there is a difference between collecting data and collecting data well, and there is a difference between a dataset that is large and a dataset that is representative.

The researchers who identified the bias reinforcement cycle in Yemen did not suggest abandoning data-driven response. They argued that practitioners and policymakers need to become aware of the biases in the data they use for decision-making, and that response organisations need to invest in identifying and mitigating those biases. The same applies to any AI system built on crisis data.

Sex and age disaggregated data is not a new concept. Frameworks exist. Standards exist. What often doesn't exist is the operational will, the funding, or the time to implement them properly under crisis conditions. That is an honest constraint, but it is also a choice about what we prioritise.

So it's over to you with this week's question:

If data collection practices in disasters have historically underrepresented women's experiences, and AI systems are now being trained on that data, who is responsible for the gap? And at what point does "we didn't have the data" stop being an acceptable answer?

See you in a fortnight. RB.

Sources & further reading

– Neumayer, E. & Plümper, T. (2007). "The Gendered Nature of Natural Disasters: The Impact of Catastrophic Events on the Gender Gap in Life Expectancy, 1981–2002." Annals of the Association of American Geographers, 97(3), 551–566. https://www.tandfonline.com/doi/full/10.1111/j.1467-8306.2007.00563.x

– Ikeda, K. (1995). "Gender Differences in Human Loss and Vulnerability in Natural Disasters: A Case Study from Bangladesh." Indian Journal of Gender Studies, 2(2), 171–193. https://journals.sagepub.com/doi/10.1177/097152159500200202

– Doocy, S., Daniels, A., Murray, S. & Kirsch, T.D. (2013). "The Human Impact of Floods: A Historical Review of Events 1980–2009 and Systematic Literature Review." PLOS Currents Disasters. https://pmc.ncbi.nlm.nih.gov/articles/PMC3644291/

– World Bank / GFDRR (2021). "Gender Dimensions of Disaster Risk and Resilience: Existing Evidence." https://www.worldbank.org/en/topic/disasterriskmanagement/publication/gender-dynamics-of-disaster-risk-and-resilience

– Paulus, D., de Vries, G., Janssen, M. & Van de Walle, B. (2023). "Reinforcing data bias in crisis information management: The case of the Yemen humanitarian response." International Journal of Information Management, 72, Article 102663. https://www.sciencedirect.com/science/article/pii/S0268401223000440

– Kreutzer, T., Orbinski, J., Appel, L., An, A., Marston, J., Boone, E. & Vinck, P. (2025). "Ethical implications related to processing of personal data and artificial intelligence in humanitarian crises: a scoping review." BMC Medical Ethics, 26, 49. https://pmc.ncbi.nlm.nih.gov/articles/PMC11998222/

– UNDP (2025). "AI, gender bias and development." https://www.undp.org/eurasia/blog/ai-gender-bias-and-development

Read More
R . R .

Issue #2: It worked, just not equally

Introduction

In crisis contexts, generally speaking, decisions are made in good faith and are defensible at the time. Albeit the nature of crisis environments means that no matter how well intended a course of action, often the results can be questionable. It is the very reasonableness of such decision making that throws up the most ethical questions. As such, this newsletter is less interested in villains and "gotchas" and more the structural and systemic environments that allowed a particular choice and decision to be made.

Note: There are of course plenty of examples of bad faith decision making within the sector, but that is a topic for another issue! You will remember the four lenses from issue 1.

1.Power and Agency

2.Data and Consent

3.Accountability and Governance

4.Operational Reality

Each issue we will be looking at a topic through one or more of these lenses.

Today's topic: If an AI system can only act on what it can see, what happens to the damage it can't?

Primary lens: Operational Reality | Secondary lens: Power & Agency

Let's go….

By way of example, let us consider the aftermath of major hurricanes. The pressure to assess damage quickly is enormous. Emergency managers need to know which areas were hit hardest and how to allocate limited resources. Traditionally, this has meant sending inspection teams into affected areas to assess the damage. This is usually a slow, dangerous, and labour- intensive process that can take days or weeks to produce a usable picture. We also need to know where to send these teams in the first place, which is itself a difficult call to make.

Immediately we can see the appeal of AI in this space. Fly drones or reposition satellites, capture imagery, and let machine learning models classify damage across thousands of buildings in minutes rather than days. Faster information + faster decisions = better help.

Of course there is a 'but…' coming…

Before we launch ourselves head first into the doom and gloom, it's important to remember that some real achievements have been made in this domain. For example, when AI-based damage assessment tools were deployed during the 2024 hurricane season, the results were genuinely impressive. One system assessed over 400 buildings in approximately 18 minutes. The technology worked. But, worked for whom? (There's that 'but' i was just talking about).

What we can't see from the sky

Aerial based damage assessment, whether done by humans or machines, depends on a basic assumption: that what is visible from above reflects what has happened below. (I can feel the collective eye roll from you seasoned disaster managers! But stick with me, and let's get into this, that's what we are here for).

To understand why this matters, it helps to know how these systems are built. First, humans look at aerial images from a previous disaster and label each building: no damage, minor damage, major damage, destroyed. An AI model is then trained on those labels and it learns the visual patterns that correspond to each category. The model is tested against more human labels from the same imagery source, and if it performs well enough, it gets deployed. When the next hurricane hits, fresh imagery is captured and the model classifies each building based on what it previously learned.

The quality of every step in that chain depends on the quality of the first one: the human labels. And the quality of those labels depends on what the humans could see. Research published at the ACM Conference on Fairness, Accountability, and Transparency audited damage labels for over 15,000 buildings across three major US hurricanes. The study compared assessments made from satellite imagery with those made from drone imagery for the same buildings, using the same damage scale, labelled by human assessors.

The disagreement rate was 29%, meaning that nearly one in three buildings was classified differently depending on which imagery source was used. Also, this pattern was not random with satellite-derived labels systematically under-reporting damage compared to drone-derived labels. Meaning that buildings that appeared undamaged from satellite altitude showed clear evidence of harm when viewed at closer range.

Strictly speaking, this isn't an AI failure story. In fact, a large proportion of the most consequential problems with AI often aren't in the AI itself, they're in everything that happens before the AI is switched on. This matters because it tells us something important about where bias enters a system, and it enters earlier than most people might think. The problem is not that an AI model made a mistake, the problem is that the data the model learns from already carries a structural blind spot.

The study didn't test an AI model specifically and it didn't need to, in order to offer up useful findings. Namely, if you train a system on satellite imagery that under-reports damage, the system will learn to under-report damage. It will do so confidently, at scale, and without flagging the gap. It doesn't matter how good your model is if what it's learning from doesn't reflect what's actually on the ground.

And here's the uncomfortable part: the model can appear to perform well at every stage. However, because it is tested against labels drawn from the same imagery source it was trained on, its accuracy is measured against the same skewed standard. Without ground- level verification i.e. someone physically going to the building, there is no way to know which aerial view was closer to reality. You just know they disagreed nearly a third of the time, and that the disagreement went in one direction.

These concerns are not unique to a single study. Independent reviews of AI in emergency management and humanitarian aid have identified data collection practices as a recurring source of structural bias across the field.

The buildings that don't look damaged enough

Let's do a hypothetical walkthrough. After a hurricane, some types of damage are highly visible from the air: collapsed roofs, debris fields, destroyed structures. These are the patterns that imagery-based models are best at detecting. Communities with this kind of damage should be prioritised. But other types of damage are less visible from above. Flooding that devastated ground floors while leaving roofs intact, structural compromise that is not apparent from aerial angles, damage to informal or non-standard housing that does not match the patterns in training data and damage in densely vegetated areas where buildings are partially obscured.

None of this is invisible. It is simply harder to see from the vantage point the system relies on. The result is a quiet but consequential skew. Areas with dramatic, visible destruction are identified quickly and accurately - excellent! But, areas with serious but less photogenic damage risk being categorised as lower priority and not because someone decided they mattered less, but because the system's view of the world made their damage harder to detect. This is not a failure of the technology. It is a failure of the perspective.

Reasonable decisions, uneven consequences. It is worth pausing here, because the instinct at this point may be to look for the person who got it wrong. But that is precisely what makes this example ethically interesting. Every decision in this chain is understandable. Using satellite imagery makes sense because it covers vast areas quickly and is available soon after a disaster. Training models on the largest available datasets is standard practice. Deploying automated tools to speed up assessment is a rational response to the data avalanche that modern disasters produce - one operational deployment documented receiving up to 369GB of drone imagery per day which is far more than any team can manually review.

Of course, no one set out to create a system that would deprioritise less visible damage. But the combination of practical, defensible decisions about imagery sources, training data, deployment timelines, and coverage priorities produced exactly that outcome. So what would this mean if these skewed labels were feeding into real-world decisions?

Through the Operational Reality lens, the system's outputs would appear sound. Buildings classified, maps generated, assessments delivered on time. The model would perform as designed. But its performance would be shaped by assumptions about resolution, vantage point, and what "damage" looks like. These are the very assumptions that, as the study shows, do not hold equally across imagery sources.

Through the Power & Agency lens, those outputs would then shape downstream decisions. Damage assessments inform where inspection teams go first, which communities receive priority aid, and how recovery funding is allocated. An AI system that detects damage unevenly would not just produce inaccurate data, it would redistribute attention, resources, and urgency, without anyone explicitly making that choice.

Speed isn't neutral

Going back to the ACM conference paper. The researchers who built and deployed these systems deserve credit for something: they documented not just their successes but their limitations.

Their published work openly identifies the operational challenges, including connectivity failures, resolution variations and spatial misalignment that degraded or delayed their outputs in the field. They also conducted the audit that revealed the 29% disagreement between imagery sources. More of this kind of transparency please!!

But it also raises an uncomfortable question: if the people who build these systems are openly acknowledging the gaps, what happens when organisations adopt the tools without reading the fine print? In fast-moving disaster response, the pressure to act on available information is immense. A damage assessment that arrives quickly, looks comprehensive, and provides clear classifications is hard to second-guess, especially when the alternative is waiting days for ground-level verification. Speed creates its own authority. But speed is not neutral. When a system produces results faster than they can be validated, the window for challenge shrinks. And when those results carry the visual authority of a colour-coded damage map, they tend to be treated as more definitive than their creators intended.

The risk is not that emergency managers blindly trust AI. Most are experienced, sceptical, and aware of limitations. The risk is that under pressure, with incomplete information and urgent demands for action, a "good enough" automated assessment becomes the baseline and the communities it overlooks become harder to advocate for, because the data does not make their case visible.

So, what are we going to do about this?

This is not an argument against AI-assisted damage assessment. The technology addresses a real and growing problem, and early deployments have shown it can deliver results at a speed that manual assessment simply cannot match. But efficiency and equity are not the same thing. And a system that delivers both speed and confidence can make the gap between them harder to see, not easier.

So it's over to you with this week's question: When an AI system is faster and more scalable than any alternative, but you know its accuracy is unevenly distributed, what does responsible deployment actually look like?

Leave your thoughts in the comments and I will see you in a fortnight.

RB.


Sources & further reading:  Manzini, T., Perali, P., Tripathi, J. & Murphy, R. (2025). "Now you see it, Now you don't: Damage Label Agreement in Drone & Satellite Post-Disaster Imagery." ACM Conference on Fairness, Accountability, and Transparency (FAccT '25). https://dl.acm.org/doi/10.1145/3715275.3732135 – Manzini, T., Perali, P. & Murphy, R. (2025). "Deploying Rapid Damage Assessments from sUAS Imagery for Disaster Response." arXiv preprint, Texas A&M University. https://arxiv.org/abs/2511.03132 – Manzini, T., Perali, P., Murphy, R. & Merrick, D. (2025). "Challenges and Research Directions from the Operational Use of a Machine Learning Damage Assessment System." arXiv preprint, Texas A&M University / Florida State University. https://arxiv.org/abs/2506.15890 – Lythreatis, S. et al. (2025). "Artificial Intelligence in Humanitarian Aid: A Review and Future Research Agenda." Technovation. https://www.sciencedirect.com/science/article/pii/S0166497225002470



Read More
R . R .

Issue #1. Ethical AI Wasn’t Designed for Disasters

Why read this newsletter?

Every disaster movie starts with someone ignoring a scientist.

Artificial intelligence is everywhere. Whether organisations are already using it or still exploring the possibilities, there is no shortage of new tools and terms to keep up with - from systems like ChatGPT to concepts such as large language models, prompt engineering, hallucinations, and retrieval-augmented generation (RAG). The list grows daily.

AI is also increasingly present in disaster and emergency management - from early warning systems and damage assessment to logistics planning and decision support. But what does this actually mean in practice?

In crisis settings, decisions are made under pressure, with incomplete information, and often with life-altering consequences. The promise of AI in these environments is clear: speed, scale, and analytical reach beyond human capacity. But with that promise comes real risk. Errors, bias, and opaque systems can be introduced precisely where there is the least room for them.

So what are we going to do about this?

Ethical guidance for AI exists across many documents, organisations, and initiatives - but it is fragmented and uneven in quality. Even less clear is how these principles hold up in crisis environments, where consent is constrained, accountability is scattered, and “human-in-the-loop” may be more symbolic than real.

This newsletter exists to sit in that gap - without claiming to fill it.

It does not argue for or against the use of AI in crisis contexts. Instead, it slows things down enough to ask how AI is being used, who it serves, and who bears the risk when it fails.

Why do disasters change the ethical rules so much anyway?

Disasters create unusual environments. They compress time, concentrate power, and narrow the range of choices available to individuals and communities. Decisions that would normally involve consultation, consent, or deliberation are often made quickly, by a small number of actors, under conditions of uncertainty.

This matters for AI because many ethical assumptions baked into technology design do not hold in crisis settings. Consent may be nominal or impossible. Data may be collected from people with little ability to refuse, limited understanding of how it will be used, or no control over its future reuse. Accountability can become scattered as responsibility is distributed across agencies, vendors, models, and workflows.

In these contexts, AI systems do not simply support decisions. They can shape strategy - influencing what is seen as urgent, relevant, or even possible. A risk score, forecast, or generated summary can quietly steer attention and resources, even when a human remains formally in charge.

The challenge is not only whether an AI system is accurate, but whether its influence is visible, contestable, and appropriate for the moment in which it is used.

What may be acceptable in planning or low-stakes settings can become ethically fraught when lives, livelihoods, and trust are on the line.

How will this newsletter approach this?

Ethical risks in disaster settings rarely appear as isolated technical failures. More often, they emerge from predictable patterns across tools, organisations, and crisis contexts.

To make those patterns visible, this newsletter examines AI developments through four recurring lenses - each highlighting a different way ethical risk tends to surface when AI is introduced into high-stakes, human-centred decision-making.

These lenses are not exhaustive, but they capture the most common ways ethical risk surfaces when AI enters crisis decision-making.

The four lenses

1. Power & Agency

Who decides, who benefits, and who bears the risk?

In crisis settings, AI systems can acquire authority they have not earned. Confident forecasts, scores, or summaries may narrow the range of options under consideration, embedding value judgements about what matters most - including implicit assumptions about fairness and whose needs are prioritised.

In time-critical environments, these outputs can shortcut human deliberation rather than support it, shifting decision-making power away from people and towards systems whose assumptions are rarely explicit or contested at the moment decisions are made.

What appears to be neutral technical support can, in practice, define urgency, shape priorities, and influence who receives assistance - with consequences that are not evenly distributed.

2. Data & Consent

How is data collected, owned, reused, and protected?

Human-centred crises are environments where meaningful consent is often constrained or impossible. Ethical data practices designed for stable settings - informed consent, clear purpose limitation, restrictions on reuse - can quickly erode under emergency conditions.

Bias can be introduced not only through what data is collected, but through whose data is missing, outdated, or over-represented. The ethical risk is not limited to collection itself, but extends to what happens afterwards: when crisis data is retained, repurposed, or combined in ways that expose communities to harm long after the emergency has passed.

3. Accountability & Governance

Who is responsible when AI influences life-critical decisions?

When AI systems are embedded into complex, multi-agency crisis workflows, responsibility can become blurred. Decisions may be shaped by a combination of data pipelines, models, vendors, internal teams, and partner organisations.

Influence ≠ responsibility.

Systems can shape outcomes without being accountable for them, while organisations may retain formal accountability without the power, time, or transparency needed to intervene meaningfully.

Calls for transparency or explainability do not resolve this on their own if decision-makers cannot realistically challenge, override, or pause an AI system under operational pressure. When something goes wrong, it is often unclear who can intervene - or where accountability ultimately sits.

4. Operational Reality

What actually happens on the ground - not what the model promises?

AI systems often rely on historical or aggregated data that struggles to keep pace with rapidly changing crisis conditions. Infrastructure damage, population movement, political constraints, and access limitations can quickly invalidate model assumptions.

In these moments, AI rarely fails loudly. Instead, it risks failing quietly - producing outputs that appear reasonable, explainable, or technically sound while no longer reflecting the realities facing responders and affected communities.

Does this feel uncomfortably familiar? Then you’re probably in the right place.

Rather than treating these lenses as abstract principles, this newsletter uses them as practical tools. Some issues will explore a single lens in depth; others will examine specific cases through several lenses at once.

As this series begins, the aim is not to provide answers, but to ask better questions - together.

Week one question

If AI systems reshape human-centred crisis decision-making in subtle ways, how would we notice - and what would it take to intervene in time?

Leave your thoughts in the comments and I’ll see you in a fortnight.

RB.

Read More