In short
- Google constructed the largest-ever flash flood dataset by utilizing Gemini to mine twenty years of worldwide information stories.
- The dataset now powers an AI mannequin that predicts city flash floods as much as 24 hours upfront.
- The system fills a significant knowledge hole that lengthy blocked flash flood forecasting.
Flash floods kill 1000’s of individuals yearly. They strike quick, hit cities hardest, and for many years there was nearly nothing scientists might do to see them coming, as a result of the information to coach prediction fashions merely did not exist.
On Thursday, Google mentioned it discovered a method round that drawback—by studying the information.
The corporate unveiled Groundsource, a system that makes use of Gemini AI to comb by way of tens of millions of reports articles revealed since 2000, pull out references to flood occasions, and pin every one to a location and a date. The result’s a dataset of two.6 million historic flash floods spanning greater than 150 international locations, and now open for anybody to obtain and use.
That dataset then was used to coach a brand new AI mannequin able to forecasting whether or not a flash flood is prone to hit an city space within the subsequent 24 hours. The forecasts are actually stay on Google’s Flood Hub, the identical platform the corporate already makes use of to warn roughly 2 billion folks about river-related flooding worldwide.
The issue Groundsource is fixing is surprisingly fundamental. Rivers have bodily gauges—sensors sitting within the water which were recording ranges for many years. That is how forecasters discovered to foretell when a river would overflow. Metropolis streets don’t have anything like that. When intense rain hits pavement and overwhelms drain methods, the flooding occurs too quick and too domestically to trace with conventional devices.
With out historic data, you possibly can’t practice an AI mannequin to acknowledge the sample. Google’s repair was to deal with information articles because the lacking sensor.
“By turning public data into actionable knowledge, we aren’t simply analyzing the previous—we’re constructing a extra resilient future for everybody in direction of our objective that nobody is shocked by a pure catastrophe,” Google mentioned.

After filtering out advertisements, navigation menus, and duplicates, and translating articles from different languages into English, the group turned tens of millions of messy textual content descriptions into clear, geolocated time-series knowledge.
The mannequin educated on that knowledge makes use of an LSTM neural community—a kind of AI constructed for processing sequences over time—to ingest hourly climate forecasts together with native elements like urbanization density, soil absorption charges, and topography. It then outputs a easy sign: medium or excessive flood threat within the subsequent 24 hours, for any city space with a inhabitants density above 100 folks per sq. kilometer.
The system has actual limitations. It solely covers areas of about 20 sq. kilometers at a time, cannot inform you how unhealthy a flood will likely be, and gained’t carry out properly in areas the place information protection is skinny.
Nonetheless, the early outcomes are telling. A regional catastrophe authority in Southern Africa acquired a Flood Hub alert through the beta section, confirmed the flood on the bottom, and dispatched a humanitarian employee to handle the response. In keeping with Google’s disaster resilience director Juliet Rothenberg, “that chain of occasions from a prediction in Flood Hub to boots on the bottom is precisely what Flood Hub was constructed for.”
Every day Debrief E-newsletter
Begin day-after-day with the highest information tales proper now, plus authentic options, a podcast, movies and extra.
