<RETURN_TO_BASE

DeepMind’s WeatherNext 2: Functional Generative Networks Power Faster 15-Day Probabilistic Forecasts

'DeepMind's WeatherNext 2 leverages Functional Generative Networks and a large ensemble to sample full 15-day global weather trajectories, producing faster, more accurate probabilistic forecasts now integrated across Google products.'

What WeatherNext 2 does

Google DeepMind has released WeatherNext 2, an AI-driven medium range global forecasting system that now powers upgraded forecasts across Google Search, Gemini, Pixel Weather and the Google Maps Platform Weather API, with full Maps integration coming soon. The system produces probabilistic 15 day global trajectories at 0.25° resolution and a 6 hour timestep, and is available as data products in Earth Engine, BigQuery and as an early access model on Vertex AI.

Functional Generative Network architecture

At the core of WeatherNext 2 is the Functional Generative Network (FGN). Rather than predicting a single deterministic field, FGN directly samples from the joint distribution over full 15 day global weather trajectories. Each state X_t includes six atmospheric variables across 13 pressure levels plus six surface variables on a 0.25° latitude-longitude grid, with updates every 6 hours.

FGN maps the regular grid to a latent representation on a spherical icosahedral mesh refined six times. A graph neural network encoder and decoder handle the grid to mesh transforms, and a graph transformer operates on mesh nodes. The production model is substantially larger than the earlier GenCast denoiser: roughly 180 million parameters per model seed, latent dimension 768 and 24 transformer layers, compared with GenCast's 57 million parameters, latent 512 and 16 layers. The model is run autoregressively from two initial analysis frames to generate ensemble trajectories.

Modeling epistemic and aleatoric uncertainty

WeatherNext 2 separates epistemic and aleatoric uncertainty in a scalable way. Epistemic uncertainty is represented by a deep ensemble of four independently initialized and trained FGN seeds. For each seed the system generates ensemble members, producing a large and diverse ensemble.

Aleatoric uncertainty is handled via shared functional perturbations. At each forecast step the model samples a 32-dimensional Gaussian noise vector epsilon_t and feeds it through parameter-shared conditional normalization layers in the network. This effectively samples a new set of weights theta_t for that forward pass. Different noise realizations create dynamically coherent alternative forecasts for the same initial condition, so ensemble members represent distinct plausible outcomes rather than independent gridpoint noise.

Training on marginals with CRPS and learning joint structure

FGN is trained on per-location, per-variable marginals using the Continuous Ranked Probability Score (CRPS) as the loss. CRPS is estimated fairly from ensemble samples at each grid point and averaged across variables, levels and time, encouraging sharp, well calibrated scalar predictive distributions. Later training stages introduce short autoregressive rollouts (up to eight steps) and backpropagate through them to improve longer range stability.

Although supervision is marginal, the low-dimensional global noise and shared functional perturbations force the model to learn realistic joint spatial and cross-variable structure. A single 32-dimensional noise vector influencing the entire field incentivizes encoding physically consistent spatial correlations and cross-variable relationships, which reduces CRPS more effectively than independent fluctuations would. Experiments show the ensemble captures realistic regional aggregates and derived quantities.

Performance vs GenCast and traditional baselines

On marginal metrics WeatherNext 2's FGN ensemble improves over GenCast in 99.9% of variable, level and lead time combinations, with average CRPS improvements around 6.5% and maximum gains near 18% for some variables at shorter lead times. Ensemble mean RMSE also improves while maintaining reliable spread-error relationships out to 15 days.

To assess joint behavior the team pooled CRPS over spatial windows and evaluated derived quantities like 10 meter wind speed and geopotential thickness between 300 hPa and 500 hPa. FGN shows lower pooled CRPS than GenCast, indicating better modeling of regional aggregates and multivariate relationships.

Tropical cyclone tracking is a critical application. Using an external tracker, the research shows FGN yields ensemble mean track errors equivalent to roughly one extra day of useful predictive skill compared with GenCast. Even when constrained to a 12 hour timestep, FGN outperforms GenCast beyond two day lead times. Relative economic value analyses on track probability fields also favor FGN across a range of cost-loss ratios, which matters for evacuation and asset protection decisions.

Availability and integrations

WeatherNext 2 is exposed as data products in Earth Engine and BigQuery, and as an early access model on Vertex AI. It already powers updated weather features in Google Search, Gemini and Pixel Weather, and the Google Maps Platform Weather API, with broader Maps integration planned next. The research paper and project resources provide further technical details and code references for those who want to explore models and tutorials.

🇷🇺

Сменить язык

Читать эту статью на русском

Переключить на Русский