Trend Analysis API Reference¶
The climate_trends
accessor provides trend analysis capabilities for climate data.
Overview¶
The Trends module extends xarray Datasets with a .climate_trends
accessor that provides:
Time series trend calculation
Spatial trend analysis across geographic domains
STL decomposition with linear regression
Performance optimization for large-scale trend analysis
Memory-efficient parallel processing
Quick Example¶
import xarray as xr
import climate_diagnostics
ds = xr.open_dataset("temperature_data.nc")
# Calculate spatial trends
trends = ds.climate_trends.calculate_spatial_trends(
variable="air",
num_years=10 # Trend per decade
)
Accessor Class¶
- class climate_diagnostics.TimeSeries.Trends.TrendsAccessor(xarray_obj)[source]¶
Bases:
object
Accessor for analyzing and visualizing trend patterns in climate datasets.
This accessor provides methods to analyze climate data trends from xarray Datasets using statistical decomposition techniques. It supports trend analysis using STL decomposition and linear regression, with proper spatial (area-weighted) averaging, seasonal filtering, and robust visualization options.
The accessor handles common climate data formats with automatic detection of coordinate names (lat, lon, time, level) for maximum compatibility across different datasets and model output conventions.
- optimize_for_trends(variable: str | None = None, operation_type: str = 'timeseries', performance_priority: str = 'balanced') Dataset | None [source]¶
Optimize dataset chunking specifically for trend analysis workflows.
This method applies dynamic, pre-configured chunking strategies optimized for various trend analysis scenarios using the new dynamic chunk calculator.
- Parameters:
variable (str, optional) – Variable to optimize for. If None, optimizes for all variables.
operation_type (str, optional) – Specific trend analysis use case. Options: - ‘timeseries’: General trend calculation (default) - ‘spatial’: Grid-point trend analysis - ‘statistical’: For statistical operations - ‘io’: For I/O heavy operations
performance_priority (str, optional) – Optimization priority. Options: ‘memory’, ‘speed’, ‘balanced’.
- Returns:
Optimized dataset with trend-analysis-friendly chunking.
- Return type:
xr.Dataset or None
- calculate_trend(variable='air', latitude=None, longitude=None, level=None, frequency='M', season='annual', area_weighted=True, period=12, plot=True, return_results=False, save_plot_path=None, title=None)[source]¶
Calculate and visualize the trend of a time series for a specified variable and region.
This method performs the following steps: 1. Selects the data for the given variable and spatial/level domain. 2. Applies a seasonal filter. 3. Computes a spatial average (area-weighted or simple) to get a 1D time series. 4. Applies Seasonal-Trend decomposition using LOESS (STL) to isolate the trend component. 5. Fits a linear regression to the trend component to calculate the slope, p-value, etc. 6. Optionally plots the STL trend and the linear fit.
- Parameters:
variable (str, optional) – Name of the variable to analyze. Defaults to ‘air’.
latitude (float, slice, list, or None, optional) – Latitude selection for the analysis domain. Can be a single point, a slice, or a list of values. If None, the full latitude range is used.
longitude (float, slice, list, or None, optional) – Longitude selection for the analysis domain.
level (float, slice, or None, optional) – Vertical level selection. If a slice is provided, data is averaged over the levels. If None and multiple levels exist, the first level is used by default.
frequency ({'Y', 'M', 'D'}, optional) – The time frequency used to report the slope of the trend line. ‘Y’ for per year, ‘M’ for per month, ‘D’ for per day. Defaults to ‘M’.
season (str, optional) – Seasonal filter to apply before analysis. Supported: ‘annual’, ‘jjas’, ‘djf’, ‘mam’, ‘jja’, ‘son’. Defaults to ‘annual’.
area_weighted (bool, optional) – If True, performs area-weighted spatial averaging using latitude weights. Defaults to True. Ignored for point selections.
period (int, optional) – The periodicity of the seasonal component for STL decomposition. For monthly data, this is typically 12. Defaults to 12.
plot (bool, optional) – If True, a plot of the trend component and its linear fit is generated. Defaults to True.
return_results (bool, optional) – If True, a dictionary containing the detailed results of the analysis is returned. Defaults to False.
save_plot_path (str or None, optional) – If provided, the path where the plot will be saved.
title (str, optional) – The title for the plot. If not provided, a descriptive title will be generated automatically.
- Returns:
If return_results is True, returns a dictionary containing the analysis results, including the trend component (pandas Series), the predicted trend line, region details, and a DataFrame with trend statistics (slope, p-value, etc.). Otherwise, returns None.
- Return type:
dict or None
- Raises:
ValueError – If the variable is not found, no time coordinate is present, or if the data selection and processing result in an empty time series.
- calculate_spatial_trends(variable='air', latitude=None, longitude=None, time_range=None, level=None, season='annual', num_years=1, n_workers=4, robust_stl=True, period=12, plot_map=True, land_only=False, save_plot_path=None, cmap='coolwarm', optimize_chunks=True, performance_priority: str = 'balanced', title=None, projection='PlateCarree')[source]¶
Calculate and visualize spatial trends across a geographic domain.
This method computes the trend at each grid point over a specified time period and spatial domain. It leverages Dask for parallel processing to efficiently handle large datasets. The trend is calculated by applying STL decomposition and linear regression to the time series of each grid cell.
The trend is calculated robustly by performing a linear regression against time (converted to fractional years), making the calculation independent of the data’s native time frequency.
- Parameters:
variable (str, optional) – Name of the variable for which to calculate trends. Defaults to ‘air’.
latitude (slice, optional) – A slice defining the latitude range for the analysis. Defaults to the full range.
longitude (slice, optional) – A slice defining the longitude range for the analysis. Defaults to the full range.
time_range (slice, optional) – A slice defining the time period for the trend analysis. Defaults to the full range.
level (float or None, optional) – A single vertical level to select for the analysis. If None and multiple levels exist, the first level is used by default.
season (str, optional) – Seasonal filter to apply before analysis. Defaults to ‘annual’.
num_years (int, optional) – The number of years over which the trend should be reported (e.g., 1 for trend per year, 10 for trend per decade). Defaults to 1.
n_workers (int, optional) – The number of Dask workers to use for parallel computation. Defaults to 4.
robust_stl (bool, optional) – If True, use a robust version of the STL algorithm, which is less sensitive to outliers. Defaults to True.
period (int, optional) – The periodicity of the seasonal component for STL. Defaults to 12.
plot_map (bool, optional) – If True, plots the resulting spatial trend map. Defaults to True.
land_only (bool, optional) – If True, the output map will mask ocean areas. Defaults to False.
save_plot_path (str or None, optional) – Path to save the output trend map plot.
cmap (str, optional) – The colormap to use for the trend map plot. Defaults to ‘coolwarm’.
optimize_chunks (bool, optional) – Whether to optimize chunking for spatial trends calculation. Defaults to True.
performance_priority (str, optional) – Optimization priority for chunking. Options: ‘memory’, ‘speed’, ‘balanced’.
title (str, optional) – The title for the plot. If not provided, a descriptive title will be generated automatically.
projection (str, optional) – The name of the cartopy projection to use. Defaults to ‘PlateCarree’.
- Returns:
A DataArray containing the calculated trend values for each grid point, typically in units of [variable_units / num_years].
- Return type:
xr.DataArray
- Raises:
ValueError – If essential coordinates (time, lat, lon) are not found, or if the data selection results in insufficient data for trend calculation.
Available Methods¶
Trend Calculation¶
- TrendsAccessor.calculate_trend(variable='air', latitude=None, longitude=None, level=None, frequency='M', season='annual', area_weighted=True, period=12, plot=True, return_results=False, save_plot_path=None, title=None)[source]
Calculate and visualize the trend of a time series for a specified variable and region.
This method performs the following steps: 1. Selects the data for the given variable and spatial/level domain. 2. Applies a seasonal filter. 3. Computes a spatial average (area-weighted or simple) to get a 1D time series. 4. Applies Seasonal-Trend decomposition using LOESS (STL) to isolate the trend component. 5. Fits a linear regression to the trend component to calculate the slope, p-value, etc. 6. Optionally plots the STL trend and the linear fit.
- Parameters:
variable (str, optional) – Name of the variable to analyze. Defaults to ‘air’.
latitude (float, slice, list, or None, optional) – Latitude selection for the analysis domain. Can be a single point, a slice, or a list of values. If None, the full latitude range is used.
longitude (float, slice, list, or None, optional) – Longitude selection for the analysis domain.
level (float, slice, or None, optional) – Vertical level selection. If a slice is provided, data is averaged over the levels. If None and multiple levels exist, the first level is used by default.
frequency ({'Y', 'M', 'D'}, optional) – The time frequency used to report the slope of the trend line. ‘Y’ for per year, ‘M’ for per month, ‘D’ for per day. Defaults to ‘M’.
season (str, optional) – Seasonal filter to apply before analysis. Supported: ‘annual’, ‘jjas’, ‘djf’, ‘mam’, ‘jja’, ‘son’. Defaults to ‘annual’.
area_weighted (bool, optional) – If True, performs area-weighted spatial averaging using latitude weights. Defaults to True. Ignored for point selections.
period (int, optional) – The periodicity of the seasonal component for STL decomposition. For monthly data, this is typically 12. Defaults to 12.
plot (bool, optional) – If True, a plot of the trend component and its linear fit is generated. Defaults to True.
return_results (bool, optional) – If True, a dictionary containing the detailed results of the analysis is returned. Defaults to False.
save_plot_path (str or None, optional) – If provided, the path where the plot will be saved.
title (str, optional) – The title for the plot. If not provided, a descriptive title will be generated automatically.
- Returns:
If return_results is True, returns a dictionary containing the analysis results, including the trend component (pandas Series), the predicted trend line, region details, and a DataFrame with trend statistics (slope, p-value, etc.). Otherwise, returns None.
- Return type:
dict or None
- Raises:
ValueError – If the variable is not found, no time coordinate is present, or if the data selection and processing result in an empty time series.
Spatial Trend Analysis¶
- TrendsAccessor.calculate_spatial_trends(variable='air', latitude=None, longitude=None, time_range=None, level=None, season='annual', num_years=1, n_workers=4, robust_stl=True, period=12, plot_map=True, land_only=False, save_plot_path=None, cmap='coolwarm', optimize_chunks=True, performance_priority: str = 'balanced', title=None, projection='PlateCarree')[source]
Calculate and visualize spatial trends across a geographic domain.
This method computes the trend at each grid point over a specified time period and spatial domain. It leverages Dask for parallel processing to efficiently handle large datasets. The trend is calculated by applying STL decomposition and linear regression to the time series of each grid cell.
The trend is calculated robustly by performing a linear regression against time (converted to fractional years), making the calculation independent of the data’s native time frequency.
- Parameters:
variable (str, optional) – Name of the variable for which to calculate trends. Defaults to ‘air’.
latitude (slice, optional) – A slice defining the latitude range for the analysis. Defaults to the full range.
longitude (slice, optional) – A slice defining the longitude range for the analysis. Defaults to the full range.
time_range (slice, optional) – A slice defining the time period for the trend analysis. Defaults to the full range.
level (float or None, optional) – A single vertical level to select for the analysis. If None and multiple levels exist, the first level is used by default.
season (str, optional) – Seasonal filter to apply before analysis. Defaults to ‘annual’.
num_years (int, optional) – The number of years over which the trend should be reported (e.g., 1 for trend per year, 10 for trend per decade). Defaults to 1.
n_workers (int, optional) – The number of Dask workers to use for parallel computation. Defaults to 4.
robust_stl (bool, optional) – If True, use a robust version of the STL algorithm, which is less sensitive to outliers. Defaults to True.
period (int, optional) – The periodicity of the seasonal component for STL. Defaults to 12.
plot_map (bool, optional) – If True, plots the resulting spatial trend map. Defaults to True.
land_only (bool, optional) – If True, the output map will mask ocean areas. Defaults to False.
save_plot_path (str or None, optional) – Path to save the output trend map plot.
cmap (str, optional) – The colormap to use for the trend map plot. Defaults to ‘coolwarm’.
optimize_chunks (bool, optional) – Whether to optimize chunking for spatial trends calculation. Defaults to True.
performance_priority (str, optional) – Optimization priority for chunking. Options: ‘memory’, ‘speed’, ‘balanced’.
title (str, optional) – The title for the plot. If not provided, a descriptive title will be generated automatically.
projection (str, optional) – The name of the cartopy projection to use. Defaults to ‘PlateCarree’.
- Returns:
A DataArray containing the calculated trend values for each grid point, typically in units of [variable_units / num_years].
- Return type:
xr.DataArray
- Raises:
ValueError – If essential coordinates (time, lat, lon) are not found, or if the data selection results in insufficient data for trend calculation.
Optimization Methods¶
- TrendsAccessor.optimize_for_trends(variable: str | None = None, operation_type: str = 'timeseries', performance_priority: str = 'balanced') Dataset | None [source]
Optimize dataset chunking specifically for trend analysis workflows.
This method applies dynamic, pre-configured chunking strategies optimized for various trend analysis scenarios using the new dynamic chunk calculator.
- Parameters:
variable (str, optional) – Variable to optimize for. If None, optimizes for all variables.
operation_type (str, optional) – Specific trend analysis use case. Options: - ‘timeseries’: General trend calculation (default) - ‘spatial’: Grid-point trend analysis - ‘statistical’: For statistical operations - ‘io’: For I/O heavy operations
performance_priority (str, optional) – Optimization priority. Options: ‘memory’, ‘speed’, ‘balanced’.
- Returns:
Optimized dataset with trend-analysis-friendly chunking.
- Return type:
xr.Dataset or None
Basic Examples¶
Simple Trend Analysis¶
# Calculate trend for a time series
result = ds.climate_trends.calculate_trend(
variable="air",
latitude=slice(60, 90), # Arctic region
plot=True
)
Spatial Trends¶
# Calculate spatial trends (creates plot automatically)
trends = ds.climate_trends.calculate_spatial_trends(
variable="air",
num_years=1, # Trend per year
plot_map=True
)
Regional Trend Comparison¶
# Compare trends for different regions
regions = {
"Arctic": {"latitude": slice(90, 60)},
"Tropics": {"latitude": slice(23.5, -23.5)}
}
regional_trends = {}
for name, bounds in regions.items():
regional_data = ds.sel(**bounds)
# Calculate trend for the region using spatial trends
trend_result = regional_data.climate_trends.calculate_spatial_trends(
variable="air",
num_years=30,
plot_map=False # Don't plot individual regions
)
# Get mean trend for the region
regional_trends[name] = trend_result.mean()
# Plot comparison
import matplotlib.pyplot as plt
plt.figure(figsize=(10, 6))
trend_values = list(regional_trends.values())
region_names = list(regional_trends.keys())
plt.bar(region_names, trend_values)
plt.ylabel("Temperature Trend (K/30 years)")
plt.title("Regional Trend Comparison")
plt.show()
Working with Trend Results¶
# Calculate trends and work with results
trends = ds.climate_trends.calculate_spatial_trends(
variable="air",
num_years=10,
plot_map=False # Don't auto-plot
)
# Print statistics
print(f"Mean global trend: {trends.mean().values:.4f} K/decade")
print(f"Max trend: {trends.max().values:.4f} K/decade")
print(f"Min trend: {trends.min().values:.4f} K/decade")
# Create custom plot
import matplotlib.pyplot as plt
fig, ax = plt.subplots(figsize=(12, 8))
trends.plot.contourf(ax=ax, levels=20, cmap="RdBu_r", center=0)
ax.set_title("Temperature Trends (K/decade)")
plt.show()
Performance Optimization for Trends¶
Optimizing Large Datasets¶
# Optimize dataset for trend analysis
ds_optimized = ds.climate_trends.optimize_for_trends(
variable="air",
use_case='trend_analysis'
)
# Calculate trends on optimized dataset
trends = ds_optimized.climate_trends.calculate_spatial_trends(
variable="air",
num_years=30,
optimize_chunks=True
)
Memory-Efficient Trend Workflows¶
# Complete workflow for large datasets
import xarray as xr
import climate_diagnostics
# Load large climate dataset
ds = xr.open_dataset("large_climate_model.nc")
# Optimize for trend calculations
ds_opt = ds.climate_trends.optimize_for_trends(
variable="temperature"
)
# Calculate long-term trends efficiently
long_term_trends = ds_opt.climate_trends.calculate_spatial_trends(
variable="temperature",
num_years=50, # 50-year trends
n_workers=8, # Use 8 parallel workers
optimize_chunks=True,
chunk_target_mb=150
)
# Calculate regional trend time series
regional_trend = ds_opt.climate_trends.calculate_trend(
variable="temperature",
latitude=slice(30, 60),
longitude=slice(-120, -80),
plot=True
)
See Also¶
Time Series Analysis API Reference - Time series analysis methods
Plotting API Reference - Plotting functions