Time Series Analysis API Reference¶

The climate_timeseries accessor provides time series analysis capabilities for climate data.

Overview¶

The TimeSeries module extends xarray Datasets with a .climate_timeseries accessor that provides:

Time series plotting and visualization
Spatial standard deviation analysis
STL decomposition for trend and seasonal analysis
Advanced chunking optimization for large datasets
Memory-efficient processing strategies
Performance tuning and diagnostics

Quick Example¶

import xarray as xr
import climate_diagnostics

ds = xr.open_dataset("temperature_data.nc")

# Plot a time series
fig = ds.climate_timeseries.plot_time_series(
    variable="air",
    latitude=slice(30, 60)
)

Accessor Class¶

class climate_diagnostics.TimeSeries.TimeSeries.TimeSeriesAccessor(xarray_obj)[source]¶

Bases: object

Accessor for analyzing and visualizing climate time series from xarray datasets. Provides methods for extracting, processing, and visualizing time series with support for weighted spatial averaging, seasonal filtering, and time series decomposition.

__init__(xarray_obj)[source]¶: Initialize the accessor with a Dataset object.

optimize_chunks(target_mb: float = 50, max_mb: float = 200, variable: str | None = None, time_freq: str | None = None, inplace: bool = False)[source]¶

Optimize dataset chunking for better time series analysis performance.

This method applies intelligent chunking strategies specifically tuned for time series operations. It analyzes the dataset structure and applies memory-efficient chunking that balances: - Memory usage (target chunk sizes) - Computational efficiency (parallel processing) - I/O performance (disk access patterns)

The chunking strategy preserves spatial chunks from disk when beneficial and optimizes time dimension chunking for typical time series workflows.

Parameters:

target_mb (float, optional) – Target chunk size in megabytes. Defaults to 50 MB. This is the “sweet spot” for most time series operations.
max_mb (float, optional) – Maximum chunk size in megabytes. Defaults to 200 MB. Hard limit to prevent memory exhaustion.
variable (str, optional) – Variable to optimize chunking for. If None, optimizes for all variables. Focusing on a specific variable can yield better optimization.
time_freq (str, optional) – Time frequency hint (‘daily’, ‘monthly’, ‘hourly’, ‘6hourly’). Helps the algorithm make better chunking decisions.
inplace (bool, optional) – DEPRECATED: In-place modification is not supported for xarray datasets. This parameter is kept for API compatibility but will raise an error if True. Always use False and reassign the result.

Returns:

Optimally chunked dataset if successful, None if chunking unavailable.

Return type:

xr.Dataset or None

Raises:

ValueError – If inplace=True is requested (not supported).

Examples

>>> # Basic optimization for time series analysis
>>> ds_optimized = ds.climate_timeseries.optimize_chunks(target_mb=100)
>>>
>>> # Focus on specific variable with time frequency hint
>>> ds_opt = ds.climate_timeseries.optimize_chunks(
...     variable='temperature', time_freq='daily', target_mb=75
... )

Notes

This method is a simplified interface to the sophisticated chunking system. For advanced control, use optimize_chunks_advanced().

print_chunking_info(detailed: bool = False)[source]¶

Print information about current dataset chunking.

Provides a clear overview of current chunking configuration, which is essential for understanding memory usage and performance characteristics.

Parameters:: detailed (bool, optional) – Whether to print detailed per-variable information. Defaults to False. When True, shows chunking info for each variable individually.

optimize_chunks_advanced(operation_type: str = 'timeseries', memory_limit_gb: float | None = None, performance_priority: str = 'balanced', variable: str | None = None, use_disk_chunks: bool = True, inplace: bool = False)[source]¶

Advanced chunking optimization using sophisticated strategies.

This method implements the advanced chunking strategy that: • Inspects on-disk chunking from file encoding • Calculates bytes per time step for optimal memory management • Chooses time chunks based on target memory usage and parallelization • Preserves spatial chunking when beneficial • Adapts to different operation types and performance priorities

Parameters:

operation_type (str, optional) – Type of operation to optimize for. Options: - ‘timeseries’: Time series analysis (trends, decomposition) - ‘spatial’: Spatial analysis and plotting - ‘statistical’: Statistical computations - ‘general’: General purpose chunking - ‘io’: Input/output operations Default is ‘timeseries’.
memory_limit_gb (float, optional) – Memory limit in GB. If None, uses 25% of available system memory.
performance_priority (str, optional) – Performance optimization priority. Options: - ‘memory’: Minimize memory usage - ‘speed’: Maximize computational speed - ‘balanced’: Balance memory and speed Default is ‘balanced’.
variable (str, optional) – Variable to optimize chunking for. If None, optimizes for all variables.
use_disk_chunks (bool, optional) – Whether to use disk-aware chunking strategy. Defaults to True.
inplace (bool, optional) – DEPRECATED: In-place modification is not supported for xarray datasets. This parameter is kept for API compatibility but will raise an error if True.

Returns:

Optimally chunked dataset if inplace=False, None otherwise.

Return type:

xr.Dataset or None

Examples

>>> # Optimize for time series analysis with memory priority
>>> ds_opt = ds.climate_timeseries.optimize_chunks_advanced(
...     operation_type='timeseries',
...     performance_priority='memory'
... )

>>> # Optimize for spatial analysis with speed priority
>>> ds = ds.climate_timeseries.optimize_chunks_advanced(
...     operation_type='spatial',
...     performance_priority='speed'
... )

analyze_chunking_strategy(variable: str | None = None)[source]¶

Analyze and suggest optimal chunking strategies for different use cases.

This method inspects the dataset and provides recommendations for different types of climate analysis operations.

Parameters:: variable (str, optional) – Variable to analyze. If None, analyzes all variables.

Examples

>>> ds.climate_timeseries.analyze_chunking_strategy()

optimize_for_decomposition(variable: str | None = None)[source]¶

Optimize chunking specifically for STL time series decomposition.

STL decomposition benefits from larger time chunks and smaller spatial chunks to minimize memory usage while maintaining good performance.

Parameters:: variable (str, optional) – Variable to optimize for. If None, optimizes for all variables.
Returns:: Optimally chunked dataset for STL decomposition.
Return type:: xr.Dataset or None

plot_time_series(variable='air', latitude=None, longitude=None, level=None, time_range=None, season='annual', year=None, area_weighted=True, figsize=(16, 10), save_plot_path=None, optimize_chunks=True, chunk_target_mb=50, title=None)[source]¶

Plot a time series of a spatially averaged variable.

This function selects data for a given variable, performs spatial averaging over the specified domain, and plots the resulting time series.

Parameters:

variable (str, optional) – Name of the variable to plot. Defaults to ‘air’.
latitude (float, slice, or list, optional) – Latitude range for spatial averaging.
longitude (float, slice, or list, optional) – Longitude range for spatial averaging.
level (float, slice, or list, optional) – Vertical level selection.
time_range (slice, optional) – Time range for the series.
season (str, optional) – Seasonal filter. Defaults to ‘annual’.
year (int, optional) – Filter for a specific year.
area_weighted (bool, optional) – If True, use latitude-based area weighting for the spatial mean. Defaults to True.
figsize (tuple, optional) – Figure size. Defaults to (16, 10).
save_plot_path (str or None, optional) – If provided, the path to save the plot figure.
optimize_chunks (bool, optional) – Whether to automatically optimize chunking for performance. Defaults to True.
chunk_target_mb (float, optional) – Target chunk size in MB for optimization. Defaults to 50 MB.
title (str, optional) – The title for the plot. If not provided, a descriptive title will be generated automatically.

Returns:

The Axes object of the plot, or None if no data could be plotted.

Return type:

matplotlib.axes.Axes or None

plot_std_space(variable='air', latitude=None, longitude=None, level=None, time_range=None, season='annual', year=None, area_weighted=True, figsize=(16, 10), save_plot_path=None, title=None)[source]¶

Plot a time series of the spatial standard deviation of a variable.

This function calculates the standard deviation across the spatial domain for each time step and plots the resulting time series. This can be used to analyze the spatial variability of a field over time.

Parameters:

variable (str, optional) – Name of the variable to plot. Defaults to ‘air’.
latitude (float, slice, or list, optional) – Latitude range for the calculation.
longitude (float, slice, or list, optional) – Longitude range for the calculation.
level (float, slice, or list, optional) – Vertical level selection.
time_range (slice, optional) – Time range for the series.
season (str, optional) – Seasonal filter. Defaults to ‘annual’.
year (int, optional) – Filter for a specific year.
area_weighted (bool, optional) – If True, use latitude-based area weighting for the standard deviation. Defaults to True.
figsize (tuple, optional) – Figure size. Defaults to (16, 10).
save_plot_path (str or None, optional) – If provided, the path to save the plot figure.
title (str or None, optional) – Custom plot title. A default title is generated if not provided.

Returns:

The Axes object of the plot, or None if no data could be plotted.

Return type:

matplotlib.axes.Axes or None

decompose_time_series(variable='air', level=None, latitude=None, longitude=None, time_range=None, season='annual', year=None, stl_seasonal=13, stl_period=12, area_weighted=True, plot_results=True, figsize=(16, 10), save_plot_path=None, optimize_chunks=True, chunk_target_mb=75, title=None)[source]¶

Decompose a time series into trend, seasonal, and residual components using STL.

Seasonal-Trend decomposition using LOESS (STL) is a robust method for decomposing a time series. This function first creates a spatially-averaged time series and then applies the STL algorithm.

Parameters:

variable (str, optional) – Name of the variable to decompose. Defaults to ‘air’.
level (float, slice, or list, optional) – Vertical level selection.
latitude (float, slice, or list, optional) – Latitude range for spatial averaging.
longitude (float, slice, or list, optional) – Longitude range for spatial averaging.
time_range (slice, optional) – Time range for the series.
season (str, optional) – Seasonal filter. Defaults to ‘annual’.
year (int, optional) – Filter for a specific year.
stl_seasonal (int, optional) – Length of the seasonal smoother for STL. Must be an odd integer. Defaults to 13.
stl_period (int, optional) – The period of the seasonal component. For monthly data, this is typically 12. Defaults to 12.
area_weighted (bool, optional) – If True, use area weighting for the spatial mean. Defaults to True.
plot_results (bool, optional) – If True, plot the original series and its decomposed components. Defaults to True.
figsize (tuple, optional) – Figure size for the plot. Defaults to (16, 10).
save_plot_path (str or None, optional) – Path to save the decomposition plot.
optimize_chunks (bool, optional) – Whether to automatically optimize chunking for STL performance. Defaults to True.
chunk_target_mb (float, optional) – Target chunk size in MB for optimization. Defaults to 75 MB.
title (str, optional) – The title for the plot. If not provided, a descriptive title will be generated automatically.

Returns:

If plot_results is False, returns a dictionary containing the ‘original’, ‘trend’, ‘seasonal’, and ‘residual’ components as pandas Series. If plot_results is True, returns a tuple of (dictionary, figure object). In error cases: returns None if plot_results is False, or (None, None) if plot_results is True.

Return type:

dict or (dict, matplotlib.figure.Figure) or None or (None, None)

Available Methods¶

Time Series Plotting¶

TimeSeriesAccessor.plot_time_series(variable='air', latitude=None, longitude=None, level=None, time_range=None, season='annual', year=None, area_weighted=True, figsize=(16, 10), save_plot_path=None, optimize_chunks=True, chunk_target_mb=50, title=None)[source]

Plot a time series of a spatially averaged variable.

This function selects data for a given variable, performs spatial averaging over the specified domain, and plots the resulting time series.

Parameters:

variable (str, optional) – Name of the variable to plot. Defaults to ‘air’.
latitude (float, slice, or list, optional) – Latitude range for spatial averaging.
longitude (float, slice, or list, optional) – Longitude range for spatial averaging.
level (float, slice, or list, optional) – Vertical level selection.
time_range (slice, optional) – Time range for the series.
season (str, optional) – Seasonal filter. Defaults to ‘annual’.
year (int, optional) – Filter for a specific year.
area_weighted (bool, optional) – If True, use latitude-based area weighting for the spatial mean. Defaults to True.
figsize (tuple, optional) – Figure size. Defaults to (16, 10).
save_plot_path (str or None, optional) – If provided, the path to save the plot figure.
optimize_chunks (bool, optional) – Whether to automatically optimize chunking for performance. Defaults to True.
chunk_target_mb (float, optional) – Target chunk size in MB for optimization. Defaults to 50 MB.
title (str, optional) – The title for the plot. If not provided, a descriptive title will be generated automatically.

Returns:

The Axes object of the plot, or None if no data could be plotted.

Return type:

matplotlib.axes.Axes or None

Statistical Analysis¶

TimeSeriesAccessor.plot_std_space(variable='air', latitude=None, longitude=None, level=None, time_range=None, season='annual', year=None, area_weighted=True, figsize=(16, 10), save_plot_path=None, title=None)[source]

Plot a time series of the spatial standard deviation of a variable.

Parameters:

variable (str, optional) – Name of the variable to plot. Defaults to ‘air’.
latitude (float, slice, or list, optional) – Latitude range for the calculation.
longitude (float, slice, or list, optional) – Longitude range for the calculation.
level (float, slice, or list, optional) – Vertical level selection.
time_range (slice, optional) – Time range for the series.
season (str, optional) – Seasonal filter. Defaults to ‘annual’.
year (int, optional) – Filter for a specific year.
area_weighted (bool, optional) – If True, use latitude-based area weighting for the standard deviation. Defaults to True.
figsize (tuple, optional) – Figure size. Defaults to (16, 10).
save_plot_path (str or None, optional) – If provided, the path to save the plot figure.
title (str or None, optional) – Custom plot title. A default title is generated if not provided.

Returns:

The Axes object of the plot, or None if no data could be plotted.

Return type:

matplotlib.axes.Axes or None

Decomposition Methods¶

TimeSeriesAccessor.decompose_time_series(variable='air', level=None, latitude=None, longitude=None, time_range=None, season='annual', year=None, stl_seasonal=13, stl_period=12, area_weighted=True, plot_results=True, figsize=(16, 10), save_plot_path=None, optimize_chunks=True, chunk_target_mb=75, title=None)[source]

Decompose a time series into trend, seasonal, and residual components using STL.

Seasonal-Trend decomposition using LOESS (STL) is a robust method for decomposing a time series. This function first creates a spatially-averaged time series and then applies the STL algorithm.

Parameters:

variable (str, optional) – Name of the variable to decompose. Defaults to ‘air’.
level (float, slice, or list, optional) – Vertical level selection.
latitude (float, slice, or list, optional) – Latitude range for spatial averaging.
longitude (float, slice, or list, optional) – Longitude range for spatial averaging.
time_range (slice, optional) – Time range for the series.
season (str, optional) – Seasonal filter. Defaults to ‘annual’.
year (int, optional) – Filter for a specific year.
stl_seasonal (int, optional) – Length of the seasonal smoother for STL. Must be an odd integer. Defaults to 13.
stl_period (int, optional) – The period of the seasonal component. For monthly data, this is typically 12. Defaults to 12.
area_weighted (bool, optional) – If True, use area weighting for the spatial mean. Defaults to True.
plot_results (bool, optional) – If True, plot the original series and its decomposed components. Defaults to True.
figsize (tuple, optional) – Figure size for the plot. Defaults to (16, 10).
save_plot_path (str or None, optional) – Path to save the decomposition plot.
optimize_chunks (bool, optional) – Whether to automatically optimize chunking for STL performance. Defaults to True.
chunk_target_mb (float, optional) – Target chunk size in MB for optimization. Defaults to 75 MB.
title (str, optional) – The title for the plot. If not provided, a descriptive title will be generated automatically.

Returns:

Return type:

dict or (dict, matplotlib.figure.Figure) or None or (None, None)

Chunking and Optimization¶

TimeSeriesAccessor.optimize_chunks(target_mb: float = 50, max_mb: float = 200, variable: str | None = None, time_freq: str | None = None, inplace: bool = False)[source]

Optimize dataset chunking for better time series analysis performance.

The chunking strategy preserves spatial chunks from disk when beneficial and optimizes time dimension chunking for typical time series workflows.

Parameters:

target_mb (float, optional) – Target chunk size in megabytes. Defaults to 50 MB. This is the “sweet spot” for most time series operations.
max_mb (float, optional) – Maximum chunk size in megabytes. Defaults to 200 MB. Hard limit to prevent memory exhaustion.
variable (str, optional) – Variable to optimize chunking for. If None, optimizes for all variables. Focusing on a specific variable can yield better optimization.
time_freq (str, optional) – Time frequency hint (‘daily’, ‘monthly’, ‘hourly’, ‘6hourly’). Helps the algorithm make better chunking decisions.
inplace (bool, optional) – DEPRECATED: In-place modification is not supported for xarray datasets. This parameter is kept for API compatibility but will raise an error if True. Always use False and reassign the result.

Returns:

Optimally chunked dataset if successful, None if chunking unavailable.

Return type:

xr.Dataset or None

Raises:

ValueError – If inplace=True is requested (not supported).

Examples

>>> # Basic optimization for time series analysis
>>> ds_optimized = ds.climate_timeseries.optimize_chunks(target_mb=100)
>>>
>>> # Focus on specific variable with time frequency hint
>>> ds_opt = ds.climate_timeseries.optimize_chunks(
...     variable='temperature', time_freq='daily', target_mb=75
... )

Notes

This method is a simplified interface to the sophisticated chunking system. For advanced control, use optimize_chunks_advanced().

TimeSeriesAccessor.optimize_chunks_advanced(operation_type: str = 'timeseries', memory_limit_gb: float | None = None, performance_priority: str = 'balanced', variable: str | None = None, use_disk_chunks: bool = True, inplace: bool = False)[source]

Advanced chunking optimization using sophisticated strategies.

Parameters:

operation_type (str, optional) – Type of operation to optimize for. Options: - ‘timeseries’: Time series analysis (trends, decomposition) - ‘spatial’: Spatial analysis and plotting - ‘statistical’: Statistical computations - ‘general’: General purpose chunking - ‘io’: Input/output operations Default is ‘timeseries’.
memory_limit_gb (float, optional) – Memory limit in GB. If None, uses 25% of available system memory.
performance_priority (str, optional) – Performance optimization priority. Options: - ‘memory’: Minimize memory usage - ‘speed’: Maximize computational speed - ‘balanced’: Balance memory and speed Default is ‘balanced’.
variable (str, optional) – Variable to optimize chunking for. If None, optimizes for all variables.
use_disk_chunks (bool, optional) – Whether to use disk-aware chunking strategy. Defaults to True.
inplace (bool, optional) – DEPRECATED: In-place modification is not supported for xarray datasets. This parameter is kept for API compatibility but will raise an error if True.

Returns:

Optimally chunked dataset if inplace=False, None otherwise.

Return type:

xr.Dataset or None

Examples

>>> # Optimize for time series analysis with memory priority
>>> ds_opt = ds.climate_timeseries.optimize_chunks_advanced(
...     operation_type='timeseries',
...     performance_priority='memory'
... )

>>> # Optimize for spatial analysis with speed priority
>>> ds = ds.climate_timeseries.optimize_chunks_advanced(
...     operation_type='spatial',
...     performance_priority='speed'
... )

TimeSeriesAccessor.print_chunking_info(detailed: bool = False)[source]

Print information about current dataset chunking.

Provides a clear overview of current chunking configuration, which is essential for understanding memory usage and performance characteristics.

Parameters:: detailed (bool, optional) – Whether to print detailed per-variable information. Defaults to False. When True, shows chunking info for each variable individually.

TimeSeriesAccessor.analyze_chunking_strategy(variable: str | None = None)[source]

Analyze and suggest optimal chunking strategies for different use cases.

This method inspects the dataset and provides recommendations for different types of climate analysis operations.

Parameters:: variable (str, optional) – Variable to analyze. If None, analyzes all variables.

Examples

>>> ds.climate_timeseries.analyze_chunking_strategy()

TimeSeriesAccessor.optimize_for_decomposition(variable: str | None = None)[source]

Optimize chunking specifically for STL time series decomposition.

STL decomposition benefits from larger time chunks and smaller spatial chunks to minimize memory usage while maintaining good performance.

Parameters:: variable (str, optional) – Variable to optimize for. If None, optimizes for all variables.
Returns:: Optimally chunked dataset for STL decomposition.
Return type:: xr.Dataset or None

Basic Examples¶

Comprehensive Analysis Workflow¶

This example demonstrates a complete workflow, from optimizing data chunks to decomposition and visualization.

import xarray as xr
import matplotlib.pyplot as plt
import climate_diagnostics

# Load a sample dataset
ds = xr.tutorial.load_dataset("air_temperature")

# 1. Optimize chunking for decomposition analysis
optimized_ds = ds.climate_timeseries.optimize_for_decomposition(
    variable="air",
    performance_priority='memory'
)

# 2. Decompose the time series for a specific region
decomposition = optimized_ds.climate_timeseries.decompose_time_series(
    variable="air",
    latitude=slice(30, 40),
    longitude=slice(-100, -90)
)

# 3. Plot the original and decomposed time series components
fig, ax = plt.subplots(figsize=(12, 8))
decomposition['original'].plot(ax=ax, label="Original")
decomposition['trend'].plot(ax=ax, label="Trend")
decomposition['seasonal'].plot(ax=ax, label="Seasonal")
ax.legend()
ax.set_title("Time Series Decomposition")
plt.show()

# 4. Analyze spatial standard deviation of the original data
fig_std = ds.climate_timeseries.plot_std_space(
    variable="air",
    title="Spatial Standard Deviation of Air Temperature"
)
plt.show()

Performance Optimization¶

Chunking for Large Datasets¶

# Basic chunking optimization
ds_optimized = ds.climate_timeseries.optimize_chunks(
    target_mb=100,
    variable="air"
)

# Advanced chunking with custom strategies
ds_advanced = ds.climate_timeseries.optimize_chunks_advanced(
    operation_type='timeseries',
    performance_priority='memory',
    variable="air"
)

Chunking Analysis and Diagnostics¶

# Print current chunking information
ds.climate_timeseries.print_chunking_info(detailed=True)

# Analyze chunking strategies
ds.climate_timeseries.analyze_chunking_strategy(variable="air")

# Optimize specifically for decomposition
ds_decomp = ds.climate_timeseries.optimize_for_decomposition(
    variable="air"
)

Memory-Efficient Workflows¶

# Complete workflow with optimization
import xarray as xr
import climate_diagnostics

# Load large dataset
ds = xr.open_dataset("large_climate_data.nc")

# Optimize chunking for time series analysis
ds_opt = ds.climate_timeseries.optimize_chunks_advanced(
    operation_type='timeseries',
    performance_priority='balanced',
    memory_limit_gb=8.0
)

# Perform analysis on optimized dataset
fig = ds_opt.climate_timeseries.plot_time_series(
    variable="temperature",
    latitude=slice(60, 90)
)

# Decompose with optimized chunking
decomp = ds_opt.climate_timeseries.decompose_time_series(
    variable="temperature",
    optimize_chunks=True
)

Working with Regional Data¶

# Calculate regional mean using utilities
from climate_diagnostics.utils import get_spatial_mean

# Select region
arctic_data = ds.sel(latitude=slice(60, 90))

# Get mean time series
arctic_ts = get_spatial_mean(arctic_data.air, area_weighted=True)

# Plot using matplotlib
import matplotlib.pyplot as plt
plt.figure(figsize=(10, 6))
arctic_ts.plot()
plt.title("Arctic Temperature")
plt.show()

Time Series Analysis API Reference¶

Overview¶

Quick Example¶

Accessor Class¶

Available Methods¶

Time Series Plotting¶

Statistical Analysis¶

Decomposition Methods¶

Chunking and Optimization¶

Basic Examples¶

Comprehensive Analysis Workflow¶

Performance Optimization¶

Chunking for Large Datasets¶

Chunking Analysis and Diagnostics¶

Memory-Efficient Workflows¶

Working with Regional Data¶

See Also¶