Processing

scmdata has some support for processing ScmRun instances to calculate statistics of interest. Here we provide examples of how to use them.

At present, we can calculate:

  • crossing times (e.g. 1.5C crossing times)

  • crossing time quantiles

  • exceedance probabilities

  • peak

  • peak year

  • categorisation in line with SR1.5

  • a set of summary variables

Load some data

For this demonstration, we are going to use MAGICC output from [RCMIP Phase 2] as available at https://zenodo.org/record/4624566/files/data-processed-submission-database-hadcrut5-target-MAGICCv7.5.1.tar.gz?download=1. Here we have just extracted the air temperature output for the SSPs from 1995 to 2100.

import matplotlib.pyplot as plt
import numpy as np
import pandas as pd
import seaborn as sns

import scmdata.processing
from scmdata import ScmRun
/tmp/ipykernel_836/579853008.py:3: DeprecationWarning: 
Pyarrow will become a required dependency of pandas in the next major release of pandas (pandas 3.0),
(to allow more performant data types, such as the Arrow string type, and better interoperability with other libraries)
but was not found to be installed on your system.
If this would cause problems for you,
please provide us feedback at https://github.com/pandas-dev/pandas/issues/54466
        
  import pandas as pd
/home/docs/checkouts/readthedocs.org/user_builds/scmdata/checkouts/stable/src/scmdata/database/_database.py:9: TqdmWarning: IProgress not found. Please update jupyter and ipywidgets. See https://ipywidgets.readthedocs.io/en/stable/user_install.html
  import tqdm.autonotebook as tqdman
magicc_output = ScmRun("magicc-rcmip-phase-2-gsat-output.csv")
magicc_output
<ScmRun (timeseries: 4800, timepoints: 106)>
Time:
	Start: 1995-01-01T00:00:00
	End: 2100-01-01T00:00:00
Meta:
	     climate_model  ensemble_member        model region scenario unit  \
	0     MAGICCv7.5.1                0  unspecified  World   ssp585    K   
	1     MAGICCv7.5.1                1  unspecified  World   ssp585    K   
	2     MAGICCv7.5.1                2  unspecified  World   ssp585    K   
	3     MAGICCv7.5.1                3  unspecified  World   ssp585    K   
	4     MAGICCv7.5.1                4  unspecified  World   ssp585    K   
	...            ...              ...          ...    ...      ...  ...   
	4795  MAGICCv7.5.1              595  unspecified  World   ssp119    K   
	4796  MAGICCv7.5.1              596  unspecified  World   ssp119    K   
	4797  MAGICCv7.5.1              597  unspecified  World   ssp119    K   
	4798  MAGICCv7.5.1              598  unspecified  World   ssp119    K   
	4799  MAGICCv7.5.1              599  unspecified  World   ssp119    K   
	
	                            variable  
	0     Surface Air Temperature Change  
	1     Surface Air Temperature Change  
	2     Surface Air Temperature Change  
	3     Surface Air Temperature Change  
	4     Surface Air Temperature Change  
	...                              ...  
	4795  Surface Air Temperature Change  
	4796  Surface Air Temperature Change  
	4797  Surface Air Temperature Change  
	4798  Surface Air Temperature Change  
	4799  Surface Air Temperature Change  
	
	[4800 rows x 7 columns]

Crossing times

The first thing we do is show how to calculate the crossing times of a given threshold.

crossing_time_15 = scmdata.processing.calculate_crossing_times(
    magicc_output,
    threshold=1.5,
)
crossing_time_15
climate_model  ensemble_member  model        region  scenario  unit  variable                      
MAGICCv7.5.1   0                unspecified  World   ssp585    K     Surface Air Temperature Change    2025.0
               1                unspecified  World   ssp585    K     Surface Air Temperature Change    2029.0
               2                unspecified  World   ssp585    K     Surface Air Temperature Change    2024.0
               3                unspecified  World   ssp585    K     Surface Air Temperature Change    2023.0
               4                unspecified  World   ssp585    K     Surface Air Temperature Change    2023.0
                                                                                                        ...  
               595              unspecified  World   ssp119    K     Surface Air Temperature Change    2023.0
               596              unspecified  World   ssp119    K     Surface Air Temperature Change       NaN
               597              unspecified  World   ssp119    K     Surface Air Temperature Change    2026.0
               598              unspecified  World   ssp119    K     Surface Air Temperature Change    2019.0
               599              unspecified  World   ssp119    K     Surface Air Temperature Change    2024.0
Length: 4800, dtype: float64

The output is a pd.Series, which is useful for many other pieces of work.

For example, we could make a plot with e.g. seaborn.

label = "1.5C crossing time"
pdf = crossing_time_15.reset_index().rename({0: label}, axis="columns")
sns.histplot(data=pdf, x=label, hue="scenario")
<Axes: xlabel='1.5C crossing time', ylabel='Count'>
../_images/49c85ff137e7e956ba965ae670cd6265e2cdcb354cb3827ba2c1e4742c07fcb5.png
label = "2.0C crossing time"
crossing_time_20 = scmdata.processing.calculate_crossing_times(
    magicc_output,
    threshold=2.0,
)

pdf = crossing_time_20.reset_index().rename({0: label}, axis="columns")
sns.histplot(data=pdf, x=label, hue="scenario")
<Axes: xlabel='2.0C crossing time', ylabel='Count'>
../_images/c1dd0ec4773cf080255b2101980e657c8733efd9af878690ed890f2853a16e5e.png

Crossing time quantiles

Calculating the quantiles of crossing times is a bit fiddly because some ensemble members will not cross the threshold at all. We show how quantiles can be calculated sensibly below. The calculation will return nan if that quantile in the crossing times corresponds to an ensemble member which never crosses the threshold.

scmdata.processing.calculate_crossing_times_quantiles(
    crossing_time_15,
    groupby=["climate_model", "model", "scenario"],
    quantiles=(0.05, 0.17, 0.5, 0.83, 0.95),
)
climate_model  model        scenario     quantile
MAGICCv7.5.1   unspecified  ssp119       0.05        2021.00
                                         0.17        2023.00
                                         0.50        2027.00
                                         0.83            NaN
                                         0.95            NaN
                            ssp126       0.05        2021.00
                                         0.17        2023.00
                                         0.50        2028.00
                                         0.83        2038.17
                                         0.95            NaN
                            ssp245       0.05        2021.00
                                         0.17        2023.00
                                         0.50        2028.00
                                         0.83        2034.00
                                         0.95        2039.00
                            ssp370       0.05        2021.00
                                         0.17        2024.00
                                         0.50        2028.00
                                         0.83        2032.00
                                         0.95        2036.00
                            ssp434       0.05        2021.00
                                         0.17        2023.00
                                         0.50        2027.00
                                         0.83        2034.00
                                         0.95        2041.00
                            ssp460       0.05        2021.00
                                         0.17        2023.00
                                         0.50        2027.00
                                         0.83        2032.00
                                         0.95        2036.00
                            ssp534-over  0.05        2020.00
                                         0.17        2022.00
                                         0.50        2025.00
                                         0.83        2030.00
                                         0.95        2033.05
                            ssp585       0.05        2020.00
                                         0.17        2022.00
                                         0.50        2025.00
                                         0.83        2030.00
                                         0.95        2034.00
dtype: float64

In the above, we can see that the 83rd percentile of crossing times is in fact to not cross at all under the SSP1-1.9 scenario.

Datetime output

If desired, data could be interpolated first before calculating the crossing times. In such cases, returning the output as datetime rather than year might be helpful.

scmdata.processing.calculate_crossing_times(
    magicc_output.resample("MS"),
    threshold=2.0,
    return_year=False,
)
climate_model  ensemble_member  model        region  scenario  unit  variable                      
MAGICCv7.5.1   0                unspecified  World   ssp585    K     Surface Air Temperature Change   2042-02-01
               1                unspecified  World   ssp585    K     Surface Air Temperature Change   2041-10-01
               2                unspecified  World   ssp585    K     Surface Air Temperature Change   2035-11-01
               3                unspecified  World   ssp585    K     Surface Air Temperature Change   2035-09-01
               4                unspecified  World   ssp585    K     Surface Air Temperature Change   2036-01-01
                                                                                                         ...    
               595              unspecified  World   ssp119    K     Surface Air Temperature Change          NaT
               596              unspecified  World   ssp119    K     Surface Air Temperature Change          NaT
               597              unspecified  World   ssp119    K     Surface Air Temperature Change          NaT
               598              unspecified  World   ssp119    K     Surface Air Temperature Change   2027-07-01
               599              unspecified  World   ssp119    K     Surface Air Temperature Change          NaT
Length: 4800, dtype: datetime64[ns]

Exceedance probabilities

Next we show how to calculate exceedance probabilities.

exceedance_probability_2C = scmdata.processing.calculate_exceedance_probabilities(
    magicc_output,
    process_over_cols=["ensemble_member", "variable"],
    threshold=2.0,
)
exceedance_probability_2C
climate_model  model        region  scenario     unit         
MAGICCv7.5.1   unspecified  World   ssp119       dimensionless    0.091667
                                    ssp126       dimensionless    0.350000
                                    ssp245       dimensionless    0.995000
                                    ssp370       dimensionless    1.000000
                                    ssp434       dimensionless    0.868333
                                    ssp460       dimensionless    1.000000
                                    ssp534-over  dimensionless    0.983333
                                    ssp585       dimensionless    1.000000
Name: 2.0 exceedance probability, dtype: float64

We can make a plot to compare exceedance probabilities over multiple scenarios.

exceedance_probability_15C = scmdata.processing.calculate_exceedance_probabilities(
    magicc_output,
    process_over_cols=["ensemble_member", "variable"],
    threshold=1.5,
)

pdf = (
    pd.DataFrame(
        [
            exceedance_probability_2C,
            exceedance_probability_15C,
        ]
    )
    .T.melt(ignore_index=False, value_name="Exceedance probability")
    .reset_index()
)
display(pdf)  # noqa

ax = sns.barplot(data=pdf, x="scenario", y="Exceedance probability", hue="variable")
ax.tick_params(labelrotation=30)
ax.set_ylim([0, 1])
ax.legend(loc="center left", bbox_to_anchor=(1.01, 0.5), title="threshold")
plt.tight_layout()
climate_model model region scenario unit variable Exceedance probability
0 MAGICCv7.5.1 unspecified World ssp119 dimensionless 2.0 exceedance probability 0.091667
1 MAGICCv7.5.1 unspecified World ssp126 dimensionless 2.0 exceedance probability 0.350000
2 MAGICCv7.5.1 unspecified World ssp245 dimensionless 2.0 exceedance probability 0.995000
3 MAGICCv7.5.1 unspecified World ssp370 dimensionless 2.0 exceedance probability 1.000000
4 MAGICCv7.5.1 unspecified World ssp434 dimensionless 2.0 exceedance probability 0.868333
5 MAGICCv7.5.1 unspecified World ssp460 dimensionless 2.0 exceedance probability 1.000000
6 MAGICCv7.5.1 unspecified World ssp534-over dimensionless 2.0 exceedance probability 0.983333
7 MAGICCv7.5.1 unspecified World ssp585 dimensionless 2.0 exceedance probability 1.000000
8 MAGICCv7.5.1 unspecified World ssp119 dimensionless 1.5 exceedance probability 0.743333
9 MAGICCv7.5.1 unspecified World ssp126 dimensionless 1.5 exceedance probability 0.940000
10 MAGICCv7.5.1 unspecified World ssp245 dimensionless 1.5 exceedance probability 1.000000
11 MAGICCv7.5.1 unspecified World ssp370 dimensionless 1.5 exceedance probability 1.000000
12 MAGICCv7.5.1 unspecified World ssp434 dimensionless 1.5 exceedance probability 1.000000
13 MAGICCv7.5.1 unspecified World ssp460 dimensionless 1.5 exceedance probability 1.000000
14 MAGICCv7.5.1 unspecified World ssp534-over dimensionless 1.5 exceedance probability 1.000000
15 MAGICCv7.5.1 unspecified World ssp585 dimensionless 1.5 exceedance probability 1.000000
../_images/c544ffc462b6e8d9361946a4d5ec4ede62795fae5c82c0b379b94f1817e96a89.png

Exceedance probabilities over time

It is also possible to calculate exceedance probabilities over time.

res = scmdata.processing.calculate_exceedance_probabilities_over_time(
    magicc_output,
    process_over_cols="ensemble_member",
    threshold=1.5,
)
res = scmdata.ScmRun(res)
res.lineplot(style="variable")
<Axes: xlabel='time', ylabel='dimensionless'>
../_images/f9edfeebabde25187d3e2ab1a242345bdc71e84ed5ff445b1af2f88afff736c6.png

Note that taking the maximum exceedance probability over all time will be less than or equal to the exceedance probability calculated with calculate_exceedance_probabilities because the order of operations matters: calculating whether each ensemble member exceeds the threshold or not then seeing how many ensemble members out of the total exceed the threshold is not the same as seeing how many ensemble members exceed the threshold at each timestep and then taking the maximum over all timesteps. In general, taking the maximum value from calculate_exceedance_probabilities_over_time will be less than or equal to the results of calculate_exceedance_probabilities, as demonstrated below.

comparison = (
    pd.DataFrame(
        {
            "calculate_exceedance_probabilities": exceedance_probability_15C,
            "max of calculate_exceedance_probabilities_over_time": (
                res.timeseries(meta=exceedance_probability_15C.index.names).max(axis=1)
            ),
        }
    )
    * 100
)
comparison.round(1)
calculate_exceedance_probabilities max of calculate_exceedance_probabilities_over_time
climate_model model region scenario unit
MAGICCv7.5.1 unspecified World ssp119 dimensionless 74.3 73.8
ssp126 dimensionless 94.0 93.7
ssp245 dimensionless 100.0 100.0
ssp370 dimensionless 100.0 100.0
ssp434 dimensionless 100.0 100.0
ssp460 dimensionless 100.0 100.0
ssp534-over dimensionless 100.0 100.0
ssp585 dimensionless 100.0 100.0

Peak

We can calculate the peaks in each timeseries.

peak_warming = scmdata.processing.calculate_peak(magicc_output)
peak_warming
climate_model  ensemble_member  model        region  scenario  unit  variable                           
MAGICCv7.5.1   0                unspecified  World   ssp585    K     Peak Surface Air Temperature Change    4.448044
               1                unspecified  World   ssp585    K     Peak Surface Air Temperature Change    5.372057
               2                unspecified  World   ssp585    K     Peak Surface Air Temperature Change    5.942238
               3                unspecified  World   ssp585    K     Peak Surface Air Temperature Change    5.220904
               4                unspecified  World   ssp585    K     Peak Surface Air Temperature Change    5.637467
                                                                                                              ...   
               595              unspecified  World   ssp119    K     Peak Surface Air Temperature Change    1.809161
               596              unspecified  World   ssp119    K     Peak Surface Air Temperature Change    1.288471
               597              unspecified  World   ssp119    K     Peak Surface Air Temperature Change    1.669414
               598              unspecified  World   ssp119    K     Peak Surface Air Temperature Change    2.459078
               599              unspecified  World   ssp119    K     Peak Surface Air Temperature Change    1.746321
Name: 0, Length: 4800, dtype: float64

From this we can then calculate median peak warming by scenario (or other quantiles).

peak_warming.groupby(["model", "scenario"]).median()
model        scenario   
unspecified  ssp119         1.653154
             ssp126         1.878890
             ssp245         2.838238
             ssp370         4.372581
             ssp434         2.371833
             ssp460         3.485626
             ssp534-over    2.613975
             ssp585         5.278692
Name: 0, dtype: float64

Or make a plot.

label = "Peak warming"
pdf = peak_warming.reset_index().rename({0: label}, axis="columns")
sns.histplot(data=pdf, x=label, hue="scenario")
<Axes: xlabel='Peak warming', ylabel='Count'>
../_images/27a040c178d5e5f11355cbe10e8b5413837e2a35523e220060648c82e4b32567.png

Similarly to exceedance probabilties, the order of operations matters: calculating the median of the peaks is different to calculating the median then taking the peak of this median timeseries. In general, the max of the median timeseries is less than or equal to the median of the peak in each timeseries.

comparison = pd.DataFrame(
    {
        "median of peak warming": peak_warming.groupby(["model", "scenario"]).median(),
        "max of median timeseries": (
            magicc_output.process_over(
                list(set(magicc_output.meta.columns) - {"model", "scenario"}),
                "median",
            ).max(axis=1)
        ),
    }
)
comparison.round(3)
median of peak warming max of median timeseries
model scenario
unspecified ssp119 1.653 1.645
ssp126 1.879 1.874
ssp245 2.838 2.838
ssp370 4.373 4.373
ssp434 2.372 2.368
ssp460 3.486 3.486
ssp534-over 2.614 2.613
ssp585 5.279 5.279

Peak time

We can calculate the peak time in each timeseries.

peak_warming_year = scmdata.processing.calculate_peak_time(magicc_output)
peak_warming_year
climate_model  ensemble_member  model        region  scenario  unit  variable                                   
MAGICCv7.5.1   0                unspecified  World   ssp585    K     Year of peak Surface Air Temperature Change    2100
               1                unspecified  World   ssp585    K     Year of peak Surface Air Temperature Change    2100
               2                unspecified  World   ssp585    K     Year of peak Surface Air Temperature Change    2100
               3                unspecified  World   ssp585    K     Year of peak Surface Air Temperature Change    2100
               4                unspecified  World   ssp585    K     Year of peak Surface Air Temperature Change    2100
                                                                                                                    ... 
               595              unspecified  World   ssp119    K     Year of peak Surface Air Temperature Change    2048
               596              unspecified  World   ssp119    K     Year of peak Surface Air Temperature Change    2036
               597              unspecified  World   ssp119    K     Year of peak Surface Air Temperature Change    2039
               598              unspecified  World   ssp119    K     Year of peak Surface Air Temperature Change    2059
               599              unspecified  World   ssp119    K     Year of peak Surface Air Temperature Change    2048
Name: 0, Length: 4800, dtype: int64

From this we can then calculate median peak warming time by scenario (or other quantiles).

peak_warming_year.groupby(["model", "scenario"]).median()
model        scenario   
unspecified  ssp119         2048.0
             ssp126         2069.0
             ssp245         2100.0
             ssp370         2100.0
             ssp434         2094.0
             ssp460         2100.0
             ssp534-over    2060.0
             ssp585         2100.0
Name: 0, dtype: float64
label = "Peak warming year"
pdf = peak_warming_year.reset_index().rename({0: label}, axis="columns")
ax = sns.histplot(data=pdf, x=label, hue="scenario", bins=np.arange(2025, 2100 + 1))
../_images/ab466cc4defaadfed4f95657199755b3ff0bcc8a704fa818ff75fc8ea58a75f8.png

SR1.5 categorisation

It is also possible to categorise the scenarios using the same categorisation as in SR1.5. To do this we have to first calculate the appropriate quantiles.

magicc_output_categorisation_quantiles = scmdata.ScmRun(
    magicc_output.quantiles_over("ensemble_member", quantiles=[0.33, 0.5, 0.66])
)
magicc_output_categorisation_quantiles
/home/docs/checkouts/readthedocs.org/user_builds/scmdata/checkouts/stable/src/scmdata/run.py:197: PerformanceWarning: DataFrame is highly fragmented.  This is usually the result of calling `frame.insert` many times, which has poor performance.  Consider joining all columns at once using pd.concat(axis=1) instead. To get a de-fragmented frame, use `newframe = frame.copy()`
  df.reset_index(inplace=True)
/home/docs/checkouts/readthedocs.org/user_builds/scmdata/checkouts/stable/src/scmdata/run.py:197: PerformanceWarning: DataFrame is highly fragmented.  This is usually the result of calling `frame.insert` many times, which has poor performance.  Consider joining all columns at once using pd.concat(axis=1) instead. To get a de-fragmented frame, use `newframe = frame.copy()`
  df.reset_index(inplace=True)
/home/docs/checkouts/readthedocs.org/user_builds/scmdata/checkouts/stable/src/scmdata/run.py:197: PerformanceWarning: DataFrame is highly fragmented.  This is usually the result of calling `frame.insert` many times, which has poor performance.  Consider joining all columns at once using pd.concat(axis=1) instead. To get a de-fragmented frame, use `newframe = frame.copy()`
  df.reset_index(inplace=True)
/home/docs/checkouts/readthedocs.org/user_builds/scmdata/checkouts/stable/src/scmdata/run.py:197: PerformanceWarning: DataFrame is highly fragmented.  This is usually the result of calling `frame.insert` many times, which has poor performance.  Consider joining all columns at once using pd.concat(axis=1) instead. To get a de-fragmented frame, use `newframe = frame.copy()`
  df.reset_index(inplace=True)
/home/docs/checkouts/readthedocs.org/user_builds/scmdata/checkouts/stable/src/scmdata/run.py:197: PerformanceWarning: DataFrame is highly fragmented.  This is usually the result of calling `frame.insert` many times, which has poor performance.  Consider joining all columns at once using pd.concat(axis=1) instead. To get a de-fragmented frame, use `newframe = frame.copy()`
  df.reset_index(inplace=True)
/home/docs/checkouts/readthedocs.org/user_builds/scmdata/checkouts/stable/src/scmdata/run.py:197: PerformanceWarning: DataFrame is highly fragmented.  This is usually the result of calling `frame.insert` many times, which has poor performance.  Consider joining all columns at once using pd.concat(axis=1) instead. To get a de-fragmented frame, use `newframe = frame.copy()`
  df.reset_index(inplace=True)
/home/docs/checkouts/readthedocs.org/user_builds/scmdata/checkouts/stable/src/scmdata/run.py:197: PerformanceWarning: DataFrame is highly fragmented.  This is usually the result of calling `frame.insert` many times, which has poor performance.  Consider joining all columns at once using pd.concat(axis=1) instead. To get a de-fragmented frame, use `newframe = frame.copy()`
  df.reset_index(inplace=True)
<ScmRun (timeseries: 24, timepoints: 106)>
Time:
	Start: 1995-01-01T00:00:00
	End: 2100-01-01T00:00:00
Meta:
	   climate_model        model  quantile region     scenario unit  \
	0   MAGICCv7.5.1  unspecified      0.33  World       ssp119    K   
	1   MAGICCv7.5.1  unspecified      0.33  World       ssp126    K   
	2   MAGICCv7.5.1  unspecified      0.33  World       ssp245    K   
	3   MAGICCv7.5.1  unspecified      0.33  World       ssp370    K   
	4   MAGICCv7.5.1  unspecified      0.33  World       ssp434    K   
	5   MAGICCv7.5.1  unspecified      0.33  World       ssp460    K   
	6   MAGICCv7.5.1  unspecified      0.33  World  ssp534-over    K   
	7   MAGICCv7.5.1  unspecified      0.33  World       ssp585    K   
	8   MAGICCv7.5.1  unspecified      0.50  World       ssp119    K   
	9   MAGICCv7.5.1  unspecified      0.50  World       ssp126    K   
	10  MAGICCv7.5.1  unspecified      0.50  World       ssp245    K   
	11  MAGICCv7.5.1  unspecified      0.50  World       ssp370    K   
	12  MAGICCv7.5.1  unspecified      0.50  World       ssp434    K   
	13  MAGICCv7.5.1  unspecified      0.50  World       ssp460    K   
	14  MAGICCv7.5.1  unspecified      0.50  World  ssp534-over    K   
	15  MAGICCv7.5.1  unspecified      0.50  World       ssp585    K   
	16  MAGICCv7.5.1  unspecified      0.66  World       ssp119    K   
	17  MAGICCv7.5.1  unspecified      0.66  World       ssp126    K   
	18  MAGICCv7.5.1  unspecified      0.66  World       ssp245    K   
	19  MAGICCv7.5.1  unspecified      0.66  World       ssp370    K   
	20  MAGICCv7.5.1  unspecified      0.66  World       ssp434    K   
	21  MAGICCv7.5.1  unspecified      0.66  World       ssp460    K   
	22  MAGICCv7.5.1  unspecified      0.66  World  ssp534-over    K   
	23  MAGICCv7.5.1  unspecified      0.66  World       ssp585    K   
	
	                          variable  
	0   Surface Air Temperature Change  
	1   Surface Air Temperature Change  
	2   Surface Air Temperature Change  
	3   Surface Air Temperature Change  
	4   Surface Air Temperature Change  
	5   Surface Air Temperature Change  
	6   Surface Air Temperature Change  
	7   Surface Air Temperature Change  
	8   Surface Air Temperature Change  
	9   Surface Air Temperature Change  
	10  Surface Air Temperature Change  
	11  Surface Air Temperature Change  
	12  Surface Air Temperature Change  
	13  Surface Air Temperature Change  
	14  Surface Air Temperature Change  
	15  Surface Air Temperature Change  
	16  Surface Air Temperature Change  
	17  Surface Air Temperature Change  
	18  Surface Air Temperature Change  
	19  Surface Air Temperature Change  
	20  Surface Air Temperature Change  
	21  Surface Air Temperature Change  
	22  Surface Air Temperature Change  
	23  Surface Air Temperature Change  
scmdata.processing.categorisation_sr15(
    magicc_output_categorisation_quantiles, ["climate_model", "scenario"]
)
climate_model  scenario   
MAGICCv7.5.1   ssp119         1.5C high overshoot
               ssp126                   Higher 2C
               ssp245                    Above 2C
               ssp370                    Above 2C
               ssp434                    Above 2C
               ssp460                    Above 2C
               ssp534-over               Above 2C
               ssp585                    Above 2C
Name: category, dtype: object

Set of summary variables

It is also possible to calculate a set of summary variables using the convenience function calculate_summary_stats. The documentation is given below.

print(scmdata.processing.calculate_summary_stats.__doc__)
    Calculate common summary statistics

    Parameters
    ----------
    scmrun : :class:`scmdata.ScmRun`
        Data of which to calculate the stats

    index : list[str]
        Columns to use in the index of the output (unit is added if not
        included)

    exceedance_probabilities_thresholds : list[float]
        Thresholds to use for exceedance probabilities

    exceedance_probabilities_variable : str
        Variable to use for exceedance probability calculations

    exceedance_probabilities_naming_base : str
        String to use as the base for naming the exceedance probabilities. Each
        exceedance probability output column will have a name given by
        ``exceedance_probabilities_naming_base.format(threshold)`` where
        threshold is the exceedance probability threshold to use. If not
        supplied, the default output of
        :func:`scmdata.processing.calculate_exceedance_probabilities` will be
        used.

    peak_quantiles : list[float]
        Quantiles to report in peak calculations

    peak_variable : str
        Variable of which to calculate the peak

    peak_naming_base : str
        Base to use for naming the peak outputs. This is combined with the
        quantile. If not supplied, ``"{} peak"`` is used so the outputs will be
        named e.g. "0.05 peak", "0.5 peak", "0.95 peak".

    peak_time_naming_base : str
        Base to use for naming the peak time outputs. This is combined with the
        quantile. If not supplied, ``"{} peak year"`` is used (unless
        ``peak_return_year`` is ``False`` in which case ``"{} peak time"`` is
        used) so the outputs will be named e.g. "0.05 peak year", "0.5 peak
        year", "0.95 peak year".

    peak_return_year : bool
        If ``True``, return the year of the peak of ``peak_variable``,
        otherwise return full dates

    categorisation_variable : str
        Variable to use for categorisation. Note that this variable point to
        timeseries that contain global-mean surface air temperatures  (GSAT)
        relative to 1850-1900 (using another reference period will not break
        this function, but is inconsistent with the original algorithm).

    categorisation_quantile_cols : list[str]
        Columns which represent individual ensemble members in the output (e.g.
        ["ensemble_member"]). The quantiles are taking over these columns
        before the data is passed to
        :func:`scmdata.processing.categorisation_sr15`.

    progress : bool
        Should a progress bar be shown whilst the calculations are done?

    Returns
    -------
    :class:`pd.DataFrame`
        Summary statistics, with each column being a statistic and the index
        being given by ``index``
    

It can be used to calculate summary statistics as shown below.

summary_stats = scmdata.processing.calculate_summary_stats(
    magicc_output,
    ["climate_model", "model", "scenario", "region"],
    exceedance_probabilities_thresholds=np.arange(1, 2.51, 0.1),
    exceedance_probabilities_variable="Surface Air Temperature Change",
    exceedance_probabilities_naming_base="Exceedance Probability|{:.2f}K",
    peak_quantiles=[0.05, 0.5, 0.95],
    peak_variable="Surface Air Temperature Change",
    peak_naming_base="Peak Surface Air Temperature Change|{}th quantile",
    peak_time_naming_base="Year of peak Surface Air Temperature Change|{}th quantile",
    peak_return_year=True,
    progress=True,
)
summary_stats
/home/docs/checkouts/readthedocs.org/user_builds/scmdata/checkouts/stable/src/scmdata/run.py:197: PerformanceWarning: DataFrame is highly fragmented.  This is usually the result of calling `frame.insert` many times, which has poor performance.  Consider joining all columns at once using pd.concat(axis=1) instead. To get a de-fragmented frame, use `newframe = frame.copy()`
  df.reset_index(inplace=True)
/home/docs/checkouts/readthedocs.org/user_builds/scmdata/checkouts/stable/src/scmdata/run.py:197: PerformanceWarning: DataFrame is highly fragmented.  This is usually the result of calling `frame.insert` many times, which has poor performance.  Consider joining all columns at once using pd.concat(axis=1) instead. To get a de-fragmented frame, use `newframe = frame.copy()`
  df.reset_index(inplace=True)
/home/docs/checkouts/readthedocs.org/user_builds/scmdata/checkouts/stable/src/scmdata/run.py:197: PerformanceWarning: DataFrame is highly fragmented.  This is usually the result of calling `frame.insert` many times, which has poor performance.  Consider joining all columns at once using pd.concat(axis=1) instead. To get a de-fragmented frame, use `newframe = frame.copy()`
  df.reset_index(inplace=True)
/home/docs/checkouts/readthedocs.org/user_builds/scmdata/checkouts/stable/src/scmdata/run.py:197: PerformanceWarning: DataFrame is highly fragmented.  This is usually the result of calling `frame.insert` many times, which has poor performance.  Consider joining all columns at once using pd.concat(axis=1) instead. To get a de-fragmented frame, use `newframe = frame.copy()`
  df.reset_index(inplace=True)
/home/docs/checkouts/readthedocs.org/user_builds/scmdata/checkouts/stable/src/scmdata/run.py:197: PerformanceWarning: DataFrame is highly fragmented.  This is usually the result of calling `frame.insert` many times, which has poor performance.  Consider joining all columns at once using pd.concat(axis=1) instead. To get a de-fragmented frame, use `newframe = frame.copy()`
  df.reset_index(inplace=True)
/home/docs/checkouts/readthedocs.org/user_builds/scmdata/checkouts/stable/src/scmdata/run.py:197: PerformanceWarning: DataFrame is highly fragmented.  This is usually the result of calling `frame.insert` many times, which has poor performance.  Consider joining all columns at once using pd.concat(axis=1) instead. To get a de-fragmented frame, use `newframe = frame.copy()`
  df.reset_index(inplace=True)
/home/docs/checkouts/readthedocs.org/user_builds/scmdata/checkouts/stable/src/scmdata/run.py:197: PerformanceWarning: DataFrame is highly fragmented.  This is usually the result of calling `frame.insert` many times, which has poor performance.  Consider joining all columns at once using pd.concat(axis=1) instead. To get a de-fragmented frame, use `newframe = frame.copy()`
  df.reset_index(inplace=True)
  0%|          | 0/23 [00:00<?, ?it/s]
 22%|██▏       | 5/23 [00:00<00:00, 49.64it/s]
 48%|████▊     | 11/23 [00:00<00:00, 50.38it/s]
 74%|███████▍  | 17/23 [00:00<00:00, 53.27it/s]
100%|██████████| 23/23 [00:00<00:00, 54.97it/s]
100%|██████████| 23/23 [00:00<00:00, 53.52it/s]

climate_model  model        scenario     region  unit           statistic                   
MAGICCv7.5.1   unspecified  ssp119       World   dimensionless  Exceedance Probability|1.00K         1.0
                                                                Exceedance Probability|1.10K         1.0
                                                                Exceedance Probability|1.20K         1.0
                                                                Exceedance Probability|1.30K    0.976667
                                                                Exceedance Probability|1.40K    0.888333
                                                                                                  ...   
                            ssp370       World                  SR1.5 category                  Above 2C
                            ssp434       World                  SR1.5 category                  Above 2C
                            ssp460       World                  SR1.5 category                  Above 2C
                            ssp534-over  World                  SR1.5 category                  Above 2C
                            ssp585       World                  SR1.5 category                  Above 2C
Name: value, Length: 184, dtype: object

We can then use pandas to create summary tables of interest.

summary_stats.unstack(["climate_model", "statistic", "unit"])
climate_model MAGICCv7.5.1
statistic Exceedance Probability|1.00K Exceedance Probability|1.10K Exceedance Probability|1.20K Exceedance Probability|1.30K Exceedance Probability|1.40K Exceedance Probability|1.50K Exceedance Probability|1.60K Exceedance Probability|1.70K Exceedance Probability|1.80K Exceedance Probability|1.90K ... Exceedance Probability|2.30K Exceedance Probability|2.40K Exceedance Probability|2.50K Peak Surface Air Temperature Change|0.05th quantile Peak Surface Air Temperature Change|0.5th quantile Peak Surface Air Temperature Change|0.95th quantile Year of peak Surface Air Temperature Change|0.05th quantile Year of peak Surface Air Temperature Change|0.5th quantile Year of peak Surface Air Temperature Change|0.95th quantile SR1.5 category
unit dimensionless dimensionless dimensionless dimensionless dimensionless dimensionless dimensionless dimensionless dimensionless dimensionless ... dimensionless dimensionless dimensionless K K K K K K
model scenario region
unspecified ssp119 World 1.0 1.0 1.0 0.976667 0.888333 0.743333 0.608333 0.418333 0.286667 0.156667 ... 0.008333 0.003333 0.0 1.345672 1.653154 2.071348 2037.0 2048.0 2050.0 1.5C high overshoot
ssp126 World 1.0 1.0 1.0 1.0 0.988333 0.94 0.836667 0.735 0.618333 0.465 ... 0.096667 0.056667 0.025 1.478604 1.87889 2.425129 2059.0 2069.0 2081.0 Higher 2C
ssp245 World 1.0 1.0 1.0 1.0 1.0 1.0 1.0 1.0 1.0 1.0 ... 0.911667 0.846667 0.783333 2.218657 2.838238 3.642782 2100.0 2100.0 2100.0 Above 2C
ssp370 World 1.0 1.0 1.0 1.0 1.0 1.0 1.0 1.0 1.0 1.0 ... 1.0 1.0 1.0 3.476406 4.372581 5.35643 2100.0 2100.0 2100.0 Above 2C
ssp434 World 1.0 1.0 1.0 1.0 1.0 1.0 1.0 0.996667 0.978333 0.946667 ... 0.58 0.463333 0.356667 1.898714 2.371833 2.958278 2092.0 2094.0 2100.0 Above 2C
ssp460 World 1.0 1.0 1.0 1.0 1.0 1.0 1.0 1.0 1.0 1.0 ... 1.0 1.0 0.995 2.753816 3.485626 4.373157 2100.0 2100.0 2100.0 Above 2C
ssp534-over World 1.0 1.0 1.0 1.0 1.0 1.0 1.0 1.0 1.0 0.996667 ... 0.811667 0.725 0.63 2.100912 2.613975 3.299928 2058.0 2060.0 2069.0 Above 2C
ssp585 World 1.0 1.0 1.0 1.0 1.0 1.0 1.0 1.0 1.0 1.0 ... 1.0 1.0 1.0 4.094323 5.278692 6.75008 2100.0 2100.0 2100.0 Above 2C

8 rows × 23 columns

index = ["climate_model", "scenario"]
pivot_merge_unit = summary_stats.to_frame().reset_index()
pivot_merge_unit["statistic"] = pivot_merge_unit["statistic"] + pivot_merge_unit[
    "unit"
].apply(lambda x: f"({x})" if x else "")
pivot_merge_unit = pivot_merge_unit.drop("unit", axis="columns")
pivot_merge_unit = pivot_merge_unit.set_index(
    list(set(pivot_merge_unit.columns) - {"value"})
).unstack("statistic")
pivot_merge_unit
value
statistic Exceedance Probability|1.00K(dimensionless) Exceedance Probability|1.10K(dimensionless) Exceedance Probability|1.20K(dimensionless) Exceedance Probability|1.30K(dimensionless) Exceedance Probability|1.40K(dimensionless) Exceedance Probability|1.50K(dimensionless) Exceedance Probability|1.60K(dimensionless) Exceedance Probability|1.70K(dimensionless) Exceedance Probability|1.80K(dimensionless) Exceedance Probability|1.90K(dimensionless) ... Exceedance Probability|2.30K(dimensionless) Exceedance Probability|2.40K(dimensionless) Exceedance Probability|2.50K(dimensionless) Peak Surface Air Temperature Change|0.05th quantile(K) Peak Surface Air Temperature Change|0.5th quantile(K) Peak Surface Air Temperature Change|0.95th quantile(K) SR1.5 category Year of peak Surface Air Temperature Change|0.05th quantile(K) Year of peak Surface Air Temperature Change|0.5th quantile(K) Year of peak Surface Air Temperature Change|0.95th quantile(K)
region scenario model climate_model
World ssp119 unspecified MAGICCv7.5.1 1.0 1.0 1.0 0.976667 0.888333 0.743333 0.608333 0.418333 0.286667 0.156667 ... 0.008333 0.003333 0.0 1.345672 1.653154 2.071348 1.5C high overshoot 2037.0 2048.0 2050.0
ssp126 unspecified MAGICCv7.5.1 1.0 1.0 1.0 1.0 0.988333 0.94 0.836667 0.735 0.618333 0.465 ... 0.096667 0.056667 0.025 1.478604 1.87889 2.425129 Higher 2C 2059.0 2069.0 2081.0
ssp245 unspecified MAGICCv7.5.1 1.0 1.0 1.0 1.0 1.0 1.0 1.0 1.0 1.0 1.0 ... 0.911667 0.846667 0.783333 2.218657 2.838238 3.642782 Above 2C 2100.0 2100.0 2100.0
ssp370 unspecified MAGICCv7.5.1 1.0 1.0 1.0 1.0 1.0 1.0 1.0 1.0 1.0 1.0 ... 1.0 1.0 1.0 3.476406 4.372581 5.35643 Above 2C 2100.0 2100.0 2100.0
ssp434 unspecified MAGICCv7.5.1 1.0 1.0 1.0 1.0 1.0 1.0 1.0 0.996667 0.978333 0.946667 ... 0.58 0.463333 0.356667 1.898714 2.371833 2.958278 Above 2C 2092.0 2094.0 2100.0
ssp460 unspecified MAGICCv7.5.1 1.0 1.0 1.0 1.0 1.0 1.0 1.0 1.0 1.0 1.0 ... 1.0 1.0 0.995 2.753816 3.485626 4.373157 Above 2C 2100.0 2100.0 2100.0
ssp534-over unspecified MAGICCv7.5.1 1.0 1.0 1.0 1.0 1.0 1.0 1.0 1.0 1.0 0.996667 ... 0.811667 0.725 0.63 2.100912 2.613975 3.299928 Above 2C 2058.0 2060.0 2069.0
ssp585 unspecified MAGICCv7.5.1 1.0 1.0 1.0 1.0 1.0 1.0 1.0 1.0 1.0 1.0 ... 1.0 1.0 1.0 4.094323 5.278692 6.75008 Above 2C 2100.0 2100.0 2100.0

8 rows × 23 columns