Changelog

Versions follow Semantic Versioning (<major>.<minor>.<patch>).

Backward incompatible (breaking) changes will only be introduced in major versions with advance notice in the Deprecations section of releases.

scmdata v1.0.0 (2024-01-29)

Improvements

Update to avoid hitting DeprecationWarning in pandas and seaborn

This should help reduce so many warnings appearing when doing common operations. (#298)

Trivial/Internal Changes

#298

scmdata v0.16.1 (2023-10-18)

Improved Documentation

Fixed documentation examples (#274)

Trivial/Internal Changes

#274

scmdata v0.16.0 (2023-10-18)

Breaking Changes

Dropped support for Python 3.8 and relaxed requirements of pint and pyam-iamc

This has led to a number of follow up issues:
- documentation of our supported dependency versions (#277)
- moving to NEP29 (#276)
- a full review of dependencies (#278)
- need to test against development versions of upstream repositories (#279)
(#275)

Improvements

Added support for scmdata.run_append() to append pd.DataFrame objects

This provides some performance benefits when performing large groupby operations in certain circumstances by reducing the number of required append operations. (#262)

Trivial/Internal Changes

#267, #268

scmdata v0.15.3 (2023-10-12)

Improvements

Added support for pandas>=2

Requirement now set to ‘pandas>=1.1’ (#235)
Migrated to use the Climate Resource copier template

This migration adds support for ruff and pre-commit hooks to improve code quality (#260)

Trivial/Internal Changes

#235

v0.15.2

(#257) Updated to support the latest version of notebook
(#252) Add py.typed file do downstream packages can use the provided type-hints. Improved the coverage of the type-hints in run.py
(#255) Unpin upper limit of the version of numpy
(#248) Correctly filter the time index of empty ScmRuns. Resolves #245
(#247) Better performance for ScmRun.setitem

v0.15.1

(#239) Move notebooks into the documentation and an update of the documentation configuration
(#238) Support scmdata.ScmRun() reading and writing files using pathlib.Path objects.
(#232) Update inplace operations to always return a result (closes #230). Removes support for pandas==1.0.5

v0.15.0

(#223) Loosen the pandas requirement to cover pandas>=1.4.3. Also support officially support Python 3.10 and 3.11
(#222) Decrease the minimum number of time points for interpolation to 2
(#221) Add option to scmdata.ScmRun.interpolate() to allow for interpolation which ignores leap-years. This also fixes a bug where scmdata.ScmRun.interpolate() converts integer values into unix time. This functionality isn’t consistent with the behaviour of the TimePoints class where integers are converted into years.
(#218) Replaced internal calls to scmdata.groupby.RunGroupby.map() with scmdata.groupby.RunGroupby.apply()
(#210) Update github actions to avoid the use of a deprecated workflow command

v0.14.2

(#209) Lazy import plotting modules to speed up startup time
(#208) Ensure that all unit operations in scmdata use scmdata.units.UNIT_REGISTRY. This now defaults to openscm_units.unit_registry instead of unique unit registry for scmdata.
(#202) Add scmdata.run.ScmRun.set_meta() to enable setting of metadata for a subset of timeseries
(#194) Deprecate scmdata.groupby.RunGroupBy.map() in preference to scmdata.groupby.RunGroupBy.apply() which is identical in functionality. Add scmdata.ScmRun.apply for applying a function to each timeseries
(#195) Refactor scmdata.database to a package. The database backends have been moved to scmdata.database.backends.
(#197) Workaround regression in Panda’s handling of xarray’s xr.CFTimeIndex
(#193) Pin the version of black used for code formatting to ensure consistency

v0.14.1

(#192) Bugfix for the versioning of the package
(#191) Add check of PyPI distribution to CI

v0.14.0

(#190) Add special case for extrapolating timeseries containing a single timestep usingconstant extrapolation. Moved scmdata.errors.InsufficientDataError from scmdata.time to scmdata.errors
(#186 and #187) Fix the handling of non-alphanumeric characters in filenames on Windows for scmdata.database.ScmDatabase. * values are no longer included in scmdata.database.ScmDatabase filenames
(#186 Move to pyproject.toml for setup etc.

v0.13.2

(#185) Allow scmdata.run.ScmRun to read remote files by providing a URL to the constructor
(#183) Deprecate scmdata.ops.integrate(), replacing with to scmdata.ops.cumsum() and scmdata.ops.cumtrapz()
(#184) Add scmdata.run.ScmRun.round()
(#182) Updated incorrect conda install instructions

v0.13.1

(#181) Allow the initialisation of empty scmdata.ScmRun objects
(#180) Add scmdata.processing.calculate_crossing_times_quantiles() to handle quantile calculations with nan values involved
(#176) Add as_run argument to scmdata.ScmRun.process_over() (closes #160)

v0.13.0

(#174) Add scmdata.processing.categorisation_sr15() (also added functionality for this to scmdata.processing.calculate_summary_stats())
(#173) Add scmdata.processing.calculate_peak()and scmdata.processing.calculate_peak_time() (also added functionality for these to scmdata.processing.calculate_summary_stats())
(#175) Remove unused scmdata.REQUIRED_COLS (closes #166)
(#172) Add scmdata.processing.calculate_summary_stats()
(#171) Add scmdata.processing.calculate_exceedance_probabilities(), scmdata.processing.calculate_exceedance_probabilities_over_time()and scmdata.ScmRun.get_meta_columns_except()
(#170) Added scmdata.ScmRun.groupby_all_except() to allow greater use of the concept of grouping by columns except a given set
(#169) Make scmdata.processing.calculate_crossing_times() able to be used as a standalone function rather than being intended to be called via scmdata.ScmRun.process_over()
(#168) Improve the error messages when checking that scmdata.ScmRun objects are identical
(#165) Add scmdata.processing.calculate_crossing_times()
(#164) Added scmdata.ScmRun.append_timewise() to allow appending of data along the time axis with broadcasting along multiple meta dimensions
(#164) Sort time axis internally (ensures that scmdata.ScmRun.__repr__() renders properly)
(#164) Added scmdata.errors.DuplicateTimesError, raised when duplicate times are passed to scmdata.ScmRun
(#164) Unified capitalisation of error messages in scmdata.errors and added the meta table to exc_info of NonUniqueMetadataError
(#163) Added scmdata.ScmRun.adjust_median_to_target() to allow for the median of an ensemble of timeseries to be adjusted to a given value
(#163) Update scmdata.plotting.RCMIP_SCENARIO_COLOURS to new AR6 colours

v0.12.1

(#162) Fix bug which led to a bad read in when the saved data spanned from before year 1000
(#162) Allowed scmdata.ScmRun.plumeplot() to handle the case where not all data will make complete plumes or have a best-estimate line if pre_calculated is True. This allows a dataset with one source that has a best-estimate only to be plotted at the same time as a dataset which has a range too with only a single call to scmdata.ScmRun.plumeplot().

v0.12.0

(#161) Loosen requirements and drop Python3.6 support

v0.11.0

(#159) Allow access to more functions in scmdata.run.BaseScmRun.process_over, including arbitrary functions
(#158) Return cftime.DatetimeGregorian rather than cftime.datetime from scmdata.time.TimePoints.as_cftime() and scmdata.offsets.generate_range() to ensure better interoperability with other libraries (e.g. xarray’s plotting functionality). Add date_cls argument to scmdata.time.TimePoints.as_cftime() and scmdata.offsets.generate_range() so that the output date type can be user specified.
(#148) Refactor scmdata.database.ScmDatabase to be able to use custom backends
(#157) Add disable_tqdm parameter to scmdata.database.ScmDatabase.load() and scmdata.database.ScmDatabase.save() to disable displaying progress bars
(#156) Fix pandas and xarray documentation links
(#155) Simplify flake8 configuration

v0.10.1

(#154) Refactor common binary operators for scmdata.run.BaseScmRun and scmdata.timeseries.Timeseries into a mixin following the removal of xarray.core.ops.inject_binary_ops() in xarray==1.18.0

v0.10.0

(#151) Add ScmRun.to_xarray() (improves conversion to xarray and ability of user to control dimensions etc. when writing netCDF files)
(#149) Fix bug in testcase for xarray<=0.16.1
(#147) Re-do netCDF reading and writing to make use of xarray and provide better handling of extras (results in speedups of 10-100x)
(#146) Update CI-CD workflow to include more sensible dependencies and also test Python3.9
(#145) Allow ScmDatabase.load() to handle lists as filter values

v0.9.1

(#144) Fix ScmRun.plumeplot() style handling (previously, if dashes was not supplied each line would be a different style even if all the lines had the same value for style_var)

v0.9.0

(#143) Alter time axis when serialising to netCDF so that time axis is easily read by other tools (e.g. xarray)

v0.8.0

(#139) Update filter to handle metadata columns which contain a mix of data types
(#139) Add ScmRun.plumeplot()
(#140) Add workaround for installing scmdata with Python 3.6 on windows to handle lack of cftime 1.3.1 wheel
(#138) Add ScmRun.quantiles_over()
(#137) Fix scmdata.ScmRun.to_csv() so that writing and reading is circular (i.e. you end up where you started if you write a file and then read it straight back into a new scmdata.ScmRun instance)

v0.7.6

(#136) Make filtering by year able to handle a np.ndarrayof integers (previously this would raise a TypeError)
(#135) Make scipy lazy loading in scmdata.time follow lazy loading seen in other modules
(#134) Add CI run in which seaborn is not installed to check scipy importing

v0.7.5

(#133) Pin pandas<1.2 to avoid pint-pandas installation failure (see pint-pandas #51)

v0.7.4

(#132) Update to new openscm-units 0.2
(#130) Add stack info to warning message when filtering results in an empty scmdata.run.ScmRun

v0.7.3

(#124) Add scmdata.run.BaseScmRun and scmdata.run.BaseScmRun.required_cols so new sub-classes can be defined which use a different set of required columns from scmdata.run.ScmRun. Also added scmdata.errors.MissingRequiredColumn and tidied up the docs.
(#75) Add test to ensure that scmdata.ScmRun.groupby() cannot pick up the same timeseries twice even if metadata is changed by the function being applied
(#125) Fix edge-case when filtering an empty scmdata.ScmRun
(#123) Add scmdata.database.ScmDatabase to read/write data using multiple files. (closes #103)

v0.7.2

(#121) Faster implementation of scmdata.run.run_append(). The original timeseries indexes and order are no longer maintained after an append.
(#120) Check the type and length of the runs argument in scmdata.run.run_append() (closes #101)

v0.7.1

(#119) Make groupby support grouping by metadata with integer values
(#119) Ensure using scmdata.run.run_append() does not mangle the index to pd.DatetimeIndex

v0.7.0

(#118) Make scipy an optional dependency
(#117) Sort timeseries index ordering (closes #97)
(#116) Update scmdata.ScmRun.drop_meta() inplace behaviour
(#115) Add na-override argument to scmdata.ScmRun.process_over() for handling nan metadata (closes #113)
(#114) Add operations: scmdata.ScmRun.linear_regression(), scmdata.ScmRun.linear_regression_gradient(), scmdata.ScmRun.linear_regression_intercept() and scmdata.ScmRun.linear_regression_scmrun()
(#111) Add operation: scmdata.ScmRun.delta_per_delta_time()
(#112) Ensure unit conversion doesn’t fall over when the target unit is in the input
(#110) Revert to using pd.DataFrame with pd.Categorical series as meta indexes.
(#108) Remove deprecated ScmDataFrame (closes #60)
(#105) Add performance benchmarks for ScmRun
(#106) Add ScmRun.integrate() so we can integrate timeseries with respect to time
(#104) Fix bug when reading csv/excel files which use integer years and lowercase_cols=True (closes #102)

v0.6.4

(#96) Fix non-unique timeseries metadata checks for ScmRun.timeseries()
(#100) When initialising ScmRun from file, make the default be to read with pd.read_csv(). This means we now initialising reading from gzipped CSV files.
(#99) Hotfix failing notebook test
(#94) Fix edge-case issue with drop_meta (closes #92)
(#95) Add drop_all_nan_times keyword argument to ScmRun.timeseries() so time points with no data of interest can easily be removed

v0.6.3

(#91) Provide support for pandas==1.1

v0.6.2

(#87) Upgrade workflow to use isort>=5
(#82) Add support for adding Pint scalars and vectors to scmdata.Timeseries and scmdata.ScmRun instances
(#85) Allow required columns to be read as extras from netCDF files (closes #83)
(#84) Raise a DeprecationWarning if no default inplace argument is provided for ScmRun.drop_meta(). inplace default behaviour scheduled to be changed to False in v0.7.0
(#81) Add scmdata.run.ScmRun.metadata to track ScmRun instance-specific metadata (closes #77)
(#80) No longer use pandas.tseries.offsets.BusinessMixin to determine Business-related offsets in scmdata.offsets.to_offset(). (closes #78)
(#79) Introduce scmdata.errors.NonUniqueMetadataError. Update handling of duplicate metadata so default behaviour of run_append is to raise a NonUniqueMetadataError. (closes #76)

v0.6.1

(#74) Update handling of unit conversion context during unit conversions
(#73) Only reindex timeseries when dealing with different time points

v0.5.2

(#65) Use pint for ops, making them automatically unit aware
(#71) Start adding arithmetic support via scmdata.ops. So far only add and subtract are supported.
(#70) Automatically set y-axis label to units if it makes sense in ScmRun’s lineplot() method

v0.5.1

(#68) Rename scmdata.run.df_append() to :func`scmdata.run.run_append`. :func`scmdata.run.df_append` deprecated and will be removed in v0.6.0
(#67) Update the documentation for ScmRun.append()
(#66) Raise ValueError if index/columns arguments are not provided when instantiating a :class`ScmRun` object with a numpy array. Add lowercase_cols argument to coerce the column names in CSV files to lowercase

v0.5.0

(#64) Remove spurious warning from ScmRun’s filter() method(#63) Remove set_meta() from ScmRun in preference for using the __setitem__() method
(#62) Fix interpolation when the data contains nan values
(#61) Hotfix filters to also include caret (“^”) in pseudo-regexp syntax. Also adds empty() property to ScmRun
(#59) Deprecate ScmDataFrame. To be removed in v0.6.0
(#58) Use cftime datetimes when appending ScmRun objects to avoid OutOfBounds errors when datetimes span many centuries
(#55) Add time_axiskeyword argument to ScmRun.timeseries, ScmRun.long_data and ScmRun.lineplot to give greater control of the time axis when retrieving data
(#54) Add drop_meta() to ScmRun for dropping metadata columns
(#53) Don’t convert case of variable names written to file. No longer convert case of serialized dataframes
(#51) Refactor relative_to_ref_period_mean() so that it returns an instance of the input data type (rather than a pd.DataFrame) and puts the reference period in separate meta columns rather than mangling the variable name.
(#47) Update README and setup.py to make it easier for new users

v0.4.3

(#46) Add test of conda installation

v0.4.2

(#45) Make installing seaborn optional

v0.4.1

(#44) Add multi-dimensional handling to scmdata.netcdf
(#43) Fix minor bugs in netCDF handling and address minor code coverage issues
(#41) Update documentation of the data model. Additionally:
- makes .time_points atttributes consistently return scmdata.time.TimePoints instances
- ensures .meta is used consistently throughout the code base (removing .metadata)
(#33) Remove dependency on pyam. Plotting is done with seaborninstead.
(#34) Allow the serialization/deserialization of scmdata.run.ScmRun and scmdata.ScmDataFrame as netCDF4 files.
(#30) Swap to using openscm-units for unit handling (hence remove much of the scmdata.units module)
(#21) Added scmdata.run.ScmRun as a proposed replacement for scmdata.dataframe.ScmDataFrame. This new class provides an identical interface as a ScmDataFrame, but uses a different underlying data structure to the ScmDataFrame. The purpose of ScmRun is to provide performance improvements when handling large sets of time-series data. Removed support for Python 3.5 untilpyam dependency is optional
(#31) Tidy up repository after changing location

v0.4.0

(#28) Expose scmdata.units.unit_registry

v0.3.1

(#25) Make scipy an optional dependency
(#24) Fix missing “N2O” unit (see #14). Also updates test of year to day conversion, it is 365.25 to within 0.01% (but depends on the Pint release).

v0.3.0

(#20) Add support for python=3.5
(#19) Add support for python=3.6

v0.2.2

(#16) Only rename columns when initialising data if needed

v0.2.1

(#13) Ensure LICENSE is included in package
(#11) Add SO2F2 unit and update to Pyam v0.3.0
(#12) Add get_unique_meta convenience method
(#10) Fix extrapolation bug which prevented any extrapolation from occurring

v0.2.0

(#9) Add time_mean method
(#8) Add make docs target

v0.1.2

(#7) Add notebook tests
(#4) Unit conversions for CH4 and N2O contexts now work for compound units (e.g. ‘Mt CH4 / yr’ to ‘Gt C / day’)
(#6) Add auto-formatting

v0.1.1

(#5) Add scmdata.dataframe.df_append to __init__.py

v0.1.0

(#3) Added documentation for the api and Makefile targets for releasing
(#2) Refactored scmdataframe from openclimatedata/openscm@077f9b5 into a standalone package
(#1) Add docs folder