Changelog
Versions follow Semantic Versioning (<major>.<minor>.<patch>).
Backward incompatible (breaking) changes will only be introduced in major versions with advance notice in the Deprecations section of releases.
scmdata v1.0.0 (2024-01-29)
Improvements
Update to avoid hitting DeprecationWarning in pandas and seaborn
This should help reduce so many warnings appearing when doing common operations. (#298)
Trivial/Internal Changes
scmdata v0.16.1 (2023-10-18)
Improved Documentation
Fixed documentation examples (#274)
Trivial/Internal Changes
scmdata v0.16.0 (2023-10-18)
Breaking Changes
Dropped support for Python 3.8 and relaxed requirements of pint and pyam-iamc
This has led to a number of follow up issues:
documentation of our supported dependency versions (#277)
moving to NEP29 (#276)
a full review of dependencies (#278)
need to test against development versions of upstream repositories (#279)
(#275)
Improvements
Added support for
scmdata.run_append()to appendpd.DataFrameobjectsThis provides some performance benefits when performing large groupby operations in certain circumstances by reducing the number of required append operations. (#262)
Trivial/Internal Changes
scmdata v0.15.3 (2023-10-12)
Improvements
Trivial/Internal Changes
v0.15.2
(#257) Updated to support the latest version of
notebook(#252) Add
py.typedfile do downstream packages can use the provided type-hints. Improved the coverage of the type-hints inrun.py(#255) Unpin upper limit of the version of numpy
(#248) Correctly filter the time index of empty ScmRuns. Resolves #245
(#247) Better performance for ScmRun.setitem
v0.15.1
v0.15.0
(#223) Loosen the pandas requirement to cover pandas>=1.4.3. Also support officially support Python 3.10 and 3.11
(#222) Decrease the minimum number of time points for interpolation to 2
(#221) Add option to
scmdata.ScmRun.interpolate()to allow for interpolation which ignores leap-years. This also fixes a bug wherescmdata.ScmRun.interpolate()converts integer values into unix time. This functionality isn’t consistent with the behaviour of the TimePoints class where integers are converted into years.(#218) Replaced internal calls to
scmdata.groupby.RunGroupby.map()withscmdata.groupby.RunGroupby.apply()(#210) Update github actions to avoid the use of a deprecated workflow command
v0.14.2
(#209) Lazy import plotting modules to speed up startup time
(#208) Ensure that all unit operations in
scmdatausescmdata.units.UNIT_REGISTRY. This now defaults toopenscm_units.unit_registryinstead of unique unit registry forscmdata.(#202) Add
scmdata.run.ScmRun.set_meta()to enable setting of metadata for a subset of timeseries(#194) Deprecate
scmdata.groupby.RunGroupBy.map()in preference toscmdata.groupby.RunGroupBy.apply()which is identical in functionality. Addscmdata.ScmRun.applyfor applying a function to each timeseries(#195) Refactor
scmdata.databaseto a package. The database backends have been moved toscmdata.database.backends.(#197) Workaround regression in Panda’s handling of xarray’s
xr.CFTimeIndex(#193) Pin the version of black used for code formatting to ensure consistency
v0.14.1
v0.14.0
(#190) Add special case for extrapolating timeseries containing a single timestep using
constantextrapolation. Movedscmdata.errors.InsufficientDataErrorfromscmdata.timetoscmdata.errors(#186 and #187) Fix the handling of non-alphanumeric characters in filenames on Windows for
scmdata.database.ScmDatabase.*values are no longer included inscmdata.database.ScmDatabasefilenames(#186 Move to
pyproject.tomlfor setup etc.
v0.13.2
(#185) Allow
scmdata.run.ScmRunto read remote files by providing a URL to the constructor(#183) Deprecate
scmdata.ops.integrate(), replacing with toscmdata.ops.cumsum()andscmdata.ops.cumtrapz()(#184) Add
scmdata.run.ScmRun.round()(#182) Updated incorrect
condainstall instructions
v0.13.1
(#181) Allow the initialisation of empty
scmdata.ScmRunobjects(#180) Add
scmdata.processing.calculate_crossing_times_quantiles()to handle quantile calculations with nan values involved(#176) Add
as_runargument toscmdata.ScmRun.process_over()(closes #160)
v0.13.0
(#174) Add
scmdata.processing.categorisation_sr15()(also added functionality for this toscmdata.processing.calculate_summary_stats())(#173) Add
scmdata.processing.calculate_peak()andscmdata.processing.calculate_peak_time()(also added functionality for these toscmdata.processing.calculate_summary_stats())(#171) Add
scmdata.processing.calculate_exceedance_probabilities(),scmdata.processing.calculate_exceedance_probabilities_over_time()andscmdata.ScmRun.get_meta_columns_except()(#170) Added
scmdata.ScmRun.groupby_all_except()to allow greater use of the concept of grouping by columns except a given set(#169) Make
scmdata.processing.calculate_crossing_times()able to be used as a standalone function rather than being intended to be called viascmdata.ScmRun.process_over()(#168) Improve the error messages when checking that
scmdata.ScmRunobjects are identical(#164) Added
scmdata.ScmRun.append_timewise()to allow appending of data along the time axis with broadcasting along multiple meta dimensions(#164) Sort time axis internally (ensures that
scmdata.ScmRun.__repr__()renders properly)(#164) Added
scmdata.errors.DuplicateTimesError, raised when duplicate times are passed toscmdata.ScmRun(#164) Unified capitalisation of error messages in
scmdata.errorsand added themetatable toexc_infoofNonUniqueMetadataError(#163) Added
scmdata.ScmRun.adjust_median_to_target()to allow for the median of an ensemble of timeseries to be adjusted to a given value(#163) Update
scmdata.plotting.RCMIP_SCENARIO_COLOURSto new AR6 colours
v0.12.1
(#162) Fix bug which led to a bad read in when the saved data spanned from before year 1000
(#162) Allowed
scmdata.ScmRun.plumeplot()to handle the case where not all data will make complete plumes or have a best-estimate line ifpre_calculatedisTrue. This allows a dataset with one source that has a best-estimate only to be plotted at the same time as a dataset which has a range too with only a single call toscmdata.ScmRun.plumeplot().
v0.12.0
(#161) Loosen requirements and drop Python3.6 support
v0.11.0
(#159) Allow access to more functions in
scmdata.run.BaseScmRun.process_over, including arbitrary functions(#158) Return
cftime.DatetimeGregorianrather thancftime.datetimefromscmdata.time.TimePoints.as_cftime()andscmdata.offsets.generate_range()to ensure better interoperability with other libraries (e.g. xarray’s plotting functionality). Adddate_clsargument toscmdata.time.TimePoints.as_cftime()andscmdata.offsets.generate_range()so that the output date type can be user specified.(#148) Refactor
scmdata.database.ScmDatabaseto be able to use custom backends(#157) Add
disable_tqdmparameter toscmdata.database.ScmDatabase.load()andscmdata.database.ScmDatabase.save()to disable displaying progress bars(#155) Simplify flake8 configuration
v0.10.1
(#154) Refactor common binary operators for
scmdata.run.BaseScmRunandscmdata.timeseries.Timeseriesinto a mixin following the removal ofxarray.core.ops.inject_binary_ops()inxarray==1.18.0
v0.10.0
(#151) Add
ScmRun.to_xarray()(improves conversion to xarray and ability of user to control dimensions etc. when writing netCDF files)(#149) Fix bug in testcase for
xarray<=0.16.1(#147) Re-do netCDF reading and writing to make use of xarray and provide better handling of extras (results in speedups of 10-100x)
(#146) Update CI-CD workflow to include more sensible dependencies and also test Python3.9
(#145) Allow
ScmDatabase.load()to handle lists as filter values
v0.9.1
(#144) Fix
ScmRun.plumeplot()style handling (previously, ifdasheswas not supplied each line would be a different style even if all the lines had the same value forstyle_var)
v0.9.0
(#143) Alter time axis when serialising to netCDF so that time axis is easily read by other tools (e.g. xarray)
v0.8.0
(#139) Update filter to handle metadata columns which contain a mix of data types
(#139) Add
ScmRun.plumeplot()(#140) Add workaround for installing scmdata with Python 3.6 on windows to handle lack of cftime 1.3.1 wheel
(#138) Add
ScmRun.quantiles_over()(#137) Fix
scmdata.ScmRun.to_csv()so that writing and reading is circular (i.e. you end up where you started if you write a file and then read it straight back into a newscmdata.ScmRuninstance)
v0.7.6
v0.7.5
(#133) Pin pandas<1.2 to avoid pint-pandas installation failure (see pint-pandas #51)
v0.7.4
(#132) Update to new
openscm-units 0.2(#130) Add stack info to warning message when filtering results in an empty
scmdata.run.ScmRun
v0.7.3
(#124) Add
scmdata.run.BaseScmRunandscmdata.run.BaseScmRun.required_colsso new sub-classes can be defined which use a different set of required columns fromscmdata.run.ScmRun. Also addedscmdata.errors.MissingRequiredColumnand tidied up the docs.(#75) Add test to ensure that
scmdata.ScmRun.groupby()cannot pick up the same timeseries twice even if metadata is changed by the function being applied(#125) Fix edge-case when filtering an empty
scmdata.ScmRun(#123) Add
scmdata.database.ScmDatabaseto read/write data using multiple files. (closes #103)
v0.7.2
(#121) Faster implementation of
scmdata.run.run_append(). The original timeseries indexes and order are no longer maintained after an append.(#120) Check the type and length of the runs argument in
scmdata.run.run_append()(closes #101)
v0.7.1
(#119) Make groupby support grouping by metadata with integer values
(#119) Ensure using
scmdata.run.run_append()does not mangle the index topd.DatetimeIndex
v0.7.0
(#118) Make scipy an optional dependency
(#116) Update
scmdata.ScmRun.drop_meta()inplace behaviour(#115) Add
na-overrideargument toscmdata.ScmRun.process_over()for handling nan metadata (closes #113)(#114) Add operations:
scmdata.ScmRun.linear_regression(),scmdata.ScmRun.linear_regression_gradient(),scmdata.ScmRun.linear_regression_intercept()andscmdata.ScmRun.linear_regression_scmrun()(#111) Add operation:
scmdata.ScmRun.delta_per_delta_time()(#112) Ensure unit conversion doesn’t fall over when the target unit is in the input
(#110) Revert to using
pd.DataFramewithpd.Categoricalseries as meta indexes.(#105) Add performance benchmarks for
ScmRun(#106) Add
ScmRun.integrate()so we can integrate timeseries with respect to time(#104) Fix bug when reading csv/excel files which use integer years and
lowercase_cols=True(closes #102)
v0.6.4
(#96) Fix non-unique timeseries metadata checks for
ScmRun.timeseries()(#100) When initialising
ScmRunfrom file, make the default be to read withpd.read_csv(). This means we now initialising reading from gzipped CSV files.(#99) Hotfix failing notebook test
(#95) Add
drop_all_nan_timeskeyword argument toScmRun.timeseries()so time points with no data of interest can easily be removed
v0.6.3
(#91) Provide support for pandas==1.1
v0.6.2
(#87) Upgrade workflow to use
isort>=5(#82) Add support for adding Pint scalars and vectors to
scmdata.Timeseriesandscmdata.ScmRuninstances(#85) Allow required columns to be read as
extrasfrom netCDF files (closes #83)(#84) Raise a DeprecationWarning if no default
inplaceargument is provided forScmRun.drop_meta(). inplace default behaviour scheduled to be changed toFalsein v0.7.0(#81) Add
scmdata.run.ScmRun.metadatato trackScmRuninstance-specific metadata (closes #77)(#80) No longer use
pandas.tseries.offsets.BusinessMixinto determine Business-related offsets inscmdata.offsets.to_offset(). (closes #78)(#79) Introduce
scmdata.errors.NonUniqueMetadataError. Update handling of duplicate metadata so default behaviour ofrun_appendis to raise aNonUniqueMetadataError. (closes #76)
v0.6.1
v0.5.2
(#65) Use pint for ops, making them automatically unit aware
(#71) Start adding arithmetic support via
scmdata.ops. So far only add and subtract are supported.(#70) Automatically set y-axis label to units if it makes sense in
ScmRun’slineplot()method
v0.5.1
(#68) Rename
scmdata.run.df_append()to :func`scmdata.run.run_append`. :func`scmdata.run.df_append` deprecated and will be removed in v0.6.0(#67) Update the documentation for
ScmRun.append()(#66) Raise ValueError if index/columns arguments are not provided when instantiating a :class`ScmRun` object with a numpy array. Add
lowercase_colsargument to coerce the column names in CSV files to lowercase
v0.5.0
(#64) Remove spurious warning from
ScmRun’sfilter()method(#63) Removeset_meta()fromScmRunin preference for using the__setitem__()method(#62) Fix interpolation when the data contains nan values
(#61) Hotfix filters to also include caret (“^”) in pseudo-regexp syntax. Also adds
empty()property toScmRun(#59) Deprecate
ScmDataFrame. To be removed inv0.6.0(#58) Use
cftimedatetimes when appendingScmRunobjects to avoid OutOfBounds errors when datetimes span many centuries(#55) Add
time_axiskeyword argument toScmRun.timeseries,ScmRun.long_dataandScmRun.lineplotto give greater control of the time axis when retrieving data(#54) Add
drop_meta()toScmRunfor dropping metadata columns(#53) Don’t convert case of variable names written to file. No longer convert case of serialized dataframes
(#51) Refactor
relative_to_ref_period_mean()so that it returns an instance of the input data type (rather than apd.DataFrame) and puts the reference period in separate meta columns rather than mangling the variable name.(#47) Update README and
setup.pyto make it easier for new users
v0.4.3
(#46) Add test of conda installation
v0.4.2
(#45) Make installing seaborn optional
v0.4.1
(#44) Add multi-dimensional handling to
scmdata.netcdf(#43) Fix minor bugs in netCDF handling and address minor code coverage issues
(#41) Update documentation of the data model. Additionally:
makes
.time_pointsatttributes consistently returnscmdata.time.TimePointsinstancesensures
.metais used consistently throughout the code base (removing.metadata)
(#33) Remove dependency on pyam. Plotting is done with seaborninstead.
(#34) Allow the serialization/deserialization of
scmdata.run.ScmRunandscmdata.ScmDataFrameas netCDF4 files.(#30) Swap to using openscm-units for unit handling (hence remove much of the
scmdata.unitsmodule)(#21) Added
scmdata.run.ScmRunas a proposed replacement forscmdata.dataframe.ScmDataFrame. This new class provides an identical interface as aScmDataFrame, but uses a different underlying data structure to theScmDataFrame. The purpose ofScmRunis to provide performance improvements when handling large sets of time-series data. Removed support for Python 3.5 untilpyamdependency is optional(#31) Tidy up repository after changing location
v0.4.0
(#28) Expose
scmdata.units.unit_registry
v0.3.1
v0.3.0
v0.2.2
(#16) Only rename columns when initialising data if needed
v0.2.1
v0.2.0
v0.1.2
v0.1.1
(#5) Add
scmdata.dataframe.df_appendto__init__.py
v0.1.0
(#3) Added documentation for the api and Makefile targets for releasing
(#2) Refactored scmdataframe from openclimatedata/openscm@077f9b5 into a standalone package
(#1) Add docs folder