scmdata.filters

Helpers for filtering data in scmdata.run.ScmRun.

Based upon pyam.utils.

scmdata.filters.datetime_match(data: List, dts: List[datetime] | datetime) ndarray[source]

Match datetimes in time columns for data filtering.

Parameters:
  • data – Input data to perform filtering on

  • dts – Datetimes to use for filtering

Returns:

Array where True indicates a match

Return type:

numpy.ndarray of bool

Raises:

TypeErrordts contains int

scmdata.filters.day_match(data: List, days: List[str] | List[int] | int | str) ndarray[source]

Match days in time columns for data filtering.

Parameters:
  • data – Input data to perform filtering on

  • days – Days to match

Returns:

Array where True indicates a match

Return type:

numpy.ndarray of bool

scmdata.filters.find_depth(meta_col: Series, s: str, level: int | str, separator: str = '|') ndarray[source]

Find all values which match given depth from a filter keyword.

Parameters:
  • meta_col – Column in which to find values which match the given depth

  • s – Filter keyword, from which level should be applied

  • level – Depth of value to match as defined by the number of separator in the value name. If an int, the depth is matched exactly. If a str, then the depth can be matched as either “X-”, for all levels up to level “X”, or “X+”, for all levels above level “X”.

  • separator – The string used to separate levels in s. Defaults to a pipe (“|”).

Returns:

Array where True indicates a match

Return type:

numpy.ndarray of bool

Raises:

ValueError – If level cannot be understood

scmdata.filters.hour_match(data: List, hours: List[int] | int) ndarray[source]

Match hours in time columns for data filtering.

Parameters:
  • data – Input data to perform filtering on

  • hours – Hours to match

Returns:

Array where True indicates a match

Return type:

numpy.ndarray of bool

scmdata.filters.is_in(vals: List, items: List) ndarray[source]

Find elements of vals which are in items.

Parameters:
  • vals – The list of values to check

  • items – The options used to determine whether each element of vals is in the desired subset or not

Returns:

Array of the same length as vals where the element is True if the corresponding element of vals is in items and False otherwise

Return type:

numpy.ndarray of bool

scmdata.filters.month_match(data: List, months: List[str] | List[int] | int | str) ndarray[source]

Match months in time columns for data filtering.

Parameters:
  • data – Input data to perform filtering on

  • months – Months to match

Returns:

Array where True indicates a match

Return type:

numpy.ndarray of bool

scmdata.filters.pattern_match(meta_col: Series, values: Iterable[str] | str, level: str | int | None = None, regexp: bool = False, separator: str = '|') ndarray[source]

Filter data by matching metadata columns to given patterns.

Parameters:
  • meta_col – Column to perform filtering on

  • values – Values to match

  • level – Passed to find_depth(). For usage, see docstring of find_depth().

  • regexp – If True, match using regexp rather than our pseudo regexp syntax.

  • has_nan – If True, convert all nan values in meta_col to empty string before applying filters. This means that “” and “*” will match rows with numpy.nan. If False, the conversion is not applied and so a search in a string column which contains numpy.nan will result in a TypeError.

  • separator – String used to separate the hierarchy levels in values. Defaults to ‘|’

Returns:

Array where True indicates a match

Return type:

numpy.ndarray of bool

Raises:

TypeError – Filtering is performed on a string metadata column which contains numpy.nan and has_nan is False

scmdata.filters.time_match(data: List, times: List[str] | List[int] | int | str, conv_codes: List[str], strptime_attr: str, name: str) ndarray[source]

Match times by applying conversion codes to filtering list.

Parameters:
  • data – Input data to perform filtering on

  • times – Times to match

  • conv_codes – If times contains strings, conversion codes to try passing to time.strptime() to convert times to datetime.datetime

  • strptime_attr – If times contains strings, the datetime.datetime attribute to finalize the conversion of strings to integers

  • name – Name of the part of a datetime to extract, used to produce useful error messages.

Returns:

Array where True indicates a match

Return type:

numpy.ndarray of bool

Raises:

ValueError – If input times cannot be converted understood or if input strings do not lead to increasing integers (i.e. “Nov-Feb” will not work, one must use [“Nov-Dec”, “Jan-Feb”] instead)

scmdata.filters.years_match(data: List, years: List[int] | ndarray | int) ndarray[source]

Match years in time columns for data filtering.

Parameters:
  • data – Input data to perform filtering on

  • years – Years to match

Returns:

Array where True indicates a match

Return type:

numpy.ndarray of bool

Raises:

TypeError – If years is not int or list of int