scmdata.filters

Helpers for filtering data in scmdata.run.ScmRun.

Based upon pyam.utils.

scmdata.filters.datetime_match(data: List, dts: Union[List[datetime.datetime], datetime.datetime]) numpy.ndarray[source]

Match datetimes in time columns for data filtering.

Parameters
  • data – Input data to perform filtering on

  • dts – Datetimes to use for filtering

Returns

Array where True indicates a match

Return type

numpy.ndarray of bool

Raises

TypeErrordts contains int

scmdata.filters.day_match(data: List, days: Union[List[str], List[int], int, str]) numpy.ndarray[source]

Match days in time columns for data filtering.

Parameters
  • data – Input data to perform filtering on

  • days – Days to match

Returns

Array where True indicates a match

Return type

numpy.ndarray of bool

scmdata.filters.find_depth(meta_col: pandas.core.series.Series, s: str, level: Union[int, str], separator: str = '|') numpy.ndarray[source]

Find all values which match given depth from a filter keyword.

Parameters
  • meta_col – Column in which to find values which match the given depth

  • s – Filter keyword, from which level should be applied

  • level – Depth of value to match as defined by the number of separator in the value name. If an int, the depth is matched exactly. If a str, then the depth can be matched as either “X-”, for all levels up to level “X”, or “X+”, for all levels above level “X”.

  • separator – The string used to separate levels in s. Defaults to a pipe (“|”).

Returns

Array where True indicates a match

Return type

numpy.ndarray of bool

Raises

ValueError – If level cannot be understood

scmdata.filters.hour_match(data: List, hours: Union[List[int], int]) numpy.ndarray[source]

Match hours in time columns for data filtering.

Parameters
  • data – Input data to perform filtering on

  • hours – Hours to match

Returns

Array where True indicates a match

Return type

numpy.ndarray of bool

scmdata.filters.is_in(vals: List, items: List) numpy.ndarray[source]

Find elements of vals which are in items.

Parameters
  • vals – The list of values to check

  • items – The options used to determine whether each element of vals is in the desired subset or not

Returns

Array of the same length as vals where the element is True if the corresponding element of vals is in items and False otherwise

Return type

numpy.ndarray of bool

scmdata.filters.month_match(data: List, months: Union[List[str], List[int], int, str]) numpy.ndarray[source]

Match months in time columns for data filtering.

Parameters
  • data – Input data to perform filtering on

  • months – Months to match

Returns

Array where True indicates a match

Return type

numpy.ndarray of bool

scmdata.filters.pattern_match(meta_col: pandas.core.series.Series, values: Union[Iterable[str], str], level: Optional[Union[str, int]] = None, regexp: bool = False, separator: str = '|') numpy.ndarray[source]

Filter data by matching metadata columns to given patterns.

Parameters
  • meta_col – Column to perform filtering on

  • values – Values to match

  • level – Passed to find_depth(). For usage, see docstring of find_depth().

  • regexp – If True, match using regexp rather than our pseudo regexp syntax.

  • has_nan – If True, convert all nan values in meta_col to empty string before applying filters. This means that “” and “*” will match rows with numpy.nan. If False, the conversion is not applied and so a search in a string column which contains numpy.nan will result in a TypeError.

  • separator – String used to separate the hierarchy levels in values. Defaults to ‘|’

Returns

Array where True indicates a match

Return type

numpy.ndarray of bool

Raises

TypeError – Filtering is performed on a string metadata column which contains numpy.nan and has_nan is False

scmdata.filters.time_match(data: List, times: Union[List[str], List[int], int, str], conv_codes: List[str], strptime_attr: str, name: str) numpy.ndarray[source]

Match times by applying conversion codes to filtering list.

Parameters
  • data – Input data to perform filtering on

  • times – Times to match

  • conv_codes – If times contains strings, conversion codes to try passing to time.strptime() to convert times to datetime.datetime

  • strptime_attr – If times contains strings, the datetime.datetime attribute to finalize the conversion of strings to integers

  • name – Name of the part of a datetime to extract, used to produce useful error messages.

Returns

Array where True indicates a match

Return type

numpy.ndarray of bool

Raises

ValueError – If input times cannot be converted understood or if input strings do not lead to increasing integers (i.e. “Nov-Feb” will not work, one must use [“Nov-Dec”, “Jan-Feb”] instead)

scmdata.filters.years_match(data: List, years: Union[List[int], numpy.ndarray, int]) numpy.ndarray[source]

Match years in time columns for data filtering.

Parameters
  • data – Input data to perform filtering on

  • years – Years to match

Returns

Array where True indicates a match

Return type

numpy.ndarray of bool

Raises

TypeError – If years is not int or list of int