ColumnMapper

class ColumnMapper(sidecar=None, tag_columns=None, column_prefix_dictionary=None, optional_tag_columns=None, warn_on_missing_column=False)[source]

Mapping of a base input file columns into HED tags.

Notes

  • All column numbers are 0 based.

Methods

ColumnMapper.__init__([sidecar, ...])

Constructor for ColumnMapper.

ColumnMapper.check_for_blank_names(...)

Validate there are no blank column names.

ColumnMapper.check_for_mapping_issues([...])

Find all issues given the current column_map, tag_columns, etc.

ColumnMapper.get_column_mapping_issues()

Get all the issues with finalizing column mapping(duplicate columns, missing required, etc.).

ColumnMapper.get_def_dict(hed_schema[, ...])

Return def dicts from every column description.

ColumnMapper.get_tag_columns()

Return the column numbers or names that are mapped to be HedTags.

ColumnMapper.get_transformers()

Return the transformers to use on a dataframe.

ColumnMapper.set_column_map([new_column_map])

Set the column number to name mapping.

ColumnMapper.set_column_prefix_dictionary(...)

Set the column prefix dictionary.

ColumnMapper.set_tag_columns([tag_columns, ...])

Set tag columns and optional tag columns.

Attributes

ColumnMapper.column_prefix_dictionary

Return the column_prefix_dictionary with numbers turned into names where possible.

ColumnMapper.sidecar_column_data

Pass through to get the sidecar ColumnMetadata.

ColumnMapper.tag_columns

Return the known tag and optional tag columns with numbers as names when possible.

ColumnMapper.__init__(sidecar=None, tag_columns=None, column_prefix_dictionary=None, optional_tag_columns=None, warn_on_missing_column=False)[source]

Constructor for ColumnMapper.

Parameters:
  • sidecar (Sidecar) – A sidecar to gather column data from.

  • tag_columns – (list): A list of ints or strings containing the columns that contain the HED tags. Sidecar column definitions will take precedent if there is a conflict with tag_columns.

  • column_prefix_dictionary (dict) – Dictionary with keys that are column numbers/names and values are HED tag prefixes to prepend to the tags in that column before processing.

  • optional_tag_columns (list) – A list of ints or strings containing the columns that contain the HED tags. If the column is otherwise unspecified, convert this column type to HEDTags.

  • warn_on_missing_column (bool) – If True, issue mapping warnings on column names that are missing from the sidecar.

Notes

  • All column numbers are 0 based.

  • The column_prefix_dictionary may be deprecated/renamed in the future.
    • These are no longer prefixes, but rather converted to value columns: {“key”: “Description”, 1: “Label/”} will turn into value columns as {“key”: “Description/#”, 1: “Label/#”} It will be a validation issue if column 1 is called “key” in the above example. This means it no longer accepts anything but the value portion only in the columns.

static ColumnMapper.check_for_blank_names(column_map, allow_blank_names)[source]

Validate there are no blank column names.

Parameters:
  • column_map (iterable) – A list of column names.

  • allow_blank_names (bool) – Only find issues if True.

Returns:

A list of dicts, one per issue.

Return type:

issues(list)

ColumnMapper.check_for_mapping_issues(allow_blank_names=False)[source]

Find all issues given the current column_map, tag_columns, etc.

Parameters:

allow_blank_names (bool) – Only flag blank names if False.

Returns:

All issues found as a list of dicts.

Return type:

issue_list(list of dict)

ColumnMapper.get_column_mapping_issues()[source]

Get all the issues with finalizing column mapping(duplicate columns, missing required, etc.).

Notes

  • This is deprecated and now a wrapper for “check_for_mapping_issues()”.

Returns:

A list dictionaries of all issues found from mapping column names to numbers.

Return type:

list

ColumnMapper.get_def_dict(hed_schema, extra_def_dicts=None)[source]

Return def dicts from every column description.

Parameters:
  • hed_schema (Schema) – A HED schema object to use for extracting definitions.

  • extra_def_dicts (list, DefinitionDict, or None) – Extra dicts to add to the list.

Returns:

A single definition dict representing all the data(and extra def dicts).

Return type:

DefinitionDict

ColumnMapper.get_tag_columns()[source]

Return the column numbers or names that are mapped to be HedTags.

Note: This is NOT the tag_columns or optional_tag_columns parameter, though they set it.

Returns:

A list of column numbers or names that are ColumnType.HedTags.

0-based if integer-based, otherwise column name.

Return type:

column_identifiers(list)

ColumnMapper.get_transformers()[source]

Return the transformers to use on a dataframe.

Returns:

dict({str or int: func}): The functions to use to transform each column. need_categorical(list of int): A list of columns to treat as categorical.

Return type:

tuple(dict, list)

ColumnMapper.set_column_map(new_column_map=None)[source]

Set the column number to name mapping.

Parameters:

new_column_map (list or dict) – Either an ordered list of the column names or column_number:column name. dictionary. In both cases, column numbers start at 0.

Returns:

List of issues. Each issue is a dictionary.

Return type:

list

ColumnMapper.set_column_prefix_dictionary(column_prefix_dictionary, finalize_mapping=True)[source]

Set the column prefix dictionary.

ColumnMapper.set_tag_columns(tag_columns=None, optional_tag_columns=None, finalize_mapping=True)[source]

Set tag columns and optional tag columns.

Parameters:
  • tag_columns (list) – A list of ints or strings containing the columns that contain the HED tags. If None, clears existing tag_columns

  • optional_tag_columns (list) – A list of ints or strings containing the columns that contain the HED tags, but not an error if missing. If None, clears existing tag_columns

  • finalize_mapping (bool) – Re-generate the internal mapping if True, otherwise no effect until finalize.

ColumnMapper.column_prefix_dictionary

Return the column_prefix_dictionary with numbers turned into names where possible.

Returns:

A column_prefix_dictionary with column labels as keys.

Return type:

column_prefix_dictionary(list of str or int)

ColumnMapper.sidecar_column_data

Pass through to get the sidecar ColumnMetadata.

Returns:

ColumnMetadata}): The column metadata defined by this sidecar.

Return type:

dict({str

ColumnMapper.tag_columns

Return the known tag and optional tag columns with numbers as names when possible.

Returns:

A list of all tag and optional tag columns as labels.

Return type:

tag_columns(list of str or int)