ColumnMapper

class ColumnMapper(sidecar=None, tag_columns=None, column_prefix_dictionary=None, optional_tag_columns=None, warn_on_missing_column=False)[source]

Mapping of a base input file columns into HED tags.

Notes

  • All column numbers are 0 based.

Methods

hed.models.column_mapper.ColumnMapper.__init__([...])

Constructor for ColumnMapper.

hed.models.column_mapper.ColumnMapper.check_for_blank_names(...)

Validate there are no blank column names

hed.models.column_mapper.ColumnMapper.check_for_mapping_issues([...])

Find all issues given the current column_map, tag_columns, etc.

hed.models.column_mapper.ColumnMapper.get_column_mapping_issues()

Get all the issues with finalizing column mapping(duplicate columns, missing required, etc)

hed.models.column_mapper.ColumnMapper.get_def_dict(...)

Return def dicts from every column description.

hed.models.column_mapper.ColumnMapper.get_tag_columns()

Returns the column numbers or names that are mapped to be HedTags

hed.models.column_mapper.ColumnMapper.get_transformers()

Return the transformers to use on a dataframe

hed.models.column_mapper.ColumnMapper.set_column_map([...])

Set the column number to name mapping.

hed.models.column_mapper.ColumnMapper.set_column_prefix_dictionary(...)

Sets the column prefix dictionary

hed.models.column_mapper.ColumnMapper.set_tag_columns([...])

Set tag columns and optional tag columns

Attributes

hed.models.column_mapper.ColumnMapper.column_prefix_dictionary

Returns the column_prefix_dictionary with numbers turned into names where possible

hed.models.column_mapper.ColumnMapper.sidecar_column_data

Pass through to get the sidecar ColumnMetadata

hed.models.column_mapper.ColumnMapper.tag_columns

Returns the known tag and optional tag columns with numbers as names when possible

ColumnMapper.__init__(sidecar=None, tag_columns=None, column_prefix_dictionary=None, optional_tag_columns=None, warn_on_missing_column=False)[source]

Constructor for ColumnMapper.

Parameters:
  • sidecar (Sidecar) – A sidecar to gather column data from.

  • tag_columns – (list): A list of ints or strings containing the columns that contain the HED tags. Sidecar column definitions will take precedent if there is a conflict with tag_columns.

  • column_prefix_dictionary (dict) – Dictionary with keys that are column numbers/names and values are HED tag prefixes to prepend to the tags in that column before processing.

  • optional_tag_columns (list) – A list of ints or strings containing the columns that contain the HED tags. If the column is otherwise unspecified, convert this column type to HEDTags.

  • warn_on_missing_column (bool) – If True, issue mapping warnings on column names that are missing from the sidecar.

Notes

  • All column numbers are 0 based.

  • The column_prefix_dictionary may be deprecated/renamed in the future.
    • These are no longer prefixes, but rather converted to value columns: {“key”: “Description”, 1: “Label/”} will turn into value columns as {“key”: “Description/#”, 1: “Label/#”} It will be a validation issue if column 1 is called “key” in the above example. This means it no longer accepts anything but the value portion only in the columns.

static ColumnMapper.check_for_blank_names(column_map, allow_blank_names)[source]

Validate there are no blank column names

Parameters:
  • column_map (iterable) – A list of column names

  • allow_blank_names (bool) – Only find issues if this is true

Returns:

A list of dicts, one per issue.

Return type:

issues(list)

ColumnMapper.check_for_mapping_issues(allow_blank_names=False)[source]

Find all issues given the current column_map, tag_columns, etc.

Parameters:

allow_blank_names (bool) – Only flag blank names if False

Returns:

Returns all issues found as a list of dicts

Return type:

issue_list(list of dict)

ColumnMapper.get_column_mapping_issues()[source]

Get all the issues with finalizing column mapping(duplicate columns, missing required, etc)

Notes

  • This is deprecated and now a wrapper for “check_for_mapping_issues()”

Returns:

A list dictionaries of all issues found from mapping column names to numbers.

Return type:

list

ColumnMapper.get_def_dict(hed_schema, extra_def_dicts=None)[source]

Return def dicts from every column description.

Parameters:
  • hed_schema (Schema) – A HED schema object to use for extracting definitions.

  • extra_def_dicts (list, DefinitionDict, or None) – Extra dicts to add to the list.

Returns:

A single definition dict representing all the data(and extra def dicts)

Return type:

DefinitionDict

ColumnMapper.get_tag_columns()[source]

Returns the column numbers or names that are mapped to be HedTags

Note: This is NOT the tag_columns or optional_tag_columns parameter, though they set it.

Returns:

A list of column numbers or names that are ColumnType.HedTags.

0-based if integer-based, otherwise column name.

Return type:

column_identifiers(list)

ColumnMapper.get_transformers()[source]

Return the transformers to use on a dataframe

Returns:

dict({str or int: func}): the functions to use to transform each column need_categorical(list of int): a list of columns to treat as categoriacl

Return type:

tuple(dict, list)

ColumnMapper.set_column_map(new_column_map=None)[source]

Set the column number to name mapping.

Parameters:

new_column_map (list or dict) – Either an ordered list of the column names or column_number:column name dictionary. In both cases, column numbers start at 0

Returns:

List of issues. Each issue is a dictionary.

Return type:

list

ColumnMapper.set_column_prefix_dictionary(column_prefix_dictionary, finalize_mapping=True)[source]

Sets the column prefix dictionary

ColumnMapper.set_tag_columns(tag_columns=None, optional_tag_columns=None, finalize_mapping=True)[source]

Set tag columns and optional tag columns

Parameters:
  • tag_columns (list) – A list of ints or strings containing the columns that contain the HED tags. If None, clears existing tag_columns

  • optional_tag_columns (list) – A list of ints or strings containing the columns that contain the HED tags, but not an error if missing. If None, clears existing tag_columns

  • finalize_mapping (bool) – Re-generate the internal mapping if True, otherwise no effect until finalize.

ColumnMapper.column_prefix_dictionary

Returns the column_prefix_dictionary with numbers turned into names where possible

Returns:

A column_prefix_dictionary with column labels as keys

Return type:

column_prefix_dictionary(list of str or int)

ColumnMapper.sidecar_column_data

Pass through to get the sidecar ColumnMetadata

Returns:

ColumnMetadata}): the column metadata defined by this sidecar

Return type:

dict({str

ColumnMapper.tag_columns

Returns the known tag and optional tag columns with numbers as names when possible

Returns:

A list of all tag and optional tag columns as labels

Return type:

tag_columns(list of str or int)