ColumnMapper¶
- class ColumnMapper(sidecar=None, tag_columns=None, column_prefix_dictionary=None, optional_tag_columns=None, warn_on_missing_column=False)[source]
Mapping of a base input file columns into HED tags.
Notes
All column numbers are 0 based.
Methods
Constructor for ColumnMapper. |
|
|
Validate there are no blank column names |
|
Find all issues given the current column_map, tag_columns, etc. |
|
Get all the issues with finalizing column mapping(duplicate columns, missing required, etc) |
Return def dicts from every column description. |
|
Returns the column numbers or names that are mapped to be HedTags |
|
Return the transformers to use on a dataframe |
|
Set the column number to name mapping. |
|
|
Sets the column prefix dictionary |
|
Set tag columns and optional tag columns |
Attributes
|
Returns the column_prefix_dictionary with numbers turned into names where possible |
Pass through to get the sidecar ColumnMetadata |
|
Returns the known tag and optional tag columns with numbers as names when possible |
- ColumnMapper.__init__(sidecar=None, tag_columns=None, column_prefix_dictionary=None, optional_tag_columns=None, warn_on_missing_column=False)[source]¶
Constructor for ColumnMapper.
- Parameters:
sidecar (Sidecar) – A sidecar to gather column data from.
tag_columns – (list): A list of ints or strings containing the columns that contain the HED tags. Sidecar column definitions will take precedent if there is a conflict with tag_columns.
column_prefix_dictionary (dict) – Dictionary with keys that are column numbers/names and values are HED tag prefixes to prepend to the tags in that column before processing.
optional_tag_columns (list) – A list of ints or strings containing the columns that contain the HED tags. If the column is otherwise unspecified, convert this column type to HEDTags.
warn_on_missing_column (bool) – If True, issue mapping warnings on column names that are missing from the sidecar.
Notes
All column numbers are 0 based.
- The column_prefix_dictionary may be deprecated/renamed in the future.
These are no longer prefixes, but rather converted to value columns: {“key”: “Description”, 1: “Label/”} will turn into value columns as {“key”: “Description/#”, 1: “Label/#”} It will be a validation issue if column 1 is called “key” in the above example. This means it no longer accepts anything but the value portion only in the columns.
- static ColumnMapper.check_for_blank_names(column_map, allow_blank_names)[source]¶
Validate there are no blank column names
- Parameters:
column_map (iterable) – A list of column names
allow_blank_names (bool) – Only find issues if this is true
- Returns:
A list of dicts, one per issue.
- Return type:
issues(list)
- ColumnMapper.check_for_mapping_issues(allow_blank_names=False)[source]¶
Find all issues given the current column_map, tag_columns, etc.
- Parameters:
allow_blank_names (bool) – Only flag blank names if False
- Returns:
Returns all issues found as a list of dicts
- Return type:
issue_list(list of dict)
- ColumnMapper.get_column_mapping_issues()[source]¶
Get all the issues with finalizing column mapping(duplicate columns, missing required, etc)
Notes
This is deprecated and now a wrapper for “check_for_mapping_issues()”
- Returns:
A list dictionaries of all issues found from mapping column names to numbers.
- Return type:
list
- ColumnMapper.get_def_dict(hed_schema, extra_def_dicts=None)[source]¶
Return def dicts from every column description.
- Parameters:
hed_schema (Schema) – A HED schema object to use for extracting definitions.
extra_def_dicts (list, DefinitionDict, or None) – Extra dicts to add to the list.
- Returns:
A single definition dict representing all the data(and extra def dicts)
- Return type:
DefinitionDict
- ColumnMapper.get_tag_columns()[source]¶
Returns the column numbers or names that are mapped to be HedTags
Note: This is NOT the tag_columns or optional_tag_columns parameter, though they set it.
- Returns:
- A list of column numbers or names that are ColumnType.HedTags.
0-based if integer-based, otherwise column name.
- Return type:
column_identifiers(list)
- ColumnMapper.get_transformers()[source]¶
Return the transformers to use on a dataframe
- Returns:
dict({str or int: func}): the functions to use to transform each column need_categorical(list of int): a list of columns to treat as categoriacl
- Return type:
tuple(dict, list)
- ColumnMapper.set_column_map(new_column_map=None)[source]¶
Set the column number to name mapping.
- Parameters:
new_column_map (list or dict) – Either an ordered list of the column names or column_number:column name dictionary. In both cases, column numbers start at 0
- Returns:
List of issues. Each issue is a dictionary.
- Return type:
list
- ColumnMapper.set_column_prefix_dictionary(column_prefix_dictionary, finalize_mapping=True)[source]¶
Sets the column prefix dictionary
- ColumnMapper.set_tag_columns(tag_columns=None, optional_tag_columns=None, finalize_mapping=True)[source]¶
Set tag columns and optional tag columns
- Parameters:
tag_columns (list) – A list of ints or strings containing the columns that contain the HED tags. If None, clears existing tag_columns
optional_tag_columns (list) – A list of ints or strings containing the columns that contain the HED tags, but not an error if missing. If None, clears existing tag_columns
finalize_mapping (bool) – Re-generate the internal mapping if True, otherwise no effect until finalize.
- ColumnMapper.column_prefix_dictionary¶
Returns the column_prefix_dictionary with numbers turned into names where possible
- Returns:
A column_prefix_dictionary with column labels as keys
- Return type:
column_prefix_dictionary(list of str or int)
- ColumnMapper.sidecar_column_data¶
Pass through to get the sidecar ColumnMetadata
- Returns:
ColumnMetadata}): the column metadata defined by this sidecar
- Return type:
dict({str
- ColumnMapper.tag_columns¶
Returns the known tag and optional tag columns with numbers as names when possible
- Returns:
A list of all tag and optional tag columns as labels
- Return type:
tag_columns(list of str or int)