BaseInput¶

class BaseInput(file, file_type=None, worksheet_name=None, has_column_names=True, mapper=None, name=None, allow_blank_names=True)[source]: Superclass representing a basic columnar file.

Methods

`hed.models.base_input.BaseInput.__init__`(file)	Constructor for the BaseInput class.
`hed.models.base_input.BaseInput.assemble`([...])	Assembles the hed strings
`hed.models.base_input.BaseInput.column_metadata`()	Get the metadata for each column
`hed.models.base_input.BaseInput.combine_dataframe`(...)	Combines all columns in the given dataframe into a single HED string series,
`hed.models.base_input.BaseInput.convert_to_form`(...)	Convert all tags in underlying dataframe to the specified form.
`hed.models.base_input.BaseInput.convert_to_long`(...)	Convert all tags in underlying dataframe to long form.
`hed.models.base_input.BaseInput.convert_to_short`(...)	Convert all tags in underlying dataframe to short form.
`hed.models.base_input.BaseInput.expand_defs`(...)	Shrinks any def-expand found in the underlying dataframe.
`hed.models.base_input.BaseInput.get_column_refs`()	Returns a list of column refs for this file.
`hed.models.base_input.BaseInput.get_def_dict`(...)	Returns the definition dict for this file
`hed.models.base_input.BaseInput.get_worksheet`([...])	Get the requested worksheet.
`hed.models.base_input.BaseInput.reset_mapper`(...)	Set mapper to a different view of the file.
`hed.models.base_input.BaseInput.set_cell`(...)	Replace the specified cell with transformed text.
`hed.models.base_input.BaseInput.shrink_defs`(...)	Shrinks any def-expand found in the underlying dataframe.
`hed.models.base_input.BaseInput.to_csv`([file])	Write to file or return as a string.
`hed.models.base_input.BaseInput.to_excel`(file)	Output to an Excel file.
`hed.models.base_input.BaseInput.validate`(...)	Creates a SpreadsheetValidator and returns all issues with this fil

Attributes

`hed.models.base_input.BaseInput.COMMA_DELIMITER`
`hed.models.base_input.BaseInput.EXCEL_EXTENSION`
`hed.models.base_input.BaseInput.FILE_EXTENSION`
`hed.models.base_input.BaseInput.FILE_INPUT`
`hed.models.base_input.BaseInput.STRING_INPUT`
`hed.models.base_input.BaseInput.TAB_DELIMITER`
`hed.models.base_input.BaseInput.TEXT_EXTENSION`
`hed.models.base_input.BaseInput.columns`	Returns a list of the column names.
`hed.models.base_input.BaseInput.dataframe`	The underlying dataframe.
`hed.models.base_input.BaseInput.dataframe_a`	Return the assembled dataframe
`hed.models.base_input.BaseInput.has_column_names`	True if dataframe has column names.
`hed.models.base_input.BaseInput.loaded_workbook`	The underlying loaded workbooks.
`hed.models.base_input.BaseInput.name`	Name of the data.
`hed.models.base_input.BaseInput.onsets`	Returns the onset column if it exists
`hed.models.base_input.BaseInput.series_a`	Return the assembled dataframe as a series
`hed.models.base_input.BaseInput.series_filtered`	Return the assembled dataframe as a series, with rows that have the same onset combined
`hed.models.base_input.BaseInput.worksheet_name`	The worksheet name.

BaseInput.__init__(file, file_type=None, worksheet_name=None, has_column_names=True, mapper=None, name=None, allow_blank_names=True)[source]¶

Constructor for the BaseInput class.

Parameters:

file (str or file-like or pandas dataframe) – An xlsx/tsv file to open.
file_type (str or None) – “.xlsx” (Excel), “.tsv” or “.txt” (tab-separated text). Derived from file if file is a filename. Ignored if pandas dataframe.
worksheet_name (str or None) – Name of Excel workbook worksheet name to use. (Not applicable to tsv files.)
has_column_names (bool) – True if file has column names. This value is ignored if you pass in a pandas dataframe.
mapper (ColumnMapper or None) – Indicates which columns have HED tags. See SpreadsheetInput or TabularInput for examples of how to use built-in a ColumnMapper.
name (str or None) – Optional field for how this file will report errors.
allow_blank_names (bool) – If True, column names can be blank

Raises:

HedFileError –

file is blank
An invalid dataframe was passed with size 0
An invalid extension was provided
A duplicate or empty column name appears
Cannot open the indicated file
The specified worksheet name does not exist
If the sidecar file or tabular file had invalid format and could not be read.

BaseInput.assemble(mapper=None, skip_curly_braces=False)[source]¶

Assembles the hed strings

Parameters:

mapper (ColumnMapper or None) – Generally pass none here unless you want special behavior.
skip_curly_braces (bool) – If True, don’t plug in curly brace values into columns.

Returns:

the assembled dataframe

Return type:

Dataframe

BaseInput.column_metadata()[source]¶

Get the metadata for each column

Returns:: number/ColumnMeta pairs
Return type:: dict

static BaseInput.combine_dataframe(dataframe)[source]¶

Combines all columns in the given dataframe into a single HED string series,: skipping empty columns and columns with empty strings.

Parameters:: dataframe (Dataframe) – The dataframe to combine
Returns:: the assembled series
Return type:: Series

BaseInput.convert_to_form(hed_schema, tag_form)[source]¶

Convert all tags in underlying dataframe to the specified form.

Parameters:

hed_schema (HedSchema) – The schema to use to convert tags.
tag_form (str) – HedTag property to convert tags to. Most cases should use convert_to_short or convert_to_long below.

BaseInput.convert_to_long(hed_schema)[source]¶

Convert all tags in underlying dataframe to long form.

Parameters:: hed_schema (HedSchema or None) – The schema to use to convert tags.

BaseInput.convert_to_short(hed_schema)[source]¶

Convert all tags in underlying dataframe to short form.

Parameters:: hed_schema (HedSchema) – The schema to use to convert tags.

BaseInput.expand_defs(hed_schema, def_dict)[source]¶

Shrinks any def-expand found in the underlying dataframe.

Parameters:

hed_schema (HedSchema or None) – The schema to use to identify defs
def_dict (DefinitionDict) – The definitions to expand

BaseInput.get_column_refs()[source]¶

Returns a list of column refs for this file.

Default implementation returns none.

Returns:: A list of unique column refs found
Return type:: column_refs(list)

BaseInput.get_def_dict(hed_schema, extra_def_dicts=None)[source]¶

Returns the definition dict for this file

Note: Baseclass implementation returns just extra_def_dicts.

Parameters:

hed_schema (HedSchema) – used to identify tags to find definitions(if needed)
extra_def_dicts (list, DefinitionDict, or None) – Extra dicts to add to the list.

Returns:

A single definition dict representing all the data(and extra def dicts)

Return type:

DefinitionDict

BaseInput.get_worksheet(worksheet_name=None)[source]¶

Get the requested worksheet.

Parameters:: worksheet_name (str or None) – The name of the requested worksheet by name or the first one if None.
Returns:: The workbook request.
Return type:: openpyxl.workbook.Workbook

Notes

If None, returns the first worksheet.

Raises:

KeyError –

The specified worksheet name does not exist

BaseInput.reset_mapper(new_mapper)[source]¶

Set mapper to a different view of the file.

Parameters:: new_mapper (ColumnMapper) – A column mapper to be associated with this base input.

BaseInput.set_cell(row_number, column_number, new_string_obj, tag_form='short_tag')[source]¶

Replace the specified cell with transformed text.

Parameters:

row_number (int) – The row number of the spreadsheet to set.
column_number (int) – The column number of the spreadsheet to set.
new_string_obj (HedString) – Object with text to put in the given cell.
tag_form (str) – Version of the tags (short_tag, long_tag, base_tag, etc)

Notes

Any attribute of a HedTag that returns a string is a valid value of tag_form.

Raises:

ValueError –
- There is not a loaded dataframe
KeyError –
- the indicated row/column does not exist
AttributeError –
- The indicated tag_form is not an attribute of HedTag

BaseInput.shrink_defs(hed_schema)[source]¶

Shrinks any def-expand found in the underlying dataframe.

Parameters:: hed_schema (HedSchema or None) – The schema to use to identify defs

BaseInput.to_csv(file=None)[source]¶

Write to file or return as a string.

Parameters:

file (str, file-like, or None) – Location to save this file. If None, return as string.

Returns:

None if file is given or the contents as a str if file is None.

Return type:

None or str

Raises:

OSError –

Cannot open the indicated file

BaseInput.to_excel(file)[source]¶

Output to an Excel file.

Parameters:

file (str or file-like) – Location to save this base input.

Raises:

ValueError –
- if empty file object was passed
OSError –
- Cannot open the indicated file

BaseInput.validate(hed_schema, extra_def_dicts=None, name=None, error_handler=None)[source]¶

Creates a SpreadsheetValidator and returns all issues with this fil

Parameters:

hed_schema (HedSchema) – The schema to use for validation
extra_def_dicts (list of DefDict or DefDict) – all definitions to use for validation
name (str) – The name to report errors from this file as
error_handler (ErrorHandler) – Error context to use. Creates a new one if None

Returns:

A list of issues for hed string

Return type:

issues (list of dict)

BaseInput.COMMA_DELIMITER = ','¶

BaseInput.EXCEL_EXTENSION = ['.xlsx']¶

BaseInput.FILE_EXTENSION = ['.tsv', '.txt', '.xlsx']¶

BaseInput.FILE_INPUT = 'file'¶

BaseInput.STRING_INPUT = 'string'¶

BaseInput.TAB_DELIMITER = '\t'¶

BaseInput.TEXT_EXTENSION = ['.tsv', '.txt']¶

BaseInput.columns¶

Returns a list of the column names.

Empty if no column names.

Returns:: the column names
Return type:: columns(list)

BaseInput.dataframe¶: The underlying dataframe.

BaseInput.dataframe_a¶

Return the assembled dataframe: Probably a placeholder name.

Returns:: the assembled dataframe
Return type:: Dataframe

BaseInput.has_column_names¶: True if dataframe has column names.

BaseInput.loaded_workbook¶: The underlying loaded workbooks.

BaseInput.name¶: Name of the data.

BaseInput.onsets¶: Returns the onset column if it exists

BaseInput.series_a¶

Return the assembled dataframe as a series

Returns:: the assembled dataframe with columns merged
Return type:: Series

BaseInput.series_filtered¶

Return the assembled dataframe as a series, with rows that have the same onset combined

Returns:: the assembled dataframe with columns merged, and the rows filtered together
Return type:: Series

BaseInput.worksheet_name¶: The worksheet name.