df_util

Functions

calculate_attribute_type(attribute_entry)

Returns the type of this attribute(annotation, object, data)

convert_filenames_to_dict(filenames[, ...])

Infers filename meaning based on suffix, e.g.

create_empty_dataframes()

Returns the default empty dataframes

get_attributes_from_row(row)

Get the tag attributes from a line.

get_library_name_and_id(schema)

Get the library("Standard" for the standard schema) and first id for a schema range

load_dataframes(filenames[, include_prefix_dfs])

Load the dataframes from the source folder or series of files.

remove_prefix(text, prefix)

save_dataframes(base_filename, dataframe_dict)

Writes out the dataframes using the provided suffixes.

calculate_attribute_type(attribute_entry)[source]

Returns the type of this attribute(annotation, object, data)

Returns:

“annotation”, “object”, or “data”.

Return type:

attribute_type(str)

convert_filenames_to_dict(filenames, include_prefix_dfs=False)[source]

Infers filename meaning based on suffix, e.g. _Tag for the tags sheet

Parameters:
  • filenames (str or None or list or dict) – The list to convert to a dict If a string with a .tsv suffix: Save to that location, adding the suffix to each .tsv file If a string with no .tsv suffix: Save to that folder, with the contents being the separate .tsv files.

  • include_prefix_dfs (bool) – If True, include the prefixes and external annotation dataframes.

Returns:

str): The required suffix to filename mapping

Return type:

filename_dict(str

create_empty_dataframes()[source]

Returns the default empty dataframes

get_attributes_from_row(row)[source]

Get the tag attributes from a line.

Parameters:

row (pd.Series) – A tag line.

Returns:

Dictionary of attributes.

Return type:

dict

get_library_name_and_id(schema)[source]

Get the library(“Standard” for the standard schema) and first id for a schema range

Parameters:

schema (HedSchema) – The schema to check

Returns:

The capitalized library name first_id(int): the first id for a given library

Return type:

library_name(str)

load_dataframes(filenames, include_prefix_dfs=False)[source]

Load the dataframes from the source folder or series of files.

Parameters:
  • filenames (str or None or list or dict) – The input filenames If a string with a .tsv suffix: Save to that location, adding the suffix to each .tsv file If a string with no .tsv suffix: Save to that folder, with the contents being the separate .tsv files.

  • include_prefix_dfs (bool) – If True, include the prefixes and external annotation dataframes.

Returns:

dataframes): The suffix:dataframe dict

Return type:

dataframes_dict(str

remove_prefix(text, prefix)[source]
save_dataframes(base_filename, dataframe_dict)[source]

Writes out the dataframes using the provided suffixes.

Does not validate contents or suffixes.

If base_filename has a .tsv suffix, save directly to the indicated location. If base_filename is a directory(does NOT have a .tsv suffix), save the contents into a directory named that. The subfiles are named the same. e.g. HED8.3.0/HED8.3.0_Tag.tsv

Parameters:
  • base_filename (str) – The base filename to use. Output is {base_filename}_{suffix}.tsv See DF_SUFFIXES for all expected names.

  • str (dataframe_dict(dict of) – df.DataFrame): The list of files to save out. No validation is done.