analysis_util

Utilities for assembly, analysis, and searching.

Functions

assemble_hed(data_input, sidecar, schema[, ...])

Return assembled HED annotations in a dataframe.

get_expression_parsers(queries[, query_names])

Returns a list of expression parsers and query_names.

hed_to_str(contents[, remove_parentheses])

search_strings(hed_strings, queries[, ...])

Returns a DataFrame of factors based on results of queries.

assemble_hed(data_input, sidecar, schema, columns_included=None, expand_defs=False)[source]

Return assembled HED annotations in a dataframe.

Parameters:
  • data_input (TabularInput) – The tabular input file whose HED annotations are to be assembled.

  • sidecar (Sidecar) – Sidecar with definitions.

  • schema (HedSchema) – Hed schema

  • columns_included (list or None) – A list of additional column names to include. If None, only the list of assembled tags is included.

  • expand_defs (bool) – If True, definitions are expanded when the events are assembled.

Returns:

A DataFrame with the assembled events. dict: A dictionary with definition names as keys and definition content strings as values.

Return type:

DataFrame or None

get_expression_parsers(queries, query_names=None)[source]

Returns a list of expression parsers and query_names.

Parameters:
  • queries (list) – A list of query strings or QueryParser objects

  • query_names (list) – A list of column names for results of queries. If missing — query_1, query_2, etc.

Returns:

DataFrame - containing the search strings

Raises:

ValueError

  • If query names are invalid or duplicated.

hed_to_str(contents, remove_parentheses=False)[source]
search_strings(hed_strings, queries, query_names=None)[source]

Returns a DataFrame of factors based on results of queries.

Parameters:
  • hed_strings (list) – A list of HedString objects (empty entries or None entries are 0’s)

  • queries (list) – A list of query strings or QueryParser objects

  • query_names (list) – A list of column names for results of queries. If missing — query_1, query_2, etc.

Returns:

DataFrame - containing the factor vectors with results of the queries

Raises:

ValueError

  • If query names are invalid or duplicated.