TabularSummary

class TabularSummary(value_cols=None, skip_cols=None, name='')[source]

Summarize the contents of columnar files.

Methods

TabularSummary.__init__([value_cols, ...])

Constructor for a BIDS tabular file summary.

TabularSummary.extract_sidecar_template()

Extract a BIDS sidecar-compatible dictionary.

TabularSummary.extract_summary(summary_info)

Create a TabularSummary object from a serialized summary.

TabularSummary.get_columns_info(dataframe[, ...])

Extract unique value counts for columns.

TabularSummary.get_number_unique([column_names])

Return the number of unique values in columns.

TabularSummary.get_summary([as_json])

Return the summary in dictionary format.

TabularSummary.make_combined_dicts(...[, ...])

Return combined and individual summaries.

TabularSummary.update(data[, name])

Update the counts based on data.

TabularSummary.update_summary(tab_sum)

Add TabularSummary values to this object.

Attributes

TabularSummary.__init__(value_cols=None, skip_cols=None, name='')[source]

Constructor for a BIDS tabular file summary.

Parameters:
  • value_cols (list, None) – List of columns to be treated as value columns.

  • skip_cols (list, None) – List of columns to be skipped.

  • name (str) – Name associated with the dictionary.

TabularSummary.extract_sidecar_template()[source]

Extract a BIDS sidecar-compatible dictionary.

Returns:

A sidecar template that can be converted to JSON.

Return type:

dict

static TabularSummary.extract_summary(summary_info)[source]

Create a TabularSummary object from a serialized summary.

Parameters:

summary_info (dict or str) – A JSON string or a dictionary containing contents of a TabularSummary.

Returns:

contains the information in summary_info as a TabularSummary object.

Return type:

TabularSummary

static TabularSummary.get_columns_info(dataframe, skip_cols=None)[source]

Extract unique value counts for columns.

Parameters:
  • dataframe (DataFrame) – The DataFrame to be analyzed.

  • skip_cols (list) – List of names of columns to be skipped in the extraction.

Returns:

A dictionary with keys that are column names and values that

are dictionaries of unique value counts.

Return type:

dict

TabularSummary.get_number_unique(column_names=None)[source]

Return the number of unique values in columns.

Parameters:

column_names (list, None) – A list of column names to analyze or all columns if None.

Returns:

Column names are the keys and the number of unique values in the column are the values.

Return type:

dict

TabularSummary.get_summary(as_json=False)[source]

Return the summary in dictionary format.

Parameters:

as_json (bool) – If False, return as a Python dictionary, otherwise convert to a JSON dictionary.

static TabularSummary.make_combined_dicts(file_dictionary, skip_cols=None)[source]

Return combined and individual summaries.

Parameters:
  • file_dictionary (FileDictionary) – Dictionary of file name keys and full path.

  • skip_cols (list) – Name of the column.

Returns:

  • TabularSummary: Summary of the file dictionary.

  • dict: of individual TabularSummary objects.

Return type:

tuple

TabularSummary.update(data, name=None)[source]

Update the counts based on data.

Parameters:
  • data (DataFrame, str, or list) – DataFrame containing data to update.

  • name (str) – Name of the summary.

TabularSummary.update_summary(tab_sum)[source]

Add TabularSummary values to this object.

Parameters:

tab_sum (TabularSummary) – A TabularSummary to be combined.

Notes

  • The value_cols and skip_cols are updated as long as they are not contradictory.

  • A new skip column cannot be used.