SummarizeColumnNamesOp

class SummarizeColumnNamesOp(parameters)[source]

Summarize the column names in a collection of tabular files.

Required remodeling parameters:
  • summary_name (str) The name of the summary.

  • summary_filename (str) Base filename of the summary.

The purpose is to check that all the tabular files have the same columns in same order.

Methods

hed.tools.remodeling.operations.summarize_column_names_op.SummarizeColumnNamesOp.__init__(...)

Constructor for summarize column names operation.

hed.tools.remodeling.operations.summarize_column_names_op.SummarizeColumnNamesOp.check_parameters(...)

Verify that the parameters meet the operation specification.

hed.tools.remodeling.operations.summarize_column_names_op.SummarizeColumnNamesOp.do_op(...)

Create a column name summary for df.

Attributes

hed.tools.remodeling.operations.summarize_column_names_op.SummarizeColumnNamesOp.PARAMS

hed.tools.remodeling.operations.summarize_column_names_op.SummarizeColumnNamesOp.SUMMARY_TYPE

SummarizeColumnNamesOp.__init__(parameters)[source]

Constructor for summarize column names operation.

Parameters:

parameters (dict) – Dictionary with the parameter values for required and optional parameters.

Raises:
  • KeyError

    • If a required parameter is missing.

    • If an unexpected parameter is provided.

  • TypeError

    • If a parameter has the wrong type.

SummarizeColumnNamesOp.check_parameters(parameters)

Verify that the parameters meet the operation specification.

Parameters:

parameters (dict) – Dictionary of parameters for this operation.

Raises:
  • KeyError

    • If a required parameter is missing.

    • If an unexpected parameter is provided.

  • TypeError

    • If a parameter has the wrong type.

SummarizeColumnNamesOp.do_op(dispatcher, df, name, sidecar=None)[source]

Create a column name summary for df.

Parameters:
  • dispatcher (Dispatcher) – Manages the operation I/O.

  • df (DataFrame) – The DataFrame to be remodeled.

  • name (str) – Unique identifier for the dataframe – often the original file path.

  • sidecar (Sidecar or file-like) – Not needed for this operation.

Returns:

A copy of df.

Return type:

DataFrame

Side-effect:

Updates the relevant summary.

SummarizeColumnNamesOp.PARAMS = {'operation': 'summarize_column_names', 'optional_parameters': {'append_timecode': <class 'bool'>}, 'required_parameters': {'summary_filename': <class 'str'>, 'summary_name': <class 'str'>}}
SummarizeColumnNamesOp.SUMMARY_TYPE = 'column_names'