SummarizeColumnNamesOp

class SummarizeColumnNamesOp(parameters)[source]

Summarize the column names in a collection of tabular files.

Required remodeling parameters:
  • summary_name (str): The name of the summary.

  • summary_filename (str): Base filename of the summary.

Optional remodeling parameters:
  • append_timecode (bool): If False (default), the timecode is not appended to the summary filename.

The purpose is to check that all the tabular files have the same columns in same order.

Methods

SummarizeColumnNamesOp.__init__(parameters)

Constructor for summarize column names operation.

SummarizeColumnNamesOp.do_op(dispatcher, df, ...)

Create a column name summary for df.

SummarizeColumnNamesOp.validate_input_data(...)

Additional validation required of operation parameters not performed by JSON schema validator.

Attributes

SummarizeColumnNamesOp.NAME

SummarizeColumnNamesOp.PARAMS

SummarizeColumnNamesOp.SUMMARY_TYPE

SummarizeColumnNamesOp.__init__(parameters)[source]

Constructor for summarize column names operation.

Parameters:

parameters (dict) – Dictionary with the parameter values for required and optional parameters.

SummarizeColumnNamesOp.do_op(dispatcher, df, name, sidecar=None)[source]

Create a column name summary for df.

Parameters:
  • dispatcher (Dispatcher) – Manages the operation I/O.

  • df (DataFrame) – The DataFrame to be remodeled.

  • name (str) – Unique identifier for the dataframe – often the original file path.

  • sidecar (Sidecar or file-like) – Not needed for this operation.

Returns:

A copy of df.

Return type:

DataFrame

Side effect:

Updates the relevant summary.

static SummarizeColumnNamesOp.validate_input_data(parameters)[source]

Additional validation required of operation parameters not performed by JSON schema validator.

SummarizeColumnNamesOp.NAME = 'summarize_column_names'
SummarizeColumnNamesOp.PARAMS = {'additionalProperties': False, 'properties': {'append_timecode': {'description': 'If true, the timecode is appended to the base filename so each run has a unique name.', 'type': 'boolean'}, 'summary_filename': {'description': 'Name to use for the summary file name base.', 'type': 'string'}, 'summary_name': {'description': 'Name to use for the summary in titles.', 'type': 'string'}}, 'required': ['summary_name', 'summary_filename'], 'type': 'object'}
SummarizeColumnNamesOp.SUMMARY_TYPE = 'column_names'