SummarizeColumnNamesOp¶
- class SummarizeColumnNamesOp(parameters)[source]
Summarize the column names in a collection of tabular files.
- Required remodeling parameters:
summary_name (str): The name of the summary.
summary_filename (str): Base filename of the summary.
- Optional remodeling parameters:
append_timecode (bool): If False (default), the timecode is not appended to the summary filename.
The purpose is to check that all the tabular files have the same columns in same order.
Methods
|
Constructor for summarize column names operation. |
|
Create a column name summary for df. |
Additional validation required of operation parameters not performed by JSON schema validator. |
Attributes
- SummarizeColumnNamesOp.__init__(parameters)[source]¶
Constructor for summarize column names operation.
- Parameters:
parameters (dict) – Dictionary with the parameter values for required and optional parameters.
- SummarizeColumnNamesOp.do_op(dispatcher, df, name, sidecar=None)[source]¶
Create a column name summary for df.
- Parameters:
dispatcher (Dispatcher) – Manages the operation I/O.
df (DataFrame) – The DataFrame to be remodeled.
name (str) – Unique identifier for the dataframe – often the original file path.
sidecar (Sidecar or file-like) – Not needed for this operation.
- Returns:
A copy of df.
- Return type:
DataFrame
- Side effect:
Updates the relevant summary.
- static SummarizeColumnNamesOp.validate_input_data(parameters)[source]¶
Additional validation required of operation parameters not performed by JSON schema validator.
- SummarizeColumnNamesOp.NAME = 'summarize_column_names'¶
- SummarizeColumnNamesOp.PARAMS = {'additionalProperties': False, 'properties': {'append_timecode': {'description': 'If true, the timecode is appended to the base filename so each run has a unique name.', 'type': 'boolean'}, 'summary_filename': {'description': 'Name to use for the summary file name base.', 'type': 'string'}, 'summary_name': {'description': 'Name to use for the summary in titles.', 'type': 'string'}}, 'required': ['summary_name', 'summary_filename'], 'type': 'object'}¶
- SummarizeColumnNamesOp.SUMMARY_TYPE = 'column_names'¶