MergeConsecutiveOp¶
- class MergeConsecutiveOp(parameters)[source]
Merge consecutive rows of a columnar file with same column value.
- Required remodeling parameters:
column_name (str): name of column whose consecutive values are to be compared (the merge column).
event_code (str or int or float): the particular value in the match column to be merged.
set_durations (bool): If true, set the duration of the merged event to the extent of the merged events.
ignore_missing (bool): If true, missing match_columns are ignored.
- Optional remodeling parameters:
match_columns (list): A list of columns whose values have to be matched for two events to be the same.
Notes
This operation is meant for time-based tabular files that have an onset column.
Methods
|
Constructor for the merge consecutive operation. |
|
Merge consecutive rows with the same column value. |
Verify that the column name is not in match columns. |
Attributes
- MergeConsecutiveOp.__init__(parameters)[source]¶
Constructor for the merge consecutive operation.
- Parameters:
parameters (dict) – Actual values of the parameters for the operation.
- MergeConsecutiveOp.do_op(dispatcher, df, name, sidecar=None)[source]¶
Merge consecutive rows with the same column value.
- Parameters:
dispatcher (Dispatcher) – Manages the operation I/O.
df (DataFrame) – The DataFrame to be remodeled.
name (str) – Unique identifier for the dataframe – often the original file path.
sidecar (Sidecar or file-like) – Not needed for this operation.
- Returns:
A new dataframe after processing.
- Return type:
Dataframe
- Raises:
ValueError –
If dataframe does not have the anchor column and ignore_missing is False.
If a match column is missing and ignore_missing is False.
If the durations were to be set and the dataframe did not have an onset column.
If the durations were to be set and the dataframe did not have a duration column.
- static MergeConsecutiveOp.validate_input_data(parameters)[source]¶
Verify that the column name is not in match columns.
- Parameters:
parameters (dict) – Dictionary of parameters of actual implementation.
- MergeConsecutiveOp.NAME = 'merge_consecutive'¶
- MergeConsecutiveOp.PARAMS = {'additionalProperties': False, 'properties': {'column_name': {'description': 'The name of the column to check for repeated consecutive codes.', 'type': 'string'}, 'event_code': {'description': 'The event code to match for duplicates.', 'type': ['string', 'number']}, 'ignore_missing': {'description': 'If true, missing match columns are ignored.', 'type': 'boolean'}, 'match_columns': {'description': 'List of columns whose values must also match to be considered a repeat.', 'items': {'type': 'string'}, 'type': 'array'}, 'set_durations': {'description': 'If true, then the duration should be computed based on start of first to end of last.', 'type': 'boolean'}}, 'required': ['column_name', 'event_code', 'set_durations', 'ignore_missing'], 'type': 'object'}¶