FactorColumnOp¶
- class FactorColumnOp(parameters)[source]
Append to tabular file columns of factors based on column values.
- Required remodeling parameters:
column_name (str): The name of a column in the DataFrame to compute factors from.
- Optional remodeling parameters
factor_names (list): Names to use as the factor columns.
factor_values (list): Values in the column column_name to create factors for.
Notes
If no factor_values are provided, factors are computed for each of the unique values in column_name column.
If factor_names are provided, then factor_values must also be provided and the two lists be the same size.
Methods
|
Constructor for the factor column operation. |
|
Create factor columns based on values in a specified column. |
|
Check that factor_names and factor_values have same length if given. |
Attributes
- FactorColumnOp.__init__(parameters)[source]¶
Constructor for the factor column operation.
- Parameters:
parameters (dict) – Parameter values for required and optional parameters.
- FactorColumnOp.do_op(dispatcher, df, name, sidecar=None)[source]¶
Create factor columns based on values in a specified column.
- Parameters:
dispatcher (Dispatcher) – Manages the operation I/O.
df (DataFrame) – The DataFrame to be remodeled.
name (str) – Unique identifier for the dataframe – often the original file path.
sidecar (Sidecar or file-like) – Not needed for this operation.
- Returns:
A new DataFrame with the factor columns appended.
- Return type:
DataFrame
- static FactorColumnOp.validate_input_data(parameters)[source]¶
Check that factor_names and factor_values have same length if given.
- FactorColumnOp.NAME = 'factor_column'¶
- FactorColumnOp.PARAMS = {'additionalProperties': False, 'dependentRequired': {'factor_names': ['factor_values']}, 'properties': {'column_name': {'description': 'Name of the column for which to create one-hot factors for unique values.', 'type': 'string'}, 'factor_names': {'description': 'Names of the resulting factor columns. If given must be same length as factor_values', 'items': {'type': 'string'}, 'minItems': 1, 'type': 'array', 'uniqueItems': True}, 'factor_values': {'description': 'Specific unique column values to compute factors for (otherwise all unique values).', 'items': {'type': 'string'}, 'minItems': 1, 'type': 'array', 'uniqueItems': True}}, 'required': ['column_name'], 'type': 'object'}¶