FactorColumnOp

class FactorColumnOp(parameters)[source]

Append to tabular file columns of factors based on column values.

Required remodeling parameters:
  • column_name (str): The name of a column in the DataFrame to compute factors from.

Optional remodeling parameters
  • factor_names (list): Names to use as the factor columns.

  • factor_values (list): Values in the column column_name to create factors for.

Notes

  • If no factor_values are provided, factors are computed for each of the unique values in column_name column.

  • If factor_names are provided, then factor_values must also be provided and the two lists be the same size.

Methods

FactorColumnOp.__init__(parameters)

Constructor for the factor column operation.

FactorColumnOp.do_op(dispatcher, df, name[, ...])

Create factor columns based on values in a specified column.

FactorColumnOp.validate_input_data(parameters)

Check that factor_names and factor_values have same length if given.

Attributes

FactorColumnOp.NAME

FactorColumnOp.PARAMS

FactorColumnOp.__init__(parameters)[source]

Constructor for the factor column operation.

Parameters:

parameters (dict) – Parameter values for required and optional parameters.

FactorColumnOp.do_op(dispatcher, df, name, sidecar=None)[source]

Create factor columns based on values in a specified column.

Parameters:
  • dispatcher (Dispatcher) – Manages the operation I/O.

  • df (DataFrame) – The DataFrame to be remodeled.

  • name (str) – Unique identifier for the dataframe – often the original file path.

  • sidecar (Sidecar or file-like) – Not needed for this operation.

Returns:

A new DataFrame with the factor columns appended.

Return type:

DataFrame

static FactorColumnOp.validate_input_data(parameters)[source]

Check that factor_names and factor_values have same length if given.

FactorColumnOp.NAME = 'factor_column'
FactorColumnOp.PARAMS = {'additionalProperties': False, 'dependentRequired': {'factor_names': ['factor_values']}, 'properties': {'column_name': {'description': 'Name of the column for which to create one-hot factors for unique values.', 'type': 'string'}, 'factor_names': {'description': 'Names of the resulting factor columns. If given must be same length as factor_values', 'items': {'type': 'string'}, 'minItems': 1, 'type': 'array', 'uniqueItems': True}, 'factor_values': {'description': 'Specific unique column values to compute factors for (otherwise all unique values).', 'items': {'type': 'string'}, 'minItems': 1, 'type': 'array', 'uniqueItems': True}}, 'required': ['column_name'], 'type': 'object'}