KeyMap¶
- class KeyMap(key_cols, target_cols=None, name='')[source]
A map of unique column values for remapping columns.
- key_cols
A list of column names that will be hashed into the keys for the map.
- Type:
list
- target_cols
Optional list of column names that will be inserted into data and later remapped.
- Type:
list or None
- name
An optional name of this remap for identification purposes.
- Type:
str
Notes: This mapping converts all columns in the mapping to strings. The remapping does not support other types of columns.
Methods
|
Information for remapping columns of tabular files. |
|
Return a dataframe template. |
|
Remap the columns of a dataframe or columnar file. |
|
Remove quotes from the specified columns and convert to string. |
Sort the col_map in place by the key columns. |
|
|
Update the existing map with information from data. |
Attributes
Return the column names of the columns managed by this map. |
- KeyMap.__init__(key_cols, target_cols=None, name='')[source]¶
Information for remapping columns of tabular files.
- Parameters:
key_cols (list) – List of columns to be replaced (assumed in the DataFrame).
target_cols (list) – List of replacement columns (assumed to not be in the DataFrame).
name (str) – Name associated with this remap (usually a pathname of the events file).
- KeyMap.make_template(additional_cols=None, show_counts=True)[source]¶
Return a dataframe template.
- Parameters:
additional_cols (list or None) – Optional list of additional columns to append to the returned dataframe.
show_counts (bool) – If True, number of times each key combination appears is in first column and values are sorted in descending order by.
- Returns:
A dataframe containing the template.
- Return type:
DataFrame
- Raises:
If additional columns are not disjoint from the key columns.
Notes
The template consists of the unique key columns in this map plus additional columns.
- KeyMap.remap(data)[source]¶
Remap the columns of a dataframe or columnar file.
- Parameters:
data (DataFrame, str) – Columnar data (either DataFrame or filename) whose columns are to be remapped.
- Returns:
DataFrame: New dataframe with columns remapped.
list: List of row numbers that had no correspondence in the mapping.
- Return type:
tuple
- Raises:
If data is missing some of the key columns.
- static KeyMap.remove_quotes(df, columns=None)[source]¶
Remove quotes from the specified columns and convert to string.
- Parameters:
df (Dataframe) – Dataframe to process by removing quotes.
columns (list) – List of column names. If None, all columns are used.
Notes
Replacement is done in place.
- KeyMap.update(data, allow_missing=True)[source]¶
Update the existing map with information from data.
- Parameters:
data (DataFrame or str) – DataFrame or filename of an events file or event map.
allow_missing (bool) – If True allow missing keys and add as n/a columns.
- Raises:
If there are missing keys and allow_missing is False.
- KeyMap.columns¶
Return the column names of the columns managed by this map.
- Returns:
Column names of the columns managed by this map.
- Return type:
list