seed.lib.mappings package¶
Submodules¶
seed.lib.mappings.mapper module¶
:copyright (c) 2014 - 2022, The Regents of the University of California, through Lawrence Berkeley National Laboratory (subject to receipt of any required approvals from the U.S. Department of Energy) and contributors. All rights reserved. :author Dan Gunter <dkgunter@lbl.gov>
- seed.lib.mappings.mapper.create_column_regexes(raw_columns)¶
- Take the columns in the format below and sanitize the keys and add in the regex. - Parameters
- raw_data – list of strings (columns names from imported file) 
- Returns
- list of dict 
 
- seed.lib.mappings.mapper.get_pm_mapping(raw_columns, mapping_data=None, resolve_duplicates=True)¶
- Create and return Portfolio Manager (PM) mapping for a given version of PM and the given list of column names. - The method will take the raw_columns (from the CSV/XLSX file) and attempt to normalize the column names so that they can be mapped to the data in the pm-mapping.json[‘from_field’]. 
seed.lib.mappings.mapping_columns module¶
:copyright (c) 2014 - 2022, The Regents of the University of California, through Lawrence Berkeley National Laboratory (subject to receipt of any required approvals from the U.S. Department of Energy) and contributors. All rights reserved. :author Nicholas Long <nicholas.long@nrel.gov>
- class seed.lib.mappings.mapping_columns.MappingColumns(raw_columns, dest_columns, previous_mapping=None, map_args=None, default_mappings=None, threshold=0)¶
- Bases: - object- This class handles the probabilistic mapping of unknown columns to defined fields. This is mainly used in the build_column_mapping API endpoint. - add_mappings(raw_column, mappings, previous_mapping=False)¶
- Add mappings to the data structure for later processing. - Parameters
- raw_column – list of strings 
- mappings – list of tuples of potential mappings and confidences 
- previous_mapping – boolean, if true these these mappings will take precedence 
 
- Returns
- Bool, whether or not the mapping was added 
 
 - apply_threshold(threshold)¶
- Remove mapping suggestions that do not meet the defined threshold - This method is forced as part of the workflow for now, but could easily be made as a separate call. - Parameters
- threshold – int, min value to be greater than or equal to. 
- Returns
- None 
 
 - property duplicates¶
- Check for duplicate initial mapping results. Duplicates exist if the first suggested mapping for two different raw_columns are the same. The example below would be one of those cases. - Returns
- List of raw col 
 
 - property final_mappings¶
- Return the final mappings in a format that can be used downstream from this method { - “raw_column_1”: (‘table’, ‘db_column_1’, confidence), “raw_column_2”: (‘table’, ‘db_column_1’, confidence), - } 
 - first_suggested_mapping(raw_column)¶
- Grab the first suggested mapping for a raw column - Parameters
- raw_column – String 
- Returns
- tuple of the mapping (‘table’, ‘field’, confidence), or () 
 
 - resolve_duplicate(dup_map_field, raw_columns)¶
- If there are duplicates, that is two raw_columns are trying to map to the same suggested column, then select the next available one on the duplicate column. The one with the highest confidence will ‘win’ the duplicate battle. - Parameters
- dup_map_field – String, name of the field that is a duplicate 
- raw_columns – list, raw columns that mapped to the same result 
 
- Returns
- None 
 
 - set_initial_mapping_cmp(raw_column)¶
- Set the initial_mapping_cmp helper item in the self.data hash. This is used to detect if there are any duplicates. The initial mapping cmp will be the first match in the list (i.e., the one with the highest confidence). - Parameters
- raw_column – String, name of the raw column to set the initial_mapping_cmp 
- Returns
- None 
 
 
- seed.lib.mappings.mapping_columns.sort_duplicates(a, b)¶
- Custom sort for the duplicate hash to decide which raw column will get the mapping suggestion based on the confidence. 
seed.lib.mappings.mapping_data module¶
seed.lib.mappings.test_mapper module¶
seed.lib.mappings.test_mapping_columns module¶
seed.lib.mappings.test_mapping_data module¶
Module contents¶
:copyright (c) 2014 - 2022, The Regents of the University of California, through Lawrence Berkeley National Laboratory (subject to receipt of any required approvals from the U.S. Department of Energy) and contributors. All rights reserved. :author