Data Importer Package¶

Submodules¶

Managers¶

:copyright (c) 2014 - 2022, The Regents of the University of California, through Lawrence Berkeley National Laboratory (subject to receipt of any required approvals from the U.S. Department of Energy) and contributors. All rights reserved. :author

class seed.data_importer.managers.NotDeletedManager(*args, **kwargs)¶

Bases: django.db.models.manager.Manager

get_all(*args, **kwargs)¶: Method to return ALL ImportFiles, including the ones where deleted == True which are normally excluded. This is used for database/filesystem cleanup.

get_queryset(*args, **kwargs)¶: Return a new QuerySet object. Subclasses can override this method to customize the behavior of the Manager.

use_for_related_fields = True¶

Models¶

URLs¶

Utils¶

:copyright (c) 2014 - 2022, The Regents of the University of California, through Lawrence Berkeley National Laboratory (subject to receipt of any required approvals from the U.S. Department of Energy) and contributors. All rights reserved. :author

class seed.data_importer.utils.CoercionRobot¶

Bases: object

lookup_hash(uncoerced_value, destination_model, destination_field)¶

make_key(value, model, field)¶

seed.data_importer.utils.acquire_lock(name, expiration=None)¶

Tries to acquire a lock from the cache. Also sets the lock’s value to the current time, allowing us to see how long it has been held.

Returns False if lock already belongs by another process.

seed.data_importer.utils.chunk_iterable(iterlist, chunk_size)¶: Breaks an iterable (e.g. list) into smaller chunks, returning a generator of the chunk.

seed.data_importer.utils.get_core_pk_column(table_column_mappings, primary_field)¶

seed.data_importer.utils.get_lock_time(name)¶: Examines a lock to see when it was acquired.

seed.data_importer.utils.kbtu_thermal_conversion_factors(country)¶

Returns thermal conversion factors provided by Portfolio Manager. In the PM app, using NREL’s test account, a property was created for each US and CAN. All possible Meters of different Type and Units were added. Readings of value 1 were added to deduce the factos provided below.

Consideration was given regarding having the provided ‘country’ value align with Organizations’ thermal_conversion_assumption enums. Even though these two should be aligned, the concept and need for these factors are not specific soley to Orgs. So the ‘country’ value here is expected to be a string. Specifically, there are instances in the codebase where the factors are needed irrespective of any Organization’s preferences.

seed.data_importer.utils.release_lock(name)¶: Frees a lock.

seed.data_importer.utils.usage_point_id(raw_source_id)¶: Extracts and returns the usage point ID of a GreenButton full uri ID.

Views¶

:copyright (c) 2014 - 2022, The Regents of the University of California, through Lawrence Berkeley National Laboratory (subject to receipt of any required approvals from the U.S. Department of Energy) and contributors. All rights reserved. :author

class seed.data_importer.views.ImportFileViewSet(**kwargs)

Bases: rest_framework.viewsets.ViewSet

data_quality_progress(request, pk=None)

Return the progress of the data quality check. — type:

status:
required: true type: string description: either success or error

progress:
type: integer description: status of background data quality task

parameter_strategy: replace parameters:

name: pk description: Import file ID required: true paramType: path

destroy(request, pk)

Returns suggested mappings from an uploaded file’s headers to known data fields. — type:

status:
required: true type: string description: Either success or error

parameter_strategy: replace parameters:

name: pk description: import_file_id required: true paramType: path

name: organization_id description: The organization_id for this user’s organization required: true paramType: query

filtered_mapping_results(request, pk=None)

Retrieves a paginated list of Properties and Tax Lots for an import file after mapping. — parameter_strategy: replace parameters:

name: pk description: Import File ID (Primary key) type: integer required: true paramType: path

response_serializer: MappingResultsResponseSerializer

first_five_rows(request, pk=None)

Retrieves the first five rows of an ImportFile. — type:

status:
required: true type: string description: either success or error

first_five_rows:
type: array of strings description: list of strings for each of the first five rows for this import file

parameter_strategy: replace parameters:

name: pk description: “Primary Key” required: true paramType: path

static has_coparent(state_id, inventory_type, fields=None)

Return the coparent of the current state id based on the inventory type. If fields are given (as a list), then it will only return the fields specified of the state model object as a dictionary.

Parameters

state_id – int, ID of PropertyState or TaxLotState
inventory_type – string, either properties | taxlots
fields – list, either None or list of fields to return

Returns

dict or state object, If fields is not None then will return state_object

mapping_done(request, pk=None)

Tell the backend that the mapping is complete. — type:

status:
required: true type: string description: either success or error

message:
required: false type: string description: error message, if any

parameter_strategy: replace parameters:

name: pk description: Import file ID required: true paramType: path

mapping_suggestions(request, pk)

Returns suggested mappings from an uploaded file’s headers to known data fields. — type:

status:
required: true type: string description: Either success or error

suggested_column_mappings:
required: true type: dictionary description: Dictionary where (key, value) = (the column header from the file,

array of tuples (destination column, score))

building_columns:
required: true type: array description: A list of all possible columns

building_column_types:
required: true type: array description: A list of column types corresponding to the building_columns array

parameter_strategy: replace parameters:

name: pk description: import_file_id required: true paramType: path

name: organization_id description: The organization_id for this user’s organization required: true paramType: query

matching_and_geocoding_results(request, pk=None)

Retrieves the number of matched and unmatched properties & tax lots for a given ImportFile record. Specifically for new imports

GET: Expects import_file_id corresponding to the ImportFile in question.

Returns:

{
    'status': 'success',
    'properties': {
        'matched': Number of PropertyStates that have been matched,
        'unmatched': Number of PropertyStates that are unmatched new imports
    },
    'tax_lots': {
        'matched': Number of TaxLotStates that have been matched,
        'unmatched': Number of TaxLotStates that are unmatched new imports
    }
}

perform_mapping(request, pk=None)

Starts a background task to convert imported raw data into PropertyState and TaxLotState, using user’s column mappings. — type:

status:
required: true type: string description: either success or error

progress_key:
type: integer description: ID of background job, for retrieving job progress

parameter_strategy: replace parameters:

name: pk description: Import file ID required: true paramType: path

queryset = <QuerySet [<ImportFile: /Users/nlong/working/seed/seed/media/uploads/example-data-properties.xlsx>, <ImportFile: /Users/nlong/working/seed/seed/media/uploads/building-performance-standards-sample-2023.xlsx>]>

raise_exception = True

raw_column_names(request, pk=None)

Retrieves a list of all column names from an ImportFile. — type:

status:
required: true type: string description: either success or error

raw_columns:
type: array of strings description: list of strings of the header row of the ImportFile

parameter_strategy: replace parameters:

name: pk description: “Primary Key” required: true paramType: path

retrieve(request, pk=None)

Retrieves details about an ImportFile. — type:

status:
required: true type: string description: either success or error

import_file:
type: ImportFile structure description: full detail of import file

parameter_strategy: replace parameters:

name: pk description: “Primary Key” required: true paramType: path

save_column_mappings(request, pk=None)

Saves the mappings between the raw headers of an ImportFile and the destination fields in the to_table_name model which should be either PropertyState or TaxLotState

Valid source_type values are found in seed.models.SEED_DATA_SOURCES

Payload:

{
    "import_file_id": ID of the ImportFile record,
    "mappings": [
        {
            'from_field': 'eui',  # raw field in import file
            'from_units': 'kBtu/ft**2/year', # pint-parsable units, optional
            'to_field': 'energy_use_intensity',
            'to_field_display_name': 'Energy Use Intensity',
            'to_table_name': 'PropertyState',
        },
        {
            'from_field': 'gfa',
            'from_units': 'ft**2', # pint-parsable units, optional
            'to_field': 'gross_floor_area',
            'to_field_display_name': 'Gross Floor Area',
            'to_table_name': 'PropertyState',
        }
    ]
}

Returns:

{'status': 'success'}

save_raw_data(request, pk=None)

Starts a background task to import raw data from an ImportFile into PropertyState objects as extra_data. If the cycle_id is set to year_ending then the cycle ID will be set to the year_ending column for each record in the uploaded file. Note that the year_ending flag is not yet enabled. — type:

status:
required: true type: string description: either success or error

message:
required: false type: string description: error message, if any

progress_key:
type: integer description: ID of background job, for retrieving job progress

parameter_strategy: replace parameters:

name: pk description: Import file ID required: true paramType: path

name: cycle_id description: The ID of the cycle or the string “year_ending” paramType: string required: true

start_data_quality_checks(request, pk=None)

Starts a background task to attempt automatic matching between buildings in an ImportFile with other existing buildings within the same org. — type:

status:
required: true type: string description: either success or error

progress_key:
type: integer description: ID of background job, for retrieving job progress

parameter_strategy: replace parameters:

name: pk description: Import file ID required: true paramType: path

start_system_matching_and_geocoding(request, pk=None)

Starts a background task to attempt automatic matching between buildings in an ImportFile with other existing buildings within the same org. — type:

status:
required: true type: string description: either success or error

progress_key:
type: integer description: ID of background job, for retrieving job progress

parameter_strategy: replace parameters:

name: pk description: Import file ID required: true paramType: path

validate_use_cases(request, pk=None)

Starts a background task to call BuildingSync’s use case validation tool. — type:

status:
required: true type: string description: either success or error

message:
required: false type: string description: error message, if any

progress_key:
type: integer description: ID of background job, for retrieving job progress

parameter_strategy: replace parameters:

name: pk description: Import file ID required: true paramType: path

class seed.data_importer.views.LocalUploaderViewSet(**kwargs)

Bases: rest_framework.viewsets.ViewSet

Endpoint to upload data files to, if uploading to local file storage. Valid source_type values are found in seed.models.SEED_DATA_SOURCES

Returns:

{
    'success': True,
    'import_file_id': The ID of the newly-uploaded ImportFile
}

create(request)

Upload a new file to an import_record. This is a multipart/form upload. — parameters:

name: import_record description: the ID of the ImportRecord to associate this file with. required: true paramType: body

name: source_type description: the type of file (e.g. ‘Portfolio Raw’ or ‘Assessed Raw’) required: false paramType: body

name: source_program_version description: the version of the file as related to the source_type required: false paramType: body

name: file or qqfile description: In-memory file object required: true paramType: Multipart

create_from_pm_import(request)

Create an import_record from a PM import request. TODO: The properties key here is going to be an enormous amount of XML data at times, need to change this This allows the PM import workflow to be treated essentially the same as a standard file upload The process comprises the following steps:

Get a unique file name for this portfolio manager import

— parameters:

name: import_record description: the ID of the ImportRecord to associate this file with. required: true paramType: body

name: properties description: In-memory list of properties from PM import required: true paramType: body

class seed.data_importer.views.MappingResultsPropertySerializer(*args, **kwargs): Bases: rest_framework.serializers.Serializer

class seed.data_importer.views.MappingResultsResponseSerializer(*args, **kwargs): Bases: rest_framework.serializers.Serializer

class seed.data_importer.views.MappingResultsTaxLotSerializer(*args, **kwargs): Bases: rest_framework.serializers.Serializer

seed.data_importer.views.convert_first_five_rows_to_list(header, first_five_rows)

Return the first five rows. This is a complicated method because it handles converting the persisted format of the first five rows into a list of dictionaries. It handles some basic logic if there are crlf in the fields. Note that this method does not cover all the use cases and cannot due to the custom delimeter. See the tests in test_views.py:test_get_first_five_rows_newline_should_work to see the limitation

Parameters

header – list, ordered list of headers as strings
first_five_rows – string, long string with |#*#| delimeter.

Returns

list

seed.data_importer.views.get_upload_details(self, request, *args, **kwargs)

Retrieves details about how to upload files to this instance.

Returns:

{
    'upload_path': The url to POST files to (see local_uploader)
}

Module contents¶

:copyright (c) 2014 - 2022, The Regents of the University of California, through Lawrence Berkeley National Laboratory (subject to receipt of any required approvals from the U.S. Department of Energy) and contributors. All rights reserved. :author