Datasets¶
Datasets Loader¶
-
class
demod.datasets.base_loader.
DatasetLoader
(allow_pickle=False, version=None, clear_parsed_data=False)¶ Base class for loading Datasets.
Can be used for helping parsing, saving and loading files. It helps retrieving access path to datasets. It also provides methods for warning and errors.
-
DATASET_NAME
¶ The name of the dataset folder, should always be specified.
- Type
str
-
raw_path
¶ The path of the raw data folder, can be used to help accessing raw files
- Type
str
-
parsed_path
¶ The path of the parsed data folder, can be used to help accessing parsed files.
- Type
str
-
version
¶ The version name of the data.
- Type
str
-
-
class
demod.datasets.tou_loader.
LoaderTOU
(activity_type='4_States', /, **kwargs)¶ Loader for Time Of Use survey data.
Supports for different kind of activity parsing.
-
activity_type
¶ the name of the activity loaded
- Type
str
-
refresh_time
¶ a time object specifying at which times the TOU is refreshed.
- Type
datetime.time
-
load_tpm
(subgroup)¶ Load a transition probability matrix for the requested subgroup.
- Parameters
subgroup (Subgroup) – The desired subgroup
- Returns
the transition probability matrix, the labels and the initial pdf
- Return type
Tuple[numpy.ndarray, Union[List[str], List[int], numpy.ndarray], numpy.ndarray]
-
load_tpm_with_duration
(subgroup)¶ Load tpms and durations for the requested subgroup.
- Parameters
subgroup (Subgroup) – The desired subgroup
- Returns
the transition probability matrix, the durations pdfs, the labels, the initial state pdf and the intial durations pdf
- Return type
Tuple[numpy.ndarray, numpy.ndarray, Union[List[str], List[int], numpy.ndarray], numpy.ndarray, numpy.ndarray]
-
load_sparse_tpm
(subgroup)¶ Load a sparse transition probability matrix.
This can be used as data input by any
SparseStatesSimulator
.- Parameters
subgroup (Subgroup) – requested suubgroup TPM
- Returns
sparse_TPM, states_labels, activity_labels, initial_pdf,
- Return type
Tuple[demod.utils.sparse.SparseTPM, Union[List[str], List[int], numpy.ndarray], Union[List[str], numpy.ndarray], numpy.ndarray]
-
load_activity_probability_profiles
(subgroup)¶ Return the activity probability profiles for a subgroup.
This can be used by an
ApplianceSimulator
that requires the activity probability profiles.The probability profiles are based on how many active occupants are in the house. If you want only the probability that the activity is occuring, use
load_activity_probabilities()
Activity profiles come as a dict key -> np.array *key: activity name *Array Shapes: DIM0: Time, DIM1:Active_Occupants
- Parameters
subgroup (Subgroup) – requested subgroup activities profile
- Returns
activity_profiles_dict, A dictionary of daily activity profiles, where the key is the activity, and the profiles are arrays of shape DIM0:n_times, DIM1:active_occupancy.
- Return type
Dict[str, numpy.ndarray]
-
load_daily_activity_starts
(subgroup)¶ Return the probability of performing an activity in a day.
Can vary depending on the subgroup.
A dictionaries containing pdfs of how many times an activity is performed during a day. The i-eth element is the probability that activity is performed i times in a day.
Arrays can be of variable length.
# For example { 'activity1': [0.3, 0.2, 0.5, 0.0], 'activity2': [0.3, 0.2, 0.0, 0.3, 0.2], ... }
- Parameters
subgroup (Subgroup) –
- Return type
Dict[str, numpy.ndarray]
-
load_activity_duration
(subgroup)¶ Return the probability of activity duration.
Can vary depending on the subgroup.
A dictionaries containing pdfs of how long the activity last. The i-eth element means the duration is i*step_size.
Element 0 means the the duration is smaller than step_size.
# For example { 'activity1': [0.0, 0.3, 0.2, 0.5, 0.0], 'activity2': [0.0, 0.3, 0.2, 0.0, 0.3, 0.2], ... }
- Parameters
subgroup (Subgroup) –
- Return type
Dict[str, numpy.ndarray]
-
load_activity_probabilities
(subgroup)¶ Return the probability that the activity is performed.
Proabilites are given at each step, during the day, for each activity. The probability means the probability of doing that activity at that time compared to another.
- Parameters
subgroup (Subgroup) –
- Return type
Dict[str, numpy.ndarray]
-
-
class
demod.datasets.base_loader.
PopulationLoader
(**kwargs)¶ Loader for population data.
-
load_population_subgroups
(population_type)¶ Loads the subgroups and their numbers of a population.
The population refers to the households population. Returns the list of subgroups, the proportion of each subgroups in the population and the total number of households for this population. Different splitting can be specified using the the
population_type
argument.- Returns
subgroups_list, subgroup_prob, total_population
- Parameters
population_type (str) –
- Return type
Tuple[List[Subgroup], List[float], int]
-
-
class
demod.datasets.base_loader.
ClimateLoader
(**kwargs)¶ Loader providing methods for loading climate data.
-
step_size
¶ The time between two different data points from the historical data.
- Type
datetime.timedelta
-
step_size
: datetime.timedelta¶ Create a climate loader.
- Parameters
version – The version of the dataset used.
allow_pickle – Wheter to allow pickle. Keep it to false unless you know what you are doing.
-
load_clearness_tpms
()¶ Return TPM for the clearness of the sky, with the labels.
The tpm containains the probability that the sky clearness changes at each step.
- Returns
The TPM of clearness
Labels containing the clearness value of each TPM states
The step size of the tpm, resolution of the transitions.
- Return type
Tuple[numpy.ndarray, Union[List[str], List[int], numpy.ndarray], datetime.timedelta]
-
load_temperatures_arma
()¶ Load the parameters for a temperature arma model.
- Returns
contains the parameters of the arma model.
- Return type
arma_dict
-
load_geographic_data
()¶ Return a dictionary with geographic information on the dataset.
- Returns
geo_dic, The geographic data dictionary with available keys
’country’: the country where the data is collected
’latitude’: in degree
’longitude’: in degree
’meridian’: in degree
’use_daylight_saving_time’: whether to use the time shift
- Return type
Dict[str, Union[str, float]]
-
load_historical_climate_data
(start_datetime)¶ Load historical data starting from the requested time.
Provides the data point at the start of the simulation using
start_datetime
, or the closest one before the start.- Parameters
start_datetime (datetime.datetime) – a datetime object, specifying the start of the required data.
- Returns
climate_dict, a dictionary with the following possible keys
’datetime’: the time stored as numpy ‘datetime64’, only mandatory key. the datetime array should be in utc format time.
’outside_temperature’: the temperature of the air [C]
’radiation_diffuse’: diffuse radiation at surface [W/m^2]
’radiation_direct’: direct radiation at surface [W/m^2]
’radiation_global’: global radiation at surface [W/m^2]
’radiation’: same as radiation_global [W/m^2]
- Return type
Dict[str, numpy.ndarray]
-
-
class
demod.datasets.base_loader.
ApplianceLoader
(allow_pickle=False, version=None, clear_parsed_data=False)¶ Loader that provide methods for loading appliances data.
Children of this class need to implement the following methods: *
_parse_appliance_dict()
-
load_appliance_dict
()¶ Load the appliance dictionary.
Try to call self.
_parse_appliance_dict()
if the parsed data is not available.- Returns
The appliance dictionary.
- Return type
Dict[str, numpy.ndarray]
-
load_real_profiles_dict
(profiles_type='full')¶ Load a dictionary containing real load profiles.
Try to call self.
_parse_real_profiles_dict()
if the parsed data is not available.- Parameters
profiles_type (str) –
The type of profile to load. Possibilities:
’full’, the wholes profiles are returned
’switchedON’, only profiles when appliances are ON
’switchedOFF’, only profiles when appliances are OFF
- Returns
The appliance dictionary, of the form {app_type: {app_name: array}}, such that it is easy to retrieve the profiles based on the type of the appliances
- Return type
Dict[str, Dict[str, numpy.ndarray]]
-
load_appliance_ownership_dict
(subgroup={})¶ Return the dictionary with probability of owning appliances.
A subgroup can be specifies for datasets that differentiate different subgroups.
get_ownership_from_dict()
can then be used to sample the ownership using an appliance dictionary.- Parameters
subgroup (Subgroup) – The subgroup of the desired ownership probabilities.
- Returns
probability of ownership for each appliance
- Return type
Dict[str, float]
-
load_yearly_target_switchons
(subgroup={})¶ Return the target of switchons in a year of each appliances.
A subgroup can be specified for datasets that differentiate different subgroups.
get_target_from_dict()
can then be used to sample the target number of yearly switchons using an appliance dictionary.- Parameters
subgroup (Subgroup) – The subgroup of the desired targets number of switchons.
- Returns
Number of target switchons for each appliance type.
- Return type
Dict[str, float]
-
load_yearly_target_consumption
(subgroup={})¶ Return the target of consumption in a year of each appliances.
A subgroup can be specified for datasets that differentiate different subgroups.
- Unit
: KwH/y
- Parameters
subgroup (Subgroup) –
- Return type
Dict[str, float]
get_target_from_dict()
can then be used to sample the target number of yearly consumption using an appliance dictionary.- Parameters
subgroup (Subgroup) – The subgroup of the desired targets number of consumption.
- Returns
Number of target consumption for each appliance type.
- Return type
Dict[str, float]
-
load_yearly_target_duration
(subgroup={})¶ Return the target of duration in a year of each appliances.
The duration is in number of steps simulated when the appliance should be on.
A subgroup can be specified for datasets that differentiate different subgroups.
- Unit
: number of steps
- Parameters
subgroup (Subgroup) –
- Return type
Dict[str, float]
get_target_from_dict()
can then be used to sample the target number of yearly duration using an appliance dictionary.- Parameters
subgroup (Subgroup) – The subgroup of the desired targets number of duration.
- Returns
Number of target duration for each appliance type.
- Return type
Dict[str, float]
-
-
class
demod.datasets.base_loader.
LightingLoader
(**kwargs)¶ Loader for lighting simulators components.
-
load_fisher_lighting
()¶ Load data for
FisherLightingSimulator
- Returns
Fisher lighting sim parameters dict.
- Return type
Dict[str, Any]
-
load_crest_lighting
()¶ Load data for
CrestLightingSimulator
- Returns
crest lighting sim parameters dict.
- Return type
Dict[str, Any]
-
load_bulbs
()¶ Load data for each light bulb, consumption and penetration.
Returns are arrays where each bulb type is an element of the array. Consumption is in Watts. Penetration is in probability.
- Returns
consumption, penetration
- Return type
Tuple[numpy.ndarray, numpy.ndarray]
-
load_bulbs_config
(subgroup={})¶ Return the light config of some houses.
The light config is A 2-D array, where Dim0 is the different houses and Dim1 the different bulbs of each house. The values correspond to the bulb consumption in watts.
- Parameters
subgroup (Subgroup) – The subgroup corresponding to the config. Defaults to {}.
- Returns
The light bulbs config.
- Return type
config
-
-
class
demod.datasets.base_loader.
HeatingLoader
(**kwargs)¶ Loader for heating simulators components.
-
load_buildings_dict
(subgroup={})¶ Load the buildings dictionary.
Try to call self.
_parse_buildings_dict()
if the parsed data is not available.- Returns
The buildings dictionary.
- Parameters
subgroup (Subgroup) –
- Return type
Dict[str, numpy.ndarray]
-
Available Datasets¶
German Time-Of-Use Survey¶
-
class
demod.datasets.GermanTOU.loader.
GTOU
(activity_type='4_States', **kwargs)¶ German Time-Of-Use Survey dataset loader.
This loads data for different types of activity models. It can also split the households in different subgroups of the whole population.
- Currently implements activity_types:
‘Sparse9States’
‘4_States’
‘DemodActivities_0’
‘Bottaccioli2018’ https://doi.org/10.1109/ACCESS.2018.2886201
German data from HERUS¶
-
class
demod.datasets.Germany.loader.
GermanDataHerus
(version='v0.1', **kwargs)¶ Data for demod for Germany.
This data originates from a Master project at HERUS lab, (EPFL). The data merges different sources:
The raw data is in some parts in an excell sheet.
- Parameters
version – A versionning system that can be used for modifications of the data.
CREST¶
Ninja Renwables¶
-
class
demod.datasets.RenewablesNinja.loader.
NinjaRenewablesClimate
(country_name, update_raw_data=False, weighted_type='population', **kwargs)¶ Loader of the climate.
Data comes from Ninja Renewable The raw datasets are downloaded on demand by this dataloader. It corresponds to MERRA-2(global).
Available data:
‘datetime’ in UTC
‘precipitation’
‘snowfall’
‘snow_mass’
‘clearness’
‘air_density’
‘outside_temperature’
‘irradiance’
- Parameters
weighted_type – The method used to weight the climate. This was performed by Renewables.ninja. Can be ‘population’ or ‘land_area’.
update_raw_data – Wether the raw data file should be updated. As time goes by, new data might be collected by Renewable Ninjas.
- Loaders: