datasets

Module for loading datasets

Classes

`scilightcon.datasets.LogsReader`

Reader object for getting time-dependent data from logs folders, created by different software (Argos, CEP, ThermoLoggers, etc.)

Examples:

>>> from scilightcon.datasets import LogsReader # doctest: +SKIP
>>> import datetime # doctest: +SKIP
>>> directory = r'\\konversija\kleja\ThermologgerLogs\v5' # doctest: +SKIP
>>> reader = LogsReader(directory) # doctest: +SKIP
>>> loggers_names_list = reader.list_loggers() # doctest: +SKIP
>>> loggers_names_list # doctest: +SKIP
['Location 2B 314', 'Location 2D 3.14 Logger 1-4', 'Location 2D 3.14 Logger 5-8', ...] # doctest: +SKIP
>>> logger_name = 'Location 2B 314' # doctest: +SKIP
>>> measurables_list = reader.list_measurables(logger_name) # doctest: +SKIP
>>> measurables_list # doctest: +SKIP
['A1-H Stalas 1', 'A1-H', 'A1-T Stalas 1', 'A1-T'] # doctest: +SKIP
>>> measurable = 'A1-H Stalas 1' # doctest: +SKIP
>>> from_date = datetime.datetime(2023,7,20) # doctest: +SKIP
>>> to_date = datetime.datetime(2023,7,21) # doctest: +SKIP
>>> times, values = reader.get_data(logger_name=logger_name, measurable=measurable, from_date=from_date, to_date=to_date) # doctest: +SKIP

Functions

`scilightcon.datasets.LogsReader.get_data(logger_name, measurable, from_date=None, to_date=None)`

Function checks if given logger_name and measurable are valid and collects timestamps and values for a given time period.

Parameters:

Name	Type	Description	Default
`logger_name`	`str`	Logger name, for example: "Location 2B 314"	required
`measurable`	`str`	Measurable name, for example: "A1-H Stalas 1"	required
`from_date`	`datetime`	Date from which the data will be collected	`None`
`to_date`	`datetime`	Date to which the data will be collected	`None`

Returns:

Name	Type	Description
`times`	`List(datetime)`	A list with timestamps
`values`	`List(float)`	A list with the values

`scilightcon.datasets.LogsReader.list_loggers()`

Collects names of available loggers

Returns:

Type	Description
`List[str]`	A list of Logger names

`scilightcon.datasets.LogsReader.list_measurables(logger_name)`

Collects names of measurables of a given logger

Parameters:

Name	Type	Description	Default
`logger_name`	`str`	Logger name	required

Returns:

Type	Description
`List[str]`	A list of measurables that can be found for the specific logger

Functions

`scilightcon.datasets.load_EKSMA_OPTICS_mirror_reflections(material)`

Loads wavelength-dependent reflection dataset of metal coated mirrors by EKSMA OPTICS.

Examples:

>>> from scilightcon.datasets import load_EKSMA_OPTICS_mirror_reflections
>>> data, header = load_EKSMA_OPTICS_mirror_reflections('Ag')
>>> np.shape(data)
(172, 2)
>>> header
['Wavelength (nm)', 'Reflection (%)']

Parameters:

Name	Type	Description	Default
`material`	`str`	`Ag`, `Au` or `Al`	required

Returns:

Name	Type	Description
`data`	`Ndarray`	A 2D array of data with headers excluded. Shape (n_samples, n_columns)
`header`	`List`	Column names or empty strings. Shape (n_columns)

`scilightcon.datasets.load_EO_filter_transmissions(filter)`

Loads wavelength-dependent transmission dataset of chosen filter from EO file. Stock number is indicated in the second line of the dataset.

Examples:

>>> from scilightcon.datasets import load_EO_filter_transmissions
>>> data, header = load_EO_filter_transmissions('lp_450nm')
>>> np.shape(data)
(293, 2)
>>> header
['Wavelength (nm)', 'Transmission (%)']

Parameters:

Name	Type	Description	Default
`filter`	`str`	`lp_400nm`, `lp_450nm`, `lp_500nm`, `lp_550nm`, `lp_600nm`, `lp_600nm`, `lp_700nm`, `lp_750nm`, `sp_400nm`, `sp_500nm`, `sp_600nm` or `sp_700nm`	required

Returns:

Name	Type	Description
`data`	`Ndarray`	A 2D array of data with headers excluded. Shape (n_samples, n_columns)
`header`	`List`	Column names or empty strings. Shape (n_columns)

`scilightcon.datasets.load_THORLABS_filter_transmissions(filter)`

Loads wavelength-dependent transmission dataset of chosen material from thorlabs file.

Examples:

>>> from scilightcon.datasets import load_THORLABS_filter_transmissions
>>> data, header = load_THORLABS_filter_transmissions('DMLP425')
>>> np.shape(data)
(2251, 2)
>>> header
['Wavelength  (nm)', 'Transmission (%)']

Parameters:

Name	Type	Description	Default
`filter`	`str`	`DMLP425`, `DMLP550`, `DMLP650`, `FB340-10`, `FBH343-10`, `FBH400-40`, `FBH515-10`, `FBH520-40`, `FBH550-40`, `FEL0400`, `FEL0450`, `FEL0500`, `FEL0550`, `FEL0600`, `FEL0650`, `FEL0700`, `FEL0750`, `FEL0800`, `FEL0850`, `FEL0900`, `FEL0950`, `FEL1000`, `FEL1050`, `FEL1100`, `FEL1150`, `FEL1200`, `FEL1250`, `FEL01300`, `FEL1350`, `FEL1400`, `FEL1450`, `FEL1500`, `FELH1000`, `FELH1050`, `FELH1100`, `FELH1250`, `FELH1500`, `FES0450`, `FES0500`, `FES0550`, `FES0600`, `FES0650`, `FES0700`, `FES0750`, `FES0800`, `FES0850`, `FES0900`, `FES0950`, `FES1000`, `FESH0450`, `FES0500`, `FES0600`, `FES0700`, `FES0750`, `FGB37`, `FGB39`, `FGS550`, `FGS700`, `FGS900`, `FGUV5`, `FGUV11`, `FL514.5-10`, `FL530-10`, `MF460-60`, `NDUV01B`, `NDUV02B`, `NDUV06B`, `NDUV10B`, `NDUV20B`, `NDUV30B`, `NDUV40B`, `NE01B`, `NE06B`, `NE10B`, `NE20B`, `NE30B`, `NE40B`, `NE50B` or `NE60B`	required

Returns:

Name	Type	Description
`data`	`Ndarray`	A 2D array of data with headers excluded. Shape (n_samples, n_columns)
`header`	`List`	Column names or empty strings. Shape (n_columns)

`scilightcon.datasets.load_atmospheric_data()`

Loads atmospheric data.

Examples:

>>> from scilightcon.datasets import load_atmospheric_data
>>> data, header = load_atmospheric_data()

Returns:

Name	Type	Description
`data`	`Ndarray`	A 2D array of data with headers excluded. Shape (n_samples, n_columns)
`header`	`List`	Column names or empty strings. Shape (n_columns)

`scilightcon.datasets.load_csv_data(data_file_name, *, data_module=DATA_MODULE)`

Loads data_file_name from data_module with importlib.resources.

Examples:

>>> from scilightcon.datasets import load_csv_data
>>> data, header = load_csv_data('Hg_lines.csv')

Parameters:

Name	Type	Description	Default
`data_file_name`	`str`	Name of csv file to be loaded from `data_module/data_file_name`.	required
`data_module`	`str or module`	Module where data lives. The default is `'scilightcon.datasets.data'`	`DATA_MODULE`

Returns:

Name	Type	Description
`data`	`ndarray`	A 2D array with each row representing one sample and each column representing the features of a given sample. Shape: n_samples, n_features
`target`	`ndarry`	A 1D array holding target variables for all the samples in `data`. For example target[0] is the target variable for data[0]. Shape (n_samples,)
`target_names`	`ndarry`	A 1D array containing the names of the classifications. For example target_names[0] is the name of the target[0] class. Shape (n_samples,)

`scilightcon.datasets.load_zipped_csv_data(data_file_name, *, data_module=DATA_MODULE)`

Extracts gzip file to csv.

Examples:

>>> from scilightcon.datasets import load_zipped_csv_data
>>> data_file_name = r'C:\Code\lightcon-scipack\scilightcon\datasets\data\data_test_detect_peaks.csv.gz'
>>> data, header = _load_zipped_csv_data(data_file_name)

Parameters:

Name	Type	Description	Default
`data_file_name`	`str`	Path of the file that needs to be extracted	required
`data_module`	`str or module`	Module where data lives. The default is `'scilightcon.datasets.data'`	`DATA_MODULE`

Returns:

Name	Type	Description
`data`	`Ndarray`	A 2D array of data with headers excluded. Shape (n_samples, n_columns)
`header`	`List`	Column names or empty strings. Shape (n_columns)