twodlearn.datasets.tsdataset module¶
-
class
twodlearn.datasets.tsdataset.
AsynchronousRecord
(data, start_time=None, prop=None, name='')[source]¶ Bases:
object
-
class
twodlearn.datasets.tsdataset.
Record
(data, start_time=None, prop=None, name='')[source]¶ Bases:
object
-
get_group
(group_name)[source]¶ Get the np.Array corresponding to the given group.
- Parameters
group_name (str) – name of the group.
- Returns
Array corresponding to the data of the group.
- Return type
np.Array
dictionary with the names of the groups and the associated columns
-
set_groups
(group_tags)[source]¶ Group the features on the data according to the provided tags.
- Parameters
group_tags (dict) – dictionary where the keys correspond to the name for the groups and the values the columns for the groups
-
-
class
twodlearn.datasets.tsdataset.
RecordSaveData
(data, start_time, prop, name)[source]¶ Bases:
tuple
-
class
twodlearn.datasets.tsdataset.
TSDataset
(records=[])[source]¶ Bases:
object
-
class
Cursor
(dataset, global_pointer)[source]¶ Bases:
object
Manages one of the continuous elements of the batch. Hence, there are as many cursors as elements in the batch
-
add_record
(other)[source]¶ add a record into the dataset @type other: Record @param other: record to be added into the dataset
-
get_stats
(groups=None)[source]¶ Obtain mean and standard deviation of the dataset to be used for normalization
@param groups: list of the group names that you want to measure
-
next_batch
(window_size, batch_size, reset=False)[source]¶ Returns the next batch_size sequences of length window_size.
- Parameters
window_size –
batch_size –
reset – reset the cursors that point where data is currently being extracted
- Returns
A dictionary with the batch samples, the format is:
batch[group] = array[window_size, batch_size, n_vars(group)]
- Return type
dict
-
next_batch_discontinuous
(batch_size)[source]¶ Get a batch when window_size is 1
This function is used by next_batch, is not intended for being used outside the class.
-
next_windowed_batch
(sequences_length, batch_size, window_size, groups=None, reset=False)[source]¶ Returns the next batch where each sample contains a sequence off window_size elements
- Parameters
sequences_length – length of the sequences
batch_size – number of sequences
window_size – size of the window
- Returns
A dictionary with the batch samples. The format is:
batch[group] = array[sequences_length, batch_size, n_vars(group)*window_size]
- Return type
dict
-
normalize
(groups=None, mu=None, std=None)[source]¶ Obtain mean and standard deviation of the dataset to be used for normalization
@param groups: list of the group names that you want to normalize
-
split_continuous
(column_name, min_samples=None)[source]¶ splits the records following continuous chunks of data from the provided column
-
to_dense
()[source]¶ Return a dense representation of the dataset.
- Returns
a tuple of the dense array and the length of each record. The records are padded with nan values.
- Return type
(array, length)
-
to_tf_dataset
(dtype=<class 'numpy.float32'>)[source]¶ Get a tf.data.Dataset with a dense representation of the dataset.
- Returns
with elements ‘data’, ‘length’. ‘data’ is a dense tensor representation of the dataset formated as (record, time, features). Records are padded with nan values.
- Return type
tf.data.Dataset
-
class
-
class
twodlearn.datasets.tsdataset.
TSDatasetSaver
(records_data, group_tags)[source]¶ Bases:
tuple
Alias for field number 1
-
class
twodlearn.datasets.tsdataset.
TSDatasets
(train=None, valid=None, test=None)[source]¶ Bases:
object
-
twodlearn.datasets.tsdataset.
sample_batch_window
(data, length, window_size, batch_size=None)[source]¶ Sample continuous windows of window_size from tensor data.
- Parameters
data (tf.Tensor) – dense representation of a set of continuous records. The format should be (record, time, features).
length (type) – length of each record.
window_size (type) – window size of the window to sample.
batch_size (type) – batch size of data. If not provided, batch_size = data.shape[0]
- Returns
continuous random continuous windows. The format is (record, time, features)
- Return type
tf.Tensor
-
twodlearn.datasets.tsdataset.
sample_window
(data, length, window_size)[source]¶ Sample continuous windows of window_size from tensor data.
- Parameters
data (tf.Tensor) – dense representation of a set of continuous records. The format should be (time, features).
length (type) – length of each record.
window_size (type) – window size of the window to sample.
- Returns
continuous random continuous windows. The format is (record, time, features)
- Return type
tf.Tensor