data
module for describing data process.
All data structure is describing as nested combination of dict
or list
for ndarray
.
Data process is a translation from data structure to another data structure or typical ndarray
.
Data cache can be implemented based on the dynamic features of list
and dict
.
The full data structure is
{
"particle":{
"A":{"p":...,"m":...}
...
},
"decay":[
{
"A->R1+B": {
"R1": {
"ang": {
"alpha":[...],
"beta": [...],
"gamma": [...]
},
"z": [[x1,y1,z1],...],
"x": [[x2,y2,z2],...]
},
"B" : {...}
},
"R->C+D": {
"C": {
...,
"aligned_angle":{
"alpha":[...],
"beta":[...],
"gamma":[...]
}
},
"D": {...}
},
},
{
"A->R2+C": {...},
"R2->B+D": {...}
},
...
],
"weight": [...]
}
- data_cut(data, expr, var_map=None)[source]
cut data with boolean expression
- Parameters:
data – data need to cut
expr – cut expression
var_map – variable map between parameters in expr and data, [option]
- Returns:
data after being cut,
- data_generator(data, fun=<function _data_split>, args=(), kwargs=None, MAX_ITER=1000)[source]
Data generator: call
fun
to eachdata
as a generator. The extra arguments will be passed tofun
.
- data_map(data, fun, args=(), kwargs=None)[source]
Apply fun for each data. It returns the same structure.
- data_mask(data, select)[source]
This function using boolean mask to select data.
- Parameters:
data – data to select
select – 1-d boolean array for selection
- Returns:
data after selection
- data_shape(data, axis=0, all_list=False)[source]
Get data size.
- Parameters:
data – Data array
axis – Integer. ???
all_list – Boolean. ???
- Returns:
- data_split(data, batch_size, axis=0)[source]
Split
data
forbatch_size
each inaxis
.- Parameters:
data – structured data
batch_size – Integer, data size for each split data
axis – Integer, axis for split, [option]
- Returns:
a generator for split data
>>> data = {"a": [np.array([1.0, 2.0]), np.array([3.0, 4.0])], "b": {"c": np.array([5.0, 6.0])}, "d": [], "e": {}} >>> for i, data_i in enumerate(data_split(data, 1)): ... print(i, data_to_numpy(data_i)) ... 0 {'a': [array([1.]), array([3.])], 'b': {'c': array([5.])}, 'd': [], 'e': {}} 1 {'a': [array([2.]), array([4.])], 'b': {'c': array([6.])}, 'd': [], 'e': {}}
- flatten_dict_data(data, fun=<built-in method format of str object>)[source]
Flatten data as dict with structure named as
fun
.
- load_dat_file(fnames, particles, dtype=None, split=None, order=None, _force_list=False, mmap_mode=None)[source]
Load
*.dat
file(s) of 4-momenta of the final particles.- Parameters:
fnames – String or list of strings. File names.
particles – List of Particle. Final particles.
dtype – Data type.
split – sizes of each splited dat files
order – transpose order
- Returns:
Dictionary of data indexed by Particle.
- load_data(file_name, **kwargs)[source]
Load data file from save_data. The arguments will be passed to
numpy.load()
.
- save_data(file_name, obj, **kwargs)[source]
Save structured data to files. The arguments will be passed to
numpy.save()
.
- save_dataz(file_name, obj, **kwargs)[source]
Save compressed structured data to files. The arguments will be passed to
numpy.save()
.
- split_generator(data, batch_size, axis=0)
Split
data
forbatch_size
each inaxis
.- Parameters:
data – structured data
batch_size – Integer, data size for each split data
axis – Integer, axis for split, [option]
- Returns:
a generator for split data
>>> data = {"a": [np.array([1.0, 2.0]), np.array([3.0, 4.0])], "b": {"c": np.array([5.0, 6.0])}, "d": [], "e": {}} >>> for i, data_i in enumerate(data_split(data, 1)): ... print(i, data_to_numpy(data_i)) ... 0 {'a': [array([1.]), array([3.])], 'b': {'c': array([5.])}, 'd': [], 'e': {}} 1 {'a': [array([2.]), array([4.])], 'b': {'c': array([6.])}, 'd': [], 'e': {}}