Skip to content
Snippets Groups Projects
Commit 1f1e1fe9 authored by [ACC] Elena Operation's avatar [ACC] Elena Operation
Browse files

improved doc

parent d9d18118
No related branches found
No related tags found
No related merge requests found
Pipeline #3415487 passed
...@@ -9,6 +9,25 @@ The main purpose is to use it as dependance of `pyjapcscout` in the control room ...@@ -9,6 +9,25 @@ The main purpose is to use it as dependance of `pyjapcscout` in the control room
acquired form the control system as parquet files, and then on the user's "GPN" computer for data acquired form the control system as parquet files, and then on the user's "GPN" computer for data
analysis without the need of JAVA or other dependances needed to interact with the control system. analysis without the need of JAVA or other dependances needed to interact with the control system.
This package provides the following (main) functions. Note that many of those functions are simple wrappers of external functions (from `pandas`, `pyarrow`, `awkward`), but sometimes with some twiks to make sure data type/shape is somewhat always preserved.
- `dict_to_pandas(input_dict)`: Creates a `pandas` dataframe from a (list of) `dict`.
- `dict_to_awkward(input_dict)`: Creates an `awkward` array from a (list of) `dict`.
- `dict_to_parquet(input_dict, filename)`: Saves a (list of) `dict` into a `parquet` file. **In order to do so, 2D arrays are split in 1D arrays of 1D arrays.**
- `dict_to_pickle(input_dict, filename)`: Saves a (list of) `dict` into a `pickle` file.
- `dict_to_json(input_dict, filename)`: Saves a (list of) `dict` into a `json` file.
- `json_to_pandas(filename)`: It loads from a `json` file a `pandas` dataframe. This function is not so interesting (because data types/shapes are not preserved), but provided for convenience.
- `pandas_to_dict(input_pandas)`: It converts back a `pandas` dataframe into a (list of) `dict`.
- `awkward_to_dict(input_awkward)`: It converts back a `awkward` array into a (list of) `dict`. **In order to preserve data type/shape, it re-merges 1D arrays of 1D arrays into 2D arrays.**
- `parquet_to_dict(filename)`: Loads a (list of) `dict` from a `parquet` file. **In order to preserve data type/shape, it re-merges 1D arrays of 1D arrays into 2D arrays.**
- `pickle_to_dict(filname)`: Loads a (list of) `dict` from a `pickle` file.
- `pandas_to_awkward(input_pandas)`: It creates an `awkward`array starting from a `pandas` dataframe.
- `awkward_to_pandas(input_awkward)`: It creates an `pandas` dataframe starting from a `awkward` array.
- `parquet_to_pandas(filename)`: It loads a `parquet` file into a `pandas` dataframe. **Instead of using the method provided by `pandas` (which does not preserve single value types and 2D arrays), it first loads the parquet as `dict`, and then converts it into a `pandas` dataframe.**
- `parquet_to_awkward(filename)`: It loads a `parquet` file into a `awkward` array.
- `save_dict(dictData, folderPath = None, filename = None, fileFormat='parquet')`: Additional wrapper of a few functions above to easily save a `dict` on a file using a supported format (`parquet` and `dict` for the time being)
- `load_dict(filename, fileFormat='parquet')`: It reads a file assuming a given format and returns its content as a `dict` (which can be then converted to other formats...)
Installation Installation
------------ ------------
......
...@@ -22,6 +22,7 @@ A simple example of use could be the following: ...@@ -22,6 +22,7 @@ A simple example of use could be the following:
The generated dict can now be converted to Awkward Arrays, PyArrow Arrays or pandas DataFrame: The generated dict can now be converted to Awkward Arrays, PyArrow Arrays or pandas DataFrame:
.. testcode:: .. testcode::
ds.dict_to_awkward(my_dict) ds.dict_to_awkward(my_dict)
ds.dict_to_pyarrow(my_dict) ds.dict_to_pyarrow(my_dict)
ds.dict_to_pandas(my_dict) ds.dict_to_pandas(my_dict)
...@@ -29,6 +30,7 @@ The generated dict can now be converted to Awkward Arrays, PyArrow Arrays or pan ...@@ -29,6 +30,7 @@ The generated dict can now be converted to Awkward Arrays, PyArrow Arrays or pan
The generated dict can also be stored to file: The generated dict can also be stored to file:
.. testcode:: .. testcode::
# Parquet # Parquet
ds.dict_to_parquet(my_dict, "my_test_data.parquet") ds.dict_to_parquet(my_dict, "my_test_data.parquet")
my_dict_load = ds.parquet_to_dict("my_test_data.parquet") my_dict_load = ds.parquet_to_dict("my_test_data.parquet")
...@@ -44,6 +46,7 @@ If one decides to store data to JSON, one should know that data type and precisi ...@@ -44,6 +46,7 @@ If one decides to store data to JSON, one should know that data type and precisi
**NOT preserved**! **NOT preserved**!
.. testcode:: .. testcode::
# JSON - no exact data preservation # JSON - no exact data preservation
ds.dict_to_json(my_dict, "my_test_data.json") ds.dict_to_json(my_dict, "my_test_data.json")
my_pandas_load = ds.json_to_pandas("my_test_data.json") my_pandas_load = ds.json_to_pandas("my_test_data.json")
...@@ -54,6 +57,7 @@ If one decides to store data to JSON, one should know that data type and precisi ...@@ -54,6 +57,7 @@ If one decides to store data to JSON, one should know that data type and precisi
The loss of precision is due to the JSON data format, and not to data conversion to pandas The loss of precision is due to the JSON data format, and not to data conversion to pandas
.. testcode:: .. testcode::
# note: going to pandas and back to dict does keep the data integrity # note: going to pandas and back to dict does keep the data integrity
my_pandas = ds.dict_to_pandas(my_dict) my_pandas = ds.dict_to_pandas(my_dict)
my_dict_from_pandas = ds.pandas_to_dict(my_pandas) my_dict_from_pandas = ds.pandas_to_dict(my_pandas)
......
0% Loading or .
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment