Skip to content
Snippets Groups Projects
Commit d0dbc9cc authored by Kyle Bringmans's avatar Kyle Bringmans
Browse files

merge

parents d33d2ad2 a9c15159
No related branches found
No related tags found
1 merge request!24Add ipoc resampling
......@@ -3,8 +3,8 @@ load_data:
preprocess:
task: MKI
time_intervals:
- '2017-01-01 00:00:00.000'
- '2017-12-31 00:00:00.000'
- '2017-05-01 00:00:00.000'
- '2017-05-02 00:00:00.000'
features:
# json here cause of several arrays, csv takes less space but doesn't allow these
dataformat: json
......
......@@ -141,7 +141,6 @@ class DataProcessor(ABC):
# resample data to 'step'
self.data = self.data.withColumn(_dfindex, f.explode(date_range_udf(f.col(_dfindex), f.col('next_ts'))))
self.data = self.data.drop('next_ts')
# join dataframes on timestamp index (rounded to nearest second)
self.data = self.data.where(f.col(_dfindex).isin(timestamps))
......@@ -354,7 +353,6 @@ class BetsProcessor(DataProcessor):
self.combine_sensor_data()
# Not made static in DataProcessor because it breaks pickling in Spark
def date_range(t1, t2, step=1):
"""
Create range of timestamps with step_size step between t1 and t2
......@@ -362,9 +360,10 @@ def date_range(t1, t2, step=1):
:param t2: (DateType) upper bound of time-interval
:param step: (int) step size in seconds
"""
# Ensure timestamps are integers
t1 = int(t1)
t2 = int(t2)
nr_of_ns_in_s = 1000000000
step = step * nr_of_ns_in_s
return [t1 + step * x for x in range(int((t2 - t1) / step) + 1)]
......
  • Contributor

    testing by querying all data of 2017

  • Contributor

    speed of new resampling seems quite good

    • Nice. Not immediately clear to me from the code: are you forward-filling or backward-filling when doing the 1s resampling? Should be forward-fill.

    • Contributor

      I tested it on a small example yesterday and the result showed a forward fill.

    • Please register or sign in to reply
0% Loading or .
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment