Bug when producing skims in distrdf mode
Currently trying to produce skims after MR218.
When I run without --distrdf-be
there is no issue, when I include it :
File "/home/users/f/b/fbury/bamboodev/bamboovenv103/lib/python3.9/site-packages/bamboo/dataframebackend.py", line 568, in df_origColNames
return [str(cN) for cN in self.rootDF.GetColumnNames() if str(cN) != "_zero_for_stats"]
AttributeError: 'DistDataframeBackend' object has no attribute 'rootDF'
Problem comes from here which calls this line that fails because rootDF
is not set. My understanding is that createRDF
has not been called yet, hence the failure here.
This is a minor issue, as this line is just to get the columns of the RDF to keep them in the produced Skim if needed (which is not my case), so a small hack to get by without works fine, but this should be fixed in the long run.
I have no experience with Dask yet, and very hard for me to understand how it has been implemented here, @swertz could I get any pointe or clue ?