Separate any `Hlt1PassthroughLargeEventDecision` from other streams

changed milestone to %Developments targeting restart after May MD

assigned to @sstahl

mentioned in merge request Allen!1492 (merged)

marked this issue as blocking Allen#494 (closed)

mentioned in merge request !3212 (merged)

@decianm @cmarinbe @sstahl where should we send these events?

Since I'm having a boring Sunday shift let me add my 2 cents -- these types of pathological events can be quite interesting for understanding reconstruction corner cases. In the past we had events which blew up the reconstruction time or memory budget for various reasons, one which I remember were collimated jets produced from material interactions resulting in huge numbers of ghosts. Anyway it would be nice if they could be permanently archived in a dedicated offline disk area so they can be easily accessed for later studies. This assumes their rate will be very small, but if that turns out not to be the case we probably have bigger issues anyway.

I think something like an ErrorStream would be the correct place for them, we had preliminary discussions about this a few days ago (with similar motivations as Vava pointed out).

However, I am not sure it makes sense to actually have a dedicated Hlt2ErrorStream just for this, or put them into e.g. the CalibStream. Other people will probably know the details what is better from an Online and Offline system point of view.

Very small streams are not good for the transfer to offline. That's why we also moved the bgi lines during pp to Hlt2Calib. I think best would be to send them to the Hlt2Calib stream at a low rate.

What other kinds of events will populate Hlt2Calib? It isn't obvious to me that we want to put events which could (by construction) cause processing problems together with other kinds of calibration events. I understand small streams are not good for transfer to offline but isn't this a bit of a special case?

You both summarized my thoughts very well

What makes small streams bad for the transfer? And small in the sense of rate, BW, ...?

From an email exchange with @hemmer about the BGI

What is even worse is when the streams have few events compared to the rest. This implies that files are kept for long on the movers without being able to transfer them, hence using memory for nothing.

I don't see a big issue with having them in the Hlt2Calib stream. The offline processing will likely be a simple passthrough sprucing. And then users of the stream anyhow have to select on the Hlt2 decision to only process their events.

Or you try to pick them up somewhere in the online system after Hlt1 and never send them offline. I don't know if that is possible.

I think the second option might be better, yes, in particular if the rate is low. Since these events will be mainly used for debugging, meaning you aren't really going to be filling histograms from them but rather profiling, running sanitizers, etc. So you will by construction need to rerun on these events many times, which is not the typical workflow even for calibration jobs offline. Then I think it can be fairly annoying to have to skip past a bunch of calibration triggers every time you do this. If we don't isolate them I guess anyone actually using them will make a job to write a subset to a separate file for convenience anyway, so why not do this for them centrally.

I don't have a particularly strong opinion on sending them offline or through the Calib stream. I think we need a solution that 1) suits the needs of the people who want to study these events 2) can be implemented without overhead.

I know that we write TAE events to /calib in the Online area via a routing bit, however I don't know how scalable that approach is to many additional scenarios. I'll try to find out. Shipping it offline via the Calib stream would be technically easier I guess.

Not sure where to comment, so I do it here. There are probably more technical and calibration lines which should be excluded from normal Hlt2 lines. So the current implementation ('Hlt1(?!PassthroughLargeEvent).*Decision') in !3212 (merged) is probably not complete.

closed with merge request !3212 (merged)

mentioned in commit 25a29ce9

mentioned in merge request !2994 (merged)

added rta-operations label

Separate any `Hlt1PassthroughLargeEventDecision` from other streams

Designs

Child items 0

Activity

Separate any `Hlt1PassthroughLargeEventDecision` from other streams

Blocks

Activity