REANA run fails
As @dparedes has pointed out in the Mattermost for the current Running RECAST on REANA instructions
# Install recast-atlas and a compatible reana client
python -m pip install --upgrade 'recast-atlas[reana]'
# Clone the helloworld repo
git clone ssh://git@gitlab.cern.ch:7999/recast-atlas/examples/helloworld.git
cd helloworld
# Set up environment variables to access the REANA instance at CERN
export REANA_SERVER_URL=https://reana.cern.ch/
export REANA_ACCESS_TOKEN=XXXXXXXXXXXXXXXX
# Add the helloworld example workflow to the cataalogue
$(recast catalogue add $PWD)
# Submit to reana using the '--backend reana' option
recast submit examples/helloworld --backend reana --tag helloworld
the job fails with the output of
reana-client logs --workflow recast-helloworld
showing an issue on the backend:
==> Workflow engine logs
Workflow exited unexpectedly.
workflow finished but failed
2022-12-06 21:39:15,366 | yadage.creators | MainThread | INFO | initializing workflow with initdata: {'input_name': 'standard model'} discover: True relative: True
2022-12-06 21:39:15,366 | adage.pollingexec | MainThread | INFO | preparing adage coroutine.
2022-12-06 21:39:15,367 | adage | MainThread | INFO | starting state loop.
2022-12-06 21:39:15,405 | yadage.wflowview | MainThread | INFO | added </init:0|defined|unknown>
2022-12-06 21:39:15,516 | yadage.wflowview | MainThread | INFO | added </hello_world_stage:0|defined|unknown>
2022-12-06 21:39:15,612 | adage.pollingexec | MainThread | INFO | submitting nodes [</init:0|defined|known>]
2022-12-06 21:39:15,664 | pack.init.step | INFO | publishing data: <TypedLeafs: {'input_name': 'standard model'}>
2022-12-06 21:39:15,665 | adage | MainThread | INFO | unsubmittable: 0 | submitted: 0 | successful: 0 | failed: 0 | total: 2 | open rules: 0 | applied rules: 2
2022-12-06 21:39:30,795 | adage.node | MainThread | INFO | node ready </init:0|success|known>
2022-12-06 21:39:30,795 | adage.pollingexec | MainThread | INFO | submitting nodes [</hello_world_stage:0|defined|known>]
2022-12-06 21:39:31,376 | reana-workflow-engine-yadage | MainThread | INFO | Submitted job with id: 157dd74c-2bd6-4cfa-bbcf-ff7bf497f6d1
2022-12-06 21:39:31,376 | adage | MainThread | INFO | unsubmittable: 0 | submitted: 0 | successful: 1 | failed: 0 | total: 2 | open rules: 0 | applied rules: 2
2022-12-06 21:39:46,510 | adage.node | MainThread | INFO | node ready </hello_world_stage:0|failed|known>
2022-12-06 21:39:46,510 | adage | MainThread | ERROR | node: </hello_world_stage:0|failed|known> failed. reason: unknown
2022-12-06 21:39:46,510 | adage | MainThread | INFO | unsubmittable: 0 | submitted: 0 | successful: 1 | failed: 1 | total: 2 | open rules: 0 | applied rules: 2
2022-12-06 21:40:01,529 | adage.controllerutils | MainThread | INFO | no nodes can be run anymore and no rules are applicable
2022-12-06 21:40:01,529 | adage.controllerutils | MainThread | INFO | no nodes can be run anymore and no rules are applicable
2022-12-06 21:40:01,529 | adage | MainThread | ERROR | some weird exception caught in adage process loop
Traceback (most recent call last):
File "/usr/local/lib/python3.8/site-packages/adage/__init__.py", line 51, in run_polling_workflow
for stepnum, controller in enumerate(coroutine):
File "/usr/local/lib/python3.8/site-packages/adage/pollingexec.py", line 89, in adage_coroutine
raise RuntimeError('workflow finished but failed')
RuntimeError: workflow finished but failed
2022-12-06 21:40:01,530 | adage | MainThread | ERROR | node: </hello_world_stage:0|failed|known> failed. reason: unknown
2022-12-06 21:40:01,530 | adage | MainThread | INFO | unsubmittable: 0 | submitted: 0 | successful: 1 | failed: 1 | total: 2 | open rules: 0 | applied rules: 2
2022-12-06 21:40:16,537 | reana-workflow-engine-yadage | MainThread | INFO | Finalizing the progress tracking for: <yadage.wflow.YadageWorkflow object at 0x7fc218afcbe0>
2022-12-06 21:40:16,544 | yadage.steering_api | MainThread | INFO | done. dumping workflow to disk.
==> Job logs
==> Step: hello_world_stage
==> Workflow ID: 5075bae0-ee01-4b52-8d70-4d57022a728a
==> Compute backend: Kubernetes
==> Job ID: reana-run-job-438977c5-ca67-478a-b5df-16e17af06b66
==> Docker image: busybox:1.33
==> Command: echo Hello my Name is standard model | tee /var/reana/users/bdd05db5-6c7d-40ee-9079-c0def4365340/workflows/5075bae0-ee01-4b52-8d70-4d57022a728a/hello_world_stage/hello_world.txt
==> Status: failed
==> Finished: 2022-12-06T21:39:46
==> Logs:
job: :
StartError
This will require checking what the versions of yadage
are deployed on the REANA backend, so after the REANA v0.9.0 upgrade this should get checked again and followed up with the REANA team.
Edited by Matthew Feickert