Message boards :
Theory Application :
Theory Task having crazy time
Message board moderation
Author | Message |
---|---|
Send message Joined: 26 Apr 21 Posts: 2 Credit: 784,747 RAC: 0 |
So I had this weird Theory task showing up. Having 9 days remaining compute time. Decided to look into the task settings to see what the hell of a task that was. Actually the job just finished after 2 hours and 8 minutes of runtime... Still pretty strange..... The estimated time is most of the time a little off with all tasks, but 9 days ... |
Send message Joined: 2 May 07 Posts: 2220 Credit: 173,695,307 RAC: 24,745 |
Short runtime and exit code 1: 2022-02-07 03:36:34 (3824): Guest Log: 03:36:32 CET +01:00 2022-02-07: cranky: [INFO] ===> [runRivet] Mon Feb 7 02:36:31 UTC 2022 [boinc pp jets 8000 25 - herwig7 7.2.0 softTune 100000 225] 2022-02-07 03:38:44 (3824): Guest Log: job: run exitcode=1 2022-02-07 03:38:44 (3824): Guest Log: job: diskusage=5148 2022-02-07 03:38:44 (3824): Guest Log: job: logsize=12 k 2022-02-07 03:38:44 (3824): Guest Log: job: times= 2022-02-07 03:38:44 (3824): Guest Log: 0m0.007s 0m0.022s 2022-02-07 03:38:44 (3824): Guest Log: 0m11.909s 0m3.401s 2022-02-07 03:38:44 (3824): Guest Log: job: cpuusage=15 2022-02-07 03:38:44 (3824): Guest Log: 03:38:43 CET +01:00 2022-02-07: cranky: [INFO] Container 'runc' finished with status code 1. 2022-02-07 03:38:44 (3824): Guest Log: 03:38:43 CET +01:00 2022-02-07: cranky: [INFO] Preparing output. 2022-02-07 03:38:44 (3824): Guest Log: [INFO] Job Finished 2022-02-07 03:38:44 (3824): Guest Log: [INFO] Shutting Down. 2022-02-07 03:38:44 (3824): VM Completion File Detected. 2022-02-07 03:38:44 (3824): VM Completion Message: Job Finished |
Send message Joined: 14 Jan 10 Posts: 1409 Credit: 9,325,730 RAC: 9,392 |
exit code 0 = success |
Send message Joined: 31 Jan 11 Posts: 12 Credit: 3,557,813 RAC: 0 |
Hi maeax and Crystal, I posted a message on a similar thread to Erich56 just now, but repeat it here. I agree it's frustrating and I don't actually understand what is happening with these runs. In the past, we had argued that, for some generators, we had to accept a small failure rate since we otherwise could not do comparisons to those generators at all. We had then hoped that updating them to the latest versions would gradually fix the issues we were seeing, but this has not really been the case. Having to operate with a non-negligible rate of jobs that fail is not nice, especially when this fraction does not seem to reduce with time. I regret if we have been too slow to react, but at least now for 2022, we have come up with a plan to revitalize T4T. To start with, we are going to stop sending out jobs for the generators that are problematic, at least until we can sit down for a good proper debugging session with the authors of those codes, and fully iron their issues out so that they would be ready and steady for sending back out in T4T again. During 2022, we plan to start by focusing our attention on getting (back) to the virtual equivalent of what the LHC machine people would call 'stable beams' for the most widely used generator, Pythia, setting a new baseline for future T4T operation. At least for that generator, our team has author-level in-house expertise, so we are confident we can do this, if we put in the hours. At the same time, we think this can allow us to try out some new and possibly even more useful tests, which I hope we will be able to also make some announcements of down the track. So despite the issues you and others have been experiencing, I hope you will choose to stick with our project a little longer and see if things improve during 2022. Best regards Peter Skands |
Send message Joined: 2 May 07 Posts: 2220 Credit: 173,695,307 RAC: 24,745 |
This is a new Herwig7 with exitcode=0 and exitcode=1 in one task: Theory_2390-1104641-235_0 Arbeitspaket 182509350 2022-02-11 11:50:07 (3432): Guest Log: job: unpack exitcode=0 2022-02-11 11:50:07 (3432): Guest Log: 11:50:06 CET +01:00 2022-02-11: cranky: [INFO] ===> [runRivet] Fri Feb 11 10:50:05 UTC 2022 [boinc pp jets 13000 170,-,2960 - herwig7 7.2.0 softTune 100000 235] 2022-02-11 11:52:25 (3432): Guest Log: job: run exitcode=1 |
©2024 CERN