Message boards :
ATLAS application :
No HITS File But Still Granted Credit?
Message board moderation
Author | Message |
---|---|
Send message Joined: 17 Sep 04 Posts: 105 Credit: 32,824,862 RAC: 131 |
A few of my completed work units have not produced a HITS file but were still granted credit. Is this correct? 2022-11-27 16:22:00 (22224): Guest Log: No HITS file was produced Regards, Bob P. |
Send message Joined: 15 Jun 08 Posts: 2541 Credit: 254,608,838 RAC: 56,545 |
Is this correct? Yes. Under certain circumstances the scripts can't reliably detect where an error comes form. In those cases the project grants credit although it doesn't get it's own reward (the HITS file). In the past few users complained about that and suggested not to reward the user in any case of an error. But the project team decided to do it as it is now. What can be seen from one of your logs is a long break starting during the task's setup phase. This might have played a role (just a guess, without clear evidence). You may try to avoid those breaks, especially the long ones. |
Send message Joined: 2 May 07 Posts: 2244 Credit: 173,902,375 RAC: 677 |
2022-11-27 16:21:13 (19880): Guest Log: "pilotErrorCode": 1150, 2022-11-27 16:21:13 (19880): Guest Log: "pilotErrorDiag": "Looping job killed by pilot", 2022-11-27 16:21:13 (19880): Guest Log: *** Listing of results directory *** You can check Virtualboxmanager for yellow triangle. |
Send message Joined: 14 Sep 08 Posts: 52 Credit: 64,094,999 RAC: 37,964 |
In the past few users complained about that and suggested not to reward the user in any case of an error. In this case, does the result show as "Completed and validated" or "Error while computing"? Even if the team decide to grant credit, I would like to get some signal that things were wrong. Unless the user is familiar with the internals of WU, the result status and credits are the only signal available to us to determine if anything is off. That's a signal common across all BOINC projects too. I know this is ATLAS forum but I feel my experience with Theory is very relevant to this discussion. The first time I started running native Theory, I thought it's unexpected for some WU to run very long given the average behavior, so I had a cron to kill the worker process (e.g., Sherpa, rivetvm.exe, etc) if they run for more than 12 (or 24?) hours. I didn't abort the task directly simply because finding the offending long-running process with ps is simpler. The results were "Completed and validated" so I assumed my action had no side effects. If the WUs failed, I certainly wouldn't have continued to do this. Later I tried same with another machine running vbox and killing the vboxwrapper failed the task, which leads me to look closer. Finally I came to the forum and soon learnt it's normal for some Theory WUs to run long. Needless to say I don't kill any processes afterwards, but that's not before I generated a few dozen bogus results. Thus I prefer some clear way of knowing my results were bad, whether I get credit or not. In those cases the project grants credit although it doesn't get it's own reward (the HITS file). Hmm, even for people going after credits, the fact we all picked BOINC and a specific project, instead of some pointless workload should mean the science results are at least remotely relevant for us. I don't know how people would feel about getting credits while not actually helping. I personally would rather get not credit for errors so I can investigate further. |
©2024 CERN