Message boards :
Theory Application :
file_xfer_error
Message board moderation
Author | Message |
---|---|
Send message Joined: 17 Aug 17 Posts: 84 Credit: 8,806,362 RAC: 17,335 |
After about 10 days if work a these seemed to fail on completion, I saw a few down as failed and watched this one tick over to check and sure enough it failed on 100% https://lhcathome.cern.ch/lhcathome/result.php?resultid=406318361 Its giving a file_xfer_error? </stderr_txt> <message> upload failure: <file_xfer_error> <file_name>Theory_2687-2527341-808_1_r686114138_result</file_name> <error_code>-161 (not found)</error_code> </file_xfer_error> </message> ]]> |
Send message Joined: 28 Sep 04 Posts: 736 Credit: 49,884,924 RAC: 35,291 |
The default maximum run time for a Theory tasks is 10 days (same as deadline). After that it gets aborted automatically. |
Send message Joined: 4 Sep 22 Posts: 95 Credit: 16,235,003 RAC: 16,787 |
After about 10 days if work a these seemed to fail on completion, I saw a few down as failed and watched this one tick over to check and sure enough it failed on 100% If you look up a bit, you will find this: 2024-03-04 19:29:47 (2245072): Status Report: Job Duration: '864000.000000' 2024-03-04 19:29:47 (2245072): Status Report: Elapsed Time: '860427.000000' 2024-03-04 19:29:47 (2245072): Status Report: CPU Time: '4690.400000' Note the CPU time. In 10 days, the CPU has been in use for barely one hour and 18 minutes. Ordinarily, I think the CPU time should always lag elapsed time by less than one hour. After an hour, if CPU time doesn't increase very nearly as fast as elapsed time, personally I believe there is no point in keeping the task running -- just abort it and get another one. |
Send message Joined: 4 Sep 22 Posts: 95 Credit: 16,235,003 RAC: 16,787 |
The default maximum run time for a Theory tasks is 10 days (same as deadline). After that it gets aborted automatically. I have never had any Theory task fail because it ran into the maximum run time. I have had numerous tasks run for 9 days and around 22 or 23 hours, then fail for no apparent reason. In all instances, I do not recall total CPU time ever being more than one hour behind elapsed time; moreover, the tasks have always run to about 99.95 completion, only to fail with a "computation error". Memory on this next bit is a little foggy, but I do believe the most common reason for failure has been "too many results". |
Send message Joined: 17 Aug 17 Posts: 84 Credit: 8,806,362 RAC: 17,335 |
Boinc hasn't been paused much in that time, the chip is a 3950x, any idea why its seemingly been idle then whilst reporting working? |
Send message Joined: 14 Jan 10 Posts: 1429 Credit: 9,539,339 RAC: 5,065 |
Boinc hasn't been paused much in that time, the chip is a 3950x, any idea why its seemingly been idle then whilst reporting working? This is the reason: 2024-02-23 14:23:35 (2189670): Guest Log: Probing /cvmfs/sft.cern.ch... Failed! - 2 minutes and 10 seconds after the start. At that moment your system could not connect to CERN. Unfortunately, the software is not written so that after ... retries the task is aborted automatically |
Send message Joined: 17 Aug 17 Posts: 84 Credit: 8,806,362 RAC: 17,335 |
But it still carried on for another 10 days before failing? |
Send message Joined: 2 May 07 Posts: 2245 Credit: 174,025,522 RAC: 9,726 |
Under jobs - Theory in this website header, you can checking how the results are for this Theory task. |
Send message Joined: 14 Jan 10 Posts: 1429 Credit: 9,539,339 RAC: 5,065 |
But it still carried on for another 10 days before failing?It did not really start, so it cannot fail. The Virtual Machine for this task got a shutdown signal from vboxwrapper. Default is after 864.000 seconds (10 days), but if you see a Theory task running without CPU it's better to kill such a task. If you like hanky panky: such a task could be saved by - suspend the task without leave in memory set - The task wll be saved to disk. - remove the saved state with VirtualBox Manager - start the task with VBox Manager - After the task is processing his first events, stop the task with VBox Manager (save to disk) - Start the task again with BOINC Manager. |
Send message Joined: 17 Aug 17 Posts: 84 Credit: 8,806,362 RAC: 17,335 |
Cheers, I think I might just stop doing theory altogether, the project seems a mess at the moment, even the native apps are failing almost instantly. |
©2025 CERN