Message boards :
Theory Application :
jobs is empty
Message board moderation
Author | Message |
---|---|
Send message Joined: 9 Jan 15 Posts: 151 Credit: 431,596,822 RAC: 0 |
Looking MCPLOT and it hit 0 and contributed CPU time is also 0 and MCPLOTS spend no time to generate new. I have seen a post about server issue and backup server replacement but wounder if new jobs is the cause of this? Hosts running and get some jobs but may not run fulltime and idle, have checked a few task and sum is diffrent in jobs done. Could project admin announce if new jobs would get out and fill the task with jobs that are sent out? Should we suspend the tasks until batch system is back to normal. Any info to users and guidelines would be appreciated. |
Send message Joined: 9 Jan 15 Posts: 151 Credit: 431,596,822 RAC: 0 |
Looks like we are dry now Exit status 207 (0x000000CF) EXIT_NO_SUB_TASKS |
Send message Joined: 18 Dec 15 Posts: 1833 Credit: 119,671,026 RAC: 49,011 |
Looks like we are dry nowindeed, the Project Status Page shows "0" for unsent tasks. |
Send message Joined: 29 Sep 04 Posts: 281 Credit: 11,866,264 RAC: 0 |
New Tasks are ending after 20mins or so with the No_subtasks error yet strangely, older Tasks, that made their connection before this problem started, ARE able to pick up new work. No_subtasks over at -dev too. MCPlots still broken. |
Send message Joined: 13 Apr 18 Posts: 443 Credit: 8,438,885 RAC: 0 |
Same error here too... No subtasks. Is it coincidence that CMS tasks are getting the same error? I have LHCb tasks running that are getting subtasks but I think they were downloaded before the troubles started. |
Send message Joined: 28 Sep 04 Posts: 736 Credit: 49,896,655 RAC: 35,403 |
I always thought that the subtasks (jobs) are something that is downloaded inside the virtualbox when the task is running. This is why you need the constant connection to the different servers in Cern. I think this would applies to all VM tasks. It would be much more fault tolerant and convenient (to us, the crunchers) if everything necessary was downloaded when Boinc downloads the task and results were uploaded back when everything was finished. |
Send message Joined: 24 Oct 04 Posts: 1182 Credit: 55,600,112 RAC: 50,910 |
You should just suspend all your Theory tasks if you have any left since they will just run 25mins and then become the usual computer error (aka server error) And you could run LHCb's if you want to d/l the 940.47MB vdi |
Send message Joined: 13 Apr 18 Posts: 443 Credit: 8,438,885 RAC: 0 |
I always thought that the subtasks (jobs) are something that is downloaded inside the virtualbox when the task is running. This is why you need the constant connection to the different servers in Cern. I think this would applies to all VM tasks. LHCb tasks are still getting jobs but if I understand correctly they run under pilot not Condor. It would be much more fault tolerant and convenient (to us, the crunchers) if everything necessary was downloaded when Boinc downloads the task and results were uploaded back when everything was finished. That would require putting result files in the ../slots/#/shared folder where they could be tampered with. Then they would maybe need to run 2 iterations of every task to verify results. Hiding results in the VM makes it more secure. Also, there may be thousands of non-BOINC hosts crunching Theory, LHCb and CMS native apps under Condor. Keeping everything in a VM means users don't have to compile and install CVMS, Singularity, etc. because that's all built in to the VM. And when the result files are ready they conveniently go to the same destination (Condor or pilot) that results from non-BOINC hosts submit their results to. |
Send message Joined: 2 May 07 Posts: 2245 Credit: 174,025,522 RAC: 9,726 |
This task finished successful last night: https://lhcathome.cern.ch/lhcathome/result.php?resultid=207036080 MCProd say 100% lost ratio. |
Send message Joined: 18 Dec 15 Posts: 1833 Credit: 119,671,026 RAC: 49,011 |
what I do not understand is why new tasks are made available for download (as seen from the Project Status Page) as long as there are no job available in the background. Some automatic mechanism should be established to stop creation of new tasks whenever job creation fails. |
Send message Joined: 13 Apr 18 Posts: 443 Credit: 8,438,885 RAC: 0 |
what I do not understand is why new tasks are made available for download (as seen from the Project Status Page) as long as there are no job available in the background. Maybe such a mechanism already exists. Maybe it broke under circumstances they never anticipated. Remember... they're just physicists and IT pros not rocket scientists. |
Send message Joined: 18 Dec 15 Posts: 1833 Credit: 119,671,026 RAC: 49,011 |
Could anyone from LHC give us information as to when jobs will be available again for the tasks that can still be downloaded? |
Send message Joined: 20 Jun 14 Posts: 380 Credit: 238,712 RAC: 0 |
I am investigating. |
Send message Joined: 20 Jun 14 Posts: 380 Credit: 238,712 RAC: 0 |
Due to a blockage with results being moved back to the mcplots server, it only gave out a trickle of new jobs. This resulted in the queue emptying quickly but not for long enough that tasks would stop being created. It has now being unblocked and jobs should start flowing again shortly. |
Send message Joined: 18 Dec 15 Posts: 1833 Credit: 119,671,026 RAC: 49,011 |
Many thanks, Laurence, for investigating and re-activating the jobs delivery. From what I can see so far, it seems to work now. However, the Project Status Page is showing a continuous drop in the number of "unsent" Theory tasks (right now, it's at 92). Which may mean that now, or soon, we have jobs, but no tasks :-) |
Send message Joined: 18 Dec 15 Posts: 1833 Credit: 119,671,026 RAC: 49,011 |
However, the Project Status Page is showing a continuous drop in the number of "unsent" Theory tasks (right now, it's at 92). Which may mean that now, or soon, we have jobs, but no tasks :-)right now the number of "unsent" tasks is ascending :-) so all looks good! |
©2025 CERN