Message boards : Number crunching : BOINC manager skipping LHC project
Message board moderation

To post messages, you must log in.

AuthorMessage
greg_be

Send message
Joined: 28 Dec 08
Posts: 318
Credit: 4,148,677
RAC: 2,010
Message 31647 - Posted: 27 Jul 2017, 7:12:35 UTC

So BOINC Manager (BM for short) is skipping over LHC and pushing more WCG tasks. WCG is 154K point below LHC in total work. BUT..LHC is lower in RAC than WCG.

But at the same time Rosetta which was my first project is higher in points in both categories than LHC, but BM gives more tasks to Rosetta and skips LHC.

I raised the resource share to 125 and all the other projects are at 100.
I reset the project as well.
No change to the behavior of BM and LHC.

What is causing this?
ID: 31647 · Report as offensive     Reply Quote
Crystal Pellet
Volunteer moderator
Volunteer tester

Send message
Joined: 14 Jan 10
Posts: 1268
Credit: 8,421,616
RAC: 2,139
Message 31649 - Posted: 27 Jul 2017, 7:53:09 UTC - in response to Message 31647.  
Last modified: 27 Jul 2017, 9:03:57 UTC

The priority of projects to run (when no tasks running high priority) is done in order of REC (recent estimated credit).
Tasks of the project with the lowest REC are starting first.
Total credit, RAC or resource share don't play a role which task should start first.
The REC is raised when a project has running tasks.

The REC is stored and maintained in the client_state.xml.
When you are familiar with the cmd-prompt, start a command prompt,
navigate to your BOINC data directory and give 2 commands:
find client_state_prev.xml "<master_url>" <RETURN> and
find client_state_prev.xml "<rec>" <RETURN>
ID: 31649 · Report as offensive     Reply Quote
Harri Liljeroos
Avatar

Send message
Joined: 28 Sep 04
Posts: 674
Credit: 43,147,992
RAC: 15,989
Message 31650 - Posted: 27 Jul 2017, 7:56:41 UTC - in response to Message 31647.  

Different projects give different amounts of credit for work done. Equal resource share does not mean equal credit or RAC for projects. Here's some comparison between different projects: https://boincstats.com/en/stats/-1/cpcs
ID: 31650 · Report as offensive     Reply Quote
greg_be

Send message
Joined: 28 Dec 08
Posts: 318
Credit: 4,148,677
RAC: 2,010
Message 31654 - Posted: 27 Jul 2017, 12:36:27 UTC - in response to Message 31649.  

So where in Client_State do I find what the REC total is per project?
I see <rec> and <rectime> labels

The priority of projects to run (when no tasks running high priority) is done in order of REC (recent estimated credit).
Tasks of the project with the lowest REC are starting first.
Total credit, RAC or resource share don't play a role which task should start first.
The REC is raised when a project has running tasks.

The REC is stored and maintained in the client_state.xml.
When you are familiar with the cmd-prompt, start a command prompt,
navigate to your BOINC data directory and give 2 commands:
find client_state_prev.xml "<master_url>" <RETURN> and
find client_state_prev.xml "<rec>" <RETURN>
ID: 31654 · Report as offensive     Reply Quote
Crystal Pellet
Volunteer moderator
Volunteer tester

Send message
Joined: 14 Jan 10
Posts: 1268
Credit: 8,421,616
RAC: 2,139
Message 31659 - Posted: 27 Jul 2017, 14:49:03 UTC - in response to Message 31654.  

In each project section in client_state you'll find

<rec>xxxx.xxxxxx</rec>

The project with the lowest value will start first, when that project has tasks 'Ready to start' and there comes a thread free due to a task finished, suspends/waiting to run or increase of available cpu's etc.
ID: 31659 · Report as offensive     Reply Quote
greg_be

Send message
Joined: 28 Dec 08
Posts: 318
Credit: 4,148,677
RAC: 2,010
Message 31663 - Posted: 27 Jul 2017, 17:20:17 UTC - in response to Message 31659.  
Last modified: 27 Jul 2017, 17:21:40 UTC

Well in that case there is a problem. LHC has only 99.xxxxx time
Einstein and all the rest like GPU have 12-14,000 in time.
Does a reset solve this issue or what does?

In each project section in client_state you'll find

<rec>xxxx.xxxxxx</rec>

The project with the lowest value will start first, when that project has tasks 'Ready to start' and there comes a thread free due to a task finished, suspends/waiting to run or increase of available cpu's etc.
ID: 31663 · Report as offensive     Reply Quote
Crystal Pellet
Volunteer moderator
Volunteer tester

Send message
Joined: 14 Jan 10
Posts: 1268
Credit: 8,421,616
RAC: 2,139
Message 31666 - Posted: 27 Jul 2017, 17:28:47 UTC

GPU has its own rules, because that project does not use much cpu and will always ask work when the GPU is idle or does not have enough work in cache.

Not sure what you mean with 'time'.

Could you extract the lines with <master_url> and the lines with <rec> and post it here.
ID: 31666 · Report as offensive     Reply Quote
greg_be

Send message
Joined: 28 Dec 08
Posts: 318
Credit: 4,148,677
RAC: 2,010
Message 31667 - Posted: 27 Jul 2017, 17:55:34 UTC - in response to Message 31666.  
Last modified: 27 Jul 2017, 18:08:22 UTC

Time aka the values listed in the <rec> section


<master_url>https://lhcathome.cern.ch/lhcathome/</master_url>
<project_name>LHC@home</project_name>

<rec>99.594310</rec>

Now look at my first project that I have been running forever that is CPU only:
<rec>946.911213</rec>

and WCG (world community grid)
<rec>944.458780</rec>

Milkyway
<rec>13309.114918</rec>

So what's going on?
ID: 31667 · Report as offensive     Reply Quote
Crystal Pellet
Volunteer moderator
Volunteer tester

Send message
Joined: 14 Jan 10
Posts: 1268
Credit: 8,421,616
RAC: 2,139
Message 31669 - Posted: 27 Jul 2017, 19:39:16 UTC - in response to Message 31667.  

You don't have LHC-tasks in progress, so the problem is: why no tasks from LHC are requested.
When you update LHC@home manually, is it requesting CPU-tasks?
How many tasks have you configured in LHC-project preferences?
ID: 31669 · Report as offensive     Reply Quote
greg_be

Send message
Joined: 28 Dec 08
Posts: 318
Credit: 4,148,677
RAC: 2,010
Message 31672 - Posted: 27 Jul 2017, 20:42:54 UTC - in response to Message 31669.  
Last modified: 27 Jul 2017, 20:47:33 UTC

Think I found the problem.
In the past I had disabled six track work because I wanted to test out virtual box again after some problems there. After the other projects ran out of work for virtual box, I was pretty sure I enabled six track again, but I guess I didn't.

Now six track and six track test are enabled along with all the other projects and I will increase my storage buffer by a half day and see if six track loads.
I currently store just 1 day of work and the job queue is full right now for just 1 day.

I got a bunch of six track stuff now and I see that the other projects are out of work:
7/27/2017 10:45:17 PM | LHC@home | No tasks are available for CMS Simulation
7/27/2017 10:45:17 PM | LHC@home | No tasks are available for LHCb Simulation
7/27/2017 10:45:17 PM | LHC@home | No tasks are available for Theory Simulation
7/27/2017 10:45:17 PM | LHC@home | No tasks are available for ATLAS Simulation
7/27/2017 10:45:17 PM | LHC@home | No tasks are available for ALICE Simulation
7/27/2017 10:45:17 PM | LHC@home | No tasks are available for Benchmark Application

Even though it appears on the server as there is work to be sent.
ID: 31672 · Report as offensive     Reply Quote
djoser
Avatar

Send message
Joined: 30 Aug 14
Posts: 145
Credit: 10,847,070
RAC: 0
Message 31699 - Posted: 29 Jul 2017, 15:48:35 UTC

I have a similar thing going on.

One of my machines (ID: 10486800) is not getting any Atlas-Tasks.
Boinc message is "No work available for ATLAS-Simulation" which obviously is not correct.

I didn't try other VB tasks, but Sixtrack works well.
Why mine when you can research? - GRIDCOIN - Real cryptocurrency without wasting hashes! https://gridcoin.us
ID: 31699 · Report as offensive     Reply Quote
djoser
Avatar

Send message
Joined: 30 Aug 14
Posts: 145
Credit: 10,847,070
RAC: 0
Message 31710 - Posted: 29 Jul 2017, 21:30:39 UTC - in response to Message 31699.  

Nonsense...it's Computer ID: 10491710, which is not getting Atlas tasks...
ID: 31710 · Report as offensive     Reply Quote
Erich56

Send message
Joined: 18 Dec 15
Posts: 1686
Credit: 100,335,921
RAC: 102,416
Message 31713 - Posted: 30 Jul 2017, 5:21:22 UTC - in response to Message 31710.  
Last modified: 30 Jul 2017, 5:24:22 UTC

Nonsense...it's Computer ID: 10491710, which is not getting Atlas tasks...

this is a problem which some people have since 3 days ago (including myself, with all my PCs) - it frist started out with all tasks failing after some 10-15 minutes (probably due to a connectivity problem with the CERN server), lateron no tasks could be downloaded any more.
For more details, see here:

https://lhcathome.cern.ch/lhcathome/forum_thread.php?id=4378
ID: 31713 · Report as offensive     Reply Quote
djoser
Avatar

Send message
Joined: 30 Aug 14
Posts: 145
Credit: 10,847,070
RAC: 0
Message 31715 - Posted: 30 Jul 2017, 7:23:41 UTC - in response to Message 31713.  

https://lhcathome.cern.ch/lhcathome/forum_thread.php?id=4378


Thanks for that, i followed this thread already.

I have the feeling that this situation is somehow related to Sixtrack. Whenever Sixtrack has thousands of workunits in the queue, Atlas seem to get "hickups". I recall similar problems last time Sixtrack had so much WU's to be distributed a few weeks ago.

Could this be associated?
Why mine when you can research? - GRIDCOIN - Real cryptocurrency without wasting hashes! https://gridcoin.us
ID: 31715 · Report as offensive     Reply Quote

Message boards : Number crunching : BOINC manager skipping LHC project


©2024 CERN