Message boards : ATLAS application : No tasks available
Message board moderation

To post messages, you must log in.

1 · 2 · 3 · 4 . . . 5 · Next

AuthorMessage
David Cameron
Project administrator
Project developer
Project scientist

Send message
Joined: 13 May 14
Posts: 387
Credit: 15,314,184
RAC: 0
Message 33082 - Posted: 19 Nov 2017, 20:47:04 UTC

A full disk on one of our servers is stopping submission of new tasks, unforuntately we'll probably have to wait until tomorrow morning (European time) to fix it.
ID: 33082 · Report as offensive     Reply Quote
David Cameron
Project administrator
Project developer
Project scientist

Send message
Joined: 13 May 14
Posts: 387
Credit: 15,314,184
RAC: 0
Message 33087 - Posted: 20 Nov 2017, 9:35:51 UTC - in response to Message 33082.  

The problem has been fixed and new WU are available now.
ID: 33087 · Report as offensive     Reply Quote
Erich56

Send message
Joined: 18 Dec 15
Posts: 1686
Credit: 100,334,692
RAC: 102,649
Message 33579 - Posted: 30 Dec 2017, 17:37:02 UTC

A few minutes ago, the freshly updated Project Status Page showed 400+ unsent tasks for ATLAS.
However, BOINC always tells me "no tasks are available for ATLAS simulation" - how come?
ID: 33579 · Report as offensive     Reply Quote
maeax

Send message
Joined: 2 May 07
Posts: 2071
Credit: 156,083,051
RAC: 105,919
Message 33580 - Posted: 30 Dec 2017, 17:48:14 UTC

A few minutes ago, this ATLAS is finished, but Linux native App!
https://lhcathome.cern.ch/lhcathome/workunit.php?wuid=83133640
ID: 33580 · Report as offensive     Reply Quote
Jim1348

Send message
Joined: 15 Nov 14
Posts: 602
Credit: 24,371,321
RAC: 0
Message 33581 - Posted: 30 Dec 2017, 18:26:45 UTC - in response to Message 33580.  

A few minutes ago, this ATLAS is finished, but Linux native App!
https://lhcathome.cern.ch/lhcathome/workunit.php?wuid=83133640

Yes, I have gotten several on my Ubuntu machines. They pass them out along with the VirtualBox ones. I wish we could select which one.
ID: 33581 · Report as offensive     Reply Quote
Erich56

Send message
Joined: 18 Dec 15
Posts: 1686
Credit: 100,334,692
RAC: 102,649
Message 33582 - Posted: 30 Dec 2017, 18:50:17 UTC - in response to Message 33581.  

They pass them out along with the VirtualBox ones. I wish we could select which one.
This would indeed be important!
ID: 33582 · Report as offensive     Reply Quote
Erich56

Send message
Joined: 18 Dec 15
Posts: 1686
Credit: 100,334,692
RAC: 102,649
Message 33586 - Posted: 31 Dec 2017, 8:01:08 UTC

On all my Windows machines, I still cannot download any ATLAS task, although - according to the Project Status Page - there are always a few hundred "unsent" tasks.
Are they all for Linux only?
ID: 33586 · Report as offensive     Reply Quote
Crystal Pellet
Volunteer moderator
Volunteer tester

Send message
Joined: 14 Jan 10
Posts: 1268
Credit: 8,421,616
RAC: 2,362
Message 33587 - Posted: 31 Dec 2017, 9:00:19 UTC - in response to Message 33586.  

On all my Windows machines, I still cannot download any ATLAS task, although - according to the Project Status Page - there are always a few hundred "unsent" tasks.
Are they all for Linux only?
No, I've one running on my Windows machine, but it's hard to get them.
I think BOINC's feeder gets confused when there is a massive amount of SixTrack workunits in the queue as we've seen before.
Even when requesting SixTrack's, you often get the message 'No tasks available'.
ID: 33587 · Report as offensive     Reply Quote
Erich56

Send message
Joined: 18 Dec 15
Posts: 1686
Credit: 100,334,692
RAC: 102,649
Message 33588 - Posted: 31 Dec 2017, 9:04:11 UTC - in response to Message 33587.  

I think BOINC's feeder gets confused when there is a massive amount of SixTrack workunits in the queue as we've seen before.
Even when requesting SixTrack's, you often get the message 'No tasks available'.
Indeed, I also often get the message "no tasks available" for Sixtrack, as well as for all other sub-projects.
You may be right that something gets confused by the tons of Sixtracks in the queue :-(
ID: 33588 · Report as offensive     Reply Quote
AuxRx

Send message
Joined: 16 Sep 17
Posts: 100
Credit: 1,618,469
RAC: 0
Message 33589 - Posted: 31 Dec 2017, 9:19:27 UTC - in response to Message 33588.  

The server is overloaded and defering requests. It should say "project is backed off" in your message log. You have to enable showing [work_fetch] first.
ID: 33589 · Report as offensive     Reply Quote
Erich56

Send message
Joined: 18 Dec 15
Posts: 1686
Credit: 100,334,692
RAC: 102,649
Message 33590 - Posted: 31 Dec 2017, 9:30:39 UTC - in response to Message 33589.  

You have to enable showing [work_fetch] first.
Yout talk about "work_fetch_debug" in the BOINC Diagnostic Log Flags settings?
ID: 33590 · Report as offensive     Reply Quote
Erich56

Send message
Joined: 18 Dec 15
Posts: 1686
Credit: 100,334,692
RAC: 102,649
Message 33591 - Posted: 31 Dec 2017, 9:38:17 UTC - in response to Message 33590.  
Last modified: 31 Dec 2017, 9:41:21 UTC

You have to enable showing [work_fetch] first.
Yout talk about "work_fetch_debug" in the BOINC Diagnostic Log Flags settings?
I tried this now - obviously, it was the wrong one :-(
I got the message "missing applications", virtually the same message as before.
ID: 33591 · Report as offensive     Reply Quote
PHILIPPE

Send message
Joined: 24 Jul 16
Posts: 88
Credit: 239,917
RAC: 0
Message 33592 - Posted: 31 Dec 2017, 9:53:33 UTC - in response to Message 33586.  
Last modified: 31 Dec 2017, 10:43:41 UTC

I can't answer you , Erich ,
but after having a look at the situation on other crunchers , i find curious that for a same volunteer ,one of its hosts is fullfilled by workunits while another host is waiting to have workunits .
It look likes as if , the sharing is not equal inside a same user.( and globally between different users , of course)
"Maybe" , it would be advantageous for the project to provide workunits with a lower local buffer as possible to share, in a better way, the jobs among volunteers.
Doing this, for instance, by overwriting the local crunchers preferences, because of the reduction of available workunits ,
the global efficiency of the server would probably increase because

    global time for the treatment of a work unit would decrease (lower queue on a host before to be run , and higher probability to have the host result faster)
    space disk on the server where waiting results are, could be reduced (above all for sixtract which needs 2 results before being validated)



Is it possible ( only under overload conditions , not in normal situation...) ?
During a normal situation , it's correct that someone wants to have a local buffer for himself , where he can store the work units of his choice , but on critical situation (overload or breakdown) , maybe the user has to change his habits , too.

ID: 33592 · Report as offensive     Reply Quote
Erich56

Send message
Joined: 18 Dec 15
Posts: 1686
Credit: 100,334,692
RAC: 102,649
Message 33593 - Posted: 31 Dec 2017, 10:10:41 UTC - in response to Message 33592.  

it is really strange: even the latest update of the Project Status Page shows plenty of ATLAS tasks unsent. And I have been pushing the "Updade" button almost every minute, for at least 2 hours now, and each time it said "no tasks available" :-(
ID: 33593 · Report as offensive     Reply Quote
AuxRx

Send message
Joined: 16 Sep 17
Posts: 100
Credit: 1,618,469
RAC: 0
Message 33594 - Posted: 31 Dec 2017, 11:23:01 UTC - in response to Message 33592.  

I disagree. I like to have two days buffer. I'd prefer even more. Because the run times for Sixtrack are way off (and at times I get through two days buffer in half a day). But also because the project has become very unreliable, and so on.

By the way, limiting WUs per user wouldn't solve the current issues.
The servers are hamstrung because of failing retries and because of queues that are unnecessarily long (million+). Not because I downloaded 40 WUs waiting to be downloaded.
Each download is only a database entry for a waiting file. And the database isn't the issue, the file system is. WUs would actually finish, if I could return my results. Instead I'm jamming the server with retries - which can't complete because the file system is clogged.

Distribution of WU is of course based on each host (i.e. system not account) as far as I know.
ID: 33594 · Report as offensive     Reply Quote
AuxRx

Send message
Joined: 16 Sep 17
Posts: 100
Credit: 1,618,469
RAC: 0
Message 33595 - Posted: 31 Dec 2017, 11:37:53 UTC - in response to Message 33593.  

[work_fetch_debug] is the correct flag. Check your log once it is set and it should show a more detailed account of what is going on.

I stopped receiving WUs yesterday (although Sixtrack, I got off ATLAS once this mess started) because the project backed off.
The message reads something like this:

[work_fetch] share 0.000 project is backed off (resource backoff: 3891.77, inc 9600.00)
ID: 33595 · Report as offensive     Reply Quote
PHILIPPE

Send message
Joined: 24 Jul 16
Posts: 88
Credit: 239,917
RAC: 0
Message 33599 - Posted: 31 Dec 2017, 13:11:59 UTC - in response to Message 33595.  
Last modified: 31 Dec 2017, 13:17:17 UTC

@AuxRx :

Thanks to have expressed your point of view.
It 's always important to have different opinions on a same subject , in order to better undestand the environment and to enable other peolple to have an idea on it.
ID: 33599 · Report as offensive     Reply Quote
Erich56

Send message
Joined: 18 Dec 15
Posts: 1686
Credit: 100,334,692
RAC: 102,649
Message 33600 - Posted: 31 Dec 2017, 14:49:01 UTC - in response to Message 33595.  

The message reads something like this:
[work_fetch] share 0.000 project is backed off (resource backoff: 3891.77, inc 9600.00)
here is a copy of the event log:

31/12/2017 10:34:45 | LHC@home | [work_fetch] share 1.000
31/12/2017 10:34:45 | LHC@home | [work_fetch] share 0.000 no applications
31/12/2017 10:34:55 | LHC@home | update requested by user
31/12/2017 10:34:55 | | [work_fetch] Request work fetch: project updated by user
31/12/2017 10:35:00 | | [work_fetch] ------- start work fetch state -------
31/12/2017 10:35:00 | | [work_fetch] target work buffer: 34560.00 + 34560.00 sec
31/12/2017 10:35:00 | LHC@home | [work_fetch] REC 6463.464 prio -1.031 can request work
31/12/2017 10:35:00 | LHC@home | [work_fetch] share 1.000
31/12/2017 10:35:00 | LHC@home | [work_fetch] share 0.000 no applications
31/12/2017 10:35:00 | LHC@home | [work_fetch] set_request() for CPU: ninst 12 nused_total 16.00 nidle_now 0.00 fetch share 1.00 req_inst 0.00 req_secs 270833.28
31/12/2017 10:35:00 | LHC@home | [work_fetch] request: CPU (270833.28 sec, 0.00 inst) NVIDIA GPU (0.00 sec, 0.00 inst)
31/12/2017 10:35:00 | LHC@home | Sending scheduler request: Requested by user.
31/12/2017 10:35:00 | LHC@home | Requesting new tasks for CPU
31/12/2017 10:35:02 | LHC@home | Scheduler request completed: got 0 new tasks
31/12/2017 10:35:02 | LHC@home | No tasks sent
31/12/2017 10:35:02 | LHC@home | No tasks are available for ATLAS Simulation
ID: 33600 · Report as offensive     Reply Quote
Jim1348

Send message
Joined: 15 Nov 14
Posts: 602
Credit: 24,371,321
RAC: 0
Message 33601 - Posted: 31 Dec 2017, 15:00:57 UTC

I see the same thing of course - "no work available" even when there is. And I just enabled ATLAS again after a few days,and picked up a native ATLAS immediately. But it is not a problem for the project if they are getting their work done. It doesn't matter who does it.

(Erich - don't miss the Vienna New Year's Concert by spending too much time pressing the button.)
ID: 33601 · Report as offensive     Reply Quote
Erich56

Send message
Joined: 18 Dec 15
Posts: 1686
Credit: 100,334,692
RAC: 102,649
Message 33603 - Posted: 31 Dec 2017, 15:22:05 UTC - in response to Message 33601.  

(Erich - don't miss the Vienna New Year's Concert by spending too much time pressing the button.)
haha, I definitely won't :-)

Happy New Year to everybody !
ID: 33603 · Report as offensive     Reply Quote
1 · 2 · 3 · 4 . . . 5 · Next

Message boards : ATLAS application : No tasks available


©2024 CERN