Message boards : Number crunching : Invalid tasks
Message board moderation

To post messages, you must log in.

Previous · 1 · 2 · 3 · 4

AuthorMessage
Richard Haselgrove

Send message
Joined: 27 Oct 07
Posts: 186
Credit: 3,297,640
RAC: 0
Message 26486 - Posted: 18 May 2014, 11:37:41 UTC - in response to Message 26485.  

Your Linux server is fine. With an APR of 6.1685501327024 for the current application, its statistics haven't been purturbed by short-running 'chaotic' tasks, and you run no danger of 'EXIT_TIME_LIMIT_EXCEEDED'.

For your windows machine, I'd never suggest running tasks otherwise than the programmers intended. But the programmers intended the tasks to run to completion, and not to be prematurely terminated.

You could have a look at your list of Tasks in progress for host 10308609, and selectively cull the jobs which are at risk of failure. Click on the number in the 'Workunit' column for each task in turn.

The top one, WU 17293121 was completed by your wingmate in under three minutes - that's a keeper, it's safe to run.

Towards the bottom of page 2 is WU 17279782: that one took 8 hours - too long for you with the server in its current state. I'd abort it and let someone else have a go.

And so on down the list. When your wingmate hasn't reported yet, suspend the task and have another look tomorrow.
ID: 26486 · Report as offensive     Reply Quote
Eric Mcintosh
Volunteer moderator
Project administrator
Project developer
Project tester
Project scientist

Send message
Joined: 12 Jul 11
Posts: 857
Credit: 1,619,050
RAC: 0
Message 26487 - Posted: 18 May 2014, 12:28:13 UTC

As a possible quick fix I have set the fpops_bound to a
100 times the "normal" value. SixTrack never loops
infinitely! :-) New WUs. as of now should have the
new value. Have to work this out properly (tomorrow).
Eric.
ID: 26487 · Report as offensive     Reply Quote
Toby Broom
Volunteer moderator

Send message
Joined: 27 Sep 08
Posts: 742
Credit: 540,637,655
RAC: 297,130
Message 26488 - Posted: 18 May 2014, 17:05:26 UTC - in response to Message 26487.  

That would kind of cool if someone did get a WU that looped forever!
ID: 26488 · Report as offensive     Reply Quote
Eric Mcintosh
Volunteer moderator
Project administrator
Project developer
Project tester
Project scientist

Send message
Joined: 12 Jul 11
Posts: 857
Credit: 1,619,050
RAC: 0
Message 26490 - Posted: 18 May 2014, 19:48:59 UTC - in response to Message 26488.  

I think I should also stop and restart BOINC doing a
"reset credit statistics" as suggested by Richard.
I can't do that so will have to wait and be done
carefully tomorrow. Eric.
ID: 26490 · Report as offensive     Reply Quote
Previous · 1 · 2 · 3 · 4

Message boards : Number crunching : Invalid tasks


©2023 CERN