Message boards : Sixtrack Application : Exceeded elapsed time limit
Message board moderation

To post messages, you must log in.

AuthorMessage
USTL-FIL (Lille Fr)

Send message
Joined: 11 Dec 09
Posts: 27
Credit: 236,761,681
RAC: 1,344
Message 33456 - Posted: 21 Dec 2017, 14:46:40 UTC

Hello!
I have actually an error rate of 10%.
I got 850 tasks fail because time limit exceed:
>exceeded elapsed time limit 5960.74 (180000000.00G/30197.62G)
Like this one: 82198734
Can you do something for have less waste of time and energy?
Thanks! An Happy New Year celebrations to all lhc@home users and staff!
ID: 33456 · Report as offensive     Reply Quote
Crystal Pellet
Volunteer moderator
Volunteer tester

Send message
Joined: 14 Jan 10
Posts: 1274
Credit: 8,480,242
RAC: 2,028
Message 33465 - Posted: 22 Dec 2017, 6:46:51 UTC - in response to Message 33456.  

The Measured integer speed of that machine is reported as 23.06 billion ops/sec. That seems a bit high for a i5-4570.

You could consider to rerun BOINC's benchmark.
ID: 33465 · Report as offensive     Reply Quote
USTL-FIL (Lille Fr)

Send message
Joined: 11 Dec 09
Posts: 27
Credit: 236,761,681
RAC: 1,344
Message 33467 - Posted: 22 Dec 2017, 14:46:59 UTC
Last modified: 22 Dec 2017, 14:51:14 UTC

Hello,

This is not only this hosts, i have now 1225 task in error on all my hosts:
https://lhcathome.cern.ch/lhcathome/results.php?userid=175863&offset=0&show_names=0&state=6&appid=

For the benchmark, i rerun benchmark: (boinc clent 7.6.31 x86_64-pc-linux-gnu)

ven. 22 déc. 2017 15:05:12 CET | | Running CPU benchmarks
ven. 22 déc. 2017 15:05:12 CET | | Suspending computation - CPU benchmarks in progress
ven. 22 déc. 2017 15:05:43 CET | | Benchmark results:
ven. 22 déc. 2017 15:05:43 CET | | Number of CPUs: 4
ven. 22 déc. 2017 15:05:43 CET | | 4492 floating point MIPS (Whetstone) per CPU
ven. 22 déc. 2017 15:05:43 CET | | 23007 integer MIPS (Dhrystone) per CPU
ven. 22 déc. 2017 15:05:45 CET | | Resuming computation

And re-sync with lhc@home and the result is:

Vitesse mesurée pour les calculs en virgule flottante 4.49 billion ops/sec
Vitesse mesurée pour les calculs en nombres entiers 23.01 billion ops/sec
Débit moyen en téléchargement ascendant 269.21 Ko/s
Débit moyen de téléchargement 17151.04 Ko/s
Temps de cycle moyen 0.1 jours
Détails de l'application Afficher
Tâches 115
Nombre de fois où le client BOINC a contacté le serveur 1075
Dernière date de contact avec le serveur 22 Dec 2017, 14:05:59 UTC

The cpu:
Processor: 4 GenuineIntel Intel(R) Core(TM) i5-4570 CPU @ 3.20GHz [Family 6 Model 60 Stepping 3]

in 115 task for this host, 109 got error Time limit exceeded after only 1h39 of cpu time:

État du client Erreur de calcul
État à la sortie 197 (0x000000C5) EXIT_TIME_LIMIT_EXCEEDED
ID de l'ordinateur 10504896
Temps de fonctionnement 1 heures 39 min 22 sec
Temps de CPU 1 heures 38 min 59 sec

What is wrong with my hosts?

Thanks.

I stop taking new tasks until this problem is solved.
ID: 33467 · Report as offensive     Reply Quote
Crystal Pellet
Volunteer moderator
Volunteer tester

Send message
Joined: 14 Jan 10
Posts: 1274
Credit: 8,480,242
RAC: 2,028
Message 33470 - Posted: 22 Dec 2017, 15:52:30 UTC - in response to Message 33467.  

...
in 115 task for this host, 109 got error Time limit exceeded after only 1h39 of cpu time:


État à la sortie 197 (0x000000C5) EXIT_TIME_LIMIT_EXCEEDED
ID de l'ordinateur 10504896
Temps de fonctionnement 1 heures 39 min 22 sec
Temps de CPU 1 heures 38 min 59 sec

What is wrong with my hosts?

Thanks.

I stop taking new tasks until this problem is solved.

Not requesting new tasks for that host is a good idea.
On a similar machine as yours (i5-4670) SixTrack tasks need about 20,000 seconds, so somehow BOINC thinks
your machine could run them much faster and when not ready within that fast time, it says time-limit-exceeded.
When you still have tasks on that machine, you could dig in the client_state.xml searching for rsc_fpops_est and rsc_fpops_bound of a sixtrack workunit.
ID: 33470 · Report as offensive     Reply Quote
USTL-FIL (Lille Fr)

Send message
Joined: 11 Dec 09
Posts: 27
Credit: 236,761,681
RAC: 1,344
Message 33471 - Posted: 22 Dec 2017, 16:14:10 UTC - in response to Message 33470.  
Last modified: 22 Dec 2017, 16:52:58 UTC

Thanks for your reply,

Here an extract of the client_state.xml for an lhc task:

<workunit>
    <name>LHC_2015_LHC_2015_260_BOINC_errors__14__s__62.31_60.32__4.3_4.4__5__78_1_sixvf_boinc47429</name>
    <app_name>sixtrack</app_name>
    <version_num>4630</version_num>
    <rsc_fpops_est>180000000000000.000000</rsc_fpops_est>
    <rsc_fpops_bound>180000000000000000.000000</rsc_fpops_bound>
    <rsc_memory_bound>100000000.000000</rsc_memory_bound>
    <rsc_disk_bound>200000000.000000</rsc_disk_bound>
    <file_ref>
        <file_name>LHC_2015_LHC_2015_260_BOINC_errors__14__s__62.31_60.32__4.3_4.4__5__78_1_sixvf_boinc47429.zip</file_name>
        <open_name>Sixin.zip</open_name>
        <copy_file/>
    </file_ref>
</workunit>


in the task slots i got this:

/var/lib/boinc-client/slots/0# cat init_data.xml |grep fpops
<rsc_fpops_est>180000000000000.000000</rsc_fpops_est>
<rsc_fpops_bound>180000000000000000.000000</rsc_fpops_bound>
    <p_fpops>4492264896.653359</p_fpops>


and here the boinc client log:

ven. 22 déc. 2017 17:09:38 CET | LHC@home | Aborting task LHC_2015_LHC_2015_260_BOINC_errors__14__s__62.31_60.32__4.3_4.4__5__70.5_1_sixvf_boinc47424_0: exceeded elapsed time limit 5960.74 (180000000.00G/30197.62G)
ven. 22 déc. 2017 17:09:40 CET | LHC@home | Computation for task LHC_2015_LHC_2015_260_BOINC_errors__14__s__62.31_60.32__4.3_4.4__5__70.5_1_sixvf_boinc47424_0 finished
ven. 22 déc. 2017 17:09:40 CET | LHC@home | Output file LHC_2015_LHC_2015_260_BOINC_errors__14__s__62.31_60.32__4.3_4.4__5__70.5_1_sixvf_boinc47424_0_r1509978979_0 for task LHC_2015_LHC_2015_260_BOINC_errors__14__s__62.31_60.32__4.3_4.4__5__70.5_1_sixvf_boinc47424_0 absent
ven. 22 déc. 2017 17:09:40 CET | LHC@home | Starting task LHC_2015_LHC_2015_260_BOINC_errors__14__s__62.31_60.32__4.3_4.4__5__76.5_1_sixvf_boinc47428_1
ven. 22 déc. 2017 17:12:28 CET | LHC@home | Aborting task LHC_2015_LHC_2015_260_BOINC_errors__14__s__62.31_60.32__4.3_4.4__5__73.5_1_sixvf_boinc47426_0: exceeded elapsed time limit 5960.74 (180000000.00G/30197.62G)
ven. 22 déc. 2017 17:12:29 CET | LHC@home | Computation for task LHC_2015_LHC_2015_260_BOINC_errors__14__s__62.31_60.32__4.3_4.4__5__73.5_1_sixvf_boinc47426_0 finished
ven. 22 déc. 2017 17:12:29 CET | LHC@home | Output file LHC_2015_LHC_2015_260_BOINC_errors__14__s__62.31_60.32__4.3_4.4__5__73.5_1_sixvf_boinc47426_0_r309012229_0 for task LHC_2015_LHC_2015_260_BOINC_errors__14__s__62.31_60.32__4.3_4.4__5__73.5_1_sixvf_boinc47426_0 absent


5960.74s -> 1h39m
The tasks are at 27% finish when they killed.

Here an host with no problem(10498212):
ven. 22 déc. 2017 07:37:07 CET |  | Processor: 4 GenuineIntel Intel(R) Core(TM) i5-6600 CPU @ 3.30GHz [Family 6 Model 94 Stepping 3]
ven. 22 déc. 2017 17:42:22 CET |  | Benchmark results:
ven. 22 déc. 2017 17:42:22 CET |  | Number of CPUs: 4
ven. 22 déc. 2017 17:42:22 CET |  | 4670 floating point MIPS (Whetstone) per CPU
ven. 22 déc. 2017 17:42:22 CET |  | 24851 integer MIPS (Dhrystone) per CPU


/var/lib/boinc-client/slots/7# cat init_data.xml |grep fpops
<rsc_fpops_est>180000000000000.000000</rsc_fpops_est>
<rsc_fpops_bound>180000000000000000.000000</rsc_fpops_bound>
    <p_fpops>4661209424.869622</p_fpops>


On this host, tasks finish without problem... i don't understand...
ID: 33471 · Report as offensive     Reply Quote
Crystal Pellet
Volunteer moderator
Volunteer tester

Send message
Joined: 14 Jan 10
Posts: 1274
Credit: 8,480,242
RAC: 2,028
Message 33472 - Posted: 22 Dec 2017, 17:24:49 UTC - in response to Message 33456.  
Last modified: 22 Dec 2017, 17:30:09 UTC

>exceeded elapsed time limit 5960.74 (180000000.00G/30197.62G)

Your error tasks had calculated the maximum time limit with a p_fpops of 30197620000000.000000
where it is now - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - p_fpops of 4492264896.653359

It seems your fpops is corrected, but maybe the error tasks arrived on your machine before that correction.
Could you check with a task that is newly arrived.
ID: 33472 · Report as offensive     Reply Quote
USTL-FIL (Lille Fr)

Send message
Joined: 11 Dec 09
Posts: 27
Credit: 236,761,681
RAC: 1,344
Message 33473 - Posted: 22 Dec 2017, 18:01:32 UTC - in response to Message 33472.  

oh yeah!!
Why did boinc make such a mistake and found a p_fpops of 30197620000000.000000? it's crazy!
what can i do to correct the problem on other hosts? reset project?

I can't test new tasks now because all my hosts goes offline this night for holidays but i come back after and check all new tasks in error.
Thank you very much for solving this mysterious problem!
Merry Christmas
ID: 33473 · Report as offensive     Reply Quote

Message boards : Sixtrack Application : Exceeded elapsed time limit


©2024 CERN