1) Message boards : Number crunching : Very long job (Message 43605)
Posted 14 Nov 2020 by Matthias Lehmkuhl
Post:
looks like I have to a very long runner too
https://lhcathome.cern.ch/lhcathome/result.php?resultid=283384624
So far I see a CPU time of 7d 18h at 74,6% -> estimated time to go 02d 09 h with a deadline of 15.11.2020 05:58 CET visible at the Client
Progress is growing slowly and CPU usage for the task is at 100%

Is it possible to extend the deadline for the result which is now 16.11.2020, 4:58:17 UTC
Or should I cancel the result when the server deadline is over
2) Message boards : ATLAS application : Very slow task download (Message 35725)
Posted 1 Jul 2018 by Matthias Lehmkuhl
Post:
count me in, the download stuck sometimes on very different percentage progress (mostly 80 to 90 % progress of the download)
LHC@home LeiNDmPLhtsnyYickojUe11pABFKDmABFKDmLNIMDmABFKDmsJuyan_EVNT.14296450._001903.pool.root.1 1,054 314106,06 K 00:22:22 2,53 Kbps Herunterladend
yesterday I've seen that the download runs with normal speed after restarting the boinc client.
My problem is, that other projects do not download on that windows machine (Boinc 7.10.2) too as long as the download of the Atlas task has not finished.
Will try to increase the buffer.
3) Message boards : Number crunching : can't open server log file (Message 26867)
Posted 11 Oct 2014 by Matthias Lehmkuhl
Post:
count me in

334 LHC@home 1.0 11.10.2014 15:07:47 [error] Error reported by file upload server: can't open log file '../log_boinc05/file_upload_handler.log' (errno: 9)
4) Message boards : Number crunching : Tasks exceeding disk limit (Message 26786)
Posted 3 Oct 2014 by Matthias Lehmkuhl
Post:
some more results
1328 LHC@home 1.0 03.10.2014 14:17:43 Aborting task w-b3_-12000_job.HLLHC_b3_-12000.0732__3__s__62.31_60.32__13_15__5__14.1177_1_sixvf_boinc558_3: exceeded disk limit: 197.24MB > 190.73MB
1392 LHC@home 1.0 03.10.2014 14:47:45 Aborting task w-b3_30000_job.HLLHC_b3_30000.0732__3__s__62.31_60.32__15_17__5__47.6472_1_sixvf_boinc1916_4: exceeded disk limit: 192.40MB > 190.73MB
1427 LHC@home 1.0 03.10.2014 14:52:45 Aborting task w-b3_24000_job.HLLHC_b3_24000.0732__4__s__62.31_60.32__11_13__5__75.8825_1_sixvf_boinc2141_2: exceeded disk limit: 213.06MB > 190.73MB
1471 LHC@home 1.0 03.10.2014 15:22:49 Aborting task w-b3_0_job.HLLHC_b3_0.0732__13__s__62.31_60.32__11_13__5__10.5883_1_sixvf_boinc4388_0: exceeded disk limit: 193.56MB > 190.73MB
1507 LHC@home 1.0 03.10.2014 15:52:52 Aborting task w-b3_4000_job.HLLHC_b3_4000.0732__8__s__62.31_60.32__13_15__5__67.059_1_sixvf_boinc3137_3: exceeded disk limit: 202.94MB > 190.73MB
1508 LHC@home 1.0 03.10.2014 15:52:52 Aborting task w-b3_-24000_job.HLLHC_b3_-24000.0732__10__s__62.31_60.32__11_13__5__15.8824_1_sixvf_boinc3676_3: exceeded disk limit: 191.12MB > 190.73MB
1546 LHC@home 1.0 03.10.2014 16:27:55 Aborting task w-b3_-22000_job.HLLHC_b3_-22000.0732__10__s__62.31_60.32__13_15__5__63.5296_1_sixvf_boinc3746_2: exceeded disk limit: 216.95MB > 190.73MB

example WU
http://lhcathomeclassic.cern.ch/sixtrack/workunit.php?wuid=21252862

some result actual finished successful.
5) Message boards : Number crunching : Tasks exceeding disk limit (Message 26768)
Posted 2 Oct 2014 by Matthias Lehmkuhl
Post:
count me in,
397 LHC@home 1.0 02.10.2014 21:51:31 Aborting task w-b3_-14000_job.HLLHC_b3_-14000.0732__2__s__62.31_60.32__13_15__5__7.05884_1_sixvf_boinc304_3: exceeded disk limit: 209.22MB > 190.73MB

Looks like we have some bad WU
http://lhcathomeclassic.cern.ch/sixtrack/workunit.php?wuid=21130329
http://lhcathomeclassic.cern.ch/sixtrack/workunit.php?wuid=21124227
http://lhcathomeclassic.cern.ch/sixtrack/workunit.php?wuid=21129470

all WU have more then one result with
196 (0xc4) EXIT_DISK_LIMIT_EXCEEDED
And it's OS independent.

On Windows also the "BOINC Windows Runtime Debugger" is started.
6) Message boards : Number crunching : Invalid tasks (Message 26338)
Posted 9 Apr 2014 by Matthias Lehmkuhl
Post:
I've 2 more invalid Linux SixTrack v446.05 (pni) against 2 Windows systems SixTrack v446.03 (pni)
http://lhcathomeclassic.cern.ch/sixtrack/workunit.php?wuid=16419099
http://lhcathomeclassic.cern.ch/sixtrack/workunit.php?wuid=16425599

and 2 more Validation inconclusive Linux against Windows
http://lhcathomeclassic.cern.ch/sixtrack/workunit.php?wuid=16583683
http://lhcathomeclassic.cern.ch/sixtrack/workunit.php?wuid=16592011
on both the next result is now not send.
7) Message boards : Number crunching : Computation Error (Message 24366)
Posted 14 Jul 2012 by Matthias Lehmkuhl
Post:
confirm, results with start date 13 that I got have finished correct
Didn't need to reset the project.

http://lhcathomeclassic.cern.ch/sixtrack/result.php?resultid=4176438
8) Message boards : Number crunching : Computation Error (Message 24336)
Posted 13 Jul 2012 by Matthias Lehmkuhl
Post:
got also some results with
<message>
Maximum elapsed time exceeded
</message>
Unhandled Exception Detected...

- Unhandled Exception Record -
Reason: Breakpoint Encountered (0x80000003) at address 0x7729280C

Engaging BOINC Windows Runtime Debugger...

http://lhcathomeclassic.cern.ch/sixtrack/result.php?resultid=4162849
http://lhcathomeclassic.cern.ch/sixtrack/result.php?resultid=4166176
http://lhcathomeclassic.cern.ch/sixtrack/result.php?resultid=4166151
9) Message boards : Number crunching : Thousands of hours of work for nothing ? (Message 24303)
Posted 12 Jul 2012 by Matthias Lehmkuhl
Post:
From my side also,
http://lhcathomeclassic.cern.ch/sixtrack/workunit.php?wuid=1773160
my is the "Fertig, als ung├╝ltig markiert" SixTrack v443.07
The result was resend 2 times with SixTrack v444.01
Looks like both SixTrack Versions create different results, so the validator can't compare them.
one more samples
http://lhcathomeclassic.cern.ch/sixtrack/workunit.php?wuid=1773180


found also some package canceled
http://lhcathomeclassic.cern.ch/sixtrack/workunit.php?wuid=1776436
http://lhcathomeclassic.cern.ch/sixtrack/workunit.php?wuid=1813172

Would be good to check this. (After the Weekend) ;-)
10) Message boards : Number crunching : Failed Data Unit Downloads (Message 24227)
Posted 10 Jul 2012 by Matthias Lehmkuhl
Post:
count me in, set to NNW till fix

stderr:
<core_client_version>6.12.34</core_client_version>
<![CDATA[
<message>
WU download error: couldn't get input files:
<file_xfer_error>
<file_name>w1jul_niebb0d__29__s__64.28_59.31__9.4_9.6__6__85_1_sixvf_boinc50548.zip</file_name>
<error_code>-224</error_code>
<error_message>file not found</error_message>
</file_xfer_error>

</message>
]]>
11) Message boards : Number crunching : A lot of mismatching results! (Message 22974)
Posted 9 Sep 2011 by Matthias Lehmkuhl
Post:
valid to same computer, my is the invalid also running Win7 x64.
http://lhcathomeclassic.cern.ch/sixtrack/workunit.php?wuid=5730
both valid results where calculated on computer 9915117 Win7 x64.

12) Message boards : Number crunching : A lot of mismatching results! (Message 22936)
Posted 7 Sep 2011 by Matthias Lehmkuhl
Post:
got also one Completed, validation inconclusive
http://lhcathomeclassic.cern.ch/sixtrack/workunit.php?wuid=12883
Linux EM64T with wingman Win7 64bit
13) Message boards : Number crunching : Validator stopped? (Message 22412)
Posted 30 Jun 2010 by Matthias Lehmkuhl
Post:
I\\\'ve also one

wuid=3552164

Edit: just seen, its from 2009
14) Message boards : Number crunching : LHC is conflicting with other projects (Message 22095)
Posted 18 Mar 2010 by Matthias Lehmkuhl
Post:
The new Version is 6.10.40 for Windows
15) Message boards : Number crunching : Computation error (Message 22080)
Posted 15 Mar 2010 by Matthias Lehmkuhl
Post:
I\\\\\\\'m sorry about graphics problem; I have given priority to
getting a Linux executable and getting these very very long
beam to beam jobs working. I should also look at how
you could restart. I guess BOINC kills the work unit agter
this problem, but actually I have checkpoint/restart files
which mean it could just be restarted...........


Hi bicmac,
there will no checkpointint/restart of the result, due to the graphic program is in the computing program.
When one of both will crash, the result is lost.
This also happens, when the Boinc screensaver with LHC is activated.
The other projects with screensaver made a second program that only shows the sreensaver based on the data of the calculating program. If the screensaver does not work, the calculating program continues without error/crash.

Please consider to remove/deactivate the graphic part of the LHC program, then the problem is resolved.
I don\\\'t use the the Boinc Screensaver, but love the LHC graphic.
16) Message boards : Number crunching : LHC is conflicting with other projects (Message 22079)
Posted 15 Mar 2010 by Matthias Lehmkuhl
Post:
Hello, dear friends!
Problem.
Once LHC task(s) is (are) downloaded, the tasks from one of other BOINC projects begin Running in high priority and blocking LHC. That one of other is always the project with shortest deadline. But I was shocked, when a host, usualy running CPDN only, captured LHC tasks, and CPDN tasks began Running in high priority. This is absurdity!
Details.
1. No matter:
- how big are LHC tasks (for 20 minutes or 20 hours);
- what is Resource share between projects;
- which value is set for Additional work buffer or how much tasks have other projects.
2. It is important, which version of BOINC is installed:
- this effect appears, when BOINC manager is 6.6 or higher;
- there is no problems, when manager is 6.2 or lower.
P.S. This situation is never seen before...


This is a new feature in Boinc 6.6 maybe also in 6.5
Now not the result with the first deadline will calculated, but in Boinc 6.6 the result with the most time problems
is calculated first.
Had this seen on my Boinc also, after a short time all goes back to normal work.
17) Message boards : Number crunching : Maximum elapsed time exceeded (Message 22078)
Posted 15 Mar 2010 by Matthias Lehmkuhl
Post:
I hope so; I doubled the time limit. Eric.
(Note also that if interrupted I have a checkpoint/restart
so that we do not lose everything, but restart from evry thousand turns i.e evry two minutes roughly.


Could finish both results, CPU Time less than one day.
80000 and 57000 seconds.

And yes, here checkpointing works fine.
18) Message boards : Number crunching : Computation error (Message 22051)
Posted 12 Mar 2010 by Matthias Lehmkuhl
Post:
OK we think we have a solution and it should go online once we have tested it.

Sadly it involves stripping out the graphics but this is a stop gap as the scientists want to get their work done and bigmac is on holidays as of today.

Once he is back graphics will become something for him to investigate.

I will keep you updated.


Hi Nesan,
Any new status?
By now there is no new application.
19) Message boards : Number crunching : Maximum elapsed time exceeded (Message 22014)
Posted 8 Mar 2010 by Matthias Lehmkuhl
Post:
Got also two of this long runners, if it goes too 100%
first:
2 hours runtime at 5.8%
resultid=18385236

second:
2 hours runtime at 4.6%
resultid=18384520

Any chance, that the results will finish in time?
20) Message boards : Number crunching : Server thinks I recevied a WU (Message 18336)
Posted 22 Oct 2007 by Matthias Lehmkuhl
Post:
I've got two results too, the Server tells me I've got them, but i can't find them on my machine. No entry in BoincLogX too.
resultid=9004185
resultid=8945482

Matthias


Next 20


©2023 CERN