1) Message boards : Number crunching : GPU advertised for LHC, but they don't do it? (Message 40226)
Posted 20 Oct 2019 by Profile PDW
Post:
See here...

https://boinc.berkeley.edu/forum_thread.php?id=13171
2) Message boards : Sixtrack Application : EXIT_DISK_LIMIT_EXCEEDED (Message 40045)
Posted 29 Sep 2019 by Profile PDW
Post:
This is happening again, for example...

https://lhcathome.cern.ch/lhcathome/result.php?resultid=246803675
3) Message boards : ATLAS application : Download failures (Message 40033)
Posted 27 Sep 2019 by Profile PDW
Post:
As well as these "Not started by deadline - canceled"
4) Message boards : ATLAS application : ATLAS native version 2.70 (Message 40031)
Posted 27 Sep 2019 by Profile PDW
Post:
Atlas native used to work for me a few months ago but doesn't now.
I have updated the default.local file that now includes the last line that wasn't there before.

$ more default.local
CVMFS_REPOSITORIES=atlas.cern.ch,atlas-condb.cern.ch,grid.cern.ch,cernvm-prod.ce
rn.ch,sft.cern.ch,alice.cern.ch
CVMFS_QUOTA_LIMIT=4096
CVMFS_CACHE_BASE=/scratch/cvmfs
CVMFS_HTTP_PROXY=DIRECT
CVMFS_SEND_INFO_HEADER=yes

Have wiped the cache, reloaded, etc.

$ sudo cvmfs_config probe
Probing /cvmfs/atlas.cern.ch... OK
Probing /cvmfs/atlas-condb.cern.ch... OK
Probing /cvmfs/grid.cern.ch... OK
Probing /cvmfs/cernvm-prod.cern.ch... OK
Probing /cvmfs/sft.cern.ch... OK
Probing /cvmfs/alice.cern.ch... OK

$ singularity exec -B /cvmfs /cvmfs/atlas.cern.ch/repo/images/singularity/x86_64-slc6.img hostname
ERROR : Image path /cvmfs/atlas.cern.ch/repo/images/singularity/x86_64-slc6.img doesn't exist: No such file or directory
ABORT : Retval = 255

$ ls /cvmfs
alice.cern.ch atlas-condb.cern.ch grid.cern.ch
atlas.cern.ch cernvm-prod.cern.ch sft.cern.ch

$ ls /cvmfs/atlas.cern.ch/repo/
ATLASLocalRootBase benchmarks conditions containers dev sw tools

There is no images directory !

Is this an easy fix and how do I get it to work please ?
5) Message boards : ATLAS application : Download failures (Message 38665)
Posted 29 Apr 2019 by Profile PDW
Post:
Seeing a lot of these download errors today.

As a side note, it says Max number of errors is 3 but it always seem to download a 4th attempt which also fails.
Shouldn't it stop and bug out when it hits 3 errors ?

Example here: https://lhcathome.cern.ch/lhcathome/workunit.php?wuid=111108234
6) Message boards : Number crunching : most unpolite host of the day (Message 36382)
Posted 12 Aug 2018 by Profile PDW
Post:
As noted on this thread on the ATLAS boards, the validation logic has been tightened so that credit is not allocated for these short unsuccessful tasks.

I'm not sure why such huge credits were being given to these tasks (credit logic has always been a mystery to me), in the past I don't think it was so large. Please post any examples of bad tasks getting credits which finished after 1 August when the validation was changed.

Hi David,

So this host https://lhcathome.cern.ch/lhcathome/results.php?hostid=10522402&offset=0&show_names=0&state=4&appid= has valid tasks (scroll past the first page or two) that were returned 2nd August. Approximately 30 minutes of run time for approximately 60 credits, ie 120 credits per hour, CPU time was a lot less.

Are these valid tasks receiving the correct credit ?

Another couple of tasks sent 4th August and returned 7th August got validated for 325+ credits...
https://lhcathome.cern.ch/lhcathome/result.php?resultid=203599212
https://lhcathome.cern.ch/lhcathome/result.php?resultid=203606471
So catching most but some still get through.

Shame he hasn't been able to get tasks to work properly.
7) Message boards : Number crunching : most unpolite host of the day (Message 36246)
Posted 3 Aug 2018 by Profile PDW
Post:
Edit: Here is another work unit that I just did that's listed as invalid: https://lhcathome.cern.ch/lhcathome/workunit.php?wuid=99773436 Now look at the other computer that also crunched it. Its also invalid, it also has about the same time on it. This can't be a issue with just my host...

That host doesn't have a normal valid Atlas task in its history.
It does have some that were invalid due to things like:

2018-07-30 23:02:02 (6056): Guest Log: mv: cannot stat `metadata-*.xml': No such file or directory
2018-07-30 23:02:02 (6056): Guest Log: ERROR: Missing metadata.xml

I haven't run Atlas for quite some time so am probably well behind on what it does now so am going to step away and let those more familiar with it try and figure out what is wrong.

Good luck !
8) Message boards : Number crunching : most unpolite host of the day (Message 36244)
Posted 3 Aug 2018 by Profile PDW
Post:
No, if the other is working fine as a service install it won't be that.

Atlas uses Vbox, have you run any other projects that use it ?
9) Message boards : Number crunching : most unpolite host of the day (Message 36240)
Posted 3 Aug 2018 by Profile PDW
Post:
Is Boinc a Service Install on the other one that works ?
10) Message boards : Number crunching : most unpolite host of the day (Message 36228)
Posted 3 Aug 2018 by Profile PDW
Post:
As noted on this thread on the ATLAS boards, the validation logic has been tightened so that credit is not allocated for these short unsuccessful tasks.

I'm not sure why such huge credits were being given to these tasks (credit logic has always been a mystery to me), in the past I don't think it was so large. Please post any examples of bad tasks getting credits which finished after 1 August when the validation was changed.

Hi David,

So this host https://lhcathome.cern.ch/lhcathome/results.php?hostid=10522402&offset=0&show_names=0&state=4&appid= has valid tasks (scroll past the first page or two) that were returned 2nd August. Approximately 30 minutes of run time for approximately 60 credits, ie 120 credits per hour, CPU time was a lot less.

Are these valid tasks receiving the correct credit ?
11) Message boards : Number crunching : most unpolite host of the day (Message 36217)
Posted 2 Aug 2018 by Profile PDW
Post:
I understand what you are saying but if you look at other users who don't hide their computers and see their Atlas results you will quickly realise that your valid results do not look normal.

Don't bother looking at my computers, they are hidden but I haven't run any tasks for a while, I just came back to see what state the project was in and if it was worth running tasks again.
12) Message boards : Number crunching : most unpolite host of the day (Message 36215)
Posted 2 Aug 2018 by Profile PDW
Post:
So this one isn't yours then ?
https://lhcathome.cern.ch/lhcathome/show_host_detail.php?hostid=10522402

It only has 136 valid and 3 invalid and 77 error.
That is also returning valid results with low run times although it has done at least one I would expect to be a real valid result...
https://lhcathome.cern.ch/lhcathome/show_host_detail.php?hostid=10522402
Its run time was 10,495.46 seconds and CPU time of 72,408.16 seconds for a credit of 295.61.
At least its credit return is much smaller for the short runs.

For the record I am not accusing you of cheating, I do not know either way if you are or not.
My concern is that the project is happily marking results as valid with a high credit score when I don't believe they should be valid.

I will wait for an Admin to respond, they may not appear until working hours though.

It is a project oversight problem, I don't know if:
1) They don't care
2) They don't know how to check
3) They don't have time to check

No idea what science your 'valid' results might be contributing, if an Admin comes back and says they are okay I won't argue any further but I don't see why everyone else is taking so many hours for so little credit.
13) Message boards : Number crunching : most unpolite host of the day (Message 36212)
Posted 2 Aug 2018 by Profile PDW
Post:
So that host https://lhcathome.cern.ch/lhcathome/results.php?hostid=10522400&offset=0&show_names=0&state=4&appid= at the moment has 785 VALID tasks that used about 3 minutes of CPU time for a credit of about 300 each.

Would an Admin like to confirm that they are valid results please ?
I see the VMs have 8 CPUs assigned but the run time is barely enough to spin up the VM let alone do any valid work.
14) Message boards : Number crunching : most unpolite host of the day (Message 36209)
Posted 2 Aug 2018 by Profile PDW
Post:
I know how to but I rather not expose all of my machines. But I can post examples.

Marked valid: https://lhcathome.cern.ch/lhcathome/result.php?resultid=203431393
Marked invalid: https://lhcathome.cern.ch/lhcathome/result.php?resultid=203456578

How long has Atlas been running valid tasks with less than 3 minutes CPU time for 293.50 credits ?

I thought they took many hours for about twice that much credit !



©2024 CERN