1) Message boards : Number crunching : Tasks "completing" at random percantages (Message 37019)
Posted 14 Oct 2018 by Profile MechaToaster
Post:
Looking through the stderr output of a few of your LHCb and Theory tasks, I would say you pretty much have the settings tweaked properly. The ideal scenario is that they run from start to finish uninterrupted which means no suspensions due to CPU busy, no preemptions by other projects, no shutdown for OS updates, etc.

Several of your tasks showed 1 start and 1 stop. If you can run them all that way you'll have a near 100% success rate. A few showed 2 starts which is very good compared to some other users' tasks I see that are restarting more than 20 times. Obviously a restart is not as guarantee of a failed task but the more restarts the higher the likelihood the task will fail.

From the length of time between the stops and restarts I would guess maybe your tasks are being preempted by other projects. In that case you might consider boosting the "switch between tasks every __ minutes" to just over 1085 which is 5 minutes greater than the max LHCb/Theory task length of 18 hours.


what do you mean by my tasks are being preempted by other projects? i am only running LHC@home. i ran another thing on my phone before but ive always stuck to just LHC@home; i have one PC and only 16gb of memory so i cannot do a lot of things at once.
2) Message boards : Number crunching : Tasks "completing" at random percantages (Message 36934)
Posted 1 Oct 2018 by Profile MechaToaster
Post:
thanks guys, good to know. didnt wanna have to tweak a million settings to get things working.
3) Message boards : Number crunching : Tasks "completing" at random percantages (Message 36931)
Posted 1 Oct 2018 by Profile MechaToaster
Post:
i apologize if this is the wrong place to ask or if this is a common thing; ive been away from these projects for a year and i forgot a lot of what is normal and not.
im having this problem(?) of some tasks, theory/ATLAS/LHCb, going fine for anywhere from 10% done to 50% to 75% then disappearing from the BOINC manager task window. when i check my tasks on this website, it shows them as completed and successfully validated with no errors(that i can see).
is this normal? the most recent one happened just now; a theory job that stopped after maybe an hour or so of work. i was actually looking at the task window while eating breakfast and it just disappeared; didnt pause or say ready to submit or anything, just vanished. the task is:

https://lhcathome.cern.ch/lhcathome/result.php?resultid=207090669
https://lhcathome.cern.ch/lhcathome/result.php?resultid=207090946 (a different one than described above, but same scenario)

whether or not an error occured here, it does seem to be different from other errors ive had. im on a new PC than what i was on before and ive been getting a lot more errors on this one than i did on my old one. the other tasks that fail always show an error message or appear on the website under "error" or "invalid". i thought it might be related to clockspeeds/unstable overclocks; i remember back on my old pc(overclocked FX-8350), i was told its better to run at stock as an unstable overclock could potentially return incorrect values and basically waste everyones time.
i tried several settings on my new PC(AMD ryzen 5 1600x): stock clockspeed/voltage with the CPUs built in boosting, a small OC(tested for stability), and full stock settings with no boost at all, but the issue persists. i have AMD virtualization enabled, ive checked memory for errors and everything seems fine? ive not run into many hardware issues on this build so i guess im a bit confused as to where i should start with troubleshooting this.
one last thing im curious about: does clock speed matter while a task is in progress? the power plan on windows 10 and ryzen PCs is a little weird and sometimes my clockspeed will fluctuate, going from maybe 3.6ghz down to 3.4ghz. should i take steps to prevent that from happening?
thanks and sorry for the wall of text.
4) Message boards : Theory Application : Theory not utilizing all cores (Message 36752)
Posted 18 Sep 2018 by Profile MechaToaster
Post:
It's not what I'm looking for. I don't want 8 tasks, I want only 1 tasks that uses 8 cores!
...

You're right. The 8-core Theory VM is only using a max of 4 cores. I tested the 8-core VM and saw this:
08/31/18 22:37:36 Allocating auto shares for slot type 0: Cpus: auto, Memory: auto, Swap: auto, Disk: auto
slot type 0: Cpus: 1.000000, Memory: 375, Swap: 25.00%, Disk: 25.00%
slot type 0: Cpus: 1.000000, Memory: 375, Swap: 25.00%, Disk: 25.00%
slot type 0: Cpus: 1.000000, Memory: 375, Swap: 25.00%, Disk: 25.00%
slot type 0: Cpus: 1.000000, Memory: 375, Swap: 25.00%, Disk: 25.00%
08/31/18 22:37:36 slot1: New machine resource allocated
08/31/18 22:37:36 Setting up slot pairings
08/31/18 22:37:36 slot2: New machine resource allocated
08/31/18 22:37:36 Setting up slot pairings
08/31/18 22:37:36 slot3: New machine resource allocated
08/31/18 22:37:36 Setting up slot pairings
08/31/18 22:37:36 slot4: New machine resource allocated
08/31/18 22:37:36 Setting up slot pairings
08/31/18 22:37:36 CronJobList: Adding job 'multicore'

Maybe only physical cores are counted (4 in my case). Have to be confirmed by someone with more physical cores.
Anyway, the most efficient way to run Theory when you have enough RAM (you have) is the single core VM.

im having a similar issue on a ryzen 5 1600x(6 core 12 threads). i had lhc@home configured to use 4 cores per task, i thought it would count threads as cores, but it will not let me run more than 1 theory task on 4 cores at a time, despite having enough RAM available. if i try to run another theory task, it stops and the status updates to "postponed: b".
5) Message boards : LHCb Application : LHCb/other tasks failing after putting computer into hibernation state? (Message 36532)
Posted 22 Aug 2018 by Profile MechaToaster
Post:

You just got the typical error for the VB tasks that just happen once in a while (depending on how many you are running)

Guest Log: [ERROR] Condor exited after 111913s without running a job.

The main thing is you can check the VB Manager and make sure they are saved and suspended and then switch to other tasks (same if you have to reboot for any reason)

Thanks for the quick response; any suggestion on the sequence of steps ?
1) first suspend the VM, then suspend the WU in BOINC Manager
2) first suspend the WU, then suspend the VM
3) doesn’t matter, just suspend both WU and respective VM in short timeframe
TIA

perhaps a dumb questions, but how does one check if tasks are "saved"? how do you suspend the VM seperately from the WU?
ive been away from LHC@home or any other distributed computing projects for several months now; i forgot a lot of things about this stuff.
6) Message boards : LHCb Application : LHCb/other tasks failing after putting computer into hibernation state? (Message 32722)
Posted 9 Oct 2017 by Profile MechaToaster
Post:
... edit: it seems all of the tasks i attempt are failing almost immediately. i should probably stop trying to run these for now yeah?

well, right now no other tasks than ATLAS seem to be available anyway, at least from what can be seen from the Project Status Page:
https://lhcathome.cern.ch/lhcathome/server_status.php

oh weird, when i checked it was all up and running. would this explain my issue?
7) Message boards : LHCb Application : LHCb/other tasks failing after putting computer into hibernation state? (Message 32719)
Posted 9 Oct 2017 by Profile MechaToaster
Post:
after a 22 day uptime and relatively no problem running LHCb tasks or others, with just a few errors here and there, today i found that upon waking from hibernation and starting 3 LHCb and 1 CMS task, they all promptly failed. i had one LHCb task in progress from the previous night, over 50% complete but that failed as well when i resumed it, along with the 3 others i had just begun.
the "Exit status" error varied for each task, but the log in all of them contained

"2017-10-09 10:36:00 (3196): Guest Log: 10/09/17 10:26:04 HibernationSupportedStates invalid '' in ad from hibernation plugin /usr/libexec/condor/condor_power_state"

i dont know much of what the log stuff means or if that is at all relevant. the only settings i changed from last night(besides putting my machine into hibernation for the night) until this morning was CPU time, which i raised from 50% to 60%. thanks for any help.

edit: it seems all of the tasks i attempt are failing almost immediately. i should probably stop trying to run these for now yeah?
8) Message boards : ATLAS application : All ATLAS and CMS tasks "aborted by project" - why so? (Message 32616)
Posted 3 Oct 2017 by Profile MechaToaster
Post:
... what do you mean by "bad project configuration"? what should i be looking to fix/change, if im understanding you correctly?

Project configuration is done by the admins, not by clients/users, so you can't do anything about it.

guess i misunderstood your earlier posts. there is nothing wrong on my end then? you had said "this is not the case here" in response to my first post, so i was a bit confused.
9) Message boards : ATLAS application : All ATLAS and CMS tasks "aborted by project" - why so? (Message 32605)
Posted 3 Oct 2017 by Profile MechaToaster
Post:
...
is this normal behaviour then? nothing wrong on my end?

Cancelling task already in progress is not normal BOINC behaviour, except when done by the project administrator for reasons like bad created batches/workunits.
This is not the case here. If there are truly already running tasks aborted by the project, this would mean bad project configuration.


so far only one atlas task in progress has been aborted by project, but it was the only atlas task ive ran since this began. ive always been in the middle of other tasks when receiving atlas jobs and by time i finish the other tasks, the atlas tasks get aborted by project before i can start them.
what do you mean by "bad project configuration"? what should i be looking to fix/change, if im understanding you correctly?
10) Message boards : ATLAS application : All ATLAS and CMS tasks "aborted by project" - why so? (Message 32594)
Posted 3 Oct 2017 by Profile MechaToaster
Post:
this has started to happen to all of my atlas tasks(not cms though) lately, just a few days maybe. they abort in about a day, maybe less. one of them was also aborted while it was in progress. i see that they have been completed by other machines like Crystal Pellet pointed out.
is this normal behaviour then? nothing wrong on my end?
11) Message boards : Sixtrack Application : SIXTRACKTEST (Message 32507)
Posted 23 Sep 2017 by Profile MechaToaster
Post:
ah ok, thanks for the clarification and info.
12) Message boards : Sixtrack Application : SIXTRACKTEST (Message 32505)
Posted 23 Sep 2017 by Profile MechaToaster
Post:
The 32bit arm exe (i.e. arm v6 and v7) has not been pushed out yet...

got 4 sixtracktest tasks on my armv7 device and they all finished without an error and validated ok.


did you have to do anything special to get them running on your device? is it an android device?
im currently having trouble getting any tasks at all on my android device, armv7 cpu. i read the latest notice about sixtrack tasks only being for the "main" operating systems for now, but is there anything that will run on android?
i signed up for einstein@home on my device and dont have any trouble getting tasks and completing them, but nothing for lhc@home.
i have set my preferences to accept all tasks and to accept other work if certain tasks are not available. ive tried updating, resetting but still nothing. it communicates successfully but does not give me any tasks.

is this normal or am i doing something wrong?
13) Message boards : ATLAS application : Low CPU usage on ATLAS and SixTrack tasks (Message 32483)
Posted 21 Sep 2017 by Profile MechaToaster
Post:

Depending on the bandwidth of your Internet connection, the task will spend some 10 to 30 minutes using only 1 core, and then will go to using your full 4 cores.


i think this was the problem i was seeing with ATLAS tasks, i did not realize they did that. the sixtrack stuff seems to have solved itself; the only thing different i did was not run the BOINC manager in admin mode. no idea, but glad its working now. thanks everyone for the replies.
14) Message boards : ATLAS application : Low CPU usage on ATLAS and SixTrack tasks (Message 32449)
Posted 17 Sep 2017 by Profile MechaToaster
Post:
im using virtualbox 5.1.26, the version that comes bundled together with the boinc manager. im on the latest boinc client; i was updating them around the time of making this post last night when i got a notice that a new update was available.
i appreciate the replies about making things run more efficiently, 4 cpus 2 tasks, but it does not address my acutal question on why the cpu usage is so low? running on all 8 cores, before last week it was always using 70%+ of my cpu and now i cannot get it above 20-30%ish.
15) Message boards : ATLAS application : Low CPU usage on ATLAS and SixTrack tasks (Message 32442)
Posted 16 Sep 2017 by Profile MechaToaster
Post:
hello. lately ive noticed that some settings in the BOINC manager application dont seem to be working as intended, specifically the "% of CPU Time" setting under the Computing tab in the Options menu.
I usually set it at 75% or 85%(which if i understand correctly, higher = more computing), along with use of all 8 cpu cores, but regardless of what i set it at, lately all my tasks(ATLAS simulation 1.01 and SixTrack 46.30(sse2)) use very low amounts of resources. generally it doesnt go above 30% CPU usage.

i have tried changing the settings, using web based settings, power plan on high performance, updating/restarting the client and many other things. i also read the thread about sixtrack low cpu usage being caused by a process involving csrss.exe and conhost.exe both having high cpu usage, but those programs are sitting at 0% usage for me. this is also occuring on ATLAS tasks anyway.

i have AMD virtualization enabled in my bios, and im running an overclocked 8 core AMD FX-8350 cpu, 16gbs ram; insuffieicnt resources shouldnt be the problem. ive tried running the tasks on a fresh reboot with no other applications open to see if it was a resource related thing, but it still does not utilize my CPU. according to the application "GPU-Z", my graphics card(AMD Radeon RX 480 4GB) is not being utilized much either, if at all. GPU load fluctuates from 0% to 10% every 10 seconds or so.

all of my settings are set so that computing, both cpu and gpu, is never suspended unless i click suspend myself. sorry for the wall of text, im just kinda outa ideas on whats going wrong here. it was working fine last week; i have not installed any weird software or updated/changed drivers or things like that since then.



©2024 CERN