1) Message boards : Sixtrack Application : Tasks available / tasks not available (Message 41449)
Posted 3 Feb 2020 by AuxRx
Post:
I've been getting work consistently (ymmv, there doesn't seem to be enough to build a buffer server side).
2) Message boards : Number crunching : VBox 6.1.2 announced - Ext.Pack giving troubles (solved) (Message 41443)
Posted 2 Feb 2020 by AuxRx
Post:
I tested this version on Windows with a running Theory- and ATLAS-task including several suspends (LAIM off), resumes, Theory snapshots, BOINC-restart and BOINC start after reboot. All without issues.


I am not so lucky. I repeatedly encounter stuck Atlas tasks that go overtime without ever completing or crashing. Some VMs are completely unresponsive, but most are still active and show a kworker issue. All tasks have in common that their log shows a "timesync vgsvcTimeSyncWorker: Radical guest time change" error. They all have been previously suspended for various reasons, so my assumption is that resuming doesn't work any more reliable than with previous VBox versions.

Therefore resuming should still be avoided, if at all possible. (BOINC 7.12.1 and VBox 6.1.2)

Edit: VBox is also littered with several inactive (crashed and completed) VMs.
3) Questions and Answers : Windows : LHC@home: Notice from server - VirtualBox is not installed (Message 41394)
Posted 27 Jan 2020 by AuxRx
Post:
Have you tried turning it on and off again? No, really, try shutting down and restarting. Check client_state.xml which is generated with each BOINC client shutdown(!), under <host_info> mine lists virtualbox_version.
https://boinc.berkeley.edu/wiki/client_state
4) Message boards : Number crunching : Project balance (Message 41388)
Posted 27 Jan 2020 by AuxRx
Post:
As far as I know it's not possible to tweak the balance beyond the capabilities that an app_config would provide (i.e. equally and permanently assigning cores)
5) Message boards : Number crunching : Why do I still offer computing? (Message 41381)
Posted 27 Jan 2020 by AuxRx
Post:
Seems you had particularly bad luck when the server outage hit last week, see https://lhcathome.cern.ch/lhcathome/forum_thread.php?id=5280&postid=41331#41331

Did you make any changes then that should possibly be reversed?
6) Questions and Answers : Windows : Only errors with all my tasks (Message 41364)
Posted 26 Jan 2020 by AuxRx
Post:
Sorry to be pedantic, but did you already go over and verify Yeti's Checklist?
https://lhcathome.cern.ch/lhcathome/forum_thread.php?id=4161

It's really thorough and should catch most issues.

Are you using any customized preferences e.g. app_config?

BTW, concerning your previous question: yes, once Atlas is running tasks will finish with random percentages short of 100%. The way the Atlas is setup at the moment, each task contains 200 events to be simulated. Once those are completed, the task stops. You can also check what is going inside the VM as well as the current progress with Vbox Extensions - Yeti's Checklist will teach you how to do so. But we need to get Atlas running first!
7) Message boards : Sixtrack Application : Tasks available / tasks not available (Message 41304)
Posted 19 Jan 2020 by AuxRx
Post:
There are no new tasks queued at the moment. What little tasks are distributed are re-sent, because another volunteer didn't complete them in time. Therefore no specific indications of a problem on your side. Just a slow week for work unit creation.
8) Message boards : Sixtrack Application : Server Status (Message 40987)
Posted 17 Dec 2019 by AuxRx
Post:
Nevermind, I can't read.
9) Questions and Answers : Sixtrack : No more tasks? (Message 40986)
Posted 17 Dec 2019 by AuxRx
Post:
Nevermind, I can't read.
10) Message boards : Sixtrack Application : Server Status (Message 40982)
Posted 17 Dec 2019 by AuxRx
Post:
No tasks are available for SixTrack?
11) Message boards : ATLAS application : Atlas Simulation 1.01 (Vbox64) will not finish (Message 38136)
Posted 7 Mar 2019 by AuxRx
Post:
Moved time between projects to 2hrs.

Not enough. You need to complete the task BEFORE switching. Depending on your configuration (I ran single threaded tasks for better efficiency), you need to allow for 8+ hours. But that only completes one task and sooner or later a task will fail due to switching.

Oh....all projects are equal %.

Again, not possible. You can limit the effects, but ultimately you need to stick to "ATLAS only" for best results. Originally I didn't want to limit myself to ATLAS, but I am glad I did. Pursuing ATLAS let down a rabbit hole that made me switch to Linux, setup a whole slew of improvements and helped me learn about how these projects are setup. I definitely recommend it.

Other observations:
We, as volunteers, cannot change the way this project run. There's a bigger scientific goal that takes precedence.
Yeti's guide is the place to start and the best source of information.
This forum contains tons of advice, although it can be hard to find.
12) Message boards : ATLAS application : Atlas Simulation 1.01 (Vbox64) will not finish (Message 38104)
Posted 5 Mar 2019 by AuxRx
Post:
My go to reply in these cases is that ATLAS cannot run along side other tasks. The tasks generally just don't like being paused or interrupted. I have never found a solution. You can mitigate the issue by increasing the time between switching tasks, but eventually some tasks will fail.

My best guess what happens is this: ATLAS tasks starts ... is interrupted or suspended ... task can't restart, but doesn't fail either ... time elapses ... server deadline is not met, hence Cancelled by server.

Sorry, but definitely let us know if you find a solution!
13) Message boards : Number crunching : Checklist Version 3 for Atlas@Home (and other VM-based Projects) on your PC (Message 37804)
Posted 21 Jan 2019 by AuxRx
Post:
Please open a new topic to further discuss. Let's keep this thread tidy.

I haven't run VM in a long time, but it should be straight forward. You are encountering common issues that are described at length and have been solved many times.
ATLAS tasks cannot be restarted reliably. You'll have to run only ATLAS and set switch between tasks ludicrously high (~24 hours).

https://lhcathome.cern.ch/lhcathome/result.php?resultid=214262091

2019-01-18 16:59:13 (2898): Stopping VM.
2019-01-18 16:59:25 (2898): Successfully stopped VM.

https://lhcathome.cern.ch/lhcathome/result.php?resultid=214264076

2019-01-20 15:50:39 (21966): Guest Log: ccopioepdi etdh e twheeb awpepb atpop  /tvoa r//vwawrw/ww
2019-01-20 15:50:39 (21966): Guest Log: ccopioepdi etdh e twheeb awpepb atpop  /tvoa r//vwawrw/www
2019-01-20 15:50:40 (21966): Guest Log: TThish ivsm  vdmo edso enso tn onte ende etdo  tsoe tuspe thutpt hpt tppr opxryox
2019-01-20 15:50:40 (21966): Guest Log: TThish ivsm  vdmo edso enso tn onte ende etdo  tsoe tuspe thutpt hpt tppr opxryoxy
2019-01-20 15:50:40 (21966): Guest Log: AATHENAT_HPERONCA__PNRUOMCB_ENR=UM2BE
2019-01-20 15:50:40 (21966): Guest Log: AATHENAT_HPERONCA__PNRUOMCB_ENR=UM2BER=2

https://lhcathome.cern.ch/lhcathome/result.php?resultid=214359243

2019-01-21 15:23:42 (1269): Stopping VM.
2019-01-21 15:23:55 (1269): Successfully stopped VM.

also looks like triple work?
2019-01-21 13:53:39 (1269): Guest Log: Copying input files into RunAtlas.
2019-01-21 13:53:39 (1269): Guest Log: Copying input files into RunAtlas.
2019-01-21 13:53:39 (1269): Guest Log: Copying input files into RunAtlas.
2019-01-21 13:53:50 (1269): Guest Log: Copied input files into RunAtlas.
2019-01-21 13:53:50 (1269): Guest Log: Copied input files into RunAtlas.
2019-01-21 13:53:50 (1269): Guest Log: Copied input files into RunAtlas.
2019-01-21 14:01:32 (1269): Guest Log: copied the webapp to /var/www
2019-01-21 14:01:32 (1269): Guest Log: copied the webapp to /var/www
2019-01-21 14:01:32 (1269): Guest Log: copied the webapp to /var/www
2019-01-21 14:01:32 (1269): Guest Log: This vm does not need to setup http proxy
2019-01-21 14:01:33 (1269): Guest Log: ATHENA_PROC_NUMBER=2
2019-01-21 14:01:33 (1269): Guest Log: This vm does not need to setup http proxy
2019-01-21 14:01:33 (1269): Guest Log: This vm does not need to setup http proxy
2019-01-21 14:01:33 (1269): Guest Log: ATHENA_PROC_NUMBER=2
2019-01-21 14:01:33 (1269): Guest Log: ATHENA_PROC_NUMBER=2
2019-01-21 14:01:34 (1269): Guest Log: Starting ATLAS job. (PandaID=4214632869 taskID=16698164)
2019-01-21 14:01:34 (1269): Guest Log: Starting ATLAS job. (PandaID=4214632869 taskID=16698164)
2019-01-21 14:01:34 (1269): Guest Log: Starting ATLAS job. (PandaID=4214632869 taskID=16698164)

https://lhcathome.cern.ch/lhcathome/result.php?resultid=214356357

2019-01-21 11:15:31 (5955): VM state change detected. (old = 'running', new = 'paused')
2019-01-21 11:15:45 (5955): Error in resume VM for VM: -2135228414
14) Questions and Answers : Getting started : Unable to agree Terms of Use (Message 37654)
Posted 19 Dec 2018 by AuxRx
Post:
The usual: try a different browser or browser settings. Worked for me.
15) Message boards : ATLAS application : Atlas runs very slowly after 94% (Message 36993)
Posted 10 Oct 2018 by AuxRx
Post:
I think maybe they recently changed their policy regarding validating tasks that fail which is a good thing because it lets users know they are doing something wrong and motivates them to correct it.


definitely not. most of my work consists of re-sends because some clients have hundreds of failed tasks (no vbox most likely) - and no one cares on either end. The volunteers just keep downloading new work.

If it was a problem with the tasks you would see several other volunteers reporting it too.


There are issues with recent task, hence this thread "runs very slowly after xx%". I'm seeing the same behaviour, I'm not as worried though.

When the server died on Oct 3rd, i had several tasks run 10+hours. ATLAS depends on the underlying infrastructure and there still seem to be issues.
16) Message boards : ATLAS application : Shorter work than expected? (Message 36807)
Posted 21 Sep 2018 by AuxRx
Post:
I think this is the most appropriate issue I can add to with my newly encountered issues.

1. I ran native_mt but recently updated vbox. I have since received at least one vbox task instead of native.

2. At the same time native_mt were being computed and completed in fastly different times than the task logs reported. As in 600 seconds credit instead of actual 20k seconds. The task logs show the correct time, but My Tasks doesnt and the credit isnt correct either.

Did anyone else have similar issues? Whatfixed it for you?
17) Message boards : ATLAS application : Non-zero return code from EVNTtoHITS (65) (Error code 65) (Message 36067)
Posted 26 Jul 2018 by AuxRx
Post:
I've found that changing the number of cores allocated to any task or project introduces this behaviour. I recently had this very issue, because I was messing with an app_config.xml. Never touch a running system seems to really hold some truth.

Here are my amateur remarks:
The problem is particularly rampant, when BOINC still holds some tasks that were requested before changes were introduced.
Especially affected are tasks that are running or suspended at the time of transition, but not exclusively.
Reboots or deleting single tasks does not help. Tasks will run in parallel for a while, but sooner or later it's down to 1 task again.

Basically, set "No new tasks", run through your buffer previously downloaded tasks (as not to give up on your WUs). It might be fixed now. Still, to err on the safe side, reset all projects. Reboot for good measure. Allow new tasks.

I did some version of the above, and it's working now ... or for now, at least.
18) Message boards : ATLAS application : Request for new Default RAM Setting (Message 36057)
Posted 26 Jul 2018 by AuxRx
Post:
Look at how many slots are running ATLAS: https://lhcathome.cern.ch/lhcathome/atlas_job.php

Can you tell me why project scientists might have bigger issues than a few misconfigured volunteer systems? Making this forum a toxic place does not help. I've tried. It's better to move on to another project. It's a numbers game, after all.
19) Message boards : ATLAS application : Non-zero return code from EVNTtoHITS (65) (Error code 65) (Message 35999)
Posted 22 Jul 2018 by AuxRx
Post:
What have you tried so far?

Creating the app_config.xml as shown above, reading the config files and rebooting the system should clear up any issues. ATLAS generally doesn't like interruptions (stopping and starting), the safe bet is to set a fixed use limit instead of a flexible "when computer is in use" limit. It might be easier (at least for troubleshooting) to crunch one project and only one subproject e.g. LHC ATLAS.
20) Message boards : Sixtrack Application : AVX Sixtrack version (Message 35963)
Posted 20 Jul 2018 by AuxRx
Post:
Any chance we can re-open this issue? I've been stuck with SSE2 instead of AVX for too long. At this point I doubt this is going to remedy itself. The algorithm that is supposed to re-evaluate my system is not doing its job. It might even be the culprit, since I was getting AVX initially.


AVX has returned with the latest wave of SixTrack tasks! Yay, thank you!

Although I might have done something to shake up the ol' rusty bits?! I was setting up an app_config.xml for another project, which escalated into creating an app_config.xml for LHC's ATLAS. I didn't even define SixTrack as an <app>. (I thought I wouldn't receive SixTrack without a definition, but I do ...)

In connection with my ATLAS issues, I reset both "MAX # jobs" and "MAX # CPUs" in the website preferences to "No Limit".

After reading the config files and updating my projects in the BOINC client, new SixTrack tasks had been released and AVX started pouring in.

Not sure what did it though.


Next 20


©2024 CERN