41) Message boards : Sixtrack Application : MacOS executable (Message 28766)
Posted 2 Feb 2017 by Profile ritterm
Post:
kyrsjo wrote:
FYI: The macOS results (from SixTrack version 4.5.30) are physically valid since none of the currently running inputs use the removed RIPP feature, however they may differ slightly from the Linux and Windows version (from SixTrack 4.5.17) as in version 4.5.25 we fixed a small bug in how the input files were read. This means that when it compares results from MacOS with Windows or Linux, it fails... We're really sorry for this mess...

Thanks for the feedback, Kyrre! If MacOS results are actually valid from a scientific perspective, what's the plan to deal with the "Validation inconclusive" results?
42) Message boards : Sixtrack Application : 260.000 WUs to send, but no handed out (Message 28761)
Posted 1 Feb 2017 by Profile ritterm
Post:
Yeti wrote:
I believe there is a limit to the number of tasks in progress -- something around 10 tasks per core/thread, perhaps.

The Limit is only your own queue-settings

Are you sure about that? I continue to get "This computer has reached a limit on tasks in progress" messages on all of my hosts and I have no "Max # of jobs" limit set. Now that I look more closely, it seems to be 14 tasks/core. Am I missing another setting somewhere?
43) Message boards : Sixtrack Application : 260.000 WUs to send, but no handed out (Message 28757)
Posted 1 Feb 2017 by Profile ritterm
Post:
Its been daysand still no work units. meanwhile the stats page still says theres 180,000+ plus...

Sorry to hear you're having problems. Fortunately for me, my four hosts have generally been getting enough work today to keep them busy; however, each one's cache has rarely been full. Work seems to come in batches with long periods of "no tasks available" messages in between.

I believe there is a limit to the number of tasks in progress -- something around 10 tasks per core/thread, perhaps. Hopefully the admins can identify any issues and get the work flowing more smoothly again.
44) Message boards : Sixtrack Application : MacOS executable (Message 28747)
Posted 31 Jan 2017 by Profile ritterm
Post:
Sorry the MacOS executable is withdrawn temporarily.

There are hosts out there still returning results using the v453 apple-darwin app (e.g., 10417292, 10415063, & 10411359). I know it might be a lesser problem right now, but are in progress tasks for that app not going to be canceled?
45) Message boards : Number crunching : specific number of several applications at the same time on one machine (Message 28742)
Posted 30 Jan 2017 by Profile ritterm
Post:
thanks!

Good luck!
46) Message boards : Sixtrack Application : 260.000 WUs to send, but no handed out (Message 28739)
Posted 30 Jan 2017 by Profile ritterm
Post:
The number of unsent sixtrack WUs is increasing, but i cant get any, why?

EDIT:Did someone do something? Now i can get them (as of 19.18UTC)

Just adding this in case it helps... While two of my hosts seem to get tasks, eventually, two others still get the "No tasks are available for SixTrack" messages.
47) Message boards : Number crunching : specific number of several applications at the same time on one machine (Message 28716)
Posted 29 Jan 2017 by Profile ritterm
Post:
Or do i have to set manually the allowed application, update the boinc-client, wait until i get some WUs from this experiment and start with the next application and start all over again when the tasks have finished?

It can be difficult to manage a mix of different sub-projects on a single host and it can take a lot of manual intervention as Crystal Pellet suggests. But, you're on the right track.

Maybe I misunderstand what you're saying, but you don't have to wait to be out of work on one sub-project before you pick up work for another. You could, for example, build up a cache of Theory tasks and then change your preferences to only accept CMS tasks. Once you've built up some of those, then change to only accept LHCb. You may have to adjust your cache setting (from, say, one day to two and then three), but you should, eventually, get enough of all three to have work for a couple of days or more. Then repeat the process.

For my interests and host resources, I've found it best to run one of each of the RAM-intensive sub-projects (CMS, LHCb, and ATLAS) on one host and have different preferences set for each host (using the default, home, work, and school venues).
48) Message boards : Number crunching : specific number of several applications at the same time on one machine (Message 28713)
Posted 29 Jan 2017 by Profile ritterm
Post:
...but this doesnt work. i get 4 LHCb tasks most of the time.

Have you restarted your BOINC client or forced the client to re-read your config files since the time you created your app_config?
49) Message boards : Sixtrack Application : MacOS executable (Message 28701)
Posted 28 Jan 2017 by Profile ritterm
Post:
The current 4530 MacOS is withdrawn.

I think I have recent inconclusive results where my wingman's task was sent out after this post (e.g., Workunit 56122830).
50) Message boards : Sixtrack Application : Inconclusive results (Message 28680)
Posted 26 Jan 2017 by Profile ritterm
Post:
Sorry if I missed it, but I'm just wondering if something can or is being done about this. I'm still building up inconclusive results with apple-darwin wingmen and some WUs are three days old without a third task being sent out.
51) Message boards : Sixtrack Application : Inconclusive results (Message 28646)
Posted 24 Jan 2017 by Profile ritterm
Post:
Almost all of my inconclusives were paired with a Apple Darwin application which finished the task in a few seconds. The third task has not been sent yet (unsent).

+1
52) Message boards : Number crunching : General Work Shortage? (Message 28504)
Posted 15 Jan 2017 by Profile ritterm
Post:
My hosts have been returning quite a few errors recently with the "...Condor exited after XXXs without running a job..." message. I'm used to seeing these occasionally, but not so many as in the last couple of days. This is happening mostly on Theory and CMS tasks, but I've had a few on LHCb, too.
53) Message boards : LHCb Application : Condor exited after 608s without running a job (Message 28470)
Posted 13 Jan 2017 by Profile ritterm
Post:
I've had several LHCb tasks fail recently due to lack of work (e.g., Task 111867891). I've also had some fail due to Condor connection errors (e.g., Task 111867740).

I don't see any issues like this posted on the boards recently and my other hosts running CMS and Theory tasks don't seem to have any problems. Not necessarily a problem for me, but wanted to point it out in case it's indicative of a more significant problem.
54) Message boards : Number crunching : "Giving up catch-up attempt.." (Message 28343)
Posted 4 Jan 2017 by Profile ritterm
Post:
Are you suspending and trying to restart the tasks or doing reboots?

No. My BOINC hosts pretty much run 24/7.

I see it is linux so I don't know how or if you have to do any updates.

I install updates pretty much as soon as the become available. I'm running VM 5.1.10 and haven't upgraded, but maybe I'll try that.

It's a shame that I don't seem to be able to run 4 VMs concurrently and efficiently on this host. However, it is a home-built system that has acted quirky on some projects. I may have to resign myself to running only two VMs at a time... :-(
55) Message boards : Number crunching : "Giving up catch-up attempt.." (Message 28307)
Posted 2 Jan 2017 by Profile ritterm
Post:
I'm wondering if these messages are an indication that the host is overloaded with VM tasks. I've found that (1) the number of "giving up" messages is far fewer when running only 3 VM tasks and non-existent when running only 2 and (2) there's a big difference between the running and CPU times of completed tasks (often 10K-12K seconds). The mix of tasks doesn't seem to matter (although I didn't try running 3-4 Theory tasks).
56) Message boards : Number crunching : "Giving up catch-up attempt.." (Message 28299)
Posted 30 Dec 2016 by Profile ritterm
Post:
My 8-core, 16GB RAM host is running 2 LHCb and 2 CMS tasks right now (alongside no other BOINC tasks requiring significant RAM) and I'm seeing a lot of "Giving up catch-up attempt.." log entries in each one. In some cases there are only a few at a time, but others show the messages coming repeatedly for several hours. So far, each task seems so be moving along with log entries indicating jobs starting and finishing.

Are these messages indicative of anything that I should be worried about? Are they specific to LHCb or CMS? I'm not seeing them on my other 16GB RAM host that's running 2 ATLAS and 2 Theory jobs.
57) Message boards : LHCb Application : Info about the LHCb experiment (Message 28298)
Posted 30 Dec 2016 by Profile ritterm
Post:
There was a discussion elsewhere about the balancing of project, so watch that space

Thanks, TB. I found the thread balancing within project? and will keep an eye on it.
58) Message boards : LHCb Application : Info about the LHCb experiment (Message 28296)
Posted 30 Dec 2016 by Profile ritterm
Post:
The 3 queues for the VM-applications mostly have between 80 and 100 BOINC-tasks Unsent and are automatically refilled.
But when there are too many requests for 1 queue at the same time, the queue of the next application will be polled.

I see, thank you.

So, as long as LHCb remains a "test app", I guess I'll have to change preferences periodically and load up on CMS tasks. Not ideal, but manageable.
59) Message boards : LHCb Application : Info about the LHCb experiment (Message 28294)
Posted 30 Dec 2016 by Profile ritterm
Post:
It looks like you proved, what I was saying :lol:

Indeed! I just wanted to make sure I understood what you were saying. Interestingly, I just picked up a CMS task on this host. So, maybe he LHCb queue is running low.
60) Message boards : LHCb Application : Info about the LHCb experiment (Message 28290)
Posted 29 Dec 2016 by Profile ritterm
Post:
Crystal Pellet wrote:
When requesting new work, first look whether 'running test applications' is enabled in your project preferences,
then whether there are workunits in the Beta-queue available (LHCb's mostly are). When not: random other applications from your preferences.

So, CP, are you saying that test applications have highest priority? Is that why one of my hosts with the settings below has been getting only LHCb tasks the past several days?

Run test applications? [checked]
Run only the selected applications
SixTrack: yes
sixtracktest: yes
CMS Simulation: yes
LHCb Simulation: yes
Theory Simulation: no
ATLAS Simulation: no
ALICE Simulation: no
Benchmark Application: no


Previous 20 · Next 20


©2024 CERN