1) Message boards : Number crunching : Long WU's (Message 24635)
Posted 18 Aug 2012 by Zapped Sparky
Post:
Richard, I've been running AP since the various executable's came out. (sse, sse2 etc...)

For two reasons:

1:To stop boinc downloading a different executable after each time I run out of tasks. To prevent it using something lower.

2:To stop boinc downloading the same executable again and again once I run out of tasks and get some more after a while.

So far I'm running Boinc 7.0.28 on AP and no tasks have reported zero run time.
2) Message boards : Number crunching : How is the SSE3 thing coming along? (Message 24319)
Posted 12 Jul 2012 by Zapped Sparky
Post:
Good news! I "pumped" the Project and got two new Six Track 444.01 (sse3) Tasks. They show long Remaining times (35+ hrs) but it has already come down from 43+ as BOINC estimates the work better.

Running Windows 7 (x64), Inteli7-2600, 16GB RAM.

Tom

Just got my first sse3 tasks myself, however the estimate is in the other direction, 1hr20m to completion, looks like they'll take about to 7hrs to complete. Still glad to be crunching though :)
3) Message boards : News : Server/Executable problems (Message 24281)
Posted 11 Jul 2012 by Zapped Sparky
Post:
A couple of tasks have downloaded successfully, been crunching one for just over five hours, so far so good :)
4) Message boards : News : Server/Executable problems (Message 24243)
Posted 11 Jul 2012 by Zapped Sparky
Post:
I'm getting download failures as well.

11/07/2012 00:49:44 | LHC@home 1.0 | Sending scheduler request: Requested by user.
11/07/2012 00:49:44 | LHC@home 1.0 | Requesting new tasks for CPU
11/07/2012 00:49:44 | LHC@home 1.0 | [sched_op] CPU work request: 902798.71 seconds; 0.00 devices
11/07/2012 00:49:46 | LHC@home 1.0 | Scheduler request completed: got 2 new tasks
11/07/2012 00:49:46 | LHC@home 1.0 | [sched_op] Server version 700
11/07/2012 00:49:46 | LHC@home 1.0 | Project requested delay of 11 seconds
11/07/2012 00:49:46 | LHC@home 1.0 | [sched_op] estimated total CPU task duration: 182764 seconds
11/07/2012 00:49:46 | LHC@home 1.0 | [sched_op] Deferring communication for 11 sec
11/07/2012 00:49:46 | LHC@home 1.0 | [sched_op] Reason: requested by project
11/07/2012 00:49:48 | LHC@home 1.0 | Started download of sixtrack_windows_gen
11/07/2012 00:49:49 | LHC@home 1.0 | Giving up on download of sixtrack_windows_gen: permanent HTTP error
11/07/2012 00:49:49 | LHC@home 1.0 | Started download of w1jul_niebb1d__22__s__64.28_59.31__10.4_10.6__6__72_1_sixvf_boinc38520.zip
11/07/2012 00:49:50 | LHC@home 1.0 | Giving up on download of w1jul_niebb1d__22__s__64.28_59.31__10.4_10.6__6__72_1_sixvf_boinc38520.zip: permanent HTTP error
11/07/2012 00:49:50 | LHC@home 1.0 | Started download of w1jul_niebb0d__2__s__64.28_59.31__8.8_9__6__32_1_sixvf_boinc2168.zip
11/07/2012 00:49:51 | LHC@home 1.0 | Giving up on download of w1jul_niebb0d__2__s__64.28_59.31__8.8_9__6__32_1_sixvf_boinc2168.zip: permanent HTTP error
11/07/2012 00:49:54 | LHC@home 1.0 | [sched_op] Deferring communication for 1 min 49 sec
11/07/2012 00:49:54 | LHC@home 1.0 | [sched_op] Reason: Unrecoverable error for task w1jul_niebb1d__22__s__64.28_59.31__10.4_10.6__6__72_1_sixvf_boinc38520_7 (WU download error: couldn't get input files:<file_xfer_error>  <file_name>w1jul_niebb1d__22__s__64.28_59.31__10.4_10.6__6__72_1_sixvf_boinc38520.zip</file_name>  <error_code>-224</error_code>  <error_message>permanent HTTP error</error_message></file_xfer_error>)
11/07/2012 00:49:54 | LHC@home 1.0 | [sched_op] Deferring communication for 3 min 53 sec
11/07/2012 00:49:54 | LHC@home 1.0 | [sched_op] Reason: Unrecoverable error for task w1jul_niebb0d__2__s__64.28_59.31__8.8_9__6__32_1_sixvf_boinc2168_9 (WU download error: couldn't get input files:<file_xfer_error>  <file_name>w1jul_niebb0d__2__s__64.28_59.31__8.8_9__6__32_1_sixvf_boinc2168.zip</file_name>  <error_code>-224</error_code>  <error_message>permanent HTTP error</error_message></file_xfer_error>)
5) Message boards : LHC@home Science : Full Mooon (Message 23971)
Posted 11 Jun 2012 by Zapped Sparky
Post:
The headline is attention grabbing :) But I don't see how the amount of sunlight hitting the moon could have any effect on it.
6) Message boards : Number crunching : Two diffrent executables ? (Message 23964)
Posted 9 Jun 2012 by Zapped Sparky
Post:
Thanks.

I also noticed that every time my cache runs dry, when I get the next bunch of WUs, executables are being downloaded again.

Is there something wrong with naming of the files and/or versioning ?

Previous files where called sixtrack_530.10_windows_x86_64.exe or sixtrack_530.10_windows_x86_64 or sixtrack_530.10_windows_intelx86 depdending on the version of the application and the architecture ...

I'm getting the same. The sixtrack.exe is being downloaded with new tasks after the tasks in my cache have been finished. Looking in the sixtrack folder "sixtrack.exe" was there while tasks were being crunched but after the tasks are completed it gets deleted. However "sixtrack_530.10_windows_intelx86" is still there.
7) Message boards : News : Sixtrack server migration today (Message 23957)
Posted 7 Jun 2012 by Zapped Sparky
Post:
Thanks for the news, no problems on my end as I have, apparently, crunched a couple of tasks today :)
8) Message boards : Cafe LHC : Merry Christmas and a happy new year! (Message 23787)
Posted 21 Dec 2011 by Zapped Sparky
Post:
To the scientists behind Sixtrack, the helpers on the forums and all of the crunchers!
9) Message boards : LHC@home Science : Communication to LHC@home (SixTrack) volunteers (Message 23760)
Posted 2 Dec 2011 by Zapped Sparky
Post:

Thanks for the news Massimo Giovannozzi, and the project information.
10) Message boards : Number crunching : no more work? (Message 23713)
Posted 19 Nov 2011 by Zapped Sparky
Post:
I upped the number of days buffered to two and not seen an outage longer than two days. Now it's 5. Just wondering.


A bigger buffer won't help you get more work at this project, in fact a large buffer can even decrease the amount of work you receive. Sixtrack allows you to have only 1 task per core at a time. If you set your buffer too high then tasks will sit around in your cache and increase your task turnaround time. If your turnaround time is too high the server won't issue resends to your computer because resends are issued only to machines with short turnaround times.

In addition to what Keith said, there are other reasons why you can go for 5 days with no work when it's obvious work has been issued to other computers. Every time your computer asks Sixtrack for work and doesn't receive any, it multiplies the time it delays the next request by 2, up to a maximum of 24 hrs. While your computer is waiting for 10 hours (for example) to request more work, a huge batch of 90 second tasks can be gobbled up, crunched and returned by other computers. I've seen my 8 core machine run off 150 short tasks in less than 20 minutes because if it happens to request work within a few minutes of a batch of shorties being loaded into the queue AND it's hungry for Sixtrack tasks because it hasn't had any for a while, then it just keeps requesting work and getting it and not increasing its delay between requests for work. If a couple thousand machines in the pool all line up that way it's like a shark feeding frenzy. It's a matter of lucky timing. As always, if you have certain skills (know how boinccmd.exe works, can write batch files and program Windows Event Scheduler), you can shift the odds in your favor, drastically in your favor, and be one of the sharks who just happens to be in the right place at the right time all the time ;-)

I only crunch LHC@Home 1.0 on alternate weekends and work has been a little dry lately, that said last night I allowed sixtrack to fetch more work, and even though the status page said none were available, it got a couple of tasks, good timing I guess. One second the feeder could be full and the next empty, it can be very luck of the draw.
11) Message boards : Cafe LHC : Faster than light? (Message 23706)
Posted 18 Nov 2011 by Zapped Sparky
Post:
Cern have posted an update on the FTL Neutrino's.
12) Message boards : Number crunching : SixTrack and LHC@home status (Message 23681)
Posted 11 Nov 2011 by Zapped Sparky
Post:
Thanks for the update Eric, even though a lot of it went over my head :) Hope your having a good holiday!
13) Message boards : Cafe LHC : CERN / LHC on BBC TV (Message 23656)
Posted 5 Nov 2011 by Zapped Sparky
Post:
Thanks, sounds interesting (not heard of the program myself :)), set my box to flip over when it's next broadcast.
14) Message boards : Number crunching : Resource share (Message 23654)
Posted 5 Nov 2011 by Zapped Sparky
Post:
Hello Gundersen and welcome to the forums! Yes, you are correct, but it's not the level of importance, it's how much time each project gets and runs for and can be set to whatever you want. So if you want Einstein to crunch for one quarter of the time and LHC for the other three quarters you can set it that way.
Click here for your Sixtrack preferences page. Click edit, you can then set it to 75, and then click update preferences.
Click here for your Einstein preferences page. Click edit and you can then set it to 25, and then click update preferences.
Afterwards in Boinc manager click update for both projects and the resource shares will update to reflect what you set them on the website.
15) Message boards : Number crunching : Is the board working? (Message 23588)
Posted 26 Oct 2011 by Zapped Sparky
Post:
No problems with the forums that I can see. No problems with tasks either, and plenty of them to get through as well. Everything appears ship shape (probably cursed it now :)).
16) Message boards : Number crunching : no more work? (Message 23577)
Posted 22 Oct 2011 by Zapped Sparky
Post:
Grab 'em while you can! Tasks ready to send as of 20 Oct 2011 16:52:16 UTC: 42,621

I see your 42K and raise you 102K, the pot is now 144K.

I've noticed no one complainting in the last two days, so I had to check.

So for now there is NO "no more work".

Hehehe, I can't raise as much but it's continuing to grow:
Tasks ready to send 147,627
Tasks in progress 28,657
Plenty for everyone, in the meantime I'll join the tumble weed in a wander about :D
17) Message boards : Number crunching : no more work? (Message 23569)
Posted 20 Oct 2011 by Zapped Sparky
Post:
Grab 'em while you can! Tasks ready to send as of 20 Oct 2011 16:52:16 UTC: 42,621
18) Message boards : Number crunching : Long delays in jobs (Message 23558)
Posted 19 Oct 2011 by Zapped Sparky
Post:
Well, no, that quorum I linked to has nothing to do with the fast reliable hosts experiment. I highlighted it to draw attention to what appears to be a BOINC server bug. That quorum has been completed but one of its members has been left in an inconsistent state. The task deemed to be invalid has been left as 'pending' and I imagine this might prevent the quorum from being deleted at the appropriate time. In the past there were plenty of examples of pending tasks left cluttering up the database long after the main body of tasks had been deleted. I don't want to see that happening again.


@Gary Roberts, Apologies, I missed that task still pending. I just saw the two completed and validated and thought the workunit was done. Thanks for the explanation, before the server move I had a task pending for about a year :) I now know what you're on about RE:the server bug and not wanting the pendings sticking around.

The criteria for a fast reliable host might need a bit of tweaking if this and this occur with any frequency. In these two examples, a short deadline resend was issued when a primary task failed. In both those cases, the first resend timed out and that triggered a second short deadline resend. It's quite possible that the first resend might get returned late and so complete the quorum. If that happens, I trust the second resends will still be awarded credit if they get completed before their own shortened deadlines. I was interested to note that the computer used for the first resend in the first of the examples given had a turnaround time of close to 4 days - hardly what you would call 'fast and reliable' :-).


Looks like the re-sends are complete and all who completed got validated and credit except for one, and there's nothing in the stderr to indicate a problem.

[EDIT] @VALDIS, credits for me are just an indicator of my computers progress. If it dropped I'd know something was up (eg errored tasks)and would look further into it.[/EDIT]
19) Message boards : Number crunching : Long delays in jobs (Message 23549)
Posted 18 Oct 2011 by Zapped Sparky
Post:
@Gary Roberts, from Igor's message 9th Oct

with help of Keith, I have implemented the reliable/trusted host settings correctly now.
I believe the execution turn-around will improve. Will monitor and see.

Thank you much!


Looks like from that quorum of yours and a couple of tasks sent my way that the reliable/trusted host update kicked in (or finally got around to the tasks) on the 15th.
wu 279308
wu 279207
In both cases I'm the third host. I can't say it's confirmation (being only a few tasks) but it looks like it's starting to work as planned.
20) Message boards : Number crunching : Long delays in jobs (Message 23527)
Posted 16 Oct 2011 by Zapped Sparky
Post:
Looking for a WU with "validation inconclusive" I found this host.

Looks like it became suddenly unstable.

Within the next hour the host most likely crashed all ~450 WUs immediately after the download (Maximum daily WU quota per CPU = 80, 6-core CPU> 460).

Do anybody know a possibility to identify this kind off rapid WU transfer automatically to stop it earlier (server side)?

No nothing automatic except the quota system which is designed to do exactly that. The only way to reduce this situation is lower quota for everybody. But then when a batch of short run tasks are issued, it would be easy to meet the quota for a day in minutes and not earn much credit the rest of the day, thus the current 80 quota.

If the host is producing errors, then that hosts' quota is lowered automatically towards 1, until it produces valid results, then it can earn a higher quota back up to the max set by the project.

The only way to stop this kind of thing is for users to stop hiding their hosts, then someone could send them a PM saying, hey check your host. Maybe they will see they also are earning not much credit and look into it.

ps,
latest check shows all tasks have been issued. now let us see how long it takes to return those 24,000 still processing.

Unfortunately people on Seti have tried PMing hosts who have their computers unhidden in order to alert them to problems their computer may be having, and I can honestly say practically 99% don't even get a response. Reducing hosts that produce nothing but errors to one task per day (or even black listing them so they get no tasks at all) may help. I know it's not in the spirit of boinc but it seems it is something that should be taken seriously and necessary steps taken in order to reduce errors.


Next 20


©2024 CERN