1) Message boards : Number crunching : Actual runtime far exceeds calculated runtime (Message 26098)
Posted 5 Dec 2013 by Nuadormrac
Post:
Actually, I think what can affect people in part, is that their doesn't seem to be an update to the WU correction factor. For instance on the current batch, I notice that it estimates a 5 hour 44 minute and 45 second completion time. However, on my i7 quad core (3610QM) proc here, it takes more like 9 hours.

Now most projects tend to adjust the WU correction factor, so that as a task takes longer, the projected length of time is extended, and shortened as it shrinks. Not 1 single task has thus far taken less time, but when it gets reported, the number of estimated seconds on the tasks don't increase by even 1. I've watched it, just as it went to first upload a task, no adjustments made, least in core client 7.0.28 for LHC here.

Now in my case, not missing deadlines like the OP, but it means having to do some micro-management on task suspends and what not, should climate models download (also been rarely available), when run aside a third project, etc, as the estimates don't update. I think if they'd look at this one factor, so that the estimate is allowed to vary based upon previously returned work (that function doesn't seem very functional atm for this project's current app and version), that would remedy any such issues people might be seeing. Of course having a dual threaded quad core helps, but it also estimates feeding 8 cores, which if one runs POEM OpenCL, could also use one using 4 cores to a POEM app info file.

All in all, that's probably why other projects he isn't seeing an issue here, because those projects are likely updating the work duration factor, based upon last turn ins. I don't know what he's seeing, but I'm seeing no such update taking place with this project, meaning that no matter how many units one crunches, and how much longer then the estimate it takes, the estimate never seems to change, at all...

Unknown also is how many projects he might be running at once. Bad estimates could be an issue if he's trying to run all 8 projects at once (he mentioned 8, not sure how many he has active and not set to NNT while crunching here though). I imagine what he might have seen, if he's running them all on a comp?, and uncertainly unless he has one of these quad cores with 8 threads (the 6 core i7 Extreme I at least haven't seen in a laptop which he mentioned using), is that the thing commits itself, trying LHC and not getting work. It then gets for other projects. Then some work comes, but it already had work, even if his queue is relatively smallish, if it got work for 8 projects already, and is trying a 9th, and doesn't have many "cores" to crunch it on... But I really haven't checked what his comps are, and no idea how many projects he's running concurrently to say whether this could be an issue for him or not, for him. Some projects like primaboinca (not sure if he runs that), also seems prone to grab a lot of work, and keep itself running on all cores for awhile, and then balances itself with project debt over the long haul, vs running just a few tasks of itself alongside other projects, which could also add to something, if one's seeing an issue with bad estimates).
2) Message boards : Number crunching : Failed to download (Message 24354)
Posted 14 Jul 2012 by Nuadormrac
Post:
But this is the odd thing, the bad tasks seem to be sent to the older client, though I'm not sure how you send the tasks. Some projects (Einstein for instance) tends to send tasks similar to those one had crunched before (so they can re-use some data files and not have to download everything again). I think Rosetta did something similar, though with the way tasks have seemed to disappear on this project, and due to the infrequency of work available I'm not sure such a scheme would help minimize downloads....

But a thought, if for some reason the old client keeps getting sent the download failures or WU rejects (as it tries to meet the 10 quorum), and the new client isn't given that honor :o :lol:

Could be coincidence, or people might try re-attaching after a detach, directory delete, and downloading all new again. But I can't say for certain this could eleviate anyone's download problems as 2 clients, isolated by themselves is rather anectdotal.

Oh, and on one task, I noticed that I ended up validating my own result, as the servers sent me the same task twice. Now I know this comp has a quad core, which is also dual threaded, and given hyper-threading is seen as 8 CPUs. But that was odd seeing my computer come up twice on the WU, and odder yet seeing the run time slightly differed each time (by about a half a minute or whatever), though it validated. Never imagined I'd establish a quorum with myself :o
3) Message boards : Number crunching : Failed to download (Message 24338)
Posted 13 Jul 2012 by Nuadormrac
Post:
I'm wondering if this is just all coincidence, of if there's some difference in what tasks the servers are giving out and where. My old laptop, which I'm wrapping up a CPDN task on, has this self same constant download error problem now, last week they didn't all show up this way, after a time before where it cropped up a bit:

http://lhcathomeclassic.cern.ch/sixtrack/results.php?hostid=9957942

The new laptop doesn't show any of these errors though it's connecting as a new box requesting all new tasks:

http://lhcathomeclassic.cern.ch/sixtrack/results.php?hostid=9973530

The operating system is the same, though I didn't do a fresh install of win7 Pro on the new laptop so it's still running win7 home premium (though I have a key for pro, just not sure I want to do the whole format, etc now)... The procs are different core 2 duo vs i7 quad core, but we're looking at a download error, and all those tasks it isn't just my box getting it. They're exceding the quota for total returned tasks or total errors from pretty much everyone... Is this coincidence, are newer boxes not getting these bad WUs, it's enough to make me scratch my head. Maybe a remove, delete project directory, and reattach will work? Has anyone tried? I'm guessing there's still bad tasks that the project is sending out to people left in the queue though, hmm....

The second PC of mine was added yesterday though, vs the other over a year ago, double hmm...
4) Message boards : Number crunching : No Tasks ??? (Message 24218)
Posted 10 Jul 2012 by Nuadormrac
Post:
After seeming to resolve itself last weekend, this problem started up again yeserday, and has been unable to get work since through today. This isn't just 7.0.28, as beings that I've got a new laptop scheduled to arrive in 2 days (only ever built desktops, not laptops) I haven't bothered to update software on a system that's getting replaced/tossed out. This is occuring on 6.12.15 on Windows 7 Pro x64 also....

14,000 + available in queue, no work sent. There's also not much in the queue besides as I've been cleaning the queue out, to wind stuff down. They say tomarrow, but Fedex hasn't shown it yet, so I'm planning on 1 or possibly 2 days left to crunch (I'll check the trackinng latter), but it doesn't take that many hours to crunch these, so....
5) Message boards : Number crunching : Server status page shows 0 tasks ready but work is issued (Message 23891)
Posted 19 Feb 2012 by Nuadormrac
Post:
This could be what I'm noticing then, but in reverse. The server status page is showing over 1,000 tasks, but the client isn't getting. I was wondering if it might have been just tasks for another platform available or something, maybe not...
6) Message boards : Number crunching : Wot no jobs? (Message 23890)
Posted 19 Feb 2012 by Nuadormrac
Post:
Odd thing is, that the server status page shows over 1,000 tasks available, but the client is getting no task available errors. I just detached and re-attached on the new URL (or what I think is the new url). It's not giving the update the project url error though anymore, just no tasks available.

I'm guessing there's just none for Windows 7 64-bit? Or maybe it doesn't want to do the update with a LHC 2 task in progress. Not sure, hmm...
7) Message boards : Number crunching : Please note: this project rarely has work (Message 22731)
Posted 16 Apr 2011 by Nuadormrac
Post:
I am connected to LHC@home since august 31 2010 and got my first WU september 01 to crunch. It\\\'s a 00:22:54 SixTrack 4209.00 task. Still wating in line september 2 at 23:00 UTC.
Hope it will work just fine !


In a way, the new commers have had to get used to a more difficult time getting a single unit, then those of us who were around for many years had to. And yet I remember when I was the new commer (when account creation opened back up), and work seemed a bit less abundent, then what it had been. But that was nothing like the years since. Hydrogen@home and Pirates would be about on par now...

But yeah, just keep it connected if you want to get something. Something should come, eventually.
8) Message boards : Number crunching : Computation error (Message 22318)
Posted 17 May 2010 by Nuadormrac
Post:
It\\\'s not just a question of not clicking on the show graphics. If you use BOINC screen saver and it happens to have a LHC unit running it will still kill the work unit. Makes no difference if it\\\'s an ATI or NVidia Card.

To be honest if you take out the screen saver permanently, I and many others won\\\'t be doing crunching for LHC. The cool screensaver was what got my attention to start with.


So, let me see; if it was a choice between removing the screen saver which crashes the WU and makes it not even run, or leaving it in and having the WU and possibly BOINC go altogethr, you\'d chose the latter?

It\'d be far better for them to turn off the screensaver, then leave it in and have this. And as others have pointed out; with the status quo, they can\'t risk having the boinc screensavers on for other projects, lest the screen saver for this project alone kills things.

It\'d be far better for LHC to remove it, then have things as they are now; both for the tasks, as well as the users who want to use the screensaver for other projects.
9) Message boards : Number crunching : Slow this train down (Message 20249)
Posted 12 Sep 2008 by Nuadormrac
Post:
Actually it\'s good to see a bunch of work. Yesterday was somewhat dismal though it was said to be open with constant out of work coupled with server crashes and probs.

It\'s good the work is comming at this pace for however long it can. And there\'s more then enough crunchers attached to this project all hungering for tasks.
10) Message boards : Number crunching : Very short Intel computation vs. normal AMD (Message 20170)
Posted 10 Sep 2008 by Nuadormrac
Post:
The procs do flip flop some in terms of performance. In the Pentium days, arguably Intel had the upper hand, until the Athlons. AMD then pulled the upper hand through the Athlon classics (K7) and the A64, as the Pentium 4s were hell under-performing P3s even. Then Curusoe came out. In a competitive market, this should be expected.

There\'s also errata on any given CPU or core stepping, of which the FDIV design flaw is probably one of the more noted (early Pentium or )5 cores). This can also be in one stepping of a core and not another (ala 2 dif core steppings of a Pentium II. If an errata comes up, it could result in a miscalculation or erroring out, where even another core stepping of the same CPU doesn\'t exhibit the same behaviour.
11) Message boards : Number crunching : Actual LHC data to crunch? (Message 20102)
Posted 10 Sep 2008 by Nuadormrac
Post:
Well, for a school the BW wouldn\'t necessarily be a problem. When I lived on the UNM dorms, they had like 3 T3 connections running in parallel, and the other college had an OC/3 line to the outside world. True a high school probably has less BW, though also fewer users; but they would have broadband of some kind, perhaps a T1 or better.

This said, a proprietary linux where the machine can\'t be used for anything else might pose a problem for a high school teacher as it would be hard to get dedicated purpose machines past the budgeting committee, whatever the school\'s BW might be.
12) Message boards : Number crunching : BOINC 5.10.x?? (Message 20080)
Posted 10 Sep 2008 by Nuadormrac
Post:
It is possible to run 2 different versions of BOINC, though it requires manually starting up and exiting each to get some time on both.

This said, recommendation does not mean requirement. And depending on the bug (I\'m not going to pretend to have experienced this); one could try getting a WU, and just micromanage BOINC a little to force it to run LHC when one\'s there for a WU and see if it goes. If it does, great, if it doesn\'t; then address it. If it works, one can activate their other projects and let it run confident one hadn't run into a problem. Just check it periodically then to make sure all remains OK. First sign of trouble, then address it, the advice was given to alert people to the possibility.

The lattest news items

06.09.2008 12:30 BST -


Hi,
Further to yesterday\'s news item, we are recommending that you download/use the 5.10.X version of the BOINC client for your opertaing system. We hope to have this sorted as soon as possible but with LHC turn on in 4 days everyone is busy preparing for that.

Later days,
Neasan

05.09.2008 10:40 BST -

Hi,
Alex has been doing some tweaking in the background this week so some issues have been resolved. We are looking into the others and fixing them as we find solutions. Also the SixTrack application and the newest version of the BOINC client (6.2.X) don\'t seem to like each other. The developers are looking into this problem and it will be fixed as soon as possible.

Later days,
Neasan


are not specifically saying whether EVERYONE would or would not have a problerm. There a recomendation based on various things that have been noted. Those without the problem in already crunched WUs might have just been lucky. But it doesn\'t follow that it will not work for anyone (wasn\'t mentioned wholy incompatible in the note).

Just be aware, and sort it among the projects as each person sees best. Some might find it more prudent to downgrade, others might not have upgraded, some might be seeing no problem at all, and until they do.... Still others might need to temporarily run 2 dif BOINcs and swap between them for their various projects, or skip one or more projects for a time.
13) Message boards : LHC@home Science : 64 Bit proccessing (Message 20079)
Posted 10 Sep 2008 by Nuadormrac
Post:
This said, BOINC is 32-bit typically and in any case the client files are.


There is a 64 bit version of BOINC for Linux, Mac and Windows. There are several projects that have 64 bit applications. Some of those are much faster than the 32 bit version, see ABC@home for example.




Yeah, but not typical; as the 32-bit version of BOINC is most often downladed, and most projects don\'t offer a 64-bit version of the project's app.

Allowing the proc to calculate things in 64-bit mode WOULD be significantly faster where type double and long doubles (80 bit floats) are used for certain fundamental reasons. But to get into this, one would have to get into the mechanics of how an x86 proc handles floating point, and specifically how it is handled in 32-bit mode.
14) Message boards : LHC@home Science : Cant get it working in Vista x64 (Message 20072)
Posted 9 Sep 2008 by Nuadormrac
Post:
No WUs yet... I\'m expecting sometime tomarrow based on the recent news items. It\'s been 3 days since the date on the news suggesting (in 4).
15) Message boards : LHC@home Science : 64 Bit proccessing (Message 20071)
Posted 9 Sep 2008 by Nuadormrac
Post:
Previously, I ran 64-bit Vista in it\'s beta 2 through it\'s RC days, and also a 64-bit Linux. My observations are that it was faster up to the point one ran out of memory. This is to say (mind you the apps were 32-bit), things appeared snappier and more responsive.

This said, BOINC is 32-bit typically and in any case the client files are. From a scientific standpoint if they had a 64-bit client, this could result in more precision being allowable in the calculations without a performance tole on the CPU. This said, type double (64-bit floats) are probably already being used; only thing is it takes a lil more for a 32-bit proc to calculate 64-bit floats then a 64-bit proc.

Memory is an issue however, depending on how much you have. Under Linux I didn\'t run out of so much, but on Vista, 1 GB RAM copuld be eaten awful fast. Once it swaps, the performance drops, and the 64-bit vista did use a ton of RAM. I never looked at Vista-32 to compare. As you\'re running, when I ran slamd64 on this box would probably be more representative; and the memory footprint on that was lower then win64.
16) Message boards : Number crunching : I get no \"workload\" (Message 20067)
Posted 9 Sep 2008 by Nuadormrac
Post:
This isn't uncommon on LHC. WUs come in groups and are not always available. However there is indication that WUs will be available in a day or so, so just stay attached.
17) Message boards : Number crunching : The new look bugs (Message 20066)
Posted 9 Sep 2008 by Nuadormrac
Post:
Rather cosmetic, but on the display side of things one third the viewing pain is taken up by the menu on all pages which results in each column being taken up. On this post screen, 1/4 the browsers visable reealestate is taken up by white (grey) space, which means only 2 words are visable on what I post, and the rest I can\'t see, edit or access on post making.

On my account space, space beyond each vield\'s visable pain, becomes non-accessible text. Not everyone has enough browser real-estate for all these columns being fixed size, or all the white space being imposed. It has essentially made using the site from encombersome at best, to non-funcional at worse. Made worse that my old monitor went bad, so had to borrow one that forces me to 1024x768 screen res (my old monitor handled 1280x1024), and being a LCD has the optimal res bit on display part. Now throw that I routinely have 5_ programs open at a time and can NOT have 1 browser window take up the entire screen, and access everything I need to multi-task through at once.... There\'s too many columns all imposing their own footprint on browser window real estate. I can't even see all I\'m typing under this setup.

BTW, this is as displayed in IE 7

Edit: at the bottom of this post, I get more then 1 screen of white space before my sig; on other forums I'm not seeing this.
18) Message boards : Number crunching : Discepency in Credits? (Message 14155)
Posted 23 Jun 2006 by Nuadormrac
Post:
Keep in mind however, that since the database crash, my computer's copy of the cedit has been increasing, but the value in the user table has not. AKA one has been going up, and the other is the same.

Also, things can be orphaned in 1 direction, aka if it is able to link the tables in 1 direction, but for whatever reason not the other. Yes, it is reasonable the same foreign key is used, and logically it should be able to go both ways. This of course is assuming that everything is working correctly.

What could be going on, is that a result is being reported by a given host, and so there's the record of the host being made through the connection with the dBase itself. However, if in the process of trying to update the user, the dBase is getting the same sorta error we are getting, such as

Warning: mysql_fetch_object(): supplied argument is not a valid MySQL result resource in /shift/lxfsrk4101/data01/projects/lhcathome/html/user/results.php on line 41

Warning: mysql_free_result(): supplied argument is not a valid MySQL result resource in /shift/lxfsrk4101/data01/projects/lhcathome/html/user/results.php on line 45


then it might be unable to proceed any further, so stops.

One other thing to keep in mind. The results table needs to be updated, along with some other tables potentially. The user table doesn't need an update on some fields, unless an account is created, the cross-project ID is changed (aka after being created, BOINC contacts and supplies a different, earlier one), etc. It's the credits that need a recipracle update. Computers is another one of those things. Which of course brings up the possible question if one of the validator's is running into the same "not valid argument" we are, and if so; is simply unable to complete an operation.

I point this out, because there might or might not even be an attempt to update the user table at this point; in which case we could be looking at old data, rather then current (aka some things being presented to us has simply been cached). Not entirely positive, as I haven't looked at LHC's dBase setup itself; though as another sign of this whole dBase crash, LHC hasn't exported XML files for stat sites like (BOINCstats) to pick up in awhile either.

http://www.boincstats.com/

LHC@Home 2006-06-21 18:45:06 GMT
2 days 03:59:00 old


All in all, doesn't seem implausable that some parts of the dBase are still functioning (else we wouldn't even be getting to the forums, or be able to look at anything), and other parts have crashed. Nor does it seem implausable that if we are getting errors such as the above, that any internal operations that require those same tables or present similar arguments might not be getting the same error we get on our screen. And from there...?

Could some of these figures such as user credit be cached somewhere anyhow, or does it have to be a recent copy. At the very least I have noticed in the past a lag time between when user credits are incremented, and when this increment shows up on their team page (which of course couldn't complete sucessfully, after the teams have disappeared).
19) Message boards : Number crunching : Discepency in Credits? (Message 14142)
Posted 23 Jun 2006 by Nuadormrac
Post:
My host tends to have more. Last time this happened was also due to the last time we had server probs, which was evened up when Churlle fixed it the last time.

Best I can figure, is something like this is going on. There's evidence of something seriously wrong in the dBase (for instance, click on certain sections like pending credit, teams, whatever, and get some error message wrt the operaton taking place).

So, the host reports a WU back, without a database association to say which account owns the host, the credit gets applied to the host, but not the owning account. Ditto on anything getting applied to team accounts, as they aren't even comming up anymore. If the dBase can't find out what computer belongs where, all these orphaned records/computers can end up gaining credit, without it applying elsewhere.

I'm not 100% positive how they have it setup, but there might be a host_computer table in the dBase which is seperate from the user table? If that's the case, the dBase would have to be able to link these tables together to make the necessary applications of credit; and performing lookups on some tables (aka pending credit and teams) seems to be what invokes the error with a 0.00 given for a result.

Or perhaps I should modify that. The computers aren't completely orphaned. They still do show up under our account as being our computers. It just seems an operation done to them credit wise, isn't reciprically being done to the comp account that owns them during such a dBase crash...
20) Message boards : Number crunching : Where all the Teams Go? :( (Message 14134)
Posted 23 Jun 2006 by Nuadormrac
Post:
thx for the update


Next 20


©2020 CERN