Message boards :
ATLAS application :
queue is empty
Message board moderation
Previous · 1 . . . 6 · 7 · 8 · 9 · 10 · 11 · 12 . . . 13 · Next
Author | Message |
---|---|
Send message Joined: 4 Sep 22 Posts: 92 Credit: 16,008,656 RAC: 12,040 |
Furthermore, it helps if one's contribution to the effort is appreciated by the people who run the projects. The only way I can see for such appreciation to be seen on an ongoing basis is through the credit system.Cash would be a show of appreciation. Then again there are those who ran Collatz for credit and didn't believe there was a point to the maths. "It takes all sorts to make a world" as my gran used to say. Well, since it's difficult to transfer cash in a TCP packet.... And I really don't care what the credit wh*res are doing. I run only those projects that are of personal interest to me, and I'm not doing any of this for any bragging rights. If I wanted that, I would have kitted myself out with a Threadripper 5995WX on something like an ASUS Pro WS WRX80E packing 128GB of RAM and a pair of video cards each with 24 GB of memory and something like a Radeon RX 7900 XTX GPU. But even with a system like that, I probably still wouldn't have even half of what some of those RAC wh*res have in their computing arsenal. As I said, a steady graph of the RAC in the boinc manager statistics graphs shows everything is probably working according to plan -- while a sudden decline in the RAC can indicate some problem(s) in just one quick glance. That is the only reason I need to see the RAC at all, and the only reason I would like the credit per task remain at a constant value per CPU hour -- something which, at present, is certainly not the case with Atlas tasks. |
Send message Joined: 12 Aug 06 Posts: 429 Credit: 10,591,167 RAC: 702 |
Well, since it's difficult to transfer cash in a TCP packet....Gridcoin is a nice idea, if it paid anything like the electric cost. And it would be easy enough to allow anyone over a certain RAC to apply for an account, give paypal address, and get regular payments. But apparently a huge place like CERN can't afford it. probably still wouldn't have even half of what some of those RAC wh*res have in their computing arsenal.ROFL! And you can write hoars, that's lots of frost :-) As for looking for errors, usually a computation error pops up in yellow on my list. Or I see the CPU% bar drop down and turn white. Or something runs a longer time than expected. I also look at the MSI afterburner graph on every machine each morning, using remote desktop, to make sure nothing is overheating, the CPU isn't throttling the GPU, and all the chips are running flat out. RAC wouldn't work for me, I keep switching projects, and run projects with a non-constant supply of work. I do hate acronyms: "1907: Given royal approval by Edward VII becoming the Royal Automobile Club", no wonder they're so expensive, probably still paying tax to the king. |
Send message Joined: 18 Dec 15 Posts: 1821 Credit: 118,941,238 RAC: 21,881 |
The credit system that it AFAIK uses is a bit weird. https://boinc.berkeley.edu/trac/wiki/CreditNewthis time, the credit seems to erode much faster that it did so far. Yesterday evening, I got 517 points for a given task, this morning it was 204 points. Of course: same host, same number of cores, and even exactly same runtime. If it goes on like this, it will soon end up with zero points :-) |
Send message Joined: 4 Sep 22 Posts: 92 Credit: 16,008,656 RAC: 12,040 |
As for looking for errors, usually a computation error pops up in yellow on my list. Or I see the CPU% bar drop down and turn white. Or something runs a longer time than expected. I also look at the MSI afterburner graph on every machine each morning, using remote desktop, to make sure nothing is overheating, the CPU isn't throttling the GPU, and all the chips are running flat out. OK, so you do a detailed check first thing every morning. I check the statistics graphs. I don't know what you're using where those yellow flags happen -- VirtualBox perhaps? That is my last line of "defence" if I find a problematic task in the boinc manager. I don't go to that depth unless I see something that suggests a problem somewhere. That, as I said, often begins with a glance at the statistics graphs. Then, I check the tasks for any project where the RAC has been dropping. I do not switch projects. I run those which are doing things of interest to me -- LHC, Einstein, Cosmology and Rosetta. I have settled on those, and they are not likely to change, at least not in the near future. Oh, if I had one of those 64-core Threadrippers I mentioned earlier (yeah, I wish!!) then I would also add in a project dealing with climate change. But that isn't going to be happening any time soon -- the CPU alone costs around $10K here in Canada. |
Send message Joined: 12 Aug 06 Posts: 429 Credit: 10,591,167 RAC: 702 |
OK, so you do a detailed check first thing every morning. I check the statistics graphs.That doesn't show things up accurately. You're just looking at how well you've done over the last month. And it can't work with Rosetta, as their work is on and off. So your RAC there will jump about randomly, and it will affect your RAC here when your computers go and do some Rosetta instead of LHC. I don't know what you're using where those yellow flags happen -- VirtualBox perhaps? That is my last line of "defence" if I find a problematic task in the boinc manager.Yellow is the colour I chose to show tasks saying "computation error". Lots of projects do that, usually the projects fault, but can be a dodgy or overheating CPU/GPU. I don't go to that depth unless I see something that suggests a problem somewhere. That, as I said, often begins with a glance at the statistics graphs. Then, I check the tasks for any project where the RAC has been dropping.I get bored on one, or miss doing another. I want 100 machines. Too much to get done! Cosmology is causing me a problem. They only work with VB 5. Uspex needs VB 6 or later. LHC will run on anything. I have settled on those, and they are not likely to change, at least not in the near future.I've got two 24 core Ryzens, that will do for now. Money is saving up for a new house. The only Climate change I know of is CPDN. Their tasks are so rare you might aswell join it and get them when they are available. You're best joining with windows and linux as one subproject is on each (I've got Boinc running on linux in a VB inside Windows on the two fastest machines, so I can get tasks from either of their subprojects). |
Send message Joined: 4 Sep 22 Posts: 92 Credit: 16,008,656 RAC: 12,040 |
@Mr P Hucker I only allow 2 concurrent Rosetta tasks, so it doesn't affect the other projects all that much. I also guess I didn't fully explain my rationale in checking at the graphs as a first indication of problems. Of course, there are constant fluctuations in the RAC from day to day -- it is the very large drops in RAC that catch my attention. These almost always suggest something that needs further checking. For example, Cosmology tasks sometimes become "stuck"; the VM becomes unmanageable, so boinc postpones further calculations for a full day, 86400 seconds. During that time, boinc will run any Cosmology tasks that are in queue, but will not request any new ones. In my experience, it also appears that it does not report completed tasks. Given the short run times of these tasks, the queue is quickly emptied of Cosmology tasks. When this happens, the Cosmology RAC will quickly decline by a significant amount. Of course, this issue is also immediately apparent if I just look at the tasks list, but this is not the only problem I've encountered. Recently, I got quite a few Theory tasks that were all running for just over 9 days, with quite some time showing for the estimated time of completion. With no Atlas tasks available at the time to take up the slack, the LHC RAC was dropping dramatically. In boinc, all looked to be OK -- CPU time was increasing along with run time, checkpoints appeared to be happening regularly, but the time to completion just kept creeping up. Now it was clearly time to check VirtualBox, where I saw the VM was still running, but the guest CPU usage was 0. Time to abort them all, which I did. Did I leave them too long before going to VB? Probably. However, Alt-F2 has never worked here. To gain direct VB access, I had to hack the boinc account to make it a login account, and run VirtualBox in a kdesu shell in my personal account: kdesu -u boinc VirtualBox. Once there, moving around is not all that easy -- so I tend to use it as a method of last resort. The only 24-core Ryzen CPUs I am aware of are a couple of ThreadRippers. I'm happy for you that you can afford not just one, but two, CPUs that combined must have set you back nearly US$3500 :D |
Send message Joined: 12 Aug 06 Posts: 429 Credit: 10,591,167 RAC: 702 |
For example, Cosmology tasks sometimes become "stuck"; the VM becomes unmanageable, so boinc postpones further calculations for a full day, 86400 seconds.You're using VB 7. Cosmology hates anything newer than 5. LHC works fine on 5. Only Uspex requires 6 or later. Or you can use the legacy Cosmology tasks, but they're not very fast. During that time, boinc will run any Cosmology tasks that are in queue, but will not request any new ones. In my experience, it also appears that it does not report completed tasks. Given the short run times of these tasks, the queue is quickly emptied of Cosmology tasks. When this happens, the Cosmology RAC will quickly decline by a significant amount. Of course, this issue is also immediately apparent if I just look at the tasks list, but this is not the only problem I've encountered.If you use Boinctasks, there's a % CPU usage column. I can see immediately something isn't processing. The only 24-core Ryzen CPUs I am aware of are a couple of ThreadRippers. I'm happy for you that you can afford not just one, but two, CPUs that combined must have set you back nearly US$3500 :DI have Ryzen 9 3900X and Ryzen 9 3900XT. When I say core I mean thread. Yes I know technically they're not cores, but they behave as such. Hell Boinc calls them CPUs! |
Send message Joined: 4 Sep 22 Posts: 92 Credit: 16,008,656 RAC: 12,040 |
The problem with Cosmology tasks is so rare it's not worth my effort to change VB versions; ver 7 is what comes with the distro, so I use it. The biggest problem is that one-day delay in restarting the postponed tasks. I haven't found a way to change that. I could probably ask in the forums on the Cosmology or Boinc websites, but it's just as easy to abort the problem task and let things carry on. There is no boinctasks anywhere in the packages available from my distro, nor any package by that name. Besides, if it is a command-line util, I prefer to use a gui when one is available. AFAIK, for quite some time now, each CPU core has its own math unit, which is shared between the threads the core includes. I guess I'm using the most restrictive possible definition of "core" -- if it includes a math unit, it is worthy of being called a CPU :D Recently, I've seen the odd report of upcoming processors that may feature one math unit per thread. If that comes about, then your definition of a CPU and mine will coincide :D |
Send message Joined: 12 Aug 06 Posts: 429 Credit: 10,591,167 RAC: 702 |
The problem with Cosmology tasks is so rare it's not worth my effort to change VB versionsOdd, with me and others complaining in the forum, they break very often. Every single day I had to clear out jammed ones. Problem went away when I went to v5. Maybe only the Windows VB has this problem? ver 7 is what comes with the distro, so I use it. The biggest problem is that one-day delay in restarting the postponed tasks.Yes, that was my problem. Once there are loads of them, the queue runs out and I have to nudge them. So if a computer is unattended too long, no Cosmology gets done. V5 just works. Oracle broke it after that. When creating a new version of something, always make it compatible with older things. I haven't found a way to change that. I could probably ask in the forums on the Cosmology or Boinc websites, but it's just as easy to abort the problem task and let things carry on.Restarting Boinc resets the 1 day timer. There must be a way to make it shorter, but I think I asked and it's hard coded. There is no boinctasks anywhere in the packages available from my distro, nor any package by that name. Besides, if it is a command-line util, I prefer to use a gui when one is available.Google is your friend. https://efmer.com/boinctasks/boinctasks-flavours/ - it's a GUI. Don't expect everything to be in the holy repository. AFAIK, for quite some time now, each CPU core has its own math unit, which is shared between the threads the core includes. I guess I'm using the most restrictive possible definition of "core" -- if it includes a math unit, it is worthy of being called a CPU :DIf you call a core a CPU, what do you call the whole CPU? You can often use all the threads, since the maths unit isn't used 100% of the time, there are memory accesses etc going on aswell, so I treat each thread as a core, so does Boinc. Recently, I've seen the odd report of upcoming processors that may feature one math unit per thread. If that comes about, then your definition of a CPU and mine will coincide :DIf they have a mathSSS unit per thread, what will they continue to share? |
Send message Joined: 4 Sep 22 Posts: 92 Credit: 16,008,656 RAC: 12,040 |
The problem with Cosmology tasks is so rare it's not worth my effort to change VB versionsOdd, with me and others complaining in the forum, they break very often. Every single day I had to clear out jammed ones. Problem went away when I went to v5. Maybe only the Windows VB has this problem? Or, the problem simply isn't so prevalent with the Linux version. However, I don't know about other distros than opensuse. I haven't found a way to change that. I could probably ask in the forums on the Cosmology or Boinc websites, but it's just as easy to abort the problem task and let things carry on.Restarting Boinc resets the 1 day timer. There must be a way to make it shorter, but I think I asked and it's hard coded.[/quote] If I restart boinc-client, the stalled task always restarts OK, and things return to normal. I have never had one stall a second time. However, does that not open up the possibility that some LHC tasks will then fail due to a compilation error? At least I have seen a few fail after a restart. All in all, it seems to me to be best just to abort the errant Cosmology task and carry on. There is no boinctasks anywhere in the packages available from my distro, nor any package by that name. Besides, if it is a command-line util, I prefer to use a gui when one is available.Google is your friend. Normally I don't go looking for things I have not heard of ;) Anyway, boincmgr does all that I need to see right now, but thanks for the info about boinctasks. It definitely looks interesting; I'll be sure to keep it in mind if I want to start looking deeper than boincmgr allows. AFAIK, for quite some time now, each CPU core has its own math unit, which is shared between the threads the core includes. I guess I'm using the most restrictive possible definition of "core" -- if it includes a math unit, it is worthy of being called a CPU :DIf you call a core a CPU, what do you call the whole CPU? From Intel (see ref 2 at https://en.wikipedia.org/wiki/Central_processing_unit#cite_note-intel-pcm-2): A thread is a logical, or virtual, CPU A core is a (possibly multithreaded) CPU The big chip that holds them all is a multi-core processor. "one math unit per thread" I have no idea what such a processor would be like. Maybe (big guess on my part) they are working on multi-threaded FPUs? |
Send message Joined: 2 May 07 Posts: 2244 Credit: 173,902,375 RAC: 456 |
New Tasks are dropping to Zero since one or two hours. |
Send message Joined: 24 Oct 04 Posts: 1176 Credit: 54,887,670 RAC: 5,761 |
lol I see nothing.....nothing |
Send message Joined: 2 May 07 Posts: 2244 Credit: 173,902,375 RAC: 456 |
I responded to all this, and somebody silently deleted it, fuck LHC. I'm off to do work on a respectable project with a programmer without OCD and a modicum of common sense. queue is empty This is the header of this Thread. Mr. Hucker, there are rules and you have to respect them!!! |
Send message Joined: 30 Dec 13 Posts: 1 Credit: 451,563 RAC: 0 |
The biggest problem is that one-day delay in restarting the postponed tasks. I haven't found a way to change that. I could probably ask in the forums on the Cosmology or Boinc websites, but it's just as easy to abort the problem task and let things carry on. Error messages can show a task has been postponed for a number of reasons. My experience from crunching QuChemPedIA to restart postponed tasks is to shut down Boinc. Open VirtualBox and make certain all VMs close, that can take a few seconds. Then restart Boinc. I have only just started crunching LHC, but that same procedure has worked for me here when I was crunching Theory tasks. |
Send message Joined: 2 May 07 Posts: 2244 Credit: 173,902,375 RAC: 456 |
queue is empty. Is back again, thank you. |
Send message Joined: 2 May 07 Posts: 2244 Credit: 173,902,375 RAC: 456 |
got only three Rescheduler. queue is empty. |
Send message Joined: 2 May 07 Posts: 2244 Credit: 173,902,375 RAC: 456 |
New Atlas vdi have 670 MByte instead of 1.5 GByte. Thank you for investigation. Magic you can now starting Tasks. |
Send message Joined: 24 Oct 04 Posts: 1176 Credit: 54,887,670 RAC: 5,761 |
New Atlas vdi have 670 MByte instead of 1.5 GByte. |
Send message Joined: 24 Oct 04 Posts: 1176 Credit: 54,887,670 RAC: 5,761 |
Well that should take 90 minutes but it is 3pm and I have nothing better to do here right now. |
Send message Joined: 14 Jan 10 Posts: 1422 Credit: 9,484,585 RAC: 1,266 |
New Atlas vdi have 670 MByte instead of 1.5 GByte. It's not the vdi file with that size. That is much bigger. About 4 GB unzipped. The 670MB file is the pool.root file coming with every new task. |
©2024 CERN