Message boards : Number crunching : VM Applications Errors
Message board moderation

To post messages, you must log in.

Previous · 1 · 2 · 3 · 4 · 5 · Next

AuthorMessage
Jim1348

Send message
Joined: 15 Nov 14
Posts: 568
Credit: 17,938,647
RAC: 21,077
Message 43160 - Posted: 1 Aug 2020, 12:20:28 UTC - in response to Message 43158.  

I remain totally baffled as to why both Linux and Windows users both need to use Vbox.

I have the following suggestions :
1. Either LHC should opt-out of FB sprints, OR, preferably, try to ensure it has SixTrack available for the next one
2. As has been mentioned above, LHC need to provide minimum specs and up-to-date guidance on how to run each app
3. Pay partial credits for failed work if it is not the fault of the volunteer

As a (more or less) innocent bystander, I can offer the following observations:
(1) The reason CMS requires VBox has been discussed many times by various people, including myself.
(2) This is a complicated project (set of projects actually, each different).
(3) I have been doing it, including native ATLAS and Theory since they first became available.
(4) Each time I set up a new machine, I have to learn what the latest procedure is, since it often changes.
(5) This is not for beginners.
(6) In fact, it may not be for home users at all, depending on your interest level.
(7) I have no idea what "FB sprints" are, and don't want to find out, But I expect this is not for them, whoever they are.

Good luck. It can be a lot of fun, but it is an acquired taste. (And computezrmle is the real expert on the subject).
ID: 43160 · Report as offensive     Reply Quote
Greger

Send message
Joined: 9 Jan 15
Posts: 149
Credit: 431,596,822
RAC: 0
Message 43164 - Posted: 1 Aug 2020, 22:20:56 UTC

For personal view am against this "Sprint" it specifically this "3 days event" and cause a more trouble then project would gain from it. I made post at Cosmology for it and other projects that i contribute to such as TN-Grid was effected hard from this short "flow" of host. There are several negative aspect that i see and in my view it become brute force and same affect as DOS attack to web server and boinc servers. It no longer used as way for project admins to test network and boinc servers. It is become place for volunteers that is not satisfied to contribute in long term and take power in large crowd to attack project to prove something that already been proven. We have seen it several times for several years and it continue over and over again for same projects without any benefit. Website and boinc servers with no response for users and task flow that disappear and been dropped few days later. This a pain for admin to project to deal with and put a lot of time to regenerate new work units and purge data to db and spend time change parameters in feeder/scheduler To limit flow to host and they increase minimum quorum or increase size of wu like TN-Grid does after each event and become habit each year.

User post on site, forum or chat not aware of this event and default answer to have been "It's Formula Sprint" and most of time they posted things like " oh ok i try another project" or they post something that would drop it and do something else with computer.
If you like to be a part of "Sprint" it would probably best to go for sixstrack application only but i would like to encourage users NOT to in this manner. I know most of users would like to take control moving one project after another and play with these computers as they put lot of time and money to it. You don't need make it as a race hunt highest credit to get top on scoreboard. Be wise and stay at project to be able to make computers more work more efficiently at LHC. You can you do it, just need to take the time doing it.
You could do better an profit doing would be better experience to new things. My credit score on stat sites to project or teamranks does not give anything at end but knowledge around LHC giving me have great value. I learned several distributions of linux and application like virtualbox,cernvm-fs, run, singularity and squid. Basic stuff to bash help me today doing simple stuff at home. Such applications and network for proxy was completely unknown for me. No way i take time and get help using it without LHC push me and community here.

There is huge need of support on every project forum that need your experience to boinc and info for project. Help them and build up a healthy community around it. I would for sure try help other if i can.

Two people in this thread that do great support and knowledge they build up and share frequently to forum almost daily. They are doing great work and that is how get experience to move project forward by inform users to make contribution better and keep users that in there end help project in another way.

Example1: 'Jim1348' is the guy that made me try native application as he did a great post with commands to install it.It was great push to take step.
Example2: 'computezrmle' posted config for squid proxy and that was not requested but put great time and effort to make it with experience in network and time setup connection to additional cache proxy could not be done this easy without his work. And support he put in PM pin out network issues and with suggestions saved much time for me and improve my contribution to project. His the guy for sure to listen to when there are issues to task. His position is well earned.

I would not be here and doing native task to project or setup squid to improve network. Contribution this have huge affect on users and that makk

Or contribute in testing applications or improve them if you can. Or go to github and point out issues and pin out annoying things to boinc site or bad settings to boinc servers.
Or just be passive and enjoy experience you have and addiction to boinc and chill out and share pictures of rigs. Anything actively would be good for project in good way or be passive to enjoy time as volunteer and build up experience about it. If project reach out users listen to and be a part to help them and if they have some requirement try and be a part to improve it.

I am in same situation as you other volunteer. I have computers in every corner in house take up space generate excessive heat in every room 27/7. I hit 40 celsius some inside while it is 30 outside like today 's heat and pay high bills to keep them alive. There are good read at forums and at discord that really are addicted to boinc and spend many years since boinc started and they have done great journey. Many of them take time and post daily in forum to help other or post info to project admin to help them. Looking up to people that MAGIC Quantum Mechanic that has been has since 2004 and never complained even though he sits with limited data and low speed but equally damn contributes to the project when he can. Not only that, a journey through Atlas-dev and vLHC.
If there are always opportunities for improvement and perseverance and commitment, it will get better one day. It is important to convey wishes for this in a good way. If you can, it is good to participate to test and share info to solve the problems in a good way.

My suggestion is to pick project that fit you and stick to to until you found anything better. This like Covid-19 hit us and great to see dedicated volunteers move to these project when it was needed.

Which way you take is up to you and both parties would be here. I would encourage users a healthy journey and dig into projects and find a way to contribute or testing or debug issues if they can. Put your computer in way it would handle task one after another and monitor them. As soon as you found an issue get on it right away instead of put 100 computers directly to project and hope for the best. It is near impossible to help users with these computers.

For LHC there are several errors that are network related and may not known for users or project admins in stderr or other logs in vm. These issues could be network to host or server.

Yes you would need additional application as virtualbox to handle all subproject here except sixtrack. But those who use linux which several here do the native is a dream when you got it running. This build for production purpose and made to run for long sessions. Boinc is tiny compare to network around Cern and sure that could explain that they have time or take time to put support for every single user that would like to "try" a few task. Experience a errors and user leave. Simple task at set virtualization in bios could make some users to drop it as it would more simple to move to another project then doing these task. That is probably what would happen for some but not all.

There are some issues i experience and can not deal with without help from project admin or Cern devs would handle it, so in meantime i would wait it out. Most of people are in vacation and no sixtrack would be made so virtualbox is easy to setup now almost one-click thing now. It is what it is and i would do what i can to project.

Enjoy your vacation.
ID: 43164 · Report as offensive     Reply Quote
Profile MAGIC Quantum Mechanic
Avatar

Send message
Joined: 24 Oct 04
Posts: 1000
Credit: 45,921,636
RAC: 2,471
Message 43167 - Posted: 1 Aug 2020, 23:38:58 UTC

Thanks Gunde, much appreciated.
ID: 43167 · Report as offensive     Reply Quote
[TA]Assimilator1
Avatar

Send message
Joined: 29 Nov 13
Posts: 57
Credit: 3,928,339
RAC: 0
Message 44422 - Posted: 1 Mar 2021, 19:13:59 UTC
Last modified: 1 Mar 2021, 19:14:38 UTC

I'm getting a ton of errors atm, I had a quick look at the task details, but it doesn't mean much to me.
I've got no errors in Rosetta atm, for what it's worth.

Can anyone help?
Team AnandTech - SETI@H, Muon1 DPAD, F@H, MW@H, A@H, LHC@H, POGS, R@H, DHEP, CPDN, E@H.
Main rig - Ryzen 3600, MSI B450 Gm Pro C AC, 32GB DDR4 3200, RX580 8GB, Win10 64bit
2nd rig - i7 4930k @4.1 GHz, 16 GB DDR3 1866, HD 7870XT 3GB(DS), Win7 64bit
ID: 44422 · Report as offensive     Reply Quote
computezrmle
Volunteer moderator
Volunteer developer
Volunteer tester
Avatar

Send message
Joined: 15 Jun 08
Posts: 1821
Credit: 123,484,329
RAC: 81,860
Message 44424 - Posted: 1 Mar 2021, 19:57:32 UTC - in response to Message 44422.  

Typical snippet from your logs:
VBoxManage.exe: error: AMD-V is disabled in the BIOS (or by the host OS) (VERR_SVM_DISABLED)

This must be enabled to run VBox tasks.
It either had never been enabled or has been disabled e.g. during a BIOS/OS upgrade.
ID: 44424 · Report as offensive     Reply Quote
[TA]Assimilator1
Avatar

Send message
Joined: 29 Nov 13
Posts: 57
Credit: 3,928,339
RAC: 0
Message 44426 - Posted: 2 Mar 2021, 18:52:54 UTC - in response to Message 44424.  
Last modified: 2 Mar 2021, 18:55:46 UTC

Thanks for your reply.

Never heard of it, what is it?
I don't recall seeing that in the bios, and I don't see it mentioned in the manual...
Team AnandTech - SETI@H, Muon1 DPAD, F@H, MW@H, A@H, LHC@H, POGS, R@H, DHEP, CPDN, E@H.
Main rig - Ryzen 3600, MSI B450 Gm Pro C AC, 32GB DDR4 3200, RX580 8GB, Win10 64bit
2nd rig - i7 4930k @4.1 GHz, 16 GB DDR3 1866, HD 7870XT 3GB(DS), Win7 64bit
ID: 44426 · Report as offensive     Reply Quote
Profile Yeti
Volunteer moderator
Avatar

Send message
Joined: 2 Sep 04
Posts: 424
Credit: 117,245,001
RAC: 47,421
Message 44428 - Posted: 2 Mar 2021, 20:13:42 UTC - in response to Message 44426.  

Perhaps you find informations in this checklist


Supporting BOINC, a great concept !
ID: 44428 · Report as offensive     Reply Quote
[TA]Assimilator1
Avatar

Send message
Joined: 29 Nov 13
Posts: 57
Credit: 3,928,339
RAC: 0
Message 44429 - Posted: 3 Mar 2021, 18:42:03 UTC

Thanks for your reply, that's a very large post of yours you linked, is their a particular part I can get away with reading? I don't want to spend hours on this.
Team AnandTech - SETI@H, Muon1 DPAD, F@H, MW@H, A@H, LHC@H, POGS, R@H, DHEP, CPDN, E@H.
Main rig - Ryzen 3600, MSI B450 Gm Pro C AC, 32GB DDR4 3200, RX580 8GB, Win10 64bit
2nd rig - i7 4930k @4.1 GHz, 16 GB DDR3 1866, HD 7870XT 3GB(DS), Win7 64bit
ID: 44429 · Report as offensive     Reply Quote
Profile Yeti
Volunteer moderator
Avatar

Send message
Joined: 2 Sep 04
Posts: 424
Credit: 117,245,001
RAC: 47,421
Message 44430 - Posted: 3 Mar 2021, 22:40:21 UTC - in response to Message 44429.  

Thanks for your reply, that's a very large post of yours you linked, is their a particular part I can get away with reading? I don't want to spend hours on this.
HM, If you really want to to crunch Atlas, Theory or CMS, you really need to go through the the list point by point as I already mentioned there:

Please, check this list and be sure to check really all Details, step by step, all are important.
...



Supporting BOINC, a great concept !
ID: 44430 · Report as offensive     Reply Quote
[TA]Assimilator1
Avatar

Send message
Joined: 29 Nov 13
Posts: 57
Credit: 3,928,339
RAC: 0
Message 44461 - Posted: 8 Mar 2021, 18:10:32 UTC
Last modified: 8 Mar 2021, 18:11:01 UTC

In the short to medium term I would've just quit running LHC if I needed to go through all that.

Anyway, some team mates managed to help me - https://forums.anandtech.com/threads/weekly-dc-stats-28feb2021.2591268/post-40454305
Turns out AMD-V was disabled in the bios, and it was called SVM mode, hence I didn't spot it in the manual or bios.
Since enabling it, I've had no errors.
Team AnandTech - SETI@H, Muon1 DPAD, F@H, MW@H, A@H, LHC@H, POGS, R@H, DHEP, CPDN, E@H.
Main rig - Ryzen 3600, MSI B450 Gm Pro C AC, 32GB DDR4 3200, RX580 8GB, Win10 64bit
2nd rig - i7 4930k @4.1 GHz, 16 GB DDR3 1866, HD 7870XT 3GB(DS), Win7 64bit
ID: 44461 · Report as offensive     Reply Quote
[TA]Assimilator1
Avatar

Send message
Joined: 29 Nov 13
Posts: 57
Credit: 3,928,339
RAC: 0
Message 44462 - Posted: 8 Mar 2021, 22:15:03 UTC

But I now have a WU stuck at 100% and 1.5 days elapsed time!
Team AnandTech - SETI@H, Muon1 DPAD, F@H, MW@H, A@H, LHC@H, POGS, R@H, DHEP, CPDN, E@H.
Main rig - Ryzen 3600, MSI B450 Gm Pro C AC, 32GB DDR4 3200, RX580 8GB, Win10 64bit
2nd rig - i7 4930k @4.1 GHz, 16 GB DDR3 1866, HD 7870XT 3GB(DS), Win7 64bit
ID: 44462 · Report as offensive     Reply Quote
etcarine

Send message
Joined: 13 Feb 21
Posts: 1
Credit: 26,310
RAC: 270
Message 44466 - Posted: 9 Mar 2021, 21:07:08 UTC

LHC@home ran for 10,161 units of work, but now says "No work available to process". I have no idea what to do.
Are there actually no work units? If there are, why arent they available?
ID: 44466 · Report as offensive     Reply Quote
Harri Liljeroos
Avatar

Send message
Joined: 28 Sep 04
Posts: 549
Credit: 29,973,249
RAC: 12,715
Message 44467 - Posted: 9 Mar 2021, 21:45:30 UTC - in response to Message 44466.  

LHC@home ran for 10,161 units of work, but now says "No work available to process". I have no idea what to do.
Are there actually no work units? If there are, why arent they available?

Your computers are hidden so no one can view what kind of work you have been running. But if it has been sixtrack tasks (not using a Virtual Machine) then the message you get is true, no new work units are available for that subproject, only a few re-sends of tasks that have failed on someone else's computer.
ID: 44467 · Report as offensive     Reply Quote
[TA]Assimilator1
Avatar

Send message
Joined: 29 Nov 13
Posts: 57
Credit: 3,928,339
RAC: 0
Message 44498 - Posted: 16 Mar 2021, 7:27:43 UTC
Last modified: 16 Mar 2021, 7:31:32 UTC

Well this takes the biscuit, I'd set BOINC to 85% computing time to leave some CPU power for GPU folding.
Last night and this morning I found my system had ground to a crawl and LHC VM was taking 100% CPU time! So I restricted BOINC to 50% and now LHC is 'only' taking ~70%.
What with BOINCs messed up way of trying to balance credit, and now LHC hogging resources I'm done with it :(

But why was it taking 100% when I'd set 85%!? It should of left any spare threads for Rosetta, not try and run another 8 with LHC VM. (that's with Atlas WUs)
Team AnandTech - SETI@H, Muon1 DPAD, F@H, MW@H, A@H, LHC@H, POGS, R@H, DHEP, CPDN, E@H.
Main rig - Ryzen 3600, MSI B450 Gm Pro C AC, 32GB DDR4 3200, RX580 8GB, Win10 64bit
2nd rig - i7 4930k @4.1 GHz, 16 GB DDR3 1866, HD 7870XT 3GB(DS), Win7 64bit
ID: 44498 · Report as offensive     Reply Quote
[TA]Assimilator1
Avatar

Send message
Joined: 29 Nov 13
Posts: 57
Credit: 3,928,339
RAC: 0
Message 44499 - Posted: 16 Mar 2021, 18:29:34 UTC - in response to Message 44498.  

Just thought of a better answer, rather than not run LHC altogether (I want to run it!), I've disabled the Atlas app. Let's see how the other apps behave....
Team AnandTech - SETI@H, Muon1 DPAD, F@H, MW@H, A@H, LHC@H, POGS, R@H, DHEP, CPDN, E@H.
Main rig - Ryzen 3600, MSI B450 Gm Pro C AC, 32GB DDR4 3200, RX580 8GB, Win10 64bit
2nd rig - i7 4930k @4.1 GHz, 16 GB DDR3 1866, HD 7870XT 3GB(DS), Win7 64bit
ID: 44499 · Report as offensive     Reply Quote
Profile anarchic teapot

Send message
Joined: 15 Feb 06
Posts: 67
Credit: 436,665
RAC: 63
Message 44781 - Posted: 21 Apr 2021, 21:30:41 UTC - in response to Message 44499.  

It's CMS that's not working for me. So far, everything else seems fine. Perhaps the short break I took from LHC saved our relationship.
sQuonk
Plague of Mice
Intel Core i3-9100 CPU@3.60 GHz, but it's doing its bit just the same.
ID: 44781 · Report as offensive     Reply Quote
Profile Yeti
Volunteer moderator
Avatar

Send message
Joined: 2 Sep 04
Posts: 424
Credit: 117,245,001
RAC: 47,421
Message 44823 - Posted: 26 Apr 2021, 12:44:15 UTC

LHC@Home is not a plug and play project like other BOINC-Projects are.

You can easily run LHC@Home like a plug and play project: if you run Sixtrack only
You can easily run LHC@Home like a plug and play project: if you run one of Atlas / Theory / CMS exclusiv and if you keep this setting: "Use at most 100 % of CPU time" (VMs don't like this kind of throttling)

If you want to run all kind of applications LHC@Home offers, you will have to make mikro-managing with your client; BOINC will not be able to always give you what you want for your client.


Supporting BOINC, a great concept !
ID: 44823 · Report as offensive     Reply Quote
[TA]Assimilator1
Avatar

Send message
Joined: 29 Nov 13
Posts: 57
Credit: 3,928,339
RAC: 0
Message 44898 - Posted: 6 May 2021, 10:17:25 UTC
Last modified: 6 May 2021, 10:22:11 UTC

I don't recall seeing that on the front page.

Anyway, disabling Atlas seems to have done the trick, no problems caused by LHC running now, but I currently have 14 errored WUs, 13 are for CMS sim, no idea why (exit codes mean nothing to me). Common ones are :-
207 (0x000000CF) EXIT_NO_SUB_TASKS
194 (0x000000C2) EXIT_ABORTED_BY_CLIENT
And a single 1 (0x00000001) Unknown error code

Also the estimated times for some Theory sim WUs are 8-9 days! Lol. But they aren't actually taking that long.
Team AnandTech - SETI@H, Muon1 DPAD, F@H, MW@H, A@H, LHC@H, POGS, R@H, DHEP, CPDN, E@H.
Main rig - Ryzen 3600, MSI B450 Gm Pro C AC, 32GB DDR4 3200, RX580 8GB, Win10 64bit
2nd rig - i7 4930k @4.1 GHz, 16 GB DDR3 1866, HD 7870XT 3GB(DS), Win7 64bit
ID: 44898 · Report as offensive     Reply Quote
tullio

Send message
Joined: 19 Feb 08
Posts: 689
Credit: 4,074,537
RAC: 277
Message 44899 - Posted: 6 May 2021, 13:55:29 UTC

I can run SixTrack, Atlas and Theory tasks on my Windows 10 PC with 12 GB RAM with no problem, always using VirtualBox, its latest version. But CMS tasks all fail after about ten thousand seconds of Condor . God knows why. I have a 30 Mbit/s connection to my Internet provider and a WiFi connection from modem to PC which reaches 250 Mbit/s.
Tullio
ID: 44899 · Report as offensive     Reply Quote
Erich56

Send message
Joined: 18 Dec 15
Posts: 1451
Credit: 35,492,508
RAC: 43,212
Message 44900 - Posted: 6 May 2021, 16:13:22 UTC - in response to Message 44899.  

@tullio: I had something like this last weekend - see the CMS thread here. And I had no idea what was the reason. 1 1/2 days later, everything ran okay.
ID: 44900 · Report as offensive     Reply Quote
Previous · 1 · 2 · 3 · 4 · 5 · Next

Message boards : Number crunching : VM Applications Errors


©2021 CERN