Message boards : Number crunching : VM Applications Errors
Message board moderation

To post messages, you must log in.

Previous · 1 · 2 · 3 · 4 · 5 · Next

AuthorMessage
Crystal Pellet
Volunteer moderator
Volunteer tester

Send message
Joined: 14 Jan 10
Posts: 1268
Credit: 8,433,416
RAC: 3,056
Message 43100 - Posted: 24 Jul 2020, 8:43:44 UTC - in response to Message 43097.  

Perhaps someone can look at their Stderr outputs to see if they can diagnose what I am doing wrong?
In your 2 results: Probing /cvmfs/sft.cern.ch... Failed!
It's a network issue. Maybe proxy or firewall related.
ID: 43100 · Report as offensive     Reply Quote
Phoenix
Avatar

Send message
Joined: 14 Mar 11
Posts: 9
Credit: 1,613,568
RAC: 0
Message 43126 - Posted: 29 Jul 2020, 22:10:26 UTC

I have made a couple of attempts to run jobs using virtual box
All attempts run a short time and error off
Have tried CMS, Theory and Atlas they all error off after a short time 1.5 % or so
Need some help
ID: 43126 · Report as offensive     Reply Quote
maeax

Send message
Joined: 2 May 07
Posts: 2071
Credit: 156,192,791
RAC: 103,819
Message 43127 - Posted: 29 Jul 2020, 22:27:10 UTC - in response to Message 43126.  

You have to enable VT-x in your BIOS of the Intel-Computer.
This Parameter is for activating the Hardware Acceleration.
ID: 43127 · Report as offensive     Reply Quote
Phoenix
Avatar

Send message
Joined: 14 Mar 11
Posts: 9
Credit: 1,613,568
RAC: 0
Message 43132 - Posted: 30 Jul 2020, 8:02:20 UTC - in response to Message 43127.  

Thank you for your help
I wanted to do some work rather than just waiting for more sixtrack
ID: 43132 · Report as offensive     Reply Quote
Cruncher Pete

Send message
Joined: 12 Oct 07
Posts: 9
Credit: 4,115,333
RAC: 0
Message 43146 - Posted: 31 Jul 2020, 6:12:23 UTC

I am also another old user that have given up on LHC because I could not set it up to run with VBox some times ago, I am disappointed that although this problem has existed for about three years as seen by your Message Box it has not been rectified.

We are in a sprint in FB Challenge, yet I can not download any other tasks bar CMS which errors out in less than a minute. I have tried all remedies that I am aware of to fix this problem at my end and since you are not issuing projects like six track that does not require VB I doubt I will ever return. Effectively, you have lost an old time cruncher that supported you for years but I can see that my 12 computers will achieve more running some other projects than yours. Sorry, that needed to be said for you did nothing to fix it and only allow work for CMS that errors out. Surely, with the capabilities of your IT Tech you can do better than this.
ID: 43146 · Report as offensive     Reply Quote
Jim1348

Send message
Joined: 15 Nov 14
Posts: 602
Credit: 24,371,321
RAC: 0
Message 43147 - Posted: 31 Jul 2020, 6:24:43 UTC - in response to Message 43146.  

I am also another old user that have given up on LHC because I could not set it up to run with VBox some times ago, I am disappointed that although this problem has existed for about three years as seen by your Message Box it has not been rectified.

Don't use VBox version 6.x. Use 5.2.x. It has worked on every machine I have ever used it on, either Windows or Linux.
https://www.virtualbox.org/wiki/Download_Old_Builds_5_2
ID: 43147 · Report as offensive     Reply Quote
computezrmle
Volunteer moderator
Volunteer developer
Volunteer tester
Help desk expert
Avatar

Send message
Joined: 15 Jun 08
Posts: 2386
Credit: 223,040,449
RAC: 136,875
Message 43148 - Posted: 31 Jul 2020, 8:14:46 UTC - in response to Message 43146.  

I post this comment as a normal volunteer, not as a moderator.

It is known for years that LHC's Vbox tasks, especially CMS and ATLAS require lots of RAM and put an immense pressure on disk IO as well as network traffic.
Hence, it makes no sense to connect a computer like this without ressource planning:
https://lhcathome.cern.ch/lhcathome/show_host_detail.php?hostid=10660427
CPUs: 1056 (!!)
RAM: 24 GB (only!!)

It requested more than 440 CMS tasks within 1.5h.

RAM would allow to run up to 12 tasks concurrently, without any RAM left for the OS or the disk cache.
12 tasks would copy 12*2.4GB = 28.8GB vdi files from the project directory to the slots directories.
12 tasks would create >20000 internet requests and ~2.5GB downloads to finish the task setup.



A result of this overload can be seen here:
https://lhcathome.cern.ch/lhcathome/show_host_detail.php?hostid=10660532
https://lhcathome.cern.ch/lhcathome/result.php?resultid=280153378
2020-07-31 16:58:04 (5352): VM Heartbeat file specified, but missing.
2020-07-31 16:58:04 (5352): VM Heartbeat file specified, but missing file system status. (errno = '2')

In addition there's a misconfiguration between Windows and VirtualBox which is not caused by LHC:
00:00:00.438450          ERROR [COM]: aRC=E_ACCESSDENIED (0x80070005) aIID={5047460a-265d-4538-b23e-ddba5fb84976} aComponent={MachineWrap} aText={The object functionality is limited}, preserve=false aResultDetail=0




Another computer has a misconfigured VT-x:
https://lhcathome.cern.ch/lhcathome/show_host_detail.php?hostid=10660554
VBoxManage.exe: error: VT-x is disabled in the BIOS for all CPU modes (VERR_VMX_MSR_ALL_VMX_DISABLED)



Volunteers running hundreds of cores and being here for more than a decade should not blame the project for homemade failures.
The main objective of LHC@home is to help the scientists rather than to satisfy any "sprint in FB Challenge".
ID: 43148 · Report as offensive     Reply Quote
crashtech

Send message
Joined: 10 May 17
Posts: 4
Credit: 16,398,359
RAC: 88,051
Message 43154 - Posted: 1 Aug 2020, 5:09:20 UTC - in response to Message 43148.  

The main objective of LHC@home is to help the scientists...

True, and if you believe that, then a helpful post would better serve that end, instead of scolding volunteers for having what you subjectively consider to be improper motivation. I've been having problems with two of my Linux hosts, which error out even when tasked with only two Theory tasks at a time across an 8C/16T CPU with 32GB RAM. So it's not always an overutilization problem. Also keep in mind that although you may have some sort of problem with contest participation, many of us who find fun and satisfaction with such things also return from time to time to fulfill personal long-term goals with projects like this, which certainly helps advance the science. I can't speak for the other volunteers, but I spend hundreds of dollars a month donating hardware and electricity to various projects. Again, I would think that if one really believes that helping the science is paramount, scolding participants cannot be an optimal way to achieve that goal.
ID: 43154 · Report as offensive     Reply Quote
Cruncher Pete

Send message
Joined: 12 Oct 07
Posts: 9
Credit: 4,115,333
RAC: 0
Message 43155 - Posted: 1 Aug 2020, 6:23:51 UTC - in response to Message 43154.  

Thank you Crashtech for your input. I could not have said it better myself. In deed, I was not even going to reply to the so called volunteer Moderator, Volunteer Developer and Volunteer Tester replying as a VOLUNTEER. Because of my Medical condition I am in front of my computer running BOINC as a Volunteer with no special skills for at least 12 to 14 hours a day and have been doing so since 1999, with 12 to 15 machines dedicated to Science. Yes, I make a lot of mistakes but I learn from them. I take offense to use me as an example of a person to cause overload. All I wanted to do is get some work and all I got was hundreds of CMS work that crashed within seconds. I am not an IT Tech but have sufficient experience to run BOINC project requirement and I have never had any problem with any other Projects. It hurt to read that it is my fault, yet the major problem of affiliating with VM seems to be the problem and they have not done anything to rectify it after years of reporting the problem to them on the Boards.

As a constructive criticism, I can say that I have learned a lot from his remarks and I can only ask that instead of being frustrated by so many complaints Reevaluate the projects needs. If you only wish to run CMS and none of the other sub-projects than say so in the News. I would also like to suggest that since this project requires by his own words powerful computers with lots of Hard Drive space and lots of RAM as well as plenty of Broadband availability would you mind promulgate your requirements in the Front page and tell us what is the minimum hardware that we should have as well as tell us that your project requires IT knowledge and normal everyday volunteers who are considered Push Button experts should not consider running this project. Otherwise, LHC is wasting our time and Money. I recently purchased a 64 core/128t computer that cost over $6,000. It seems it is a waste of money for I can not set up and run LHC on Win10 with 32Gb of Ram and a 1Tb of Hard Drive on a fast Broadband cable Ethernet .I installed the latest BOINC and updated VM to V6.1. I set up my preferences to receive all projects yet I only got CMS work that error-ed out almost immediately. I am sure other projects will appreciate this machine and my dedication to science.
ID: 43155 · Report as offensive     Reply Quote
computezrmle
Volunteer moderator
Volunteer developer
Volunteer tester
Help desk expert
Avatar

Send message
Joined: 15 Jun 08
Posts: 2386
Credit: 223,040,449
RAC: 136,875
Message 43156 - Posted: 1 Aug 2020, 6:31:45 UTC - in response to Message 43154.  

I've been having problems with two of my Linux hosts, which error out even when tasked with only two Theory tasks at a time across an 8C/16T CPU with 32GB RAM.

Since your computer list doesn't show any 8C/16T CPU direct links to the failed tasks would be helpful.
ID: 43156 · Report as offensive     Reply Quote
maeax

Send message
Joined: 2 May 07
Posts: 2071
Credit: 156,192,791
RAC: 103,819
Message 43157 - Posted: 1 Aug 2020, 8:17:03 UTC - in response to Message 43155.  
Last modified: 1 Aug 2020, 8:18:48 UTC

It hurt to read that it is my fault, yet the major problem of affiliating with VM seems to be the problem and they have not done anything to rectify it after years of reporting the problem to them on the Boards.

No, it isn't your fault.
In the lhc@home stats you can see many Tier1 Institutes from Cern around the world.
They are running Boinc also, but with Linux.
So the concept for running work is not reduced for us volunteers only.
https://lhcathome.cern.ch/lhcathome/top_users.php
ID: 43157 · Report as offensive     Reply Quote
davidBAM

Send message
Joined: 21 Nov 18
Posts: 1
Credit: 11,198,807
RAC: 0
Message 43158 - Posted: 1 Aug 2020, 10:43:31 UTC
Last modified: 1 Aug 2020, 10:46:02 UTC

a) This isn't the only project which has problems when the Formula Boinc Circus comes to town.
b) This isn't the only project.

Everyone tries to do their best to advance the science and even competitive crunchers like myself have their place in the grand scheme of things.

If LHC don't want to be considered for FB Sprints, then I believe a simple email to Seb would accomplish that. If they do want to be considered for Sprints, they will naturally get a load of volunteer crunchers who don't have in-depth familiarity with how best to run the project on their hardware.

I have actually run LHC before & I do have an Honours Degree in Computer Science. In spite of both of those facts, it has taken me every waking hour since the challenge started to get even half of my machines crunching this project. They are all Ubuntu 20.04 with a minimum of 1Gb RAM per thread.

I remain totally baffled as to why both Linux and Windows users both need to use Vbox.

I have the following suggestions :
1. Either LHC should opt-out of FB sprints, OR, preferably, try to ensure it has SixTrack available for the next one
2. As has been mentioned above, LHC need to provide minimum specs and up-to-date guidance on how to run each app
3. Pay partial credits for failed work if it is not the fault of the volunteer
ID: 43158 · Report as offensive     Reply Quote
Henry Nebrensky

Send message
Joined: 13 Jul 05
Posts: 165
Credit: 14,925,288
RAC: 34
Message 43159 - Posted: 1 Aug 2020, 12:17:53 UTC - in response to Message 43158.  

a) This isn't the only project which has problems when the Formula Boinc Circus comes to town.
b) This isn't the only project.
...
I have the following suggestions :
1. Either LHC should opt-out of FB sprints, OR, preferably, try to ensure it has SixTrack available for the next one

Surely it's the "sprint organiser's" responsibility to ensure they choose appropriate projects with suitable work available. Sixtrack job availability has been known to be intermittent for decades.

Touch wood, my recent CMS tasks are all running and I'm getting tasks for all sub-projects selected in my preferences. Not sure why you think you're stuck with purely CMS tasks. What happened when you unticked CMS?
ID: 43159 · Report as offensive     Reply Quote
Jim1348

Send message
Joined: 15 Nov 14
Posts: 602
Credit: 24,371,321
RAC: 0
Message 43160 - Posted: 1 Aug 2020, 12:20:28 UTC - in response to Message 43158.  

I remain totally baffled as to why both Linux and Windows users both need to use Vbox.

I have the following suggestions :
1. Either LHC should opt-out of FB sprints, OR, preferably, try to ensure it has SixTrack available for the next one
2. As has been mentioned above, LHC need to provide minimum specs and up-to-date guidance on how to run each app
3. Pay partial credits for failed work if it is not the fault of the volunteer

As a (more or less) innocent bystander, I can offer the following observations:
(1) The reason CMS requires VBox has been discussed many times by various people, including myself.
(2) This is a complicated project (set of projects actually, each different).
(3) I have been doing it, including native ATLAS and Theory since they first became available.
(4) Each time I set up a new machine, I have to learn what the latest procedure is, since it often changes.
(5) This is not for beginners.
(6) In fact, it may not be for home users at all, depending on your interest level.
(7) I have no idea what "FB sprints" are, and don't want to find out, But I expect this is not for them, whoever they are.

Good luck. It can be a lot of fun, but it is an acquired taste. (And computezrmle is the real expert on the subject).
ID: 43160 · Report as offensive     Reply Quote
Greger

Send message
Joined: 9 Jan 15
Posts: 151
Credit: 431,596,822
RAC: 0
Message 43164 - Posted: 1 Aug 2020, 22:20:56 UTC

For personal view am against this "Sprint" it specifically this "3 days event" and cause a more trouble then project would gain from it. I made post at Cosmology for it and other projects that i contribute to such as TN-Grid was effected hard from this short "flow" of host. There are several negative aspect that i see and in my view it become brute force and same affect as DOS attack to web server and boinc servers. It no longer used as way for project admins to test network and boinc servers. It is become place for volunteers that is not satisfied to contribute in long term and take power in large crowd to attack project to prove something that already been proven. We have seen it several times for several years and it continue over and over again for same projects without any benefit. Website and boinc servers with no response for users and task flow that disappear and been dropped few days later. This a pain for admin to project to deal with and put a lot of time to regenerate new work units and purge data to db and spend time change parameters in feeder/scheduler To limit flow to host and they increase minimum quorum or increase size of wu like TN-Grid does after each event and become habit each year.

User post on site, forum or chat not aware of this event and default answer to have been "It's Formula Sprint" and most of time they posted things like " oh ok i try another project" or they post something that would drop it and do something else with computer.
If you like to be a part of "Sprint" it would probably best to go for sixstrack application only but i would like to encourage users NOT to in this manner. I know most of users would like to take control moving one project after another and play with these computers as they put lot of time and money to it. You don't need make it as a race hunt highest credit to get top on scoreboard. Be wise and stay at project to be able to make computers more work more efficiently at LHC. You can you do it, just need to take the time doing it.
You could do better an profit doing would be better experience to new things. My credit score on stat sites to project or teamranks does not give anything at end but knowledge around LHC giving me have great value. I learned several distributions of linux and application like virtualbox,cernvm-fs, run, singularity and squid. Basic stuff to bash help me today doing simple stuff at home. Such applications and network for proxy was completely unknown for me. No way i take time and get help using it without LHC push me and community here.

There is huge need of support on every project forum that need your experience to boinc and info for project. Help them and build up a healthy community around it. I would for sure try help other if i can.

Two people in this thread that do great support and knowledge they build up and share frequently to forum almost daily. They are doing great work and that is how get experience to move project forward by inform users to make contribution better and keep users that in there end help project in another way.

Example1: 'Jim1348' is the guy that made me try native application as he did a great post with commands to install it.It was great push to take step.
Example2: 'computezrmle' posted config for squid proxy and that was not requested but put great time and effort to make it with experience in network and time setup connection to additional cache proxy could not be done this easy without his work. And support he put in PM pin out network issues and with suggestions saved much time for me and improve my contribution to project. His the guy for sure to listen to when there are issues to task. His position is well earned.

I would not be here and doing native task to project or setup squid to improve network. Contribution this have huge affect on users and that makk

Or contribute in testing applications or improve them if you can. Or go to github and point out issues and pin out annoying things to boinc site or bad settings to boinc servers.
Or just be passive and enjoy experience you have and addiction to boinc and chill out and share pictures of rigs. Anything actively would be good for project in good way or be passive to enjoy time as volunteer and build up experience about it. If project reach out users listen to and be a part to help them and if they have some requirement try and be a part to improve it.

I am in same situation as you other volunteer. I have computers in every corner in house take up space generate excessive heat in every room 27/7. I hit 40 celsius some inside while it is 30 outside like today 's heat and pay high bills to keep them alive. There are good read at forums and at discord that really are addicted to boinc and spend many years since boinc started and they have done great journey. Many of them take time and post daily in forum to help other or post info to project admin to help them. Looking up to people that MAGIC Quantum Mechanic that has been has since 2004 and never complained even though he sits with limited data and low speed but equally damn contributes to the project when he can. Not only that, a journey through Atlas-dev and vLHC.
If there are always opportunities for improvement and perseverance and commitment, it will get better one day. It is important to convey wishes for this in a good way. If you can, it is good to participate to test and share info to solve the problems in a good way.

My suggestion is to pick project that fit you and stick to to until you found anything better. This like Covid-19 hit us and great to see dedicated volunteers move to these project when it was needed.

Which way you take is up to you and both parties would be here. I would encourage users a healthy journey and dig into projects and find a way to contribute or testing or debug issues if they can. Put your computer in way it would handle task one after another and monitor them. As soon as you found an issue get on it right away instead of put 100 computers directly to project and hope for the best. It is near impossible to help users with these computers.

For LHC there are several errors that are network related and may not known for users or project admins in stderr or other logs in vm. These issues could be network to host or server.

Yes you would need additional application as virtualbox to handle all subproject here except sixtrack. But those who use linux which several here do the native is a dream when you got it running. This build for production purpose and made to run for long sessions. Boinc is tiny compare to network around Cern and sure that could explain that they have time or take time to put support for every single user that would like to "try" a few task. Experience a errors and user leave. Simple task at set virtualization in bios could make some users to drop it as it would more simple to move to another project then doing these task. That is probably what would happen for some but not all.

There are some issues i experience and can not deal with without help from project admin or Cern devs would handle it, so in meantime i would wait it out. Most of people are in vacation and no sixtrack would be made so virtualbox is easy to setup now almost one-click thing now. It is what it is and i would do what i can to project.

Enjoy your vacation.
ID: 43164 · Report as offensive     Reply Quote
Profile Magic Quantum Mechanic
Avatar

Send message
Joined: 24 Oct 04
Posts: 1114
Credit: 49,504,188
RAC: 3,842
Message 43167 - Posted: 1 Aug 2020, 23:38:58 UTC

Thanks Gunde, much appreciated.
ID: 43167 · Report as offensive     Reply Quote
[TA]Assimilator1
Avatar

Send message
Joined: 29 Nov 13
Posts: 58
Credit: 4,010,807
RAC: 28
Message 44422 - Posted: 1 Mar 2021, 19:13:59 UTC
Last modified: 1 Mar 2021, 19:14:38 UTC

I'm getting a ton of errors atm, I had a quick look at the task details, but it doesn't mean much to me.
I've got no errors in Rosetta atm, for what it's worth.

Can anyone help?
Team AnandTech - SETI@H, Muon1 DPAD, F@H, MW@H, A@H, LHC@H, POGS, R@H, DHEP, CPDN, E@H.
Main rig - Ryzen 3600, MSI B450 Gm Pro C AC, 32GB DDR4 3200, RX580 8GB, Win10 64bit
2nd rig - i7 4930k @4.1 GHz, 16 GB DDR3 1866, HD 7870XT 3GB(DS), Win7 64bit
ID: 44422 · Report as offensive     Reply Quote
computezrmle
Volunteer moderator
Volunteer developer
Volunteer tester
Help desk expert
Avatar

Send message
Joined: 15 Jun 08
Posts: 2386
Credit: 223,040,449
RAC: 136,875
Message 44424 - Posted: 1 Mar 2021, 19:57:32 UTC - in response to Message 44422.  

Typical snippet from your logs:
VBoxManage.exe: error: AMD-V is disabled in the BIOS (or by the host OS) (VERR_SVM_DISABLED)

This must be enabled to run VBox tasks.
It either had never been enabled or has been disabled e.g. during a BIOS/OS upgrade.
ID: 44424 · Report as offensive     Reply Quote
[TA]Assimilator1
Avatar

Send message
Joined: 29 Nov 13
Posts: 58
Credit: 4,010,807
RAC: 28
Message 44426 - Posted: 2 Mar 2021, 18:52:54 UTC - in response to Message 44424.  
Last modified: 2 Mar 2021, 18:55:46 UTC

Thanks for your reply.

Never heard of it, what is it?
I don't recall seeing that in the bios, and I don't see it mentioned in the manual...
Team AnandTech - SETI@H, Muon1 DPAD, F@H, MW@H, A@H, LHC@H, POGS, R@H, DHEP, CPDN, E@H.
Main rig - Ryzen 3600, MSI B450 Gm Pro C AC, 32GB DDR4 3200, RX580 8GB, Win10 64bit
2nd rig - i7 4930k @4.1 GHz, 16 GB DDR3 1866, HD 7870XT 3GB(DS), Win7 64bit
ID: 44426 · Report as offensive     Reply Quote
Profile Yeti
Volunteer moderator
Avatar

Send message
Joined: 2 Sep 04
Posts: 453
Credit: 193,369,412
RAC: 10,065
Message 44428 - Posted: 2 Mar 2021, 20:13:42 UTC - in response to Message 44426.  

Perhaps you find informations in this checklist


Supporting BOINC, a great concept !
ID: 44428 · Report as offensive     Reply Quote
Previous · 1 · 2 · 3 · 4 · 5 · Next

Message boards : Number crunching : VM Applications Errors


©2024 CERN