Message boards :
ATLAS application :
Atlas-native App finished in Seconds
Message board moderation
Author | Message |
---|---|
Send message Joined: 2 May 07 Posts: 2190 Credit: 173,346,434 RAC: 57,160 |
This Compuer finishing Atlas-native App in a few seconds and get Cobblestones. Also a lot of crashed tasks: https://lhcathome.cern.ch/lhcathome/results.php?hostid=10451475&offset=0&show_names=0&state=4&appid= |
Send message Joined: 2 May 07 Posts: 2190 Credit: 173,346,434 RAC: 57,160 |
Have a Task with use of 7 Cpu's tested and running successful: Laufzeit 29 min. 54 sek. CPU Zeit 2 Stunden 6 min. 9 sek. Prüfungsstatus Gültig Punkte 58.22 |
Send message Joined: 5 Apr 15 Posts: 18 Credit: 5,910,849 RAC: 0 |
Hi All, I'm running into trouble with just any kind of LCH tasks these days. Initially I was running 5-core applications (Theory, Atlas, and if available CMS, LHCb, etc...) without any issues for now about a year I guess since they became available. Then I started getting "VM unmanageable" errors on Atlas about 2-3 months ago. Restarting BOINC after a reboot solved this issue, but I had to constantly reboot my system which is not very handy. I uninstalled VirtualBox (version 6 at that time) and reinstalled it again to try to solve the "VM unmanageable" error, but to no avail. I then stopped all LHC WU's, searched for any settings that could provoke an issue. Just out of curiosity, I then enabled 7-core applications as I thought "maybe 5-core WU's carry a flaw, and 7-core don't have this flaw". To my horror, all of the tasks errored out with the message : WU aborted, within 30 seconds of starting the WU. But without any further information. When checking the standard log, I can't even see what the origin of the error is. I then scaled back to 6-core WU's, still the same issue, after 30 seconds maximum, "WU aborted". I scaled back to 5-cores. I now even have the same issue on both of them... FYI. There's nothing wrong with my processor or memory, I tried running WorldCommunityGrid and it runs just fine on 9 cores (out of 12, 2 more cores are assigned to GPUGrid) I then decided to go drastic and remove VirtualBox and BOINC and reinstall them, but with keeping application settings for now. I did it, but unfortunately, simply the same errors again on LHC... "WU aborted" after 30 seconds. Is there a way I can easily enable logging (I see a lot of generic BOINC logging capabilities, but not specific to WU's) ? Should I remove BOINC & VB completely including existing settings ? and start all over again ? Would appreciate your help as it's a pity wasting so much time (struggling for nearly 2 months now) ! :-) |
Send message Joined: 9 Jan 15 Posts: 151 Credit: 431,596,822 RAC: 0 |
When i check log from last task today i found this. VBoxManage.exe: error: VT-x is disabled in the BIOS for all CPU modes (VERR_VMX_MSR_ALL_VMX_DISABLED) check your BIOS if VT-x is enabled. |
Send message Joined: 5 Apr 15 Posts: 18 Credit: 5,910,849 RAC: 0 |
Hi Gunde, It was a combination of factors in fact. I upgraded the BIOS to the latest version to patch an Intel security bug, but at the same time, it wreaked havoc in the BIOS settings, effectively resetting the VT-X. In parallel, there was something wrong with the interaction between BOINC 7.14.2 and Virtual Box 6.0.4 which got screwed up by me installing other software. (silly me...) And in the end, BOINC seems to have a good memory and retained that VT-X was disabled in the .xml file in the BOINC_DATA directory (see details in Yeti's check list). So, in the end, flashing back latest working version of the BIOS, enabling VT-X again; removing the software I installed; removing BOINC & Virtual Box 6 reinstalling BOINC 7.14.2, VirtualBox 5.2.26; modifying BOINC's memory in the xml file and finally it works again like a charm ! Only pity I lost nearly an entire month to troubleshoot the darn thing... Which makes me pose the question : does Atlas/LHC really need VM's to run ? Why can WorldComGrid, ClimatePrediction or Rosetta run without it ? (just asking out of pure ignorance, apologies for that !) Nice weekend to all ! B.E. |
Send message Joined: 13 Apr 18 Posts: 443 Credit: 8,438,885 RAC: 0 |
Which makes me pose the question : does Atlas/LHC really need VM's to run ? The apps need to run on a platform that has specific capabilities readily available on Linux as standard features or easy addons. Not saying those same capabilities couldn't be built into or added onto Windows or OSX but the experts believe it's less work to virtualize the environment and suffer the performance hit. Can't say I disagree with them. Why can WorldComGrid, ClimatePrediction or Rosetta run without it ? They feel that given their needs they can get statistically valid results from apps skillfully designed to compile and run on multiple platforms. |
Send message Joined: 14 Jan 10 Posts: 1375 Credit: 9,162,469 RAC: 5,174 |
Which makes me pose the question : does Atlas/LHC really need VM's to run ? BOINC is an important, but not a major partner to run ATLAS jobs. Most partners are scientific institutions running hosts with Scientific Linux, so the Scientists develop their applications only for Linux and not for other OS's. |
©2024 CERN