Message boards : Number crunching : Only 1 of 40 simulation jobs finished and gave results 1
Message board moderation

To post messages, you must log in.

AuthorMessage
Profile dr_mabuse
Avatar

Send message
Joined: 30 Dec 05
Posts: 57
Credit: 816,635
RAC: 29
Message 28146 - Posted: 19 Dec 2016, 10:59:20 UTC

In the last days I run a lot of simulation jobs. Nearly all ended with a computation error without counting points. The other problem is the vast amount of hard disk space they occupied.
I run about 14 different BOINC projects and I had never such a bad performance.
There should be a complete rewwork of the undergoing software.
In the appendix I include a list of my results.
Hope it helps to optimize the performance.
Greetings from Germany
Dr.Mabuse
Task Arbeitspaket Computer Gesendet Meldezeit Status Laufzeit(sek) CPU Zeit(sek) Punkte Anwendung
109917065 52589857 9953030 19 Dec 2016, 9:53:37 UTC 19 Dec 2016, 10:45:59 UTC Abbruch durch Benutzer 0.00 0.00 --- CMS Simulation v47.60 (vbox64)
109917226 52589980 9953030 19 Dec 2016, 9:53:29 UTC 27 Dec 2016, 9:53:29 UTC In Bearbeitung --- --- --- LHCb Simulation v0.11 (vbox64)
109917234 52589988 9953030 19 Dec 2016, 9:53:15 UTC 19 Jan 2017, 9:53:15 UTC In Bearbeitung --- --- --- Theory Simulation v262.50 (vbox64)
109917113 52589884 9953030 19 Dec 2016, 9:53:01 UTC 19 Dec 2016, 10:42:03 UTC Fehler beim Berechnen 708.75 73.31 --- CMS Simulation v47.60 (vbox64)
109917211 52589965 9953030 19 Dec 2016, 9:52:49 UTC 19 Dec 2016, 10:38:05 UTC Fehler beim Berechnen 792.72 72.83 --- Theory Simulation v262.50 (vbox64)
109917214 52589968 9953030 19 Dec 2016, 9:52:36 UTC 19 Dec 2016, 10:25:36 UTC Fehler beim Berechnen 798.04 67.97 --- Theory Simulation v262.50 (vbox64)
109917175 52589929 9953030 19 Dec 2016, 9:52:28 UTC 19 Dec 2016, 10:25:36 UTC Fehler beim Berechnen 828.09 69.17 --- Theory Simulation v262.50 (vbox64)
109917032 52589824 9953030 19 Dec 2016, 9:52:16 UTC 19 Dec 2016, 10:11:06 UTC Fehler beim Berechnen 732.64 69.05 --- CMS Simulation v47.60 (vbox64)
109915523 52588369 9953030 19 Dec 2016, 3:45:39 UTC 19 Dec 2016, 5:29:35 UTC Fehler beim Berechnen 801.12 66.63 --- Theory Simulation v262.50 (vbox64)
109915427 52588273 9953030 19 Dec 2016, 3:45:27 UTC 19 Dec 2016, 5:24:27 UTC Fehler beim Berechnen 844.88 158.83 --- LHCb Simulation v0.11 (vbox64)
109915470 52588316 9953030 19 Dec 2016, 3:45:16 UTC 19 Dec 2016, 4:52:16 UTC Fehler beim Berechnen 951.59 109.06 --- LHCb Simulation v0.11 (vbox64)
109915419 52588265 9953030 19 Dec 2016, 3:45:05 UTC 19 Dec 2016, 5:09:15 UTC Fehler beim Berechnen 808.01 78.97 --- CMS Simulation v47.60 (vbox64)
109915421 52588267 9953030 19 Dec 2016, 3:44:53 UTC 19 Dec 2016, 10:09:21 UTC Fehler beim Berechnen 734.49 72.22 --- CMS Simulation v47.60 (vbox64)
109915432 52588278 9953030 19 Dec 2016, 3:44:42 UTC 19 Dec 2016, 4:25:23 UTC Fehler beim Berechnen 1,080.71 124.88 --- LHCb Simulation v0.11 (vbox64)
109915379 52588225 9953030 19 Dec 2016, 3:44:30 UTC 19 Dec 2016, 9:52:03 UTC Abbruch durch Benutzer 0.00 0.00 --- LHCb Simulation v0.11 (vbox64)
109913717 52586654 9953030 18 Dec 2016, 22:15:35 UTC 18 Dec 2016, 22:33:02 UTC Fehler beim Berechnen 912.07 188.17 --- LHCb Simulation v0.11 (vbox64)
109913645 52586583 9953030 18 Dec 2016, 22:15:22 UTC 18 Dec 2016, 23:03:26 UTC Fehler beim Berechnen 713.83 67.11 --- CMS Simulation v47.60 (vbox64)
109913551 52586489 9953030 18 Dec 2016, 22:15:10 UTC 18 Dec 2016, 22:50:04 UTC Fehler beim Berechnen 758.72 63.27 --- CMS Simulation v47.60 (vbox64)
109912493 52585492 9953030 18 Dec 2016, 18:09:36 UTC 18 Dec 2016, 21:18:26 UTC Fehler beim Berechnen 706.29 66.92 --- Theory Simulation v262.50 (vbox64)
109912364 52585363 9953030 18 Dec 2016, 18:09:25 UTC 18 Dec 2016, 21:11:16 UTC Fehler beim Berechnen 813.77 67.83 --- Theory Simulation v262.50 (vbox64)
109912420 52585419 9953030 18 Dec 2016, 18:09:13 UTC 18 Dec 2016, 21:05:37 UTC Fehler beim Berechnen 814.36 109.94 --- LHCb Simulation v0.11 (vbox64)
109912378 52585377 9953030 18 Dec 2016, 17:56:02 UTC 18 Dec 2016, 18:08:31 UTC Abbruch durch Benutzer 0.00 0.00 --- CMS Simulation v47.60 (vbox64)
109912180 52585179 9953030 18 Dec 2016, 17:55:50 UTC 18 Dec 2016, 18:08:31 UTC Abbruch durch Benutzer 0.00 0.00 --- CMS Simulation v47.60 (vbox64)
109911223 52584266 9953030 18 Dec 2016, 15:18:16 UTC 18 Dec 2016, 16:08:16 UTC Fehler beim Berechnen 749.21 67.23 --- CMS Simulation v47.60 (vbox64)
109911092 52584161 9953030 18 Dec 2016, 15:18:03 UTC 18 Dec 2016, 15:53:18 UTC Fehler beim Berechnen 1,139.38 209.02 --- LHCb Simulation v0.11 (vbox64)
109911202 52584246 9953030 18 Dec 2016, 15:17:55 UTC 18 Dec 2016, 17:55:39 UTC Abbruch durch Benutzer 0.00 0.00 --- CMS Simulation v47.60 (vbox64)
109911217 52584260 9953030 18 Dec 2016, 15:17:30 UTC 18 Dec 2016, 15:34:09 UTC Fehler beim Berechnen 848.30 72.52 --- Theory Simulation v262.50 (vbox64)
109910981 52584059 9953030 18 Dec 2016, 14:34:23 UTC 18 Dec 2016, 14:50:58 UTC Fehler beim Berechnen 842.88 65.05 --- Theory Simulation v262.50 (vbox64)
109910782 52583907 9953030 18 Dec 2016, 14:34:09 UTC 18 Dec 2016, 15:17:43 UTC Abbruch durch Benutzer 0.00 0.00 --- Theory Simulation v262.50 (vbox64)
109910911 52583998 9953030 18 Dec 2016, 14:34:02 UTC 18 Dec 2016, 15:09:25 UTC Fehler beim Berechnen 729.95 76.63 --- Theory Simulation v262.50 (vbox64)
109903591 52577589 9953030 17 Dec 2016, 13:58:55 UTC 18 Dec 2016, 10:07:10 UTC Fertig und Bestätigt 62,618.35 43,222.59 571.65 CMS Simulation v47.60 (vbox64)
109903569 52577567 9953030 17 Dec 2016, 13:58:47 UTC 17 Dec 2016, 16:54:01 UTC Fehler beim Berechnen 1,332.21 295.38 --- LHCb Simulation v0.11 (vbox64)
109903531 52577529 9953030 17 Dec 2016, 13:58:35 UTC 18 Dec 2016, 14:34:02 UTC Abbruch durch Benutzer 0.00 0.00 --- Theory Simulation v262.50 (vbox64)
109903301 52577304 9953030 17 Dec 2016, 13:10:16 UTC 17 Dec 2016, 13:45:06 UTC Fehler beim Berechnen 861.43 106.45 --- LHCb Simulation v0.11 (vbox64)
109903131 52577137 9953030 17 Dec 2016, 13:10:03 UTC 17 Dec 2016, 14:02:05 UTC Fehler beim Berechnen
ID: 28146 · Report as offensive     Reply Quote
computezrmle
Volunteer moderator
Volunteer developer
Volunteer tester
Avatar

Send message
Joined: 15 Jun 08
Posts: 1992
Credit: 143,596,632
RAC: 93,663
Message 28148 - Posted: 19 Dec 2016, 13:27:27 UTC - in response to Message 28146.  

From your logfiles and computer details:
Required extension pack not installed, remote desktop not enabled.
Guest Log: BIOS: VirtualBox 5.0.28
Setting CPU throttle for VM. (94%)
VM Heartbeat file specified, but missing.

Arbeitsspeicher 8060.5 MB

I would recommend to:
- upgrade to the most recent VirtualBox (5.1.10)
- install the VirtualBox extensions
- set CPU throttle to 100 %
- follow this task to see what causes the problems regarding the heartbeat error
- run only Theory WUs until this works stable (least RAM requirements)
- run only one WU at the same time until this works stable
ID: 28148 · Report as offensive     Reply Quote
Profile dr_mabuse
Avatar

Send message
Joined: 30 Dec 05
Posts: 57
Credit: 816,635
RAC: 29
Message 28162 - Posted: 20 Dec 2016, 14:55:53 UTC - in response to Message 28148.  

Thanks for your answer.
Today I downloaded the new version of VMware and installed it and also the Extension pack 5.1.10. The installation completed successfully.
Then I downloaded some new tasks of LHC@home. The first task claimed active but didn't start.
The virtual machine BOINC_VM was not started. I tried to start it from the console. I saw a DOS windows but got the error message:
Für die virtuelle Maschine BOINC_VM konnte keine neue Sitzung eröffnet werden.

Could not open the medium 'C:\ProgramData\BOINC\slots\0\cernvm.vmdk'.

VD: error VERR_FILE_NOT_FOUND opening image file 'C:\ProgramData\BOINC\slots\0\cernvm.vmdk' (VERR_FILE_NOT_FOUND).

Fehlercode:E_FAIL (0x80004005)
Komponente:MediumWrap
Interface:IMedium {4afe423b-43e0-e9d0-82e8-ceb307940dda}

So what does that mean in popular words ? I am no expert on VM and I don't have the time to read the manual with more than 350 pages.

In between the task Theory_29516_1482239153.068194_0 started and took abnout 12% of my CPU time.

Another severe problem is the excessive disk space needed. see here:

20.12.2016 15:48:03 | LHC@home 1.0 | Nachricht vom Server: Theory Simulation needs 5037.39MB more disk space. You currently have 2592.00 MB available and it needs 7629.39 MB.
20.12.2016 15:48:03 | LHC@home 1.0 | Nachricht vom Server: CMS Simulation needs 5037.39MB more disk space. You currently have 2592.00 MB available and it needs 7629.39 MB.
20.12.2016 15:48:03 | LHC@home 1.0 | Nachricht vom Server: LHCb Simulation needs 5037.39MB more disk space. You currently have 2592.00 MB available and it needs 7629.39 MB.



What are your suggestions ?
thanks for help
Dr.mabuse
ID: 28162 · Report as offensive     Reply Quote
computezrmle
Volunteer moderator
Volunteer developer
Volunteer tester
Avatar

Send message
Joined: 15 Jun 08
Posts: 1992
Credit: 143,596,632
RAC: 93,663
Message 28164 - Posted: 20 Dec 2016, 16:36:46 UTC - in response to Message 28162.  

Today I downloaded the new version of VMware and installed it and also the Extension pack 5.1.10. The installation completed successfully.

Good.

Then I downloaded some new tasks of LHC@home.

Only Theory I hope as the other VM subprojects need much more resources.

The first task claimed active but didn't start.

How long did you wait?
Be patient. Your computer has to copy a couple of files from the project directory to the slots directory.
The cernvm.vmdk can have between 800 MB (Theory) and 1.7 GB (CMS).
Do you have enough free disk space?

I tried to start it from the console.

Not good. This will not work as the VM is controlled by BOINC (vboxwrapper) and the cernvm.vmdk is deleted from the slots directory after an error occures or the WU is finished.

Another severe problem is the excessive disk space needed. see here:

20.12.2016 15:48:03 | LHC@home 1.0 | Nachricht vom Server: Theory Simulation needs 5037.39MB more disk space. You currently have 2592.00 MB available and it needs 7629.39 MB.
20.12.2016 15:48:03 | LHC@home 1.0 | Nachricht vom Server: CMS Simulation needs 5037.39MB more disk space. You currently have 2592.00 MB available and it needs 7629.39 MB.
20.12.2016 15:48:03 | LHC@home 1.0 | Nachricht vom Server: LHCb Simulation needs 5037.39MB more disk space. You currently have 2592.00 MB available and it needs 7629.39 MB.

You need enough free disk space (see above why).
It looks like you activated all three subprojects on the CERN webpage.
I recommend that you uncheck CMS and LHCb and start with Theory only until your computer runs it stable.
ID: 28164 · Report as offensive     Reply Quote
Profile dr_mabuse
Avatar

Send message
Joined: 30 Dec 05
Posts: 57
Credit: 816,635
RAC: 29
Message 28248 - Posted: 26 Dec 2016, 10:31:10 UTC - in response to Message 28164.  

Thanks. I got one simulation task running to a good end:
Theory_19795_1482692034.820099_0
Arbeitspaket 52948799
Erstellt 25 Dec 2016, 18:53:56 UTC
Gesendet 25 Dec 2016, 19:52:02 UTC
Ablaufdatum 25 Jan 2017, 19:52:02 UTC
Empfangen 26 Dec 2016, 8:53:33 UTC
Serverstatus Abgeschlossen
Laufzeit 12 Stunden 46 min. 48 sek.
CPU Zeit 5 Stunden 52 min. 42 sek.
Prüfungsstatus Gültig
Punkte 320.98

Thanks for help. I'll try some other.
Merry Christmas
Dr.Mabuse
ID: 28248 · Report as offensive     Reply Quote
William C Wilson
Avatar

Send message
Joined: 11 Sep 08
Posts: 25
Credit: 384,225
RAC: 0
Message 28249 - Posted: 26 Dec 2016, 16:48:29 UTC

Need advise -
As far as I know do not have VM Ware running on my machine. Made account but I am at a lost of what to download and install.

Somebody, from VM Ware products for download, please indicate what to download and install. Wanted to run those extra programs, slow downloading to Brazil but do not want to waste my computer time to idle or projects not importante to me.

Thank you

Bill in Brazil
William C Wilson
São Paulo Brazil
ID: 28249 · Report as offensive     Reply Quote
Jesse Viviano

Send message
Joined: 12 Feb 14
Posts: 72
Credit: 2,257,450
RAC: 3,831
Message 28251 - Posted: 26 Dec 2016, 16:56:24 UTC - in response to Message 28249.  
Last modified: 26 Dec 2016, 16:56:31 UTC

This project uses VirtualBox and not VMware. Download VirtualBox from https://www.virtualbox.org/.
ID: 28251 · Report as offensive     Reply Quote
William C Wilson
Avatar

Send message
Joined: 11 Sep 08
Posts: 25
Credit: 384,225
RAC: 0
Message 28252 - Posted: 26 Dec 2016, 18:07:19 UTC - in response to Message 28251.  

Thank you for your QUICK response. Figuring that not VM Ware and re-installed it again. Last time around 11 minutes was bombing out.  Somewhere saw run only one core at a time, so trying that now. If bombs again, will re-install from your recomendation and try again.

Really really appreciate your hint. Will repost here of what will happens. Hope will finish normally now. Thank you very much.
William C Wilson
São Paulo Brazil
ID: 28252 · Report as offensive     Reply Quote

Message boards : Number crunching : Only 1 of 40 simulation jobs finished and gave results 1


©2022 CERN