Message boards : ATLAS application : ATLAS vbox version 2.00
Message board moderation

To post messages, you must log in.

Previous · 1 · 2 · 3 · 4 · 5 · Next

AuthorMessage
computezrmle
Volunteer moderator
Volunteer developer
Volunteer tester
Avatar

Send message
Joined: 15 Jun 08
Posts: 1439
Credit: 74,001,548
RAC: 119,041
Message 40150 - Posted: 14 Oct 2019, 17:53:11 UTC - in response to Message 40148.  

From your logfile:
2019-10-14 18:06:37 (11056): Guest Log:     "exeErrorDiag": "Non-zero return code from EVNTtoHITS (65); Logfile error in log.EVNTtoHITS: \"Segmentation fault: Event counter: 19; Run: 284500; Evt: 3543749; Current algorithm: ISF_Kernel_FullG4; Current Function: unknown\"",

Segmentation fault
This is something David Cameron should investigate or forward to the developers.
ID: 40150 · Report as offensive     Reply Quote
tullio

Send message
Joined: 19 Feb 08
Posts: 602
Credit: 3,745,066
RAC: 27
Message 40153 - Posted: 15 Oct 2019, 7:31:11 UTC

3 out of 4 produced HITS files. Another is running.
Tullio
ID: 40153 · Report as offensive     Reply Quote
Crystal Pellet
Volunteer moderator
Volunteer tester

Send message
Joined: 14 Jan 10
Posts: 942
Credit: 6,295,475
RAC: 1,091
Message 40159 - Posted: 15 Oct 2019, 17:55:03 UTC

Why do I get a task running the previous ATLAS version instead of version 2.00.
It also re-downloaded the old ATLASM_2017_03_01.vdi

Task: KNmMDmt2UevnsSi4apGgGQJmABFKDmABFKDmgPVNDmABFKDmL0ckcm_0 ----- 1.01 ATLAS Simulation (vbox64_mt_mcore_atlas)
ID: 40159 · Report as offensive     Reply Quote
Gunde

Send message
Joined: 9 Jan 15
Posts: 84
Credit: 333,747,286
RAC: 293,266
Message 40160 - Posted: 15 Oct 2019, 17:56:14 UTC
Last modified: 15 Oct 2019, 18:17:10 UTC

Got a few ATLAS Simulation v2.00 (vbox64_mt_mcore_atlas) x86_64-pc-linux-gnu but server fallback to v1.01 on last downloaded task.
Was there any issue with v2.00? Same as for others i got a few long runners up to 4 days now. 2 of them stalled with kernal panic and 2 with reset adapter.

When check task list it appears that many vbox v1.01 got invalid/error on upload

</stderr_txt>
<message>
upload failure: <file_xfer_error>
  <file_name>AqhLDmNm2dvnsSi4apGgGQJmABFKDmABFKDmRR5ZDmABFKDm4Ju39n_0_r644774494_ATLAS_result</file_name>
  <error_code>-161 (not found)</error_code>
</file_xfer_error>
</message>


Was the data purged on server or any issue on servers to receive result file?
Issue started yesterday and i see several from today.
ID: 40160 · Report as offensive     Reply Quote
computezrmle
Volunteer moderator
Volunteer developer
Volunteer tester
Avatar

Send message
Joined: 15 Jun 08
Posts: 1439
Credit: 74,001,548
RAC: 119,041
Message 40161 - Posted: 15 Oct 2019, 18:37:19 UTC - in response to Message 40159.  

... previous ATLAS version instead of version 2.00

Might be that BOINC has not been restarted on at least one of the servers since v2.00 has bee released.
ID: 40161 · Report as offensive     Reply Quote
Gunde

Send message
Joined: 9 Jan 15
Posts: 84
Credit: 333,747,286
RAC: 293,266
Message 40162 - Posted: 15 Oct 2019, 19:08:13 UTC

I have checked few of my host and it looks to an issue to get work on vbox vm:s. With old vbox 1.01 application it allow me to reach top and i would that only one process of Athena.py is running.

I could not reach top from console on new 2.00 Centos 7. Stuck on login on each session but from system monitor they stay on low usage. Would like to see if cpu and ram usage and which processes but not possible.
turn on Native application with 2.72 they fire up fine up running after getting data.
ID: 40162 · Report as offensive     Reply Quote
David Cameron
Project administrator
Project developer
Project scientist

Send message
Joined: 13 May 14
Posts: 314
Credit: 10,283,970
RAC: 6,248
Message 40163 - Posted: 15 Oct 2019, 19:32:22 UTC - in response to Message 40161.  

I have now deprecated the 1.01 versions so only 2.00 should be sent out now.
ID: 40163 · Report as offensive     Reply Quote
[VENETO] boboviz

Send message
Joined: 7 May 08
Posts: 39
Credit: 218,356
RAC: 0
Message 40168 - Posted: 16 Oct 2019, 12:52:48 UTC

I'm returning to sixtrack, waiting for more stable atlas app
ID: 40168 · Report as offensive     Reply Quote
Toby Broom
Volunteer moderator

Send message
Joined: 27 Sep 08
Posts: 597
Credit: 371,400,451
RAC: 28,865
Message 40170 - Posted: 16 Oct 2019, 17:09:48 UTC

I got a few completed, pushed my success rate of ATLAS tasks to 34%
ID: 40170 · Report as offensive     Reply Quote
David Cameron
Project administrator
Project developer
Project scientist

Send message
Joined: 13 May 14
Posts: 314
Credit: 10,283,970
RAC: 6,248
Message 40182 - Posted: 17 Oct 2019, 12:41:31 UTC - in response to Message 40162.  

I have checked few of my host and it looks to an issue to get work on vbox vm:s. With old vbox 1.01 application it allow me to reach top and i would that only one process of Athena.py is running.

I could not reach top from console on new 2.00 Centos 7. Stuck on login on each session but from system monitor they stay on low usage. Would like to see if cpu and ram usage and which processes but not possible.
turn on Native application with 2.72 they fire up fine up running after getting data.


With thanks (again) to computezrmle, we will now have a better "top" display on console 3 which avoids the annoying scrolling effects. This will become active for new WU within a few hours.
ID: 40182 · Report as offensive     Reply Quote
csbyseti

Send message
Joined: 6 Jul 17
Posts: 22
Credit: 22,460,783
RAC: 769
Message 40183 - Posted: 17 Oct 2019, 13:48:58 UTC - in response to Message 40182.  

Version 2 works with Virtualbox fine on my machines, but it would be nice if the VM Console will show something at Alt-F2.

That's the best point to control working of the WU.
ID: 40183 · Report as offensive     Reply Quote
maeax

Send message
Joined: 2 May 07
Posts: 919
Credit: 33,710,713
RAC: 3,738
Message 40184 - Posted: 17 Oct 2019, 14:20:23 UTC - in response to Message 40183.  
Last modified: 17 Oct 2019, 14:27:47 UTC

This is your Virtualbox in use on one PC:
2019-10-17 07:40:37 (6520): Guest Log: BIOS: VirtualBox 5.1.26
Please upgrade to 6.0.x. (with ExtPack)
ID: 40184 · Report as offensive     Reply Quote
computezrmle
Volunteer moderator
Volunteer developer
Volunteer tester
Avatar

Send message
Joined: 15 Jun 08
Posts: 1439
Credit: 74,001,548
RAC: 119,041
Message 40185 - Posted: 17 Oct 2019, 15:10:38 UTC

The currently active monitoring script doesn't output anything at ALT-F2 until at least 1 event has finished.
As the current ATLAS batch sometimes needs up to 30-40m per event it may look like ALT-F2 has crashed.

Be patient.
I already sent David a suggestion for an improved monitoring but there are a few (hopefully minor) issues to solve before it can go live.
ID: 40185 · Report as offensive     Reply Quote
csbyseti

Send message
Joined: 6 Jul 17
Posts: 22
Credit: 22,460,783
RAC: 769
Message 40186 - Posted: 17 Oct 2019, 18:03:56 UTC - in response to Message 40185.  

The currently active monitoring script doesn't output anything at ALT-F2 until at least 1 event has finished.
As the current ATLAS batch sometimes needs up to 30-40m per event it may look like ALT-F2 has crashed.

Be patient.
I already sent David a suggestion for an improved monitoring but there are a few (hopefully minor) issues to solve before it can go live.


I don't think that it will work after 1 Event is finished or 1 Event will use the whole WU run time.

Normal i use this funktion only if the WU Runtime rises in an abnormal way.

At startup of a WU i use only the CPU Utilisation % of BoincTask for the first hour.

Perhaps it's a VirtualBox Version Problem, i think i got Version 5.2.XX on all machines.
The version which comes with Boinc.
ID: 40186 · Report as offensive     Reply Quote
Profile Yeti
Volunteer moderator
Avatar

Send message
Joined: 2 Sep 04
Posts: 406
Credit: 96,116,916
RAC: 37
Message 40194 - Posted: 18 Oct 2019, 11:05:40 UTC - in response to Message 40184.  

Please upgrade to 6.0.x. (with ExtPack)

Maybe I missed it but I never saw a word from Projektteam that VirtualBox Version 6.x is okay to use with Atlas. And so long I will stay with 5.x


Supporting BOINC, a great concept !
ID: 40194 · Report as offensive     Reply Quote
maeax

Send message
Joined: 2 May 07
Posts: 919
Credit: 33,710,713
RAC: 3,738
Message 40195 - Posted: 18 Oct 2019, 11:32:00 UTC

Yes Yeti,
have one PC with 6.0.12, since a few days.
5.2.8 is default on Boinc-Webpage. Support is up to July 2020 for 5.2.x.
So, will also wait for the complete upgrade.
ID: 40195 · Report as offensive     Reply Quote
Toby Broom
Volunteer moderator

Send message
Joined: 27 Sep 08
Posts: 597
Credit: 371,400,451
RAC: 28,865
Message 40198 - Posted: 18 Oct 2019, 16:13:34 UTC

I don't think the project team let us know, it's normally one of us that tries it.

I still use 5.1.x as more reliable for me, I will try out 6.0.x it works well for the other subproject if I cap the maximum number of cores to be less than 100%, I have to do some testing to find the breaking point on my other computer as I have always let them run to 100%
ID: 40198 · Report as offensive     Reply Quote
Profile Yeti
Volunteer moderator
Avatar

Send message
Joined: 2 Sep 04
Posts: 406
Credit: 96,116,916
RAC: 37
Message 40199 - Posted: 18 Oct 2019, 16:19:35 UTC - in response to Message 40198.  

I don't think the project team let us know, it's normally one of us that tries it.

HM, not really a good idea. In the past, the wrapper of Atlas had some particularities so that it was not a good idea to switch to a new Major-Release from VirtualBox without the okay from the projectteam. Sometimes they had to make special preparations for the wrapper ...


Supporting BOINC, a great concept !
ID: 40199 · Report as offensive     Reply Quote
Jonathan

Send message
Joined: 25 Sep 17
Posts: 52
Credit: 1,626,567
RAC: 33
Message 40201 - Posted: 18 Oct 2019, 22:04:06 UTC - in response to Message 40199.  
Last modified: 18 Oct 2019, 22:04:22 UTC

Running on Virtual Box 6.0.12 and I think I have a stuck work unit. I don't see any output to Alt+F2 nor Alt+F3 when I go to check. This task has been running for a day or so and I should at least get the TOP info. I have given it a few minutes to update the screens too. Windows task manager shows about 11% cpu usage so the VM is doing something.
Is there anything else I can check to see if I am stuck?

From Properties window in Boinc

Application
ATLAS Simulation 2.00 (vbox64_mt_mcore_atlas)
Name
YzILDmjJ6evnsSi4apGgGQJmABFKDmABFKDmwrETDmABFKDmuXz3im
State
Running
Received
10/17/2019 3:58:54 PM
Report deadline
10/24/2019 3:58:53 PM
Resources
8 CPUs
Estimated computation size
43,200 GFLOPs
CPU time
1d 14:09:14
CPU time since checkpoint
00:01:31
Elapsed time
1d 00:49:30
Estimated time remaining
00:01:32
Fraction done
99.896%
Virtual memory size
149.73 MB
Working set size
9.96 GB
Directory
slots/0
Process ID
16176
Progress rate
3.960% per hour
Executable
vboxwrapper_26198ab7_windows_x86_64.exe
ID: 40201 · Report as offensive     Reply Quote
Jonathan

Send message
Joined: 25 Sep 17
Posts: 52
Credit: 1,626,567
RAC: 33
Message 40203 - Posted: 19 Oct 2019, 5:46:40 UTC - in response to Message 40201.  

I took a look at stderr.txt and it appears I am missing the line showing the Atlas job starting.like all my successful, completed tasks.
After "2019-10-17 16:03:29 (16176): Guest Log: ATHENA_PROC_NUMBER=8"
I don't see something like this, Guest Log: *** Starting ATLAS job.

I set no new tasks, aborted all other work units and left the stuck task running.

Task link
https://lhcathome.cern.ch/lhcathome/result.php?resultid=249135043

Work unit link
https://lhcathome.cern.ch/lhcathome/workunit.php?wuid=125127389
ID: 40203 · Report as offensive     Reply Quote
Previous · 1 · 2 · 3 · 4 · 5 · Next

Message boards : ATLAS application : ATLAS vbox version 2.00


©2020 CERN