Message boards : LHCb Application : Low CPU usage
Message board moderation

To post messages, you must log in.

1 · 2 · Next

AuthorMessage
hoarfrost

Send message
Joined: 2 Jan 06
Posts: 5
Credit: 1,008,285
RAC: 0
Message 28063 - Posted: 5 Dec 2016, 21:53:14 UTC

Hello!

Now my computer process SixTrack and LHCb tasks. LHCb tasks is not use a CPU and many of them fail.
What I am doing wrong?

Link to my computer.

Thank you!
ID: 28063 · Report as offensive     Reply Quote
Toby Broom
Volunteer moderator

Send message
Joined: 27 Sep 08
Posts: 564
Credit: 351,946,477
RAC: 183,746
Message 28064 - Posted: 5 Dec 2016, 22:24:03 UTC - in response to Message 28063.  

Nothing, these are just testing at the moment so it's common for them to do very little.
ID: 28064 · Report as offensive     Reply Quote
hoarfrost

Send message
Joined: 2 Jan 06
Posts: 5
Credit: 1,008,285
RAC: 0
Message 28071 - Posted: 6 Dec 2016, 19:41:15 UTC - in response to Message 28064.  

Now i see that LHCb tasks work and completed ~28% into ~10 hours. But CPU consumption ~0% for all tasks. Very strange behavior! :)
ID: 28071 · Report as offensive     Reply Quote
hoarfrost

Send message
Joined: 2 Jan 06
Posts: 5
Credit: 1,008,285
RAC: 0
Message 28074 - Posted: 7 Dec 2016, 5:53:09 UTC - in response to Message 28071.  

Now tasks consume a CPU time, but still failing. :)
ID: 28074 · Report as offensive     Reply Quote
Toby Broom
Volunteer moderator

Send message
Joined: 27 Sep 08
Posts: 564
Credit: 351,946,477
RAC: 183,746
Message 28076 - Posted: 7 Dec 2016, 7:00:21 UTC

LHCb have at least 50% failure rate for me too. You can opt out of work for that project in your preferences if you don't want the tasks
ID: 28076 · Report as offensive     Reply Quote
hoarfrost

Send message
Joined: 2 Jan 06
Posts: 5
Credit: 1,008,285
RAC: 0
Message 28085 - Posted: 8 Dec 2016, 8:14:55 UTC - in response to Message 28076.  

LHCb have at least 50% failure rate for me too. You can opt out of work for that project in your preferences if you don't want the tasks

No-no! It's not a problem. Real help to science - is more intresting! :)
ID: 28085 · Report as offensive     Reply Quote
captainjack

Send message
Joined: 21 Jun 10
Posts: 25
Credit: 3,038,551
RAC: 4,163
Message 28087 - Posted: 8 Dec 2016, 21:07:27 UTC

Getting nothing but these error messages.

2016-12-08 15:01:50 (22444): Guest Log: [INFO] Job finished in slot1 with unknown exit code.


And no CPU usage.

Turning these off until I hear that they are working again.
ID: 28087 · Report as offensive     Reply Quote
Cinzia

Send message
Joined: 3 Mar 16
Posts: 5
Credit: 157,749
RAC: 0
Message 28096 - Posted: 12 Dec 2016, 6:22:36 UTC

Hi all,

First of all, thanks for your contribution.
We had some issue with the lhcb tasks in these days but now they are back working.
When you see that the task terminate without CPU usage or very little CPU usage, it means we have no simulation jobs waiting to be executed. We are working on improving this part.

Cheers
Cinzia
ID: 28096 · Report as offensive     Reply Quote
Toby Broom
Volunteer moderator

Send message
Joined: 27 Sep 08
Posts: 564
Credit: 351,946,477
RAC: 183,746
Message 28103 - Posted: 13 Dec 2016, 18:48:05 UTC

Can you make them quit if there is no work, then you can get a task from another part of the project and do some work on Theory,....
ID: 28103 · Report as offensive     Reply Quote
Profile Laurence
Project administrator
Project developer

Send message
Joined: 20 Jun 14
Posts: 323
Credit: 237,904
RAC: 376
Message 28104 - Posted: 13 Dec 2016, 19:46:21 UTC - in response to Message 28103.  

Yes, we hope to add that functionality within the next few days.
ID: 28104 · Report as offensive     Reply Quote
Sunny129
Avatar

Send message
Joined: 12 Dec 05
Posts: 31
Credit: 9,709,398
RAC: 0
Message 28105 - Posted: 13 Dec 2016, 19:59:22 UTC - in response to Message 28076.  

LHCb have at least 50% failure rate for me too.

I have to wonder why this is the case for you when, for me, the failure rate is 100%. I mean if most (virtually all) LHCb errors are due to the fact that there are no simulation jobs waiting to be executed at that particular moment in time (and the problem therefore truly is a server-side issue, not a host issue), then everyone should be experiencing almost identical failure rates regardless of the sizes of their DC farms...and yet that's not the case.
ID: 28105 · Report as offensive     Reply Quote
Toby Broom
Volunteer moderator

Send message
Joined: 27 Sep 08
Posts: 564
Credit: 351,946,477
RAC: 183,746
Message 28107 - Posted: 13 Dec 2016, 21:45:25 UTC - in response to Message 28105.  
Last modified: 13 Dec 2016, 21:47:30 UTC

Last few day it's been good for LHCb, my failure rate wasn't so bad.

I agree with you it's strange that the failure rates are so different, maybe there is different problems for different people?

Looking over a few of yours there are the same as mine, 10min of waiting for Condor ping then fail. Your 4970K isn't doing too bad failure wise.
ID: 28107 · Report as offensive     Reply Quote
William C Wilson
Avatar

Send message
Joined: 11 Sep 08
Posts: 25
Credit: 384,225
RAC: 0
Message 28277 - Posted: 28 Dec 2016, 14:49:31 UTC

I am running LHCb simulations, only last few days. Getting failures after about 4 to 5 min, all related to VM Box. Was not installed properly.  Installed Oracle VM Virtual Box 5.1.12 and it´s extension. By trial and error, now bring computer up, start VM manually, and start "execution". After that I manually start Boinc.

First test I ran only 1 core (i4790), Windows 10 Pro (Insider Preview Build 14986)and went to completion. Next, ran 6 cores of simulation, and strange. CPU usage always relatively low (less than 78%. Memory pushing 92% of 24 gb.

When run 7 Seti or Six track applications, always pushes CPU to 98% at 4.3 ghz.

Disk usage was very high for LCHb, caused by VM process but never approached system max of 480 mega BYTES/second but extremely high rate, around 300 Mb/sec. After 10 minutes, all 6 instances failed with computational errors at once.

Now running 1 instance of LHCb Simulation with 6 instances of SETI@home, CPU reaches total of 90% at only 3.93 Ghz. Disk usage around 3 to 8% capacity. Pause LHCb, add the 7th SETI and CPU goes to 98%, my throttle limit set for BOINC.

I will try 4 instances of LHCb next with 3 SETI instances. I just do not want it to seem that something is wrong with the LCHb application or data being crunched. But seems there is some limits caused by VM. Just strange why it does not “grab” all the CPU cycles it can.

Any comments or suggestions will be appreciated. Do not want to screw up the processing, but want to take advantage of maximum of my machine when on. Thanks. Bill in Brazil
William C Wilson
São Paulo Brazil
ID: 28277 · Report as offensive     Reply Quote
tullio

Send message
Joined: 19 Feb 08
Posts: 594
Credit: 3,699,506
RAC: 4,201
Message 28288 - Posted: 29 Dec 2016, 0:13:01 UTC

LHCb is using 110% of my 2 core Opteron 1210 on Leap Linux 42.2. But is it multicore? I am using the "top" command.
Tullio
ID: 28288 · Report as offensive     Reply Quote
tullio

Send message
Joined: 19 Feb 08
Posts: 594
Credit: 3,699,506
RAC: 4,201
Message 28873 - Posted: 14 Feb 2017, 17:56:19 UTC
Last modified: 14 Feb 2017, 17:56:53 UTC

My Windows 10 CPU tasks are not using any CPU while those on my Linux box take about 9% CPU.
Tullio
ID: 28873 · Report as offensive     Reply Quote
tullio

Send message
Joined: 19 Feb 08
Posts: 594
Credit: 3,699,506
RAC: 4,201
Message 28884 - Posted: 15 Feb 2017, 8:08:32 UTC

All LHCb tasks fail with computation error both on the Windows 10 PC and the Linux box.
Tullio
ID: 28884 · Report as offensive     Reply Quote
Toby Broom
Volunteer moderator

Send message
Joined: 27 Sep 08
Posts: 564
Credit: 351,946,477
RAC: 183,746
Message 29385 - Posted: 18 Mar 2017, 12:34:20 UTC

What is the status of this at the moment the logs look like this:

2017-03-18 09:30:21 (2620): Guest Log: [INFO] New Job Starting in slot1
2017-03-18 09:30:21 (2620): Guest Log: [INFO] Condor JobID: 14906.19 in slot1
2017-03-18 09:30:21 (2620): Guest Log: [INFO] Starting pilot in slot1
2017-03-18 09:49:12 (2620): Guest Log: [INFO] Job finished in slot1 with .
2017-03-18 09:50:32 (2620): Guest Log: [INFO] New Job Starting in slot1
2017-03-18 09:50:32 (2620): Guest Log: [INFO] Condor JobID: 14908.30 in slot1
2017-03-18 09:50:42 (2620): Guest Log: [INFO] Starting pilot in slot1
2017-03-18 10:09:33 (2620): Guest Log: [INFO] Job finished in slot1 with .

overall the CPU usage is low
ID: 29385 · Report as offensive     Reply Quote
tullio

Send message
Joined: 19 Feb 08
Posts: 594
Credit: 3,699,506
RAC: 4,201
Message 29407 - Posted: 19 Mar 2017, 8:36:16 UTC

Two tasks failed on my Windows 10 PC with EXIT_INIT_FAILURE. On the Task Manager CPU usage was close to zero.
Tullio
ID: 29407 · Report as offensive     Reply Quote
PHILIPPE

Send message
Joined: 24 Jul 16
Posts: 88
Credit: 239,917
RAC: 0
Message 29537 - Posted: 22 Mar 2017, 20:45:31 UTC - in response to Message 29407.  
Last modified: 22 Mar 2017, 21:24:35 UTC

In order to fix the issue you have with your windows 10,
your cpu time is too short.I had experimented the same trouble with my host.
--------------------------------------------------------------------------------
Can you try to open virtualbox and delete all the faulty VMs,located in its left panel and which appears with a red circle (meaning the VM is not reachable by virtualbox).(be careful , only the ones with a red circle,not the blue ones)
This or these VMs prevent you to run correctly next work units LHCb.
--------------------------------------------------------------------------------
If nothing appears in the left panel (sometimes it arrives), you have to go to your slot directories (C:\ProgramData\BOINC\slots).You enter inside each slot where a vm is running and right click on the file terminated with .vbox to open it with virtualbox and you repeat the same as written above.
For instance for me it was :C:\ProgramData\BOINC\slots\4\boinc_1ceae01e8526c2b2\boinc_1ceae01e8526c2b2.vbox (right click and open with virtualbox and right click and delete the faulty vms)
When you have cleaned up all your slots, using this described method , you would normally have a correct run for the next wus.
This troubles seems to arrive rather often.When you notice cpu times is too low compared to elapsed times,it ' s a habit to take,avoiding to waste electricity with no scientific results.
ID: 29537 · Report as offensive     Reply Quote
tullio

Send message
Joined: 19 Feb 08
Posts: 594
Credit: 3,699,506
RAC: 4,201
Message 29593 - Posted: 24 Mar 2017, 16:05:06 UTC - in response to Message 29537.  
Last modified: 24 Mar 2017, 16:27:07 UTC

My VirtualBox Manager shows no VMs. I am running SETI@Home Beta on this CPU and GPU since all LHC tasks fail. On a stderr.txt of LHCb I noticed an Autentication failure. It seems that the server does not recognize me as a legitimate user.
On my other Linux box, where I have loaded a 64-bit SuSE Leap 42.2 two native Atlas tasks are running multicore, one at the time, with absolutely no problem with a 4100 MB RAM. The CPU is an AMD E-450 which I had mistakenly considered a 32-bit CPU. I have enabled AMD-V in its BIOS.
Tullio
On the Windows PC I am running Einstein@home tasks, both CPU and GPU, to take advantage of its GTX 1050 GPU board with its Pascal processor.
VirtualBox is 5.1.18 on all PCs.
ID: 29593 · Report as offensive     Reply Quote
1 · 2 · Next

Message boards : LHCb Application : Low CPU usage


©2019 CERN