Message boards : Number crunching : Decreasing subprojects within the past months
Message board moderation

To post messages, you must log in.

AuthorMessage
Erich56

Send message
Joined: 18 Dec 15
Posts: 1322
Credit: 24,370,335
RAC: 10,129
Message 37474 - Posted: 2 Dec 2018, 7:18:55 UTC

in the recent months, most of the time only ATLAS tasks have been available and/or were working well.

CMS is probably dead.
Same seems to be the case with LHCb.
Sixtrack sends tasks only once in a while, and within 2 or 3 days they are used up.

This situation is really bad for people with machines that are NOT able to crunch ATLAS, mostly because of lack of sufficient RAM. For example, from a total of 5 machines I use for crunching LHC tasks, 3 old notebooks have 4GB or 3GB RAM, with no possibility for additional memory.
Which means that whenever we have the situation where only ATLAS is available, I can use only 2 PCs out of 5. And I bet that I am not the only one with this kind of problem.

In a way it's sad to see how it's going downwards with LHC :-(
ID: 37474 · Report as offensive     Reply Quote
maeax

Send message
Joined: 2 May 07
Posts: 1071
Credit: 36,371,556
RAC: 5,031
Message 37475 - Posted: 2 Dec 2018, 7:28:04 UTC

The LHC is stopping now for a longer period.
ID: 37475 · Report as offensive     Reply Quote
Crystal Pellet
Volunteer moderator
Volunteer tester

Send message
Joined: 14 Jan 10
Posts: 991
Credit: 6,426,879
RAC: 482
Message 37478 - Posted: 2 Dec 2018, 9:03:36 UTC - in response to Message 37474.  

You forgot to mention the Theory application. Even when LHC is under construction Theory-tasks are mostly ever available.
A 1-core task needs only 730MB of RAM up to 1030 for a 4-core task. On a 32bit OS a task can even run with 256MB, I think default setup with 384MB.
Yeah, I know it sometimes happens that a Theory-VM is not feeded with jobs, specially when there are sixtracks in the pipe,
but then you have the Sixtrack-tasks.
ID: 37478 · Report as offensive     Reply Quote
Erich56

Send message
Joined: 18 Dec 15
Posts: 1322
Credit: 24,370,335
RAC: 10,129
Message 37484 - Posted: 2 Dec 2018, 17:18:03 UTC - in response to Message 37478.  

You forgot to mention the Theory application.
yes, Crystal, you're right - I forgot to mention Theory. What concerns it's memory usage, it is perfect for machines with low RAM.
However, as mentioned by me and also others in the Theory-thread, this subproject has had a lot of problems during the past weeks. Either there are no tasks (like exactly at the moment I am writing this posting) or there are tasks and no jobs, ...
Almost every other day, these problems come up, in one way or the other. Theory, unfortunately, has become rather unreliable :-(
No idea why the LHC people are not able to solve these problems on a long-lasting basis.
ID: 37484 · Report as offensive     Reply Quote
Erich56

Send message
Joined: 18 Dec 15
Posts: 1322
Credit: 24,370,335
RAC: 10,129
Message 37485 - Posted: 2 Dec 2018, 17:58:01 UTC - in response to Message 37484.  

... Either there are no tasks (like exactly at the moment I am writing this posting) or there are tasks and no jobs, ...
a few minutes after I wrote this posting, new Theory tasks became available. However, after some 18 minutes, they errored out with 207 (0x000000CF) EXIT_NO_SUB_TASKS
By now, all this becomes rather frustrating :-(((
No idea what's going on at LHC@home.
ID: 37485 · Report as offensive     Reply Quote
Crystal Pellet
Volunteer moderator
Volunteer tester

Send message
Joined: 14 Jan 10
Posts: 991
Credit: 6,426,879
RAC: 482
Message 37486 - Posted: 2 Dec 2018, 18:11:41 UTC
Last modified: 2 Dec 2018, 18:13:59 UTC

Me, lucky one ;)
10:02:17 +0100 2018-12-02 [INFO] New Job Starting in slot1
10:02:38 +0100 2018-12-02 [INFO] New Job Starting in slot1
10:02:55 +0100 2018-12-02 [INFO] New Job Starting in slot1
10:03:34 +0100 2018-12-02 [INFO] New Job Starting in slot1
10:51:12 +0100 2018-12-02 [INFO] New Job Starting in slot1
12:01:02 +0100 2018-12-02 [INFO] New Job Starting in slot1
12:05:05 +0100 2018-12-02 [INFO] New Job Starting in slot1
12:05:59 +0100 2018-12-02 [INFO] New Job Starting in slot1
12:46:01 +0100 2018-12-02 [INFO] New Job Starting in slot1
17:15:33 +0100 2018-12-02 [INFO] New Job Starting in slot1
17:24:07 +0100 2018-12-02 [INFO] New Job Starting in slot1
18:45:52 +0100 2018-12-02 [INFO] New Job Starting in slot1
At least emptiing the tasks-bucket when no sub-jobs are available is working.
ID: 37486 · Report as offensive     Reply Quote
Erich56

Send message
Joined: 18 Dec 15
Posts: 1322
Credit: 24,370,335
RAC: 10,129
Message 37487 - Posted: 2 Dec 2018, 18:19:26 UTC - in response to Message 37486.  

At least emptiing the tasks-bucket when no sub-jobs are available is working.
I cannot confirm. Short time ago, there were 8 tasks shown, now it's 65. So the bucket is filling up, instead of the other way round. And still all downloaded tasks are erroring out with the "no subtasks" notice :-(
ID: 37487 · Report as offensive     Reply Quote
Erich56

Send message
Joined: 18 Dec 15
Posts: 1322
Credit: 24,370,335
RAC: 10,129
Message 37491 - Posted: 3 Dec 2018, 7:06:40 UTC - in response to Message 37487.  

At least emptiing the tasks-bucket when no sub-jobs are available is working.
I cannot confirm. Short time ago, there were 8 tasks shown, now it's 65. So the bucket is filling up, instead of the other way round. And still all downloaded tasks are erroring out with the "no subtasks" notice :-(
and it was going like this all night long. Quite a number of tasks was downloaded on my various machines, 1 or 2 succeeded, all others failed.

So, Theory should be stopped until the LHC people finally find out what the problem is and repair it. The way it's been running for several weeks makes no sense at all. It's just a waste.

(Sorry if in the eyes of some colleagues I am again not polite enough; but the situation is rather annoying!)
ID: 37491 · Report as offensive     Reply Quote
maeax

Send message
Joined: 2 May 07
Posts: 1071
Credit: 36,371,556
RAC: 5,031
Message 37493 - Posted: 3 Dec 2018, 7:56:25 UTC - in response to Message 37491.  

So, Theory should be stopped until the LHC people finally find out what the problem is and repair it. The way it's been running for several weeks makes no sense at all. It's just a waste.

(Sorry if in the eyes of some colleagues I am again not polite enough; but the situation is rather annoying!)


Erich,
this is the simple answer. Have you ever worked in IT?
If it is the infrastructure, than it is not the PROJECT!!
So, give the Team the time to find the solution.
It doesn't be better to give every day a watermark.
ID: 37493 · Report as offensive     Reply Quote

Message boards : Number crunching : Decreasing subprojects within the past months


©2021 CERN