Message boards : Theory Application : New version 300.00
Message board moderation

To post messages, you must log in.

Previous · 1 · 2 · 3 · Next

AuthorMessage
maeax

Send message
Joined: 2 May 07
Posts: 928
Credit: 33,756,172
RAC: 2,611
Message 40478 - Posted: 16 Nov 2019, 13:27:11 UTC

It's not possible to ping sft.cern.ch. Seems to be down or unreachable.
ID: 40478 · Report as offensive     Reply Quote
computezrmle
Volunteer moderator
Volunteer developer
Volunteer tester
Avatar

Send message
Joined: 15 Jun 08
Posts: 1443
Credit: 76,491,739
RAC: 100,189
Message 40481 - Posted: 16 Nov 2019, 18:26:09 UTC - in response to Message 40478.  

It's not possible to ping sft.cern.ch. Seems to be down or unreachable.

It's not possible because it's not a server name.
Hence it doesn't have a DNS entry.

Instead it's the name of a CVMFS repository.
Status of the stratum 1 servers can be checked here:
http://cernvm-monitor.cern.ch/cvmfs-monitor/sft.cern.ch/

Be aware that stratum 1 servers should not (tends to must not) be used directly by LHC@home volunteers.
Instead it's recommended to use their openhtc.io counterparts.
ID: 40481 · Report as offensive     Reply Quote
Luigi R.
Avatar

Send message
Joined: 7 Feb 14
Posts: 99
Credit: 5,027,000
RAC: 0
Message 40483 - Posted: 17 Nov 2019, 8:39:37 UTC - in response to Message 40395.  
Last modified: 17 Nov 2019, 8:40:03 UTC

Jobs will last on average 2 hours rather than 12.
Very long task: https://lhcathome.cern.ch/lhcathome/result.php?resultid=251991795
ID: 40483 · Report as offensive     Reply Quote
Crystal Pellet
Volunteer moderator
Volunteer tester

Send message
Joined: 14 Jan 10
Posts: 951
Credit: 6,323,990
RAC: 1,942
Message 40486 - Posted: 17 Nov 2019, 11:07:51 UTC - in response to Message 40483.  
Last modified: 17 Nov 2019, 11:10:20 UTC

Jobs will last on average 2 hours rather than 12.
Very long task: https://lhcathome.cern.ch/lhcathome/result.php?resultid=251991795
The longest known Theory task of batch 2279 lasted 376 hours and 55 minutes, the second longest 236.5 hours ;)
ID: 40486 · Report as offensive     Reply Quote
Luigi R.
Avatar

Send message
Joined: 7 Feb 14
Posts: 99
Credit: 5,027,000
RAC: 0
Message 40488 - Posted: 17 Nov 2019, 12:55:57 UTC - in response to Message 40486.  

The longest known Theory task of batch 2279 lasted 376 hours and 55 minutes, the second longest 236.5 hours ;)
Not so long then.
Anyway that host avg time is about 1.6 hours, so 18 hours is quite a bit. :)

What's the host avg time and theory app version for an almost 377-hours task?
ID: 40488 · Report as offensive     Reply Quote
Crystal Pellet
Volunteer moderator
Volunteer tester

Send message
Joined: 14 Jan 10
Posts: 951
Credit: 6,323,990
RAC: 1,942
Message 40489 - Posted: 17 Nov 2019, 14:13:18 UTC - in response to Message 40488.  

What's the host avg time and theory app version for an almost 377-hours task?
That info comes from MC Production -> http://mcplots-dev.cern.ch/production.php?view=revision&rev=2279
There is no host-info available. You see my 2 mentioned jobs are real extreme out-liers.
When clicking on the graph, you'll see the number of jobs per 5 minutes interval.
ID: 40489 · Report as offensive     Reply Quote
Crystal Pellet
Volunteer moderator
Volunteer tester

Send message
Joined: 14 Jan 10
Posts: 951
Credit: 6,323,990
RAC: 1,942
Message 40493 - Posted: 18 Nov 2019, 8:58:56 UTC - in response to Message 40429.  

Testing Theory vbox32:
This task was ready within a few seconds.
2019-11-18 09:31:49 (6420): Guest Log: [INFO] ===> [runRivet] Mon Nov 18 09:31:38 CET 2019 [boinc pp jets 7000 25,-,760 - herwig7 7.1.3 default 100000 186]
2019-11-18 09:32:33 (6420): Guest Log: [INFO] Preparing output.
2019-11-18 09:32:33 (6420): Guest Log: [INFO] Job Finished

Maybe the base memory for Theory vbox32 of 320MB is too low for herwig7.
Pythia's are running fine. I increased the memory requirement to 384MB waiting for future herwig's.
For BOINC you are reserving 700,000,000 bytes of memory by setting this in rsc_memory_bound. Maybe you could change this in line with the real needed memory.

Could you also have a look to my remarks mentioned here.
ID: 40493 · Report as offensive     Reply Quote
Luigi R.
Avatar

Send message
Joined: 7 Feb 14
Posts: 99
Credit: 5,027,000
RAC: 0
Message 40496 - Posted: 18 Nov 2019, 9:09:38 UTC - in response to Message 40489.  
Last modified: 18 Nov 2019, 9:13:58 UTC

Oh, I didn't notice that graph is clickable.

A lot of njobs=1 looks to me there are many uncommon jobs or large runtime from a few slow hosts so that 5mins-bins are too thin.
ID: 40496 · Report as offensive     Reply Quote
Crystal Pellet
Volunteer moderator
Volunteer tester

Send message
Joined: 14 Jan 10
Posts: 951
Credit: 6,323,990
RAC: 1,942
Message 40505 - Posted: 18 Nov 2019, 15:28:04 UTC - in response to Message 40493.  

Maybe the base memory for Theory vbox32 of 320MB is too low for herwig7.
Pythia's are running fine. I increased the memory requirement to 384MB waiting for future herwig's.
Herwig++ is running OK with 384MB Base memory. Lowest free memory seen 6MB with only 3MB swap used.
ID: 40505 · Report as offensive     Reply Quote
Profile MAGIC Quantum Mechanic
Avatar

Send message
Joined: 24 Oct 04
Posts: 939
Credit: 40,262,835
RAC: 20,032
Message 40511 - Posted: 19 Nov 2019, 3:33:58 UTC
Last modified: 19 Nov 2019, 3:49:17 UTC

Yeah we don't get much Ram on a Windows X86 no matter how many you plug in (<4GB)

I finally got to start my X86 Theory 300.02 (just ran 3 Sixtrack that was almost 100 hours each)

This Theory task is now at 1% after 1 hour so far.....it says remaining time 4days 4mins 30 sec
ID: 40511 · Report as offensive     Reply Quote
Crystal Pellet
Volunteer moderator
Volunteer tester

Send message
Joined: 14 Jan 10
Posts: 951
Credit: 6,323,990
RAC: 1,942
Message 40515 - Posted: 19 Nov 2019, 8:23:48 UTC - in response to Message 40429.  

Remarks:
- Console ALT-F2 only shows: Running job output should appear here - no events shown, although I see with ALT-F3 that agile-runmc and rivetvm.exe are running
-
I got Console ALT-F2 (vbox32) working to show the progress of events processing.
It should be repaired otherwise, but this is what I did:

Suspended a task with LAIM off.
With VirtualBox Manager I discarded the saved state.
I started the VM outside of BOINC with VirtualBox Manager.
Wait until the pythia, herwig etc has started in 'top'.
Switch to ALT-F2 and it works.
Saved the machine state and resumed the task in BOINC.
Console ALT-F2 is now also working in Remote Display Port.

btw: I increased the Base Memory to 512MB. I also had a Pythia8 ready within a few seconds without doing real work https://lhcathome.cern.ch/lhcathome/result.php?resultid=252193424
ID: 40515 · Report as offensive     Reply Quote
Erich56

Send message
Joined: 18 Dec 15
Posts: 1283
Credit: 23,078,312
RAC: 2,804
Message 40583 - Posted: 22 Nov 2019, 7:46:20 UTC

Although more than unsent 600 tasks are shown in the Server Status page, my host received only 1 task, and from then on it keeps saying "no tasks available for Theory similation".
Why so?
ID: 40583 · Report as offensive     Reply Quote
maeax

Send message
Joined: 2 May 07
Posts: 928
Credit: 33,756,172
RAC: 2,611
Message 40584 - Posted: 22 Nov 2019, 7:50:01 UTC - in response to Message 40583.  

Laurence is searching for this fauxpas.
When you have a idea...
ID: 40584 · Report as offensive     Reply Quote
Crystal Pellet
Volunteer moderator
Volunteer tester

Send message
Joined: 14 Jan 10
Posts: 951
Credit: 6,323,990
RAC: 1,942
Message 40591 - Posted: 22 Nov 2019, 14:07:57 UTC - in response to Message 40583.  

Although more than unsent 600 tasks are shown in the Server Status page, my host received only 1 task, and from then on it keeps saying "no tasks available for Theory similation".
Why so?
Probably you have a limit of 1 for # CPUs. Set 'No Limit' and you will get as many tasks as you have cores or # of jobs, if the latter is less.
ID: 40591 · Report as offensive     Reply Quote
Erich56

Send message
Joined: 18 Dec 15
Posts: 1283
Credit: 23,078,312
RAC: 2,804
Message 40594 - Posted: 22 Nov 2019, 16:33:22 UTC - in response to Message 40591.  

Probably you have a limit of 1 for # CPUs. Set 'No Limit' and you will get as many tasks as you have cores or # of jobs, if the latter is less.
yes, I did set the limit of "1 CPU", since the new version of Theory has 1-core tasks now (in opposite to multicore tasks as until short time ago).
But this limit I set already a few days ago, and still I could download more than 1 task (as long as my setting for # of tasks was/is >1).

But thanks anyway for the hint, I'll try it.
ID: 40594 · Report as offensive     Reply Quote
Luigi R.
Avatar

Send message
Joined: 7 Feb 14
Posts: 99
Credit: 5,027,000
RAC: 0
Message 40775 - Posted: 3 Dec 2019, 23:41:56 UTC

I have 3 sherpa jobs (not native).

1) CPU time 74:10:23, Elapsed time 72:59:24
2) CPU time 72:45:38, Elapsed time 71:43:31
3) CPU time 05:12:12, Elapsed time 42:06:48

Should I abort all of them?
ID: 40775 · Report as offensive     Reply Quote
Erich56

Send message
Joined: 18 Dec 15
Posts: 1283
Credit: 23,078,312
RAC: 2,804
Message 40776 - Posted: 4 Dec 2019, 5:47:51 UTC - in response to Message 40775.  

I had a similar situation recently, at the end they somehow failed, and I got no validation.
So I would abort them.
ID: 40776 · Report as offensive     Reply Quote
Luigi R.
Avatar

Send message
Joined: 7 Feb 14
Posts: 99
Credit: 5,027,000
RAC: 0
Message 40779 - Posted: 4 Dec 2019, 8:56:31 UTC - in response to Message 40776.  

ID: 40779 · Report as offensive     Reply Quote
Profile Ray Murray
Volunteer moderator
Avatar

Send message
Joined: 29 Sep 04
Posts: 251
Credit: 11,223,587
RAC: 6
Message 40780 - Posted: 4 Dec 2019, 8:58:15 UTC - in response to Message 40775.  
Last modified: 4 Dec 2019, 9:06:52 UTC

Have a look in "Show graphics" or "Console".
Graphics then logs. The top line will show what flavour of simulation you have and the number of events to be processed. Some will only process a couple of thousand events instead of the 100,000 that Pythias do.
Console then ALT-F2 will show current events and, every so often, an estimate of time remaining. If that estimate is a reasonable number and it is going down then the task is probably healthy. If that number is increasing or is something daft, like 2,000 days, then it is unlikely to finish successfully.

If you are in doubt, post the first line and last 10? or so.

Cross-posted so too late. I should be working so can't look at error logs just y.
ID: 40780 · Report as offensive     Reply Quote
Aurum
Avatar

Send message
Joined: 12 Jun 18
Posts: 88
Credit: 35,825,464
RAC: 8,387
Message 40831 - Posted: 7 Dec 2019, 12:56:23 UTC - in response to Message 40594.  

Set 'No Limit' and you will get as many tasks as you have cores or # of jobs, if the latter is less.
None of my rigs have been supplied with more than ten Theory WUs and all specify No Limit/No Limit in Prefs..
ID: 40831 · Report as offensive     Reply Quote
Previous · 1 · 2 · 3 · Next

Message boards : Theory Application : New version 300.00


©2020 CERN