Message boards : Sixtrack Application : SIXTRACKTEST
Message board moderation

To post messages, you must log in.

Previous · 1 . . . 5 · 6 · 7 · 8

AuthorMessage
Profile Ray Murray
Volunteer moderator
Avatar

Send message
Joined: 29 Sep 04
Posts: 232
Credit: 10,614,512
RAC: 3,678
Message 39822 - Posted: 4 Sep 2019, 12:11:41 UTC - in response to Message 39653.  

I have 2 sixtracktests that I returned on 15th Aug. My wingman aborted them just as they were about to timeout on 29th Aug. With lots of sixtracktests in the queue, and the resends going to the back of that queue, they are still waiting to be resent.
If there are similar delays in returns of split 10^7 jobs, it may take a looong time to get the full job results returned.
CPDN, which runs for months, uses a "Trickle" system to upload intermediate data. Could something similar be implemented here for these so that the full job could be run to completion but the intermediate results could be used if necessary if the job is lost or to pinpoint any failure?
I generally run 24/7 so my longest job at 4.6 days wasn't a problem for me.
ID: 39822 · Report as offensive     Reply Quote
Profile MAGIC Quantum Mechanic
Avatar

Send message
Joined: 24 Oct 04
Posts: 840
Credit: 36,698,441
RAC: 36,064
Message 39834 - Posted: 5 Sep 2019, 21:59:12 UTC

(I haven't had time to check this entire thread)

It looks like X86 does not like running the Test v502.05 (sse2) tasks but can run the SixTrack v502.05 and never got any of the v502.05 (avx) so no idea if that would run on a X86 so I switched the X86 back to regular Sixtracks and Theory VB tasks.
ID: 39834 · Report as offensive     Reply Quote
computezrmle

Send message
Joined: 15 Jun 08
Posts: 1105
Credit: 52,486,961
RAC: 137,219
Message 39835 - Posted: 6 Sep 2019, 5:44:15 UTC - in response to Message 39834.  

... and never got any of the v502.05 (avx) so no idea if that would run ...

At least this can be explained.

This is the only host on your list that is running x86_32 instead of x86_64:
https://lhcathome.cern.ch/lhcathome/show_host_detail.php?hostid=10447575

It's CPU doesn't have AVX extensions:
http://www.cpu-world.com/CPUs/K10/AMD-Phenom%20II%20X3%20720%20-%20HDX720WFK3DGI%20(HDX720WFGIBOX).html
ID: 39835 · Report as offensive     Reply Quote
Alessio Mereghetti
Volunteer moderator
Project administrator
Project developer
Project tester
Project scientist

Send message
Joined: 29 Feb 16
Posts: 141
Credit: 1,722,470
RAC: 5,239
Message 39839 - Posted: 6 Sep 2019, 9:00:57 UTC - in response to Message 39822.  

Hi Ray,
thanks for the suggestion. We are looking into something of this kind, i.e. we split the job in sub-steps, and we collect the partial results together with the checkpoint-restart files.
The machinery in SixTrack is almost ready, for BOINC the process will be completely transparent; we are now in the process of modifying the software on the side of the scientists - it will come in few months, with a major code re-writing. Slowly but moving :)
Happy crunching
ID: 39839 · Report as offensive     Reply Quote
Profile Ray Murray
Volunteer moderator
Avatar

Send message
Joined: 29 Sep 04
Posts: 232
Credit: 10,614,512
RAC: 3,678
Message 39868 - Posted: 8 Sep 2019, 17:15:18 UTC
Last modified: 8 Sep 2019, 17:21:44 UTC

Those 2 x 10^7 tasks from 15 Aug have at last been resent and I've got couple of 10^6 _2 myself so we must have gotten to the end of the queue.
ID: 39868 · Report as offensive     Reply Quote
Crystal Pellet
Volunteer moderator
Volunteer tester

Send message
Joined: 14 Jan 10
Posts: 718
Credit: 5,983,729
RAC: 2,299
Message 39870 - Posted: 8 Sep 2019, 17:48:01 UTC - in response to Message 39868.  

Those 2 x 10^7 tasks from 15 Aug have at last been resent .....
I've got a *_4 resent of a 10^7-task from the initial replication of 2 one month ago from the 8th of August.

https://lhcathome.cern.ch/lhcathome/workunit.php?wuid=120081701 Another 4 days of work. :)
ID: 39870 · Report as offensive     Reply Quote
Crystal Pellet
Volunteer moderator
Volunteer tester

Send message
Joined: 14 Jan 10
Posts: 718
Credit: 5,983,729
RAC: 2,299
Message 39911 - Posted: 12 Sep 2019, 16:47:49 UTC - in response to Message 39870.  

I've got a *_4 resent of a 10^7-task from the initial replication of 2 one month ago from the 8th of August.

https://lhcathome.cern.ch/lhcathome/workunit.php?wuid=120081701 Another 4 days of work. :)
Finally finished ;)
ID: 39911 · Report as offensive     Reply Quote
Eric Mcintosh
Volunteer moderator
Project administrator
Project developer
Project tester
Project scientist

Send message
Joined: 12 Jul 11
Posts: 851
Credit: 1,612,284
RAC: 304
Message 39914 - Posted: 13 Sep 2019, 8:45:16 UTC - in response to Message 32365.  

Thanks, excellent feedback. Memory requirement could be reduced
by tracking fewer particles in the task. Memory management in SixTrack
is now dynamic. There must be something wrong with our memory estimate
I think. In the first place we should avoid using all your memory and a
potential deadlock........We shall see. Eric.
ID: 39914 · Report as offensive     Reply Quote
Profile MAGIC Quantum Mechanic
Avatar

Send message
Joined: 24 Oct 04
Posts: 840
Credit: 36,698,441
RAC: 36,064
Message 39921 - Posted: 14 Sep 2019, 0:24:44 UTC - in response to Message 39914.  

Thanks, excellent feedback. Memory requirement could be reduced
by tracking fewer particles in the task. Memory management in SixTrack
is now dynamic. There must be something wrong with our memory estimate
I think. In the first place we should avoid using all your memory and a
potential deadlock........We shall see. Eric.

No problems with mine Eric
Ran about 1200 of them so far and have another 444 to run and have been easy to finish way before the deadline.
Have them running X8 with 16GB ram and 24GB ram and X4 with 12GB ram and over no problems running X3 with 8GB ram on the old 3-core.

No errors

(good to see you here too Eric)

-Samson
ID: 39921 · Report as offensive     Reply Quote
Erich56

Send message
Joined: 18 Dec 15
Posts: 1119
Credit: 20,659,315
RAC: 36,977
Message 39923 - Posted: 14 Sep 2019, 6:15:41 UTC - in response to Message 32505.  

I just downloaded a Sixtracktest task (for the first time), it runs perfectly, memory use is 40MB constantly.
ID: 39923 · Report as offensive     Reply Quote
Previous · 1 . . . 5 · 6 · 7 · 8

Message boards : Sixtrack Application : SIXTRACKTEST


©2019 CERN