Message boards : News : Status, 19th August, 2012.
Message board moderation

To post messages, you must log in.

AuthorMessage
Eric Mcintosh
Volunteer moderator
Project administrator
Project developer
Project tester
Project scientist

Send message
Joined: 12 Jul 11
Posts: 843
Credit: 1,578,195
RAC: 0
Message 24652 - Posted: 19 Aug 2012, 14:25:12 UTC

All is running rather well; over 100,000 tasks queued, and over 56,000 running. I have a bit more work prepared, but badly need to do some analysis. After some flak, we have been receiving many messages of support and also a lot of help in identifying the problem with the MAC executable.

Igor has identified and corrected the problem with Credits and is still cleaning up and trying to repair.
(This was my fault; trying to run 10**7 turn jobs taking 80 hours.
However I can report that 99% of them have completed successfully,
and others are still active.)

The Mac executable issue may even be solved, but we need to watch for the next days still.

There may be a problem with Deadlines....we shall see.

I am waiting for PC support to install my NVIDIA TESLA, memory and upgraded power supply, and Linux. I am ready to install the software next and try Tomography. There is some interest in ABP especially for existing MPI applications. We shall see.

I have STILL NOT finished the SixDesk doc or prepared the tutorial.

I take this opportunity to outline the LXTRACK system: I hope IT support could fill in the details and do it.

The justification is that AFS limitations and problems have made life very difficult.
I have used my desk side pcslux99 (thanks to Frank who donated it) as a protoptype to run several hundred thousand jobs over the last few weeks.
Sadly I do not have the LSF commands like bjobs and bsub, as it as an old 32-bit machine, and I am NOT wanting to become a sysadmin again. It has almost 200GB of disk space of which I am using only 12% but increasing. Under this setup I have virtually no problems and do everything with the SixDesk scripts called from master scripts in acrontab entries.

LXTRACK should be a "standard" lxplus Virtual machine i.e. with LSF and CASTOR and SVN and AFS etc etc. BUT with at least a Terabyte of disk space NON AFS, /data, say. Only users in the AFS PTS Group boinc_users should be allowed to login.
(We could even create the /data/$LOGNAME directory for them.) How can we manage this space? Given the small number of cooperative users a script to monitor is probably adequate.
Processes shoul NOT be killed for exceeding CPU or real time limits.
Later, ideally, we could possibly create non_AFS buffers for communication with BOINC.

ID: 24652 · Report as offensive     Reply Quote
Toby Broom
Volunteer moderator

Send message
Joined: 27 Sep 08
Posts: 466
Credit: 140,686,110
RAC: 226,708
Message 24653 - Posted: 19 Aug 2012, 17:19:02 UTC

We could have a fund raiser for hardware?
ID: 24653 · Report as offensive     Reply Quote
Eric Mcintosh
Volunteer moderator
Project administrator
Project developer
Project tester
Project scientist

Send message
Joined: 12 Jul 11
Posts: 843
Credit: 1,578,195
RAC: 0
Message 24654 - Posted: 19 Aug 2012, 20:04:47 UTC - in response to Message 24653.  

Didn't mean to post that bit.....I don't think this is a
financial issue. We already have over 50,000 active
CPUs (thanks to you all) . :-)
ID: 24654 · Report as offensive     Reply Quote
Profile MAGIC Quantum Mechanic
Avatar

Send message
Joined: 24 Oct 04
Posts: 713
Credit: 21,026,455
RAC: 23,161
Message 24658 - Posted: 19 Aug 2012, 22:39:04 UTC - in response to Message 24652.  


I am waiting for PC support to install my NVIDIA TESLA, memory and upgraded power supply, and Linux. I am ready to install the software next and try Tomography. There is some interest in ABP especially for existing MPI applications. We shall see.



Funny you would mention that Eric,

About 12 hours ago (4am here) I ordered 8GB Ram,a 750watt PS,and the Geforce GTX 660Ti superclocked to upgrade one of my quad-cores!

1344 Cuda cores.......can't wait to test that!

-Samson

Volunteer Mad Scientist For Life
ID: 24658 · Report as offensive     Reply Quote

Message boards : News : Status, 19th August, 2012.


©2018 CERN