log in

News archive

Deadline change for ATLAS jobs
Due to the tight deadline of the ATLAS tasks, we change to deadline of ATLAS jobs from 2 weeks to 1 week. The ATLAS job takes about 3-4 CPU hours to finish on a moderate CPU (2.5GFLOPS). 16 Aug 2017, 8:42:03 UTC · Comment


New ATLAS app version released for Linux hosts
We released a new version of the ATLAS app today, 2.41 for the x86_64-pc-linux-gnu platform.
The new features of this version include:
1. It requires the host OS to be either Scentific Linux 6 or Cent OS 7.
2. It require CVMFS and Singularity instead of Virtualbox to run the ATLAS jobs.
3. It is more efficient, as the avoidance of using Virtualbox.
Currently, this version is set to beta version.

For people who want to try it out,we provide a script to install everything including CVMFS, singularity here,


Try it if you are interested!
10 Aug 2017, 11:04:25 UTC · Comment


CMS Weekend problem
Warning: The WMAgent which controls CMS jobs appears to have a failed component very recently. Queue seems to be exhausted. Please set No New Tasks or change to a backup app while I try to raise someone at CERN to fix it. This could be a problem given that this is expected to be the heaviest weekend of the year for holiday travel in Europe... 5 Aug 2017, 4:31:20 UTC · Comment


Optimising distribution of SixTrack tasks
Dear Volunteers,

we are trying to improve the distribution of SixTrack tasks. If your host could process more tasks but during the project update you don't receive any, can you let us know and send us your client logging report? Please continue the thread "SixTrack Tasks NOT being distributed" opened by Eric here:
http://lhcathome.cern.ch/sixtrack/forum_thread.php?id=4324
so that we can collect all the issues in only one place. In this way, we could try to better tune parameters controlling the distribution of tasks on the server side.

At the same time, we apologize for the loss of credits following the accidental deletion of lines in the main DB - please see message:
http://lhcathome.cern.ch/sixtrack/forum_thread.php?id=4362&postid=31563#31563
As you can see, task distribution is progressing regularly since the beginning of the week

Thanks in advance for your precious cooperation,
Alessio and Riccardo, for the SixTrack team
27 Jul 2017, 12:44:17 UTC · Comment


Aborted Work Units
After deleting many really old results from 2013 until March 2017 (was meant to
be December 2016) it seems many Tasks have been aborted. A full analysis
and report will be posted. No action required by volunteers. Eric.
24 Jul 2017, 12:26:54 UTC · Comment


CMS Jobs working again
It's been a few hours now since the Data Bridge appears to have been fixed and jobs are staging out normally. You can resume running CMS tasks at your will. 18 Jul 2017, 17:16:35 UTC · Comment


CMS@Home -- please set No New Tasks and perhaps temporarily run another project
There is a problem staging-out CMS@Home jobs to the Data Bridge. Until we find the cause, please set your CMS crunchers to No New Tasks, or temporarily move them to another app or project.
Sorry for the trouble, unfortunately it's beyond my capability to resolve.
16 Jul 2017, 21:37:26 UTC · Comment


No RESULTS accepted from Linux Kernel 4.8.*
As an emergency measure and over the weekend, I have set
max_results_day to -1 for all hosts running Linux (Ubuntu?)
Kernel 4.8.*. SixTrack is consistently crashing with an IFORT run
time formatted I/O error. This will avoid wasting your valuable
contributions. Eric.
14 Jul 2017, 13:50:37 UTC · Comment


IMPORTANT, pull back on SixTrack Inconclusive Results
Please see Message 31102 on SixTrack Application,
Inconclusive Results, keyword IMPORTANT. Eric.
26 Jun 2017, 18:08:03 UTC · Comment


CMS application job queue is being run down.
We want to update the WMAgent job controller, so I've stopped the next batch (I hope). We should run out of jobs in 10-12 hours, so set any machine running CMS tasks to No New Tasks as soon as practicable. Should be up again tomorrow. 26 Jun 2017, 15:59:30 UTC · Comment


SixTrack Inconclusive Results
Please see the SixTrack Application threads for an important update,
Message 31064, Keyword BANNED
26 Jun 2017, 6:24:06 UTC · Comment


sixtrack_validator
There will be a (very) short interruption while I
install a new sixtrack_validator. Should fix null/empty
fort.10 and the nasty "outlier" problem.
See SixTrack Application, sixtrack_validator for more news and details.
24 Jun 2017, 9:21:04 UTC · Comment


SixTrack Tasks distribution issues
Please see Message boards:SixTrack application, thread
"SixTrack Tasks NOT being distributed". This is to have one place for
all relevant messages. This thread is for SixTrack only.
My first post reports my personal status.
20 Jun 2017, 12:43:01 UTC · Comment


Network and server problems Sunday night
We had a network problem in the computer centre at CERN last night, leading to a number of issues for our servers. BOINC servers should be back in business now.

Normally tasks should be correctly uploaded again on the next attempt. If you see any issues, please try an update or reset of the project.

Sorry for the trouble, and happy crunching!
19 Jun 2017, 7:19:47 UTC · Comment


SixTrack News - May 2017
The SixTrack team would like to thank all the teams who took part in the 2017 pentathlon hosted by SETI.Germany:
https://www.seti-germany.de/boinc_pentathlon/
where LHC@Home was chosen for the swimming discipline. The pentathlon gave us the possibility of carrying out a vast simulation campaign, with lots of new results generated that we are now analysing. While the LHC experiments send volunteers tasks where data collected by the LHC detectors has to be analysed or Monte Carlo codes for data generation, SixTrack work units probe the dynamics of LHC beams; hence, your computers are running a live model of the LHC in order to explore its potential without actually using real LHC machine time, precious to physics.

Your contribution to our analyses is essential. For instance, we reached ~2.5 MWUs processed in total, with a peak slightly above 400kWUs processed at the same time, and >50TFLOPs, during the entire two weeks of the pentathlon. The pentathlon was also the occasion to verify recent improvements to our software infrastructure. After this valuable experience, we are now concentrating our energies on updating the executables with brand new functionality, extending the range of studies and of supported systems. This implies an even increased dependence on your valuable support.

Thanks a lot to all people involved! We count on your help and committment to science and to LHC@home to pursue the new challenges of beam dynamics which lie ahead.
26 May 2017, 14:43:25 UTC · Comment


LHCb application is in production
We are very happy to announce that the LHCb application is out of beta and is now in production mode on LHC@home. Thank you all for your precious contribution.

We are grateful to have you all as part of our project.

Please, refer to the LHCb application forum for any problem or feedback.

Thanks a lot
Cinzia
27 Apr 2017, 7:50:51 UTC · Comment


New file server
We have added a new file server for download/upload to scale better with higher load. If there should be errors with download or upload of tasks, please report on the MBs.

Thanks for contributing to LHC@home!
26 Apr 2017, 14:00:48 UTC · Comment


ATLAS application now in production
The ATLAS application is now in production here on LHC@home, after a period of testing. This marks another milestone for the LHC@home consolidation, and we would like to warmly thank all of you who have contributed to help and tests for the migration!

Please refer to Yeti's checklist for the the ATLAS application and the ATLAS application forum if you need help.
22 Mar 2017, 15:45:58 UTC · Comment


Network interruptions 15th of March
Due to a network upgrade in the CERN computer centre, connections to LHC@home servers will intermittently time out tomorrow Wednesday morning between 4 and 7am UTC.

BOINC clients will retry later as usual, so this should be mostly transparent.
14 Mar 2017, 9:58:26 UTC · Comment


VLHCathome project fully migrated
The former vLHCathome project has now been migrated here and the old vLHCathome project site has been redirected.

The credit has also been migrated as discussed in this thread.

If your BOINC client complains about a wrong project URL, please re-attach to this project, LHC@home.

Thanks again to all who contributed to vLHCathome and to those who contribute here!

-- The team
2 Mar 2017, 8:35:49 UTC · Comment


Draining the CMS job queue
Because of an upgrade to the WMAgent server, we need to drain the CMS job queue. So, I'm not submitting any more batches at present and we should start running out over the weekend. If you see that you are not getting any CMS jobs (not tasks...) please set No New Jobs or stop BOINC.
I expect that the intervention will take place Monday morning, and hopefully we'll have new jobs again later that day.
17 Feb 2017, 10:57:18 UTC · Comment


Good news for the CMS@Home application
This afternoon we demonstrated the final link in the chain of producing Monte Carlo data for CMS using this project (and the -dev project too, of course), namely the transfer of result files from the temporary Data Bridge storage to a CMS Tier 2 site's storage element (SE). To summarise, the steps are:

o Creating a configuration script defining the process(es) to be simulated
o Submitting a batch of jobs of duration and result-file size suitable for running by volunteers
o Having those jobs picked up by volunteers running BOINC and the CMS@Home application, and the result files returned to the Data Bridge
o Running "merge" jobs on a small cluster at CERN to collect the smaller files into larger files (~2.2 GB) -- this step has to be done at CERN as most volunteers will not have the bandwidth (or data plan!) to handle the data volumes required. This step also serves to a large extent as the verification step required to satisfy CMS of the result files' integrity.
o Transferring the merged files into the Grid environment where they are then readily available to CMS researchers around the world

Thanks, everybody. From here on it gets more political, but we've been garnering support as the project progressed. We now need to move into a more "production" environment and convince central powers-that-be to take over the responsibility of submitting suitable workflows and collecting the results. You will still see some changes in the future, especially as we bring some of the more-advanced features across here from the -dev project.
27 Jan 2017, 20:59:36 UTC · Comment


MacOS executable OSX 10.10.5 Yosemite
Well I have finally got some work on my Mac with our new MacOS executable
built on OS X 10.10.5 Yosemite .
Please report to me eric.mcintosh@cern.ch,
or to the Topic Sixtrack Application, MacOS executable thread,
if you get some work and there are problems. Eric.
19 Jan 2017, 10:41:43 UTC · Comment


VM applications broken by the Windows 10 update KB3206632
The Windows 10 update KB3206632 introduces an issue that affects virtualization-based security (VBS) and hence may break VM applications. The issue is fixed in the update KB3213522. If you are running Windows 10, please ensure that you have applied the KB3213522 update.

Thanks everyone who contributed the treads on this issue.

Refs:
Missing heartbeat file errors
Microsoft KB3206632 from 16/12/15
8 Jan 2017, 20:51:44 UTC · Comment


Season's Greetings
A very Merry Christmas and a Happy New Year to all the LHC@home supporters.
(I shall send some news about our plans for 2017 in the next few days.)
Eric.
25 Dec 2016, 8:33:19 UTC · Comment


VM applications
Following the Theory simulations added 1 week ago, we have now also deployed the CMS and LHCb applications from the Virtual LHC@home project here on the consolidated, original LHC@home.

Please note that in order to run VM applications in addition to the classic BOINC application Sixtrack, you need to have a 64bit machine with VirtualBox installed and virtualisation extensions (VT-x) enabled. The details are explained on the join us and faq pages on the LHC@home web site.

By default, only the Sixtrack application is enabled in your BOINC project preferences. If you have VirtualBox installed and wish to try VM applications as well, you need to enable other applications in your LHC@home project preferences.

Please note that if you run an older PC with Windows XP or similar, it is recommended to stay with the default; Sixtrack only.

Thanks for your contributions to LHC@home!

--The team
21 Nov 2016, 10:05:13 UTC · Comment


LHC@home consolidation
As part of consolidation of LHC@home, we have setup a new server web front end using SSL for this project. The new URL is:

https://lhcathome.cern.ch/lhcathome

Please feel free to connect to the new site at your convenience. (BOINC 7.2 clients and later supports SSL.)

The old LHC@home classic site will continue operation as long as required. Currently there are no new Sixtrack tasks in the queue, but soon more applications and work will be available from this project.
6 Oct 2016, 10:56:02 UTC · Comment


LHC@Home - SixTrack Project News
The members of the SixTrack project from LHC@Home would like to thank all the volunteers who made their CPUs available to us! Your contribution is precious, as in our studies we need to scan a rather large parameter space in order to find the best working points for our machines, and this would be hard to do without the computing power you all offer to us!

Since 2012 we have started performing measurements with beam dedicated to probing what we call the “dynamic aperture” (DA). This is the region in phase space where particles can move without experiencing a large increase of the amplitude of their motion. For large machines like the LHC this is an essential parameter for granting beam stability and allowing long data taking at the giant LHC detectors. The measurements will be benchmarked against numerical simulations, and this is the point where you play an important role! Currently we are finalising a first simulation campaign and we are in the process of writing up the results in a final document. As a next step we are going to analyse the second half of the measured data, for which a new tracking campaign will be needed. …so, stay tuned!

Magnets are the main components of an accelerator, and non-linearities in their fields have direct impact on the beam dynamics. The studies we are carrying out with your help are focussed not only on the current operation of the LHC but also on its upgrade, i.e. the High Luminosity LHC (HL-LHC). The design of the new components of the machine is at its final steps, and it is essential to make sure that the quality of the magnetic fields of the newly built components allow to reach the highly demanding goals of the project. Two aspects are mostly relevant:

    specifications for field quality of the new magnets. The criterion to assess whether the magnets’ filed quality is acceptable is based on the computation of the DA, which should larger than a pre-defined lower bound. The various magnet classes are included in the simulations one by one and the impact on DA is evaluated and the expected field quality is varied until the acceptance criterion of the DA is met.


    dynamic aperture under various optics conditions, analysis of non-linear correction system, and optics optimisation are essential steps to determine the field quality goals for the magnet designers, as well as evaluate and optimise the beam performance.


The studies involve accelerator physicists from both CERN and SLAC.




Long story made short, the tracking simulations we perform require significant computer resources, and BOINC is very helpful in carrying out the studies. Thanks a lot for your help!
The SixTrack team




Latest papers:

R. de Maria, M. Giovannozzi, E. McIntosh (CERN), Y. Cai, Y. Nosochkov, M-H. Wang (SLAC), DYNAMIC APERTURE STUDIES FOR THE LHC HIGH LUMINOSITY LATTICE, Presented at IPAC 2015.
Y. Nosochkov, Y. Cai, M-H. Wang (SLAC), S. Fartoukh, M. Giovannozzi, R. de Maria, E. McIntosh (CERN), SPECIFICATION OF FIELD QUALITY IN THE INTERACTION REGION MAGNETS OF THE HIGH LUMINOSITY LHC BASED ON DYNAMIC APERTURE, Presented at IPAC 2014

Latest talks:

Y. Nosochkov, Dynamic Aperture and Field Quality, DOE review of LARP, FNAL, USA, July 2016
Y. Nosochkov , Field Quality and Dynamic Aperture Optimization, LARP HiLumi LHC collaboration meeting, SLAC, USA, May 2016
M. Giovannozzi, Field quality update and recent tracking results, HiLumi LHC LARP annual meeting, CERN, October 2015
Y. Nosochkov, Dynamic Aperture for the Operational Scenario Before Collision, LARP HiLumi LHC collaboration meeting, FNAL, USA, May 2015 26 Jul 2016, 8:37:55 UTC · Comment


Disk Space Exceeded
I am sorry we have submitted some "bad" WUs.
They are using too much disk space.
Please delete any WUS with names like
wjt-18-L1-trc......
wjt-15-L1-trc.......
Apologies.
16 Mar 2016, 6:15:12 UTC · Comment


Server daemons temporarily stopped
Due to a problem with an underlying disk server, the BOINC daemons are temporarily shut down until the disk volume is back. 27 Feb 2016, 12:36:01 UTC · Comment


Short server interruption 9-Feb.
Our LHC@home servers will be down for a short while from 8UTC 9-Feb. due to a disk server intervention. (Intervention postponed 1 week.) 2 Feb 2016, 8:27:29 UTC · Comment


BOINC Server up
The server is back, for the moment at least.
Clearing backlog of results. Eric.
7 Dec 2015, 7:55:32 UTC · Comment


Server down.
The BOINC server has been stopped temporarily because of
file system problems at CERN. Hopefully to be restarted tomorrow
Monday. Eric.
6 Dec 2015, 9:56:18 UTC · Comment


Work/result buffering problem at CERN
We have had a BOINC CERN side buffer problem over the weekend.
It is being investigated and hopefully soon corrected. Eric.
16 Nov 2015, 9:49:17 UTC · Comment


Another short service interruption
The LHC@home servers will be down for a short while from 6:30 UTC Tuesday 10th November for a database update. 9 Nov 2015, 7:51:18 UTC · Comment


Service interruption tomorrow morning
LHC@home servers will be down for about 1 hour tomorrow morning from 6am UTC, due to an intervention on the database server. 8 Sep 2015, 9:07:23 UTC · Comment


Server interruption 12 UTC
The BOINC server will be down for maintenance for about 30 minutes from 12:00 UTC today.

BOINC clients will back off and return results later once the server is up as usual.

Many thanks for your contributions to LHC@home!
24 Aug 2015, 6:36:43 UTC · Comment


Brief Interruption, Thursday 18th June,2015
There will be a hopefully brief interruption to the service tomorrow
Thursday at 10:30 CST to provide separate NFS servers for SixTrack
and Atlas. The WWW pages should still be accessible and a further
message will be posted when the operation is complete. Eric and Nils.
17 Jun 2015, 16:22:48 UTC · Comment


Project down due to a server issue
Due to a problem with an NFS server backend at CERN, the Sixtrack and ATLAS BOINC projects are down. A fix is underway. 11 Jun 2015, 9:42:59 UTC · Comment


HostID 10137504 user aqvario
HostID 10137504 owner aqvario.
I set the max_results_day to -1; locking the stable door
after the horse has bolted. For some reason I cannot read the
messages I read this morning on this topic. Thanks for the
help and the Google translation. Eric.
6 Jun 2015, 14:12:35 UTC · Comment


Quorom of 5, wzero and Pentathlon
I am currently running a set of very important tests to try and
find the cause of a few numerical differences between different platforms
and executables. I could/would not do this usually but because of your efforts
during the Pentathlon I have a unique opportunity. Also keeps up the
workload and gives you all an opportunity to get credits.
These test are wzero with a quorum of 5.Thanks. Eric.
17 May 2015, 14:32:09 UTC · Comment


DISK LIMIT EXCEEDED
Please note that this may occur if you are also subscribed
to the LHC experiment projects ATLAS or CMS using vLHCathome.
A workround is to delete the remaining files yourself.
16 May 2015, 19:03:40 UTC · Comment


New news on the BOINC Pentathlon
Please look at the NEWS 15th May, 2015 for latest update
involving the BOINC Pentathlon. Eric.
15 May 2015, 20:36:56 UTC · Comment


News 15th May, 2015
As many of you know LHC@home has been selected to host
the Sprint event of the BOINC Pentathlon organised by
Seti.Germany. Information can be found at
http://www.seti-germany.de/boinc_pentathlon/22_en_Welcome.html
The event starts at midnight and will last for three days.

This is rather exciting for us and will be a real test of
our BOINC server setup at CERN. Although this is the weekend
following Ascension my colleagues are making a big effort to
submit lots of work, and I am seeing a new record number of active WUs
every time I look. The latest number was over 270,000 and the Sprint
has not yet officially started.

We have done our best to be ready without making any last minute changes
and while this should be fun I must confess to being rather worried
about our infrastructure. We shall see.

We still have our problems, for a year now.

I am having great difficulties building new executables since Windows XP
was deprecated and I am now tring to switch to gfortran on Cygwin.
It would seem to be appropriate to use the free compiler on our
volunteer project.

We are seeing too many null/empty result files. While an empty result can
be valid if the initial conditions for tracking are invalid, I am hoping
to treat these results as invalid. These errors are making it extremely
difficult for me to track down the few real validated but wrong results.
I have seen at least one case where a segment violation occurred, a clear
error, but an empty result was returned. The problem does not seem to
be OS or hardware or case dependent.

I am also working on cleaning the database of ancient WUs. We had not
properly deprecated old versions of executables until very recently.

I am currently using boinctest/sixtracktest to try a SixTrack which will return the full results giving more functionality and also allowing a case to be automatically handled as a series of subcases.

Then we must finally get back MacOS executables, AVX support, etc

Still an enormous amount of production is being carried out successfully
thanks to your support.

I shall say no more until we see how it goes for the next three days. Eric.
15 May 2015, 20:34:20 UTC · Comment


Short stoppage for a disk intervention
The Sixtrack server will be down for a while this afternoon for a disk intervention. Clients will be able to upload results again soon. 30 Apr 2015, 12:27:31 UTC · Comment


Upgrade of the look and feel of the SixTrack website
The http://lhcathomeclassic.cern.ch/sixtrack/ website has been brought up to date with a new look and feel, which is consistent the other LHC@Home projects. It maintains all the links and the functionality of the previous one. 23 Apr 2015, 12:20:31 UTC · Comment


Status Result Differences 29th March, 2015
Please have a look at my lates post to:
Number Crunching/Host messing up tons of results. Eric.
29 Mar 2015, 16:30:47 UTC · Comment


Server Intervention 10-Feb-2014
There will be a short server interruption on Tuesday 10-Feb-2014 from 14:00-15:00 CET for a hardware upgrade.


Update: The upgrade finished at 15:00 and the service is back up.
9 Feb 2015, 10:17:46 UTC · Comment


Uploads failing
Apologies; disk full problem. Cleaning up and hoping to
return to normal shortly. Thanks for all the messages. Eric.
29 Jan 2015, 16:27:52 UTC · Comment


News, December, 2014.
Well not much news really. The project is ticking over
and we have processed a tremendous amount of work in 2014.

Right now we are trying to move the project to a new CERN IT
infrastructure so there may be a few hiccups in January
(CERN is closed for two weeks, but systems are up and running).

We are still using executables from May and I still don't have
a valid MacOS executable :-( , no heartbeat so something is really
wrong. Haven't found an explication for the "no permission/cannot acceess"
problems on Windows but the overall error rate is about 1.5% which
seems to be "normal". We have also had problems with the w- WUs
which produced a lot of output, now under control. However running
with a smaller number of pairs ro reduce volume of output seems
to give problems with validation. Working on this.

A New Year, so I shall try and make a big effort to get moving forward
as we have been pretty well stuck for 9 months; after ten years I am
a bit disappointed at the lack of progress. However, as usual, we must
maintain the service as top priority.

I have also noted increased interest from the experiments in using volunteer
computing and this may impact lhcahomeclassic......

Anyway, LHC is heading steadily to restart in the Spring, and we shall
continue studying the High Luminosity upgrade. Many thanks for your
patience and understanding and continued valued support.

A Very Happy New Year. Eric.
31 Dec 2014, 11:03:50 UTC · Comment


Season's Greetings
I wish you a very Merry Christmas and
a Happ[y|ier] New Year. Thanks for all
your support (news to follow). Eric.
24 Dec 2014, 15:11:54 UTC · Comment


Heavy I/O on Windows WUs
It sems WUs with names beginning w-.... are creating a bit
much I/O for Windows. Under investigation, but the results
are good and are required. Thanks. Eric.
31 Oct 2014, 19:58:01 UTC · Comment


17:00 CET, 15th October, Service back to "normal".
I believe we have finally resolved various issues as
of about 16:00 today. Apologies for the downtime. Eric.
14 Oct 2014, 15:47:00 UTC · Comment


CERN AFS problems
We seem to be having intermittent? problems with our local
file system. Server running but.....will fix soonest.
10 Oct 2014, 15:54:03 UTC · Comment


Service back; 5th October
I think we are back in business. Lots of work coming, I hope,
once we sort out the disk space issue. Sorry for all the hassle
and thank you for your continued support.
5 Oct 2014, 13:01:11 UTC · Comment


Re-enabled daemons
I have painfully cancelled all w-b3 WUs. According to doc they
stay in the database but are marked as "not needed".
I have also disabled further WUs of this type until we sort it out.
Hope to have saved some 65,000 valid WUs. We shall see tomorrow.
Please post to this thread if further problems (I have restarted as root...).
It will probably take some time to get back to normal.
Report will follow in due course.
4 Oct 2014, 17:58:49 UTC · Comment


Service disabled
I have managed to stem the flood and disable the service.
Apologies and will inform as soon as we are started again.
4 Oct 2014, 8:42:20 UTC · Comment


Disk Limit increased
I am unable to stop submission.
I have upped the limit on disk space to 500MB.
I can't do anything about active WUs but I hope the new limit
will suffice for new WUs. More news tomorrow.
4 Oct 2014, 0:02:30 UTC · Comment


Disk Limit exceeded w-b3
Drastic action being taken to delete the download WUs.
This may crash the server....
Apologies for the wasted CPU.
3 Oct 2014, 22:35:08 UTC · Comment


Power Supply Ripple
Asequesed and for your information Miriam has described her
recent studies as follows:
A principal component of the planned upgrade to a high luminosity LHC (HI-LHC) is the replacement of the high field quadropole magnets - the so called "inner triplet".
The long term beam stability can be significantly reduced by magnetic field errors, miasalignment of the magnets and by irregularities in the power supply (ripple). The recent batch of fifteen or so studies, involving over one and a half million cases or Work Units each of one million turns (for a stable beam), are aimed at determining the maximum allowable tolerances for the power supply ripple assuming the known field and alignment errors.
22 Jul 2014, 15:40:43 UTC · Comment


More on DOWNLOAD
After running through the w- WUs I am now running
a few test jobs as I think the WUs may have been OK.
I cannot reproduce the problem (of course!) at CERN on my
Windows 7 system. Eric.
22 Jul 2014, 15:38:21 UTC · Comment


Download Errors located.
ERR_DOWNLOAD problem located and there should be no more once this
batch of dud WUs has been cleared. May be Monday before
I can do anything else. Eric.
19 Jul 2014, 8:32:33 UTC · Comment


DOWNLOAD ERRORS
Just noticed error rate has doubled to about 6% in
last 24 hours. Seem to be ERR_RESULT_DOWNLOAD which I
have confirmed my checking MBs right now. Any help/detailed
info welcome while I notify CERN support.
(Another Friday afternoon problem!) Eric.
18 Jul 2014, 14:24:19 UTC · Comment


Three Problems, 22nd May.
Settling down a bit; I am seeing around 2% WU failures.

Problem 1: EXIT_TIME_LIMIT_EXCEEDED. Tried to minimise this
and will hopefully implement "outliers" to avoid it in future.

Problem 2: Can't Create Process and I will look for help on this.
Probably connected with our build but we shall see.

Problem 3: Found 545 invalid results involving 124 hosts.
One invalid result was duplicated! but i am not going to run
everything 3 times. Can live with this. The top 12 culprits gave
77 45 26 25 22 21 19 16 14 11 10 9 invalid results each.
(I thought we stopped using hosts with this many errors......)
Seems to be hardware, overclocking, cosmic rays?????

Getting a lot of production done successfully. Eric.
22 May 2014, 16:15:39 UTC · Comment


Status, 19th May, 2014
Getting a lot of work done, but out of 400,000 WUs over the last seven days
still have about 8000 errors (2% and decreasing I think). The main problem
is EXIT_TIME_LIMIT_EXCEEDED but also "Can't create process". A side effect is
a mess up with credits. I have increased the fpops bound to help, I hope, and
today "reset credit statistics". Please be patient about credits and I shall see
what happens and if we can compensate somehow.
Unfortunately today I discovered a result difference, only one, but I need to
do more checking. I see no invalid results so the former Linux/Windoes
discrepancy is largely resolved. My priority is the integrity of the results
and I may have to spend some days pinning down the result difference,
checking various ifort versions, and doing more checks and tests.
We have a macOS executable under test.
Thank you for your patience, understanding and support. Eric.
(P.S. Getting correct identical results on any PC from a Pentium 3
to the latest, with a multitude of versions of Linux, Windows and macOS
is not easy! I can publish only when the LHC@home service is > 99%.
Afterwards GPU, Android, and 10 million turns)
19 May 2014, 20:07:05 UTC · Comment


CreateProcess problems
I am seeing about 1% CreateProcess problems mainly on Windows 7.
Most often Access Denied (in various languages :-).
Also some Access violation, page out of date or similar.
Found some BOINC mails about this. Under investigation.
Seems to be host dependent.
(More work coming sooon.) Eric.
16 May 2014, 10:20:54 UTC · Comment


LHC@home is back
The service was restarted today and WUs should start
coming in, building up gradually. Thanks to all. Eric.
14 May 2014, 14:56:03 UTC · Comment


First production tests, 11th May, 2014
Trying 590 WUs tonight. If all OK will restart full
production tomorrow 12th May. Eric.
11 May 2014, 19:10:14 UTC · Comment


Status, 10th May
Please see MBs, Number Crunching, Status 10th May, Version 451.07 10 May 2014, 9:26:51 UTC · Comment


WU Submission SUSPENDED 19th April, 2014
In order to avoid any further errors and waste of your valuable
resources I have temporarily stopped WU submission. There are only
a few thousand WUs active and when they are cleared I hope we will have
new Windows executables. Sadly the Windows executables are now giving
wrong results in many cases. I looked at using Homogeneous Redundancy
but I would still get wrong results. I thought of removing the Windows
executables but they are over 80% of our capacity. In this way I hope in
a few days after users and support return from vacation we can safely
introduce new Windows executables after tests using the BOINC test
facility. Sorry about that but I would rather get it fixed properly as we
have lots of new work coming.

Thankyou for your patience and support. Eric.
19 Apr 2014, 11:05:24 UTC · Comment


Status, March 2014
First, in reply to a recent query about 2014 workload, thanks to Msssimo:
"The majority of the 2014 studies will be devoted to LHC upgrade and the rest to understand the nominal
machine. I do not expect any increase in workload when approaching the LHC re-start in 2015, on the
other hand, we will all be locked up in the control room and the resources for performing the
simulations will be reduced."

Second, we have been experiencing major problems with our
Windows executables for several months now.
There are "small" result differences between Windows and Linux.
After extensive testing I believe they are due to the Windows
ifort compiler. This will be verified and fixed as soon as I
return to CERN next week. In addition new builds of SixTrack
for Windows, which now include a call boinc_unzip, are failing
on Windows in at least two ways; there is a problem parsing the
hardware description (/proc/cpuinfo on Linux) and secondly we
get "cannot Create Process" errors. So, we shall first try and
build without the hopefully resposible call, and fix the result
differences. We can then resume development of the case splitting
to smaller WUs and the return of all results.

It is great that your support continues and, when required, we have
lots of capacity. Saw a new record of over 140,000 WUs in
process a couple of weeks ago. Eric.
16 Mar 2014, 8:43:45 UTC · Comment


Status, 24th January, 2014
Hope this will answer some of your messages.

We still have some 34,000 WUs NOT being taken. We have apparently
almost 6000 in progress.

We introduced SixTrack Version 4.5.03 on Wednesday 22nd
January after extensive testing on boinctest and at CERN.
Unluckily Yuri flooded us with work at the same time
and AFS blew up leading to a huge backlog of over 16,000
results to be downloaded.

1. Results Validation;seems to be OK. I summarise that,
countimg from 0-59 we do NOT CHECK Words 51, 59? and 60
in fort.10.

The validator log shows many many "cannot open" supposedly
existing results for comparison. They were probably lost
somehow.

2. Assimilation; the log shows
"Herror too many total results" !!!
There are about 2000 (1979) unique messages and cases/WUs.
I suspect we may nedd to clean the database and remove results
(with clients losing credit I am afraid, but they will probably never
get credit for these anyway).
I could delete them from upload but that would probably be worse.

3. Scheduler log: there are about 2.4 million messages of which
there are 1.64M unrecognised messages, multiple messages per WU.
This is perhaps significant!
previously these messages existed only for Macs as far as I can see.
here is one case:
2014-01-22 17:24:41.1073 [PID=51877] HOST::parse(): unrecognized: opencl_cpu_prop
2014-01-22 17:24:41.1075 [PID=51877] HOST::parse(): unrecognized: platform_vendor
2014-01-22 17:24:41.1075 [PID=51877] HOST::parse(): unrecognized: Advanced Micro Devices, Inc.
2014-01-22 17:24:41.1075 [PID=51877] HOST::parse(): unrecognized: /platform_vendor
2014-01-22 17:24:41.1075 [PID=51877] HOST::parse(): unrecognized: opencl_cpu_info
2014-01-22 17:24:41.1075 [PID=51877] HOST::parse(): unrecognized: name
2014-01-22 17:24:41.1075 [PID=51877] HOST::parse(): unrecognized: Intel(R) Core(TM) i7-4770K CPU @ 3.50GHz
2014-01-22 17:24:41.1075 [PID=51877] HOST::parse(): unrecognized: /name
2014-01-22 17:24:41.1076 [PID=51877] HOST::parse(): unrecognized: vendor
2014-01-22 17:24:41.1076 [PID=51877] HOST::parse(): unrecognized: GenuineIntel
2014-01-22 17:24:41.1076 [PID=51877] HOST::parse(): unrecognized: /vendor
2014-01-22 17:24:41.1076 [PID=51877] HOST::parse(): unrecognized: vendor_id
2014-01-22 17:24:41.1076 [PID=51877] HOST::parse(): unrecognized: 4098
2014-01-22 17:24:41.1076 [PID=51877] HOST::parse(): unrecognized: /vendor_id
2014-01-22 17:24:41.1076 [PID=51877] HOST::parse(): unrecognized: available
2014-01-22 17:24:41.1076 [PID=51877] HOST::parse(): unrecognized: 1
2014-01-22 17:24:41.1076 [PID=51877] HOST::parse(): unrecognized: /available
2014-01-22 17:24:41.1076 [PID=51877] HOST::parse(): unrecognized: half_fp_config
2014-01-22 17:24:41.1076 [PID=51877] HOST::parse(): unrecognized: 0
2014-01-22 17:24:41.1076 [PID=51877] HOST::parse(): unrecognized: /half_fp_config
2014-01-22 17:24:41.1077 [PID=51877] HOST::parse(): unrecognized: single_fp_config
2014-01-22 17:24:41.1077 [PID=51877] HOST::parse(): unrecognized: 191
2014-01-22 17:24:41.1077 [PID=51877] HOST::parse(): unrecognized: /single_fp_config
2014-01-22 17:24:41.1077 [PID=51877] HOST::parse(): unrecognized: double_fp_config
2014-01-22 17:24:41.1077 [PID=51877] HOST::parse(): unrecognized: 63
2014-01-22 17:24:41.1077 [PID=51877] HOST::parse(): unrecognized: /double_fp_config
2014-01-22 17:24:41.1077 [PID=51877] HOST::parse(): unrecognized: endian_little
2014-01-22 17:24:41.1077 [PID=51877] HOST::parse(): unrecognized: 1
2014-01-22 17:24:41.1077 [PID=51877] HOST::parse(): unrecognized: /endian_little
2014-01-22 17:24:41.1077 [PID=51877] HOST::parse(): unrecognized: execution_capabilities
2014-01-22 17:24:41.1078 [PID=51877] HOST::parse(): unrecognized: 3
2014-01-22 17:24:41.1078 [PID=51877] HOST::parse(): unrecognized: /execution_capabilities
2014-01-22 17:24:41.1078 [PID=51877] HOST::parse(): unrecognized: extensions
2014-01-22 17:24:41.1078 [PID=51877] HOST::parse(): unrecognized: cl_khr_fp64 cl_amd_fp64 cl_khr_global_int32_base_atomics cl_khr_global_int32_extended_atomics cl_khr_local_int32_base_atomics cl_khr_local_int32_extended_atomics cl_khr_int64_base_atomics cl_khr_int64_extended_atomics cl_khr_3d_image_writes cl_kh
2014-01-22 17:24:41.1078 [PID=51877] HOST::parse(): unrecognized: /extensions
2014-01-22 17:24:41.1153 [PID=51877] HOST::parse(): unrecognized: global_mem_size
2014-01-22 17:24:41.1153 [PID=51877] HOST::parse(): unrecognized: 17029206016
2014-01-22 17:24:41.1153 [PID=51877] HOST::parse(): unrecognized: /global_mem_size
2014-01-22 17:24:41.1153 [PID=51877] HOST::parse(): unrecognized: local_mem_size
2014-01-22 17:24:41.1153 [PID=51877] HOST::parse(): unrecognized: 32768
2014-01-22 17:24:41.1153 [PID=51877] HOST::parse(): unrecognized: /local_mem_size
2014-01-22 17:24:41.1153 [PID=51877] HOST::parse(): unrecognized: max_clock_frequency
2014-01-22 17:24:41.1154 [PID=51877] HOST::parse(): unrecognized: 3500
2014-01-22 17:24:41.1154 [PID=51877] HOST::parse(): unrecognized: /max_clock_frequency
2014-01-22 17:24:41.1154 [PID=51877] HOST::parse(): unrecognized: max_compute_units
2014-01-22 17:24:41.1154 [PID=51877] HOST::parse(): unrecognized: 8
2014-01-22 17:24:41.1154 [PID=51877] HOST::parse(): unrecognized: /max_compute_units
2014-01-22 17:24:41.1154 [PID=51877] HOST::parse(): unrecognized: opencl_platform_version
2014-01-22 17:24:41.1155 [PID=51877] HOST::parse(): unrecognized: OpenCL 1.2 AMD-APP (1348.5)
2014-01-22 17:24:41.1155 [PID=51877] HOST::parse(): unrecognized: /opencl_platform_version
2014-01-22 17:24:41.1155 [PID=51877] HOST::parse(): unrecognized: opencl_device_version
2014-01-22 17:24:41.1155 [PID=51877] HOST::parse(): unrecognized: OpenCL 1.2 AMD-APP (1348.5)
2014-01-22 17:24:41.1155 [PID=51877] HOST::parse(): unrecognized: /opencl_device_version
2014-01-22 17:24:41.1155 [PID=51877] HOST::parse(): unrecognized: opencl_driver_version
2014-01-22 17:24:41.1155 [PID=51877] HOST::parse(): unrecognized: 1348.5 (sse2,avx)
2014-01-22 17:24:41.1155 [PID=51877] HOST::parse(): unrecognized: /opencl_driver_version
2014-01-22 17:24:41.1155 [PID=51877] HOST::parse(): unrecognized: /opencl_cpu_info
2014-01-22 17:24:41.1156 [PID=51877] HOST::parse(): unrecognized: /opencl_cpu_prop
2014-01-22 17:24:41.3583 [PID=51877] Request: [USER#221474] [HOST#10137513] [IP 69.35.195.242] client 7.2.33
2014-01-22 17:24:41.3880 [PID=51877] Sending reply to [HOST#10137513]: 0 results, delay req 6.00
2014-01-22 17:24:41.3880 [PID=51877] Scheduler ran 0.035 seconds

I am not an expert but it seems to me it might explain work not being taken.......
(but never saw this with boinctest!).

Other issue; one client reports "Cannot Create Process" mon Windows 7.
May or may not be significant.

Are executables 'signed" OK?

So all a bit complicated but hope to sort it (very) soon.
Eric.
24 Jan 2014, 12:34:09 UTC · Comment


Hiccup, today 23rd January
Apologies for an interruption to service.
Working on it. More news when corrected.
Eric.
23 Jan 2014, 8:45:25 UTC · Comment


Publications Update
The WWW page
http://lhcathome.web.cern.ch/sixtrack/sixtrack-and-numerical-simulations
has been updated by Massimo with new recent publications concerning LHC@home.
19 Nov 2013, 15:52:20 UTC · Comment


News Status and Plans 19th November, 2013
Please see the MB Number Crunching for an update. Eric. 19 Nov 2013, 8:09:05 UTC · Comment


Problem October 23rd Fixed
The permissions on the directory for the logs was wrong.
Corrected and results being uploaded. A fuller report and
a new Status and Plans will be issued soonest.
24 Oct 2013, 8:16:11 UTC · Comment


Problems 23rd October, 2013
Sorry for the upload problems. Hope somebody here will
fix this soon. (I thought we had a new record number
of WUs in progress! :-) Eric.
23 Oct 2013, 17:24:26 UTC · Comment


Status, 13th September, 2013
Still fighting to produce a good set of Linux executables.
Lots of work for Windows systems!
Created some notes on Numerical reproducibility
[url=http://cern.ch/mcintosh]CV and Notes on Floating-Point[url].
13 Sep 2013, 6:16:34 UTC · Comment


Status 6th September
New thread as feedback is in several others.
I have resolved server out of space for the short term and
we will implement a proper fix soonest.

Issue remains with Linux executables I think. I have checked and
informed my colelagues. The ".exe" suffix is confusing but the pni
executables look OK (crash on my test machine without pni of
course, but OK on my modern one).We do not hae a MAC executable
yet.

Now things have settled down we pursue an analysis of the problem(s).
I do not want to go back because we urgently need the new physics in
this version.

Thanks for your patience and undersatnding Getting lots of results
anyway. Eric.

6 Sep 2013, 12:18:44 UTC · Comment


New SixTrack
SixTrack CERN Version 4463 is now in production. 4 Sep 2013, 7:39:01 UTC · Comment


Testing
Just running "last" tests. Hope to have new SixTrack tomorrow. 2 Sep 2013, 19:03:12 UTC · Comment


Short Failing Work Units
We are tyring to use the test option of BOINC SixTrack project.
The very short WUs are failing. We have a fix and shall try agian
soon. More production to follow. Thanks for your patience.
Eric.
1 Sep 2013, 6:00:46 UTC · Comment


Staus and Plans, 30th August, 2013
Please see Message Boards: Number Crunching: Status and Plans 20th August, 2013
(Sorry about date!). Eric.
30 Aug 2013, 12:56:30 UTC · Comment


May, 2013 update.
Server down (temporarily I hope). Trying to fix the "unzip" problem.
See my recent posts to Number Crunching: Status and Plans May 25th,
and Results Discrepancies for more info. Eric.
25 May 2013, 11:04:18 UTC · Comment


More work coming now.
We have introduced a new SixTrack Version 4446 and I am resuming
production on an intensity scan as well as running more tests; usual
mixture of short/long run times. We are also trying to return more
results files to help identify problems. Thanks for your help as usual.
Eric.
8 May 2013, 17:56:27 UTC · Comment


Dynamic Aperture Tune Scan
Hello everybody,
after some few technical problem in the last few days, we are now ready to submit a first Tune Scan for the Dynamic Aperture study we are performing at CERN.
This simulations will give us a first hint on how the HighLuminosity upgrade for the LHC will work, and in particular the effect of the Beam-Beam interaction will be analysed.
This will be only the first bunch of simulations, because various scenario are possible for this upgrade, and we need to deeply investigate each one of them to decide which one is the one that better fit our requirements...so keep you machine ready to crunch!!
15 Mar 2013, 9:35:53 UTC · Comment


Interruption for server update
There will be a short server interruption today for a software update. New jobs should come later once we have checked the software chain.

The update is now done. Thanks for your contributions and have a nice day!
10 Mar 2013, 9:07:24 UTC · Comment


Forum restrictions
Due to spam activity, all forums apart from Questions & Answers: Getting Started now requires some BOINC credit to allow posting. If you are a complete newcomer, please check existing Questions & Answers first.

The team.
15 Feb 2013, 12:58:25 UTC · Comment


Pause
There will be a pause for a week or two.
See the News (no ) "More work" thread for more info.
8 Feb 2013, 15:10:03 UTC · Comment


More work
Can't keep up but more work coming now. 3 Feb 2013, 4:33:59 UTC · Comment


Production 2013
Great; as you will have seen running flat out on intensity scans, one million turns max.
Over 100,000 tasks running! CERN side infrastructure is creaking at the seams.
Will run down in a week or two to introduce a new SixTrack version (with suitable
warning).
31 Jan 2013, 11:07:55 UTC · Comment


First tests 2013
Trying to run a few thousand cases from Scinetific Linux 6 (SLC6)
here at CERN. Eric.
11 Jan 2013, 12:09:25 UTC · Comment


A Happy New Year
Thanks for all the support in 2012 (and before). Further delay due to a Power Cut
PC broken and the CERN annual closure for two weeks. Once again more detailed
information when I have recovered. So a Happy New Year and I am hoping for
an even better 2013.
6 Jan 2013, 11:39:19 UTC · Comment


Problems/Status 28th November, 2012 and PAUSE
Discovered some problems with result replication! and run out of
disk space at CERN. There will be a pause, for a few days at least,
while I investigate and resolve. (Wil post details soonest to the
MB Number Crunching.) Eric
28 Nov 2012, 17:17:55 UTC · Comment


Status, Thursday 15th November
Hiccup; mea culpa. On vacation and travelling since Tuesday
and ran out of disk space in BOINC buffer at CERN :-(
I think all is OK again now after corrective actions and more work
is on the way. Sorry about that. Eric.
15 Nov 2012, 7:44:40 UTC · Comment


Status and Plans, Sunday 4th November
First service continues to run well; the first intensity scan is nearing completion with well over a million results in 15 studies successfully returned. Just a couple of hundred thousand more!
(Sadly no one study is complete but a couple are very close and I shall start post-processing and analysis soon. I am still reflecting on the thread "Number crunching; WU not being sent to another user".
This is not easy, trying to get studies complete, but keeping the system busy. I am the "feeder" and since in the end I need all the studies I am rather prioritising keeping WUs available.)

Just checked and we have over 80,000, yes eighty thousand WUs active and this is a new (recent) record.

Draft documentation of the User side is now available thanks to my colleague R. Demaria. If you are interested
[url=SixDesk Doc]http://sixtrack-ng.web.cern.ch/sixtrack-ng/[/url]
and I hope you can access it (otherwise I shall put a copy to LHC@home).

Right now I hope to try new executables with new physics on our test server and I mght shortly appeal for some volunteers to help (and also to run a few more 10 million turn jobs). I do NOT want to risk the production service while it is running so smoothly.

Otherwise (At Last!) I shall start writing my paper on how to get identical results on ANY IEEE 754 hardware with ANY standard compiler
at ANY level of Optimisation. Thanks to all. Eric.
4 Nov 2012, 15:08:44 UTC · Comment


Status and Plans, Saturday 29th September, 2012
All running very smoothly indeed. Just a problem with deadline scheduling which I hope we can discuss and resolve on Monday, especially with some feedback from the BOINC meeting in London.
Also some hiccups on the CERN AFS infrastructure.
I am now hoping to prioritise the writing of my paper on numeric results reproducibility but I am continuing to run work for the next weeks as described in my new thread "Work Unit Description"
in the Message Board "Number Crunching".
I am also pondering how to best handle "very long"
jobs bearing in mind your feedback.
And of course I shall try and keep you informed.

Thankyou for your continued support. Eric.

29 Sep 2012, 11:20:02 UTC · Comment


Status, Sunday 9th September, 2012.
All running well still. One user reports "Maximum Elapsed Time Exceeded" though
on several, all? of his, WUs.
Still checking for MacOS results but no
further complaints at the moment.

I present some basic info.

There have been several changes to URLs and Servers outwith my control. The correct site is http:lhcathomeclassic.cern.ch/sixtrack/
This can indeed be found easily from LHC@home and then The Sixtrack Project (rather than Test4Theory). The current server is boinc05@cern.ch.

I define "normal" WUs as 10**5/100,000 turns but remember all particles may be lost after an arbitrary number of turns, sometimes, even just a few turns at large amplitudes.
Long WUs are 10**6 or one million turns and very Long WUs
10**7 or 10 million turns, and who knows maybe one day 10**8 turns.
That depends on how the floating-point error accumulates and at which point the loss/increase of energy and loss of symplecticity invalidate the results. It will be exciting to find out.

For Functionality, Reliability and Performance.
While waiting for the LXTRACK user node and the second server for test and backup (I assume they will finally get approved!):

Functionality; adequate for the moment. It would be good to have a priority system, three levels.
1. Run first, after other Level 1.
2. Normal; queue after Level 1 and before Level 3.
3. Run only if No Level 1/2 tasks queued.

I am thinking in terms of running 10**7 jobs as a series of 10**6 jobs. This requires returning and submitting more data, the fort.6 output and the checkpoint/restart files as a minimum. This would be very good additional functionality in itself.

Reliability; pretty good but needs the backup server, LXTRACK, and less reliance on CERN AFS..
Should provide a quick test (1 or 2 minutes) to verify the node produces correct results without running the whole WU. This would not obviate result validation but would avoid wasting resources.
I could also provide a longer test on the WWW with canonical results that any volunteer could run if he suspects he has over-clocked or is getting results rejected.

Performance; pretty good now with SSE2, SSSE3, PNI or whatever.
Should implement GPU option. Should measure the cost of the numeric portability.
(Incidentally Intel are hosting a Webinar on this topic on Wednesday, but I guess it will address only Intel H/W.)

9 Sep 2012, 15:57:05 UTC · Comment


Status, 2nd September, 2012
Well all seems to be running rather well as seen from the
CERN side. So I present the topics for review on Tuesday.
1. IT report on LXTRACK proposal (to greatly improve facilities for the
physicists including more disk space and much improved reliability).
2. Proposal for a second "test" server (to test very long jobs, to try returnig
the full results, without affecting the current service).
3. Project Status and open issues from the MBs:
a) More buffered work (user request).
b) Access to boinc01! Apparently some attempts to contact this obsolete service.
Could be WWW pointers or what.
c) HTTP problems, one user? (I need to send byte count and MD5 checksum.)
d) MacOS executable. Open issue; works for some people.
e) Deadline scheduling Seems that work is deleted because volunteers fear their
contribution will be wasted. But is this true? I have 99.999% results OK but how many
WUs were not credited............
f) GPU enabled SixTrack
4. A.O.B. including Date and time for a small party and the invitation list
to celebrate recent progress and the many helpful comments and suggestions.
2 Sep 2012, 15:34:14 UTC · Comment


Status, 26th August, 2012
MacOS executable is working, for some at least.
I have queued 500,000 jobs, intensity scan,
while I clear the decks. Many thanks for all the
suggestions and comments on (very) long jobs.
26 Aug 2012, 13:47:17 UTC · Comment


Very long jobs
I am now going to submit just a few hundred very
log 10**7 turn jobs to complete two studies.
I think this will be OK now; we shall see.
22 Aug 2012, 15:48:40 UTC · Comment


Credits
Please see the Message Board Number Crunching, Thread Credits for some
hopefully good news from Igor.
20 Aug 2012, 19:15:57 UTC · Comment


Status, 19th August, 2012.
All is running rather well; over 100,000 tasks queued, and over 56,000 running. I have a bit more work prepared, but badly need to do some analysis. After some flak, we have been receiving many messages of support and also a lot of help in identifying the problem with the MAC executable.

Igor has identified and corrected the problem with Credits and is still cleaning up and trying to repair.
(This was my fault; trying to run 10**7 turn jobs taking 80 hours.
However I can report that 99% of them have completed successfully,
and others are still active.)

The Mac executable issue may even be solved, but we need to watch for the next days still.

There may be a problem with Deadlines....we shall see.

I am waiting for PC support to install my NVIDIA TESLA, memory and upgraded power supply, and Linux. I am ready to install the software next and try Tomography. There is some interest in ABP especially for existing MPI applications. We shall see.

I have STILL NOT finished the SixDesk doc or prepared the tutorial.

I take this opportunity to outline the LXTRACK system: I hope IT support could fill in the details and do it.

The justification is that AFS limitations and problems have made life very difficult.
I have used my desk side pcslux99 (thanks to Frank who donated it) as a protoptype to run several hundred thousand jobs over the last few weeks.
Sadly I do not have the LSF commands like bjobs and bsub, as it as an old 32-bit machine, and I am NOT wanting to become a sysadmin again. It has almost 200GB of disk space of which I am using only 12% but increasing. Under this setup I have virtually no problems and do everything with the SixDesk scripts called from master scripts in acrontab entries.

LXTRACK should be a "standard" lxplus Virtual machine i.e. with LSF and CASTOR and SVN and AFS etc etc. BUT with at least a Terabyte of disk space NON AFS, /data, say. Only users in the AFS PTS Group boinc_users should be allowed to login.
(We could even create the /data/$LOGNAME directory for them.) How can we manage this space? Given the small number of cooperative users a script to monitor is probably adequate.
Processes shoul NOT be killed for exceeding CPU or real time limits.
Later, ideally, we could possibly create non_AFS buffers for communication with BOINC.
19 Aug 2012, 14:25:12 UTC · Comment


MacOS Executable
(Re-)activated MacOS executable built on MacBook PRO.
Will be watching closely for errors. Eric and Igor.
16 Aug 2012, 15:13:42 UTC · Comment


Status 12th August
All is running rather well from CERN side and I have initiated an intensity scan to run while I work a bit on the GPU. I have a real time deadline and I
must try this over the next two weeks. In spite of a couple of issues
with the CERN infrastructure I have still managed to queue over 90,000 Work Units as part of an Intensity scan (different bunch sizes and charge).

We are getting flak about credits or points. One obscene message I tried to hide, but the user said he got only
200 points for 80 hours when he expected at least 1000, and another user 62.70 points for 110 hours. So we lost a couple of volunteers, but we are also getting support with over 40,000 active Work Units.

There is also an issue with the real time deadline for my 10 million turn jobs.

I hope to fix the MAC executable next week with my colleague.
12 Aug 2012, 11:51:47 UTC · Comment


Status, 12th August
Please see the NEWS Message Board. 12 Aug 2012, 11:43:55 UTC · Comment


Status/Plans, 7th August 2012
First, many thanks for your continued support. From my/CERN side all has been running rather well and I am submerged by results.
I now need to take some time to analyse them. In particular to decide between the two methods of computing the beam - beam effect.
Then I shall probably submit several studies to do an intensity scan where I study the beam - beam effect depending on the size, and hence charge, of the accelerated bunch of particles.

At the same time, I must finish the documentation of the "user"
infrastructure so that my colleagues may easily use BOINC as they return from vacation. In addition I want to set up a dedicated "user" system "lxtrack" in order to provide disk space here and to try and keep up with the results as they are returned.

I have to look at the Deadline problem for 10**7 turn jobs.
I set a bound of 30 days for any WU....need to discuss with Igor is that is NOT what you see at home. Of course we really want a low bound to get results back quickly, but I also want to use older slower systems. We shall have to work out some sort of compromise. My attempt as 10**7 turns was probably a bit over the top, but I was keen to try it.

We hope/expect to produce a valid MAC executable this week. I also need to add some new "physics", new elements, to Sixtrack as provided by a colleague. (Also need to add modifications for "Collimation" but they are not relevant to BOINC.)
The next version should also support SSE4.1.

I was very pleasantly surprised to win an NVIDIA TESLA C2075.
The catch is that I have to use it and program it with OpenACC. There will doubtless be some hiccups intsalling the board and the necessary
(PGI) software. I shall in fact try my "Tomography" application which already runs in parallel using HPF or openMP. If that works I shall seriously consider a multi-threaded Sixtrack (using GPUs or not) by tracking many more particles in each Work Unit. Non-trivial but rather exciting. I am just at the ideas stage here, but.....it would of course use multiple threads on a multiple core PC as well. A dream?

Finally, I have to take time to publish my work on floating-point portability and reproducibility. I believe I might be the only person who gets identical bit for bit 0 ULP different results after many Gigaflops with 5 different Fortran compilers at different levels of optimisation.



7 Aug 2012, 15:18:31 UTC · Comment


MAC executable
STOP PRESS: Trying a new prototype executable for MACs.
Built with ifort defaults on a macBook Pro (using sse3 I guess).
Eric and Igor.
16 Jul 2012, 14:29:09 UTC · Comment


Server status
My colleague has cleaned database and I think that is the end of http errors etc etc.
I have submitted new work and I am always getting results anyway. There is still a whole
bag of worms around sse2 sse3 ssse3 pni and whatever, not helped by Intel's ifort
refusal to run optimised code on non-Intel hardware.

Igor has much improved version distribution and some people are getting "PNI"
versions. The important thing is that SSE2 upwards is much faster than the generic
version. Don't want to waste resources. All versions are completely numerically
portable (I hope so) but when panic is over I shall be looking at all rejected results
as I believe they are due to hardware failures (over-clocked?).

If all goes well I shall try and issue an update to whatever happened to lhc@home
this weekend.

In the meantime someone has changed the WWW pages, or whatever and I don't even know if you
can read this. All my bookmarks failed and usual start page NOT available.

Eric (from his new super MAC notebook pro, bought at great personal expense,
but have never had the time to set up. I am going to try and install BOINC now.)
11 Jul 2012, 17:58:09 UTC · Comment


Server/Executable problems
An exciting day; a new particle and maybe even the Higgs boson itself.

We have been busy preparing new executables for BOINC, including a MAC
executable.

Sadly we have run out of disk space and there are likely to be some hiccups
for the next few hours, hopefully not longer. We have three new executables for
both Windows and Linux: run anywhere, use SSE2, use SSE3. The run anywhere is
slow but every little helps. The executable for MAC requires at least SSE3 I
believe and the exact requirements are not well understood as I write.
I am currently running tests on as many types of hardware as I can.

The disk full situation can cause havoc and certainly explains why you have
not been able to get more work for the last hours.
More news as soon as we make some progress.

Thanks for your continued support which will to make an even better
LHC for 2015. Eric.
4 Jul 2012, 12:05:36 UTC · Comment


Sixtrack server migration today
Dear volunteers,

The Sixtrack BOINC project has been migrated to a new server today. If you should encounter any difficulties with the setup, please detach from the project and attach again.

BOINC and Sixtrack should be fully operational again from 2PM CET. (12:00 UTC)


Best regards, the BOINC service team.
5 Jun 2012, 10:55:57 UTC · Comment




News is available as an RSS feed   RSS