Message boards : Number crunching : Is the user base/project participants growing a bit too large, for our server?
Message board moderation

To post messages, you must log in.

1 · 2 · 3 · 4 · Next

AuthorMessage
Nuadormrac

Send message
Joined: 26 Sep 05
Posts: 85
Credit: 421,130
RAC: 0
Message 12925 - Posted: 4 Mar 2006, 0:31:01 UTC

When trying to upload some results, I got a project down message in BOINC

3/3/2006 5:23:06 PM|LHC@home|Sending scheduler request to http://lhcathome-sched1.cern.ch/scheduler/cgi
3/3/2006 5:23:06 PM|LHC@home|Reason: Requested by user
3/3/2006 5:23:06 PM|LHC@home|Reporting 2 results
3/3/2006 5:23:16 PM|LHC@home|Scheduler request to http://lhcathome-sched1.cern.ch/scheduler/cgi succeeded
3/3/2006 5:23:16 PM|LHC@home|Message from server: Server can't open database
3/3/2006 5:23:16 PM|LHC@home|Project is down


Looking further, I got this from the front page:

Server Status

Database overload - please hold connections


Wonder if we're beginning to out-grow the project server here, or if some scripts might be being run, which are loading the thing down?
ID: 12925 · Report as offensive     Reply Quote
Profile [B^S] Molzahn

Send message
Joined: 21 Jan 06
Posts: 46
Credit: 174,756
RAC: 0
Message 12926 - Posted: 4 Mar 2006, 2:20:25 UTC
Last modified: 4 Mar 2006, 2:37:43 UTC

Hey Son Goku,

By no means am i an authority on servers or the LHC server/network.

I had the same message and i would assume it related to adding new work today or some form of (minor) failure or maintenance. From what i have seen LHC has been adding new work over the day in small increments and the concurrent connections haven't reached a high number.

Anyhow if you look at the LHC stats the users have increased by about 5k since late December of last year... So your assertion is indeed plausible. These are just my two cents.
users graph

On a side note, there are pictures of the LHC BOINC computer farm here; it looks pretty big to me. I would assume their connection and computing power could handle things as they are for the moment. Not sure if this is all for distributed computing but the title of the folder is BOINC workshop

Hopefully someone can answer your question with more insight, maybe i was helpful :) ,
Mike

blog pictures
ID: 12926 · Report as offensive     Reply Quote
senatoralex85

Send message
Joined: 17 Sep 05
Posts: 60
Credit: 4,221
RAC: 0
Message 12927 - Posted: 4 Mar 2006, 5:41:53 UTC - in response to Message 12926.  

Hey Son Goku,

By no means am i an authority on servers or the LHC server/network.

I had the same message and i would assume it related to adding new work today or some form of (minor) failure or maintenance. From what i have seen LHC has been adding new work over the day in small increments and the concurrent connections haven't reached a high number.

Anyhow if you look at the LHC stats the users have increased by about 5k since late December of last year... So your assertion is indeed plausible. These are just my two cents.
users graph

On a side note, there are pictures of the LHC BOINC computer farm here; it looks pretty big to me. I would assume their connection and computing power could handle things as they are for the moment. Not sure if this is all for distributed computing but the title of the folder is BOINC workshop

Hopefully someone can answer your question with more insight, maybe i was helpful :) ,
Mike


---------------------------------------------------------------------------

Interesting. According to the graph, it looks like the project was closed to new users between november and January of last year because no users signed up. I never knew it was closed......Did anyone else know this?

ID: 12927 · Report as offensive     Reply Quote
[B@H] Ray

Send message
Joined: 13 Jul 05
Posts: 82
Credit: 6,336
RAC: 0
Message 12928 - Posted: 4 Mar 2006, 5:54:58 UTC - in response to Message 12927.  
Last modified: 4 Mar 2006, 6:01:52 UTC


Interesting. According to the graph, it looks like the project was closed to new users between november and January of last year because no users signed up. I never knew it was closed......Did anyone else know this?


Yes, I started with BOINC in June but LHC was closed to new members. In mid July they opened it up again. Before July they did not have enoughf work for more members. After I joined in July there was steady work for a few Mo. but on and off after that. When they get another study now they go by much faster with more people crunching.

EDIT: May I have one row of computers from this one

Pizza@Home - Rays Place - Rays place Forums
ID: 12928 · Report as offensive     Reply Quote
Gaspode the UnDressed

Send message
Joined: 1 Sep 04
Posts: 506
Credit: 118,619
RAC: 0
Message 12930 - Posted: 4 Mar 2006, 12:55:11 UTC - in response to Message 12925.  


Server Status

Database overload - please hold connections


Wonder if we're beginning to out-grow the project server here, or if some scripts might be being run, which are loading the thing down?


There's a new batch of work available. Over the space of a couple of hours every machine will now try and fill its queue, and this will cause a spike in the demands on the project servers. Once the queues start to fill the demand will fall back again. If we had a steady stream of work the servers could cope with a much higher population. With the constant peaks and troughs we will get this from time to time.


Gaspode the UnDressed
http://www.littlevale.co.uk
ID: 12930 · Report as offensive     Reply Quote
Profile Morgan the Gold
Avatar

Send message
Joined: 18 Sep 04
Posts: 38
Credit: 173,867
RAC: 0
Message 12931 - Posted: 4 Mar 2006, 13:26:14 UTC

Server Status

Up, 59829 workunits to crunch
57975 workunits in progress
41 concurrent connections


the 'in progress' is up now
the servers just have fits at the onset of new work
ID: 12931 · Report as offensive     Reply Quote
Toby

Send message
Joined: 1 Sep 04
Posts: 137
Credit: 1,697,025
RAC: 447
Message 12942 - Posted: 6 Mar 2006, 4:08:27 UTC - in response to Message 12927.  

Interesting. According to the graph, it looks like the project was closed to new users between november and January of last year because no users signed up. I never knew it was closed......Did anyone else know this?



I don't know if they were closed or not however that graph uses the same data source as mine do (see here). They are based on the XML dumps generated every day by the projects and (here is the important part) these XML dumps only export users who have more than 0 credit. So even if people were signing up during December, there was no work so none of the new users could get credit and be listed in the XML.

- A member of The Knights Who Say NI!
My BOINC stats site
ID: 12942 · Report as offensive     Reply Quote
[B@H] Ray

Send message
Joined: 13 Jul 05
Posts: 82
Credit: 6,336
RAC: 0
Message 12943 - Posted: 6 Mar 2006, 5:19:48 UTC - in response to Message 12942.  
Last modified: 6 Mar 2006, 5:21:23 UTC

Interesting. According to the graph, it looks like the project was closed to new users between november and January of last year because no users signed up. I never knew it was closed......Did anyone else know this?



I don't know if they were closed or not however that graph uses the same data source as mine do (see here). They are based on the XML dumps generated every day by the projects and (here is the important part) these XML dumps only export users who have more than 0 credit. So even if people were signing up during December, there was no work so none of the new users could get credit and be listed in the XML.

Yes, they were closed to new members. Don't have the date that it closed but repoened to new members on 13 July 2005. See the Old News for details

.


Pizza@Home - Rays Place - Rays place Forums
ID: 12943 · Report as offensive     Reply Quote
River~~

Send message
Joined: 13 Jul 05
Posts: 456
Credit: 75,142
RAC: 0
Message 12946 - Posted: 6 Mar 2006, 8:32:39 UTC - in response to Message 12943.  
Last modified: 6 Mar 2006, 8:52:56 UTC

...
Yes, they were closed to new members. Don't have the date that it closed but repoened to new members on 13 July 2005...


and looking at your 'Joined' date Ray, I guess you were hovering around the closed door just waiting for it to swing open. It's good to see I wasn't the only one with such enthusiasm ;-)


Edit: looking back at the old news, the door was only open for five days that time, they closed it again at 8000 users, then re-opened to allow just another 2000 users on. I am not sure when (if?) the door was left permanently open.

Unlike some projects (think Rosetta) this project has never needed/wanted to go for the maximum number of crunchers. By keeping the number down to the level where work is turned round in about a week they get the performance their designers can usefully use and can afford to save on database resources. This is a case of matching the user base to the workload, so of course Rosetta and LHC will have different optimal user bases.

At present there is a historically low workload, and the bulk of the processing has already been done. I'd guess the only reason the door is still open is to counter the drift away as some crunchers leave the project now that the work is sporadic. Either that or it has been left as it is on the principle of "don't fix what is already working", in which case as soon as the database or servers show some strain based on numbers of active crunchers we'd expect to see the door closed again.

I agree with earlier postings: what we saw recently was not overload based on the number of registered users, but the rush for the door when the work came back on. The backoff algorithms written into the client deal with this quite adequately providing people do not try to get an unfair advantage by using the update now button to circumvent the cutoff.
ID: 12946 · Report as offensive     Reply Quote
Profile anarchic teapot
Avatar

Send message
Joined: 15 Feb 06
Posts: 67
Credit: 460,896
RAC: 0
Message 12949 - Posted: 6 Mar 2006, 14:54:37 UTC - in response to Message 12925.  

Wonder if we're beginning to out-grow the project server here, or if some scripts might be being run, which are loading the thing down?


I've just joined. It's probably all my fault.

/me feels very sorry for the poor b'stard who had to heave all those ATX-tower PCs into place.

sQuonk
Plague of Mice
Intel Core i3-9100 CPU@3.60 GHz, but it's doing its bit just the same.
ID: 12949 · Report as offensive     Reply Quote
[B@H] Ray

Send message
Joined: 13 Jul 05
Posts: 82
Credit: 6,336
RAC: 0
Message 12950 - Posted: 6 Mar 2006, 15:34:23 UTC

River~~
I was just looking to see what would be good to join when they opened it up in July, looks like I found it open the same day as you. Really I heard that it was opening over at Boinc Synergy and joined.

That opening and closing left a lot of people waiting at that time but we had work almost all the time until around October. Looks like they could close it to new members again with the databases being used to there Max. now. I think this is the only production project that has limited the number of users. Maby we will se more that have to do it in the future.
Ray

Pizza@Home - Rays Place - Rays place Forums
ID: 12950 · Report as offensive     Reply Quote
River~~

Send message
Joined: 13 Jul 05
Posts: 456
Credit: 75,142
RAC: 0
Message 12951 - Posted: 6 Mar 2006, 19:32:01 UTC - in response to Message 12950.  

... I think this is the only production project that has limited the number of users...


That is right. It is also, I think, the production project with the smallest total intended work. We are already well beyond the end of the original LHC@home proposal. CERN hedged their bets on that one - there was a back-up plan in case the BOINC project did not work out, and they could have done the essentials of the LHC design in-house, but for more money and less precise predictions.

Even with the latest extensions to the LHC project, in terms of total numbers of cpu ops LHC remains the smallest of all the projects so far.

Most other projects are fairly open ended, like SETI or Einstein will keep on collecting new data for processing and Rosetta want to get to SETI size before they even think about how big they eventually want to be.

LHC is the first of the medium scale DC projects - small enough that the original project would have been possible in house, but large enough to be greatly improved by the extra resources from the net. As DC becomes more respectable, we can expect to see growth in both directions - more enormous projects but also many many more small "limited edition" projects. Or that is my guess, anyway.

River~~

ID: 12951 · Report as offensive     Reply Quote
Profile [B^S] Molzahn

Send message
Joined: 21 Jan 06
Posts: 46
Credit: 174,756
RAC: 0
Message 12953 - Posted: 6 Mar 2006, 19:58:51 UTC - in response to Message 12951.  
Last modified: 6 Mar 2006, 20:00:38 UTC

Hey River~~,

I have a quick question for you about the limited nature of LHC@Home.

I have been wondering if the project will end come 2007 once the LHC is complete and operational.

There is a discussion from august of last year about the end of LHC and someone made the assertion that LHC wouldn't stop LHC@Home because they have free distributed computing.

But from your post and what I gather from this Wiki quote:

"This accelerator will generate vast quantities of computer data, which CERN will stream to laboratories around the world for distributed processing. In April 2005, a trial successfully streamed 600MB per second to seven different sites across the world. If all the data generated by the LHC is to be analyzed, then scientists must achieve triple this before 2007."

CERN will probably shut down the BOINC project and rely on super computers and networks at universities, laboratories, and other collabrative participants to analyze data. (Not BOINC)

Would you agree with this assumption?

Or would you think that they would rely on BOINC's free DC to help in some form?

Sorry, i know this has been discussed but i just can't imagine that the project will continue (it seems you agree), but past posts seem to think they will.

Wondering what your opinion (or anyone elses) is,

Mike

blog pictures
ID: 12953 · Report as offensive     Reply Quote
Profile FalconFly
Avatar

Send message
Joined: 2 Sep 04
Posts: 121
Credit: 592,214
RAC: 0
Message 12954 - Posted: 6 Mar 2006, 20:18:42 UTC - in response to Message 12953.  

Hm, I'm not sure if they have to pay for their Access to other Institution's Computing assets (I think so, however, at least they'll be in competition with other Projects for the precious Slot times if they take other Institution's shared resources).

I believe that they have to use them for lightning fast storage, afterwards (considering the power we as a community demonstrated ) we might be a faster, more reliable and more hassle-free alternative.

I mean, who needs expensive Supercomputers when we can do better by all means?
Scientific Network : 45000 MHz - 77824 MB - 1970 GB
ID: 12954 · Report as offensive     Reply Quote
River~~

Send message
Joined: 13 Jul 05
Posts: 456
Credit: 75,142
RAC: 0
Message 12966 - Posted: 8 Mar 2006, 18:58:17 UTC - in response to Message 12953.  

Hey River~~,

I have a quick question for you about the limited nature of LHC@Home.
...
CERN will probably shut down the BOINC project and rely on super computers and networks at universities, laboratories, and other collabrative participants to analyze data. (Not BOINC)

Would you agree with this assumption?
...


CERN don't analyse data, they allocate spaces on the machines for experimental groups which are usually international collaborations of universities.

The design work on the machines is not done by the CERN IT department either, but by the beam physics teams. They buy in IT resources deom the CERN IT department (essentially a transfer of internal budget).

There is in principle nothing to stop any of the experimental groups that use the LHC from buying in access to the LHC@home project - what they'd be buying would be the experience Chrulle and colleagues have got from running this project.

There is also in principle nothing to stop any collaboration from starting its own DC project to do the data analysis, whether based on BOINC or not.

My best guess is that they will not do either, but will stick to the tried and tested supercomputers that they already have. The beam physicists had a fall back plan in case the BOINC effort flopped, and they had a good lead time as the computing was done well ahead of the engineering. From the perspective of an experimental team, if BOINC flopped and there was a significant delay, that could lose that team the scientific precendence. My guess (and it is only a guess) is that they will not want to take the risk.

A few years on, when DC is even more established, and when results (including the LHC@home results) are more widely known by funding bodies, then I think we will start to see it happen. The finding of gravity waves by the Einstein BOINC project, or some silimlar breakthrough on any other pure sciene project would also make it more likely to happen.

You asked for my opinion -- that is all this posting is. I do not have any inside knowledge and I will be delighted if I am proved wrong by some group of experimentalists taking over the LHC network for their crunching.

However, even if the experimentalists never use LHC@home, I do not see it ever being swithed off. Wound down, yes, wound up, no.

There will always be the desire by the beam physics folks to keep the resource available as a backup. Then, if perhaps two or three magnets have a catastrophic failure they will have the computer power to work out how to re-deploy what is left. If the experimentalists come along with some new unexpected demand (this happens!) then the more computer power on tap the more likely it is that the beam physicists can deliver. I therefore predict that LHC@home may get to the stage where it is months between having work available - 6 months maybe with no work, but then a burst will come along. By then surely at last all the remaining participants will be running other projects but keeping this in reserve.

River~~

ID: 12966 · Report as offensive     Reply Quote
Kaal

Send message
Joined: 7 Nov 05
Posts: 19
Credit: 248,179
RAC: 0
Message 12969 - Posted: 9 Mar 2006, 0:48:01 UTC
Last modified: 9 Mar 2006, 0:48:54 UTC

Consider also that Cern & the EU have been dumping funds into the Datagrid (http://public.eu-egee.org/) for the immediate future. It's planned to up to speed by the time the LHC comes online. If it fails in it's expected throughput there is the off chance that LHC@home will be poked to see if it can pick up the slack (after all if the application is grid aware it might be preferable to have the last few results a little slower in reporting) while the next experiment is using 'the grid'...
I concur with River~~ that LHC@home won't be canned as it is too valuable a potential resource.
What if a new experiment needs to be put in place? The LHC can't be shut down for years while tests are performed. Models will have to be runa nd within budget constraints, to be sure.
No, LHC@Home won't disappear but its use may be small and/or intermittent.

In terms of BOINC and other projects, Right now the highest profile seems to be CPDN and with the press coverage that is generating it may only be a couple of years before DC/BOINC is considered to be a standard option for utility computing.

There's gonna be work for quite some time to come between the various projects and it's entirely possible that the LHC or any successor will supply that work.
ID: 12969 · Report as offensive     Reply Quote
Profile Chrulle

Send message
Joined: 27 Jul 04
Posts: 182
Credit: 1,880
RAC: 0
Message 12970 - Posted: 9 Mar 2006, 12:51:01 UTC
Last modified: 14 Mar 2006, 16:12:47 UTC

Since you all seem interested here is a little progress report and a look at the future of lhc@home.

Servers:
We have received our new LHC@home servers. Two new dual 3.0 GHz machines. Unfortunately there is a problem with the disks. The disks ids are reordered during the installation process, so that after a reboot the system will try to boot the wrong disks. This can be fixed manually, but this is a problem for the automatic system at CERN. Whenever a security upgrade is needed or some hardware has failed and been replaced the system does a complete reinstall of the servers system disk. It would then require manual interaction to get it to run again. We are hoping to have this problem solved soon.

Sixtrack:
Sixtrack will continue to run with many small studies. Eric MacIntosh is looking into porting a new version of sixtrack to BOINC. This new version will have new physics modelling and beam collimation.


New Apps:
Garfield, a drift chamber simulation, has been ported to BOINC and tested. At the moment it is running on the 100 recycled farm servers we were given by the computer centre. We are now waiting for an official request for resources from ATLAS and other experiments. If we have enough request we will look into porting garfield to windows and moving it to LHC@home.
Pythia and ATLFast have also been ported to boinc.
Pythia is an event generator used by all the experiments at CERN. The idea was to use it in LHC@home to create a database of good noise[1]. ATLFast is a fast and reduced version of the full ATLAS simulation software. This could be used to do preliminary event reconstruction on LHC@home once the LHC is running.

Africa@home:
The Malariacontrol project has been spun-off from CERN. The servers have been relocated to the university of Geneva and to STI in Basel for the beta phase. To learn more about the project go to Africa@home. We are now looking for other projects to run within the africa at home project.

The future:
Sixtrack will continue to run for the time being and we hope to launch more applications. LHC@home will continue also after the LHC starts running, whether it will run sixtrack, one of the new apps or a mixture has not been decided yet.
We would like to have some summer students work on LHC@home related projects, so if you are a CS student and interested have a look at the student programme. On a more personal note I will leave Cern at the end of the month. I do not yet know who will take over the day to day server administration, but i expect Ben will look in on the message boards from time to time.

[1] good noise is important for the physicist. They have to pick out the one interesting interaction from a bunch of other events and maybe even outside sources like cosmic radiation. They therefore need to tune their simulations using generated noise.
Chrulle
Research Assistant & Ex-LHC@home developer
Niels Bohr Institute
ID: 12970 · Report as offensive     Reply Quote
lpoorman

Send message
Joined: 24 Jul 05
Posts: 18
Credit: 4,236
RAC: 0
Message 12971 - Posted: 9 Mar 2006, 14:37:16 UTC - in response to Message 12970.  

Hey Chrulle,

Your "little" progress report is very much appreciated!I think it is tremendous that people such as Jukka Klem and yourself take the time to keep us abreast of some of the things that are going on in connection with LHC@Home.

The ATLFast sounds, at least to me, like a very exciting project and I can only hope that the project leaders do decide to farm some of it out to the LHC@Home users.

Lastly, thank you very, very much for being such an active contributor to the bulletin boards and for supplying us with so much valuable information. I wish you all the best in you new endeavors wherever and whatever they may be.
ID: 12971 · Report as offensive     Reply Quote
Profile Ben Segal
Volunteer moderator
Project administrator

Send message
Joined: 1 Sep 04
Posts: 140
Credit: 2,579
RAC: 0
Message 12972 - Posted: 9 Mar 2006, 16:27:27 UTC

I'd like to confirm from "inside the house" that LHC@home is definitely not going away.

The reason for the rather intermittent submission of work these days is that the accelerator design guys are taking their time looking at returned results and figuring out what new cases need computing. They continue to inform us that they will be submitting large amounts of work in the near future.

The great value of the LHC@home installation is widely recognized at CERN. Setting it up has been a major investment of effort and learning for us, and apart from representing a huge CPU potential, it has also led to several beneficial side effects such as the work done last year on achieving identical results on all volunteer platforms. As Chrulle has written here, other groups at CERN are now looking seriously at BOINC applications for their work.

As far as the question "Is the user base/project participants growing a bit too large, for our server?" goes, we are now assured of a very solid and adequate hardware platform for LHC@home in the CERN Computer Centre. As several of you have pointed out, the current participants are numerous and effective enough to provide all the capacity we can use, even though we are indeed still the smallest of all BOINC production projects to date. We hope to be able to provide work for all our faithful followers over the next months - at least enough to keep you faithful after all the help you have given us so far.

Thanking you all again,

Ben Segal / LHC@home Coordinator
ID: 12972 · Report as offensive     Reply Quote
Profile [B^S] Molzahn

Send message
Joined: 21 Jan 06
Posts: 46
Credit: 174,756
RAC: 0
Message 12973 - Posted: 9 Mar 2006, 18:38:23 UTC - in response to Message 12972.  
Last modified: 9 Mar 2006, 18:54:43 UTC

Chrulle & Ben Segal,

Thank you both for the update.

Most importantly, thank you for your hard work; and the many hours you have dedicated to a project that (i am sure) all of us "crunchers" are happy to work on.

-Mike

Post Script: River~~ Thanks for responding, i was glad to read your thoughts (and everyone elses).
:)

blog pictures
ID: 12973 · Report as offensive     Reply Quote
1 · 2 · 3 · 4 · Next

Message boards : Number crunching : Is the user base/project participants growing a bit too large, for our server?


©2024 CERN