Thread 'Tasks of batch 12577096 have 200 Events'

Author	Message
Crystal Pellet Volunteer moderator Volunteer tester Send message Joined: 14 Jan 10 Posts: 1530 Credit: 10,029,686 RAC: 1,547	Message 33066 - Posted: 15 Nov 2017, 12:06:50 UTC Tasks of batch mc16_13TeV 140_CVetoBVeto.simul (12577096) have 200 Events. ID: 33066 · Reply Quote

maeax Send message Joined: 2 May 07 Posts: 2285 Credit: 178,823,324 RAC: 773	Message 33068 - Posted: 15 Nov 2017, 20:11:33 UTC To do more work than 25 Events in one task is the old discussion to make it possible in the preferences. Native app is no problem to do more work than 25 Events. Have tasks with 100 Events (also in Windows). ID: 33068 · Reply Quote

David Cameron Project administrator Project developer Project scientist Send message Joined: 13 May 14 Posts: 387 Credit: 15,314,184 RAC: 0	Message 33069 - Posted: 16 Nov 2017, 8:25:44 UTC - in response to Message 33066. Tasks of batch mc16_13TeV 140_CVetoBVeto.simul (12577096) have 200 Events. The previous task 12515739 had 50 events and the WU were finishing very quickly, so the efficiency was not so good. So we asked for more events in the new tasks in order to have longer WU. This means the overall data to download is lower but you have to upload 200MB at the end of each WU. ID: 33069 · Reply Quote

maeax Send message Joined: 2 May 07 Posts: 2285 Credit: 178,823,324 RAC: 773	Message 33074 - Posted: 17 Nov 2017, 15:08:19 UTC Task mc16_13TeV 140_CVetoBVeto.simul (12577096): 11007/37300 David, is it on a good way with Simulation 12577096 for the most Volunteers? For my Computers (allways the same configuration) the Cobblestones are growing now. ID: 33074 · Reply Quote

David Cameron Project administrator Project developer Project scientist Send message Joined: 13 May 14 Posts: 387 Credit: 15,314,184 RAC: 0	Message 33139 - Posted: 26 Nov 2017, 11:54:22 UTC Last modified: 26 Nov 2017, 11:54:44 UTC I've added info on the number of events per WU to the task info on http://atlasathome.cern.ch/ ID: 33139 · Reply Quote

Crystal Pellet Volunteer moderator Volunteer tester Send message Joined: 14 Jan 10 Posts: 1530 Credit: 10,029,686 RAC: 1,547	Message 33140 - Posted: 26 Nov 2017, 13:30:40 UTC - in response to Message 33139. I've added info on the number of events per WU to the task info on http://atlasathome.cern.ch/ Thanks David! ID: 33140 · Reply Quote

AuxRx Send message Joined: 16 Sep 17 Posts: 100 Credit: 1,618,469 RAC: 0	Message 33247 - Posted: 8 Dec 2017, 14:43:06 UTC - in response to Message 33139. I noticed I would miss the progress bars yesterday. Since the website is being overhauled, could this information be integrated into atlas_job.php as to make all information available on one page? We at least need a sticky with all project related links so new users can find this "hidden" information. ID: 33247 · Reply Quote

csbyseti Send message Joined: 6 Jul 17 Posts: 22 Credit: 29,430,354 RAC: 0	Message 33297 - Posted: 13 Dec 2017, 8:53:42 UTC it will be fine for me if the Atlas Task will be as big as possible. The reason is that Atlas (or LHC) is a SSD killer. At this Machine a Ryzen 1700 32GB Ram and 850 Evo 250GB only for Boinc i got 0,7 TBW on the System Disk and 20,5 TBW on the Boinc Disk. It was build first week of August 2017 and not all the time running LHC Work Units. So the Warranty value of 80TBW will be reached in 24 Month or earlier. The small work Units will run only 1 hour (3 cores) and often produce a big 200MB download. With 5 Task running at the same time more than 1GB in download are written in 1 hour. Bigger WU's will reduce the amount of download written very much. ID: 33297 · Reply Quote

AuxRx Send message Joined: 16 Sep 17 Posts: 100 Credit: 1,618,469 RAC: 0	Message 33308 - Posted: 13 Dec 2017, 12:00:06 UTC - in response to Message 33297. I am concerned longer WUs (and some 200 event WUs already fall into this 8h+ category) will increase the risk of failing WUs. As long as WUs cannot be stopped/continued reliably and disconnects kill a WU (I am disconnected every 24h) I would ask not to increase run time. In it's current form failing a WU can cost a third of my systems daily run time. In mitigation of the TBW issue I would recommend using HDD space where available and increasing checkpoint time. With each event taking around 4 minutes to compute, save points can be spread out even further. AFAIK not downloads of 200MB but compute data exceeding 4GB per VM is the underlying cause. Maybe increasing core count per task could further lighten the load on storage. Please correct me, if I am misinterpreting the numbers. ID: 33308 · Reply Quote

Erich56 Send message Joined: 18 Dec 15 Posts: 1957 Credit: 158,795,515 RAC: 53,181	Message 33310 - Posted: 13 Dec 2017, 12:53:38 UTC - in response to Message 33308. The reason is that Atlas (or LHC) is a SSD killer. In mitigation of the TBW issue I would recommend using HDD space where available That's why on all PCs with which I am running various LHC projects I have a HDD (besides the actual system SSD), in one case even an external USB-3 external HDD, in order to avoid this problem. The other day, for example, I noticed very late that due to a server problem, CMS tasks only ran for about 10-12 minutes and then finished unsuccessfully, each time building an image vdi of about 3 GB. If you run serveral such tasks concurrently, and this goes like this for 20 or 30 hours, you can imagine what this means in terms of TBW. ID: 33310 · Reply Quote