Message boards :
ATLAS application :
Tasks of batch 12577096 have 200 Events
Message board moderation
Author | Message |
---|---|
Send message Joined: 14 Jan 10 Posts: 1417 Credit: 9,441,837 RAC: 794 |
Tasks of batch mc16_13TeV 140_CVetoBVeto.simul (12577096) have 200 Events. |
Send message Joined: 2 May 07 Posts: 2242 Credit: 173,902,375 RAC: 2,454 |
To do more work than 25 Events in one task is the old discussion to make it possible in the preferences. Native app is no problem to do more work than 25 Events. Have tasks with 100 Events (also in Windows). |
Send message Joined: 13 May 14 Posts: 387 Credit: 15,314,184 RAC: 0 |
Tasks of batch mc16_13TeV 140_CVetoBVeto.simul (12577096) have 200 Events. The previous task 12515739 had 50 events and the WU were finishing very quickly, so the efficiency was not so good. So we asked for more events in the new tasks in order to have longer WU. This means the overall data to download is lower but you have to upload 200MB at the end of each WU. |
Send message Joined: 2 May 07 Posts: 2242 Credit: 173,902,375 RAC: 2,454 |
Task mc16_13TeV 140_CVetoBVeto.simul (12577096): 11007/37300 David, is it on a good way with Simulation 12577096 for the most Volunteers? For my Computers (allways the same configuration) the Cobblestones are growing now. |
Send message Joined: 13 May 14 Posts: 387 Credit: 15,314,184 RAC: 0 |
I've added info on the number of events per WU to the task info on http://atlasathome.cern.ch/ |
Send message Joined: 14 Jan 10 Posts: 1417 Credit: 9,441,837 RAC: 794 |
I've added info on the number of events per WU to the task info on http://atlasathome.cern.ch/ Thanks David! |
Send message Joined: 16 Sep 17 Posts: 100 Credit: 1,618,469 RAC: 0 |
I noticed I would miss the progress bars yesterday. Since the website is being overhauled, could this information be integrated into atlas_job.php as to make all information available on one page? We at least need a sticky with all project related links so new users can find this "hidden" information. |
Send message Joined: 6 Jul 17 Posts: 22 Credit: 29,430,354 RAC: 0 |
it will be fine for me if the Atlas Task will be as big as possible. The reason is that Atlas (or LHC) is a SSD killer. At this Machine a Ryzen 1700 32GB Ram and 850 Evo 250GB only for Boinc i got 0,7 TBW on the System Disk and 20,5 TBW on the Boinc Disk. It was build first week of August 2017 and not all the time running LHC Work Units. So the Warranty value of 80TBW will be reached in 24 Month or earlier. The small work Units will run only 1 hour (3 cores) and often produce a big 200MB download. With 5 Task running at the same time more than 1GB in download are written in 1 hour. Bigger WU's will reduce the amount of download written very much. |
Send message Joined: 16 Sep 17 Posts: 100 Credit: 1,618,469 RAC: 0 |
I am concerned longer WUs (and some 200 event WUs already fall into this 8h+ category) will increase the risk of failing WUs. As long as WUs cannot be stopped/continued reliably and disconnects kill a WU (I am disconnected every 24h) I would ask not to increase run time. In it's current form failing a WU can cost a third of my systems daily run time. In mitigation of the TBW issue I would recommend using HDD space where available and increasing checkpoint time. With each event taking around 4 minutes to compute, save points can be spread out even further. AFAIK not downloads of 200MB but compute data exceeding 4GB per VM is the underlying cause. Maybe increasing core count per task could further lighten the load on storage. Please correct me, if I am misinterpreting the numbers. |
Send message Joined: 18 Dec 15 Posts: 1811 Credit: 118,366,941 RAC: 25,525 |
The reason is that Atlas (or LHC) is a SSD killer. In mitigation of the TBW issue I would recommend using HDD space where available That's why on all PCs with which I am running various LHC projects I have a HDD (besides the actual system SSD), in one case even an external USB-3 external HDD, in order to avoid this problem. The other day, for example, I noticed very late that due to a server problem, CMS tasks only ran for about 10-12 minutes and then finished unsuccessfully, each time building an image vdi of about 3 GB. If you run serveral such tasks concurrently, and this goes like this for 20 or 30 hours, you can imagine what this means in terms of TBW. |
©2024 CERN