Thread 'Huge input file!'

Author	Message
Crystal Pellet Volunteer moderator Volunteer tester Send message Joined: 14 Jan 10 Posts: 1530 Credit: 10,029,455 RAC: 1,541	Message 32637 - Posted: 5 Oct 2017, 9:53:07 UTC Last modified: 5 Oct 2017, 17:23:25 UTC For a task from the batch mc16_13TeV AZNLOCTEQ6L1_Ztautau.simul (12236583), I got a huge input file: 757.812.338 bytes. The job needs normal run time for a dual-core VM. No problem for my bandwidth LHC@home 05 Oct 11:11:17 Started download of jf_f6dcceffbec93fe20e79c5f21f22f9ee LHC@home 05 Oct 11:12:37 Finished download of jf_f6dcceffbec93fe20e79c5f21f22f9ee and no data limit, but could be an issue for others, when they have to download such files for many ATLAS-tasks. Edit: The run time will be longer, cause the job has more than 50 events, probably 100 events. Edit2: >100 events; guessing how many . . . Edit3: >200 events; guessing how many . . . ID: 32637 · Reply Quote

gyllic Send message Joined: 9 Dec 14 Posts: 202 Credit: 2,659,192 RAC: 9	Message 32639 - Posted: 5 Oct 2017, 19:57:41 UTC - in response to Message 32637. look here for more information (for https you have to accept the certificate): https://bigpanda.cern.ch/task/?jeditaskid=12236583 https://bigpanda.cern.ch/jobsss/?jeditaskid=12236583&mode=nodrop&display_limit=100& ID: 32639 · Reply Quote

HerveUAE Send message Joined: 18 Dec 16 Posts: 123 Credit: 37,495,365 RAC: 0	Message 32641 - Posted: 6 Oct 2017, 5:41:00 UTC I got 2 of those: https://lhcathome.cern.ch/lhcathome/result.php?resultid=158597383 https://lhcathome.cern.ch/lhcathome/result.php?resultid=158522054 They have been running for more than 2 days now, and each has processed more than 600 events so far. We are the product of random evolution. ID: 32641 · Reply Quote

Crystal Pellet Volunteer moderator Volunteer tester Send message Joined: 14 Jan 10 Posts: 1530 Credit: 10,029,455 RAC: 1,541	Message 32644 - Posted: 6 Oct 2017, 8:09:13 UTC - in response to Message 32641. They have been running for more than 2 days now, and each has processed more than 600 events so far. From the links gyllic mentioned (thanks for those links), I conclude that there are 1000 events in one job. Used 6000 / finished 6 = 1000 events How many cores your ATLAS-VM have? How do you know yours are over 600 after 2 days of running? In my console I only see the last events of yesterday. The today's events are in front of that and can't be shown. At midnight my total events number was 256 (2 cores together), but I don't know how far they are now. ID: 32644 · Reply Quote

Crystal Pellet Volunteer moderator Volunteer tester Send message Joined: 14 Jan 10 Posts: 1530 Credit: 10,029,455 RAC: 1,541	Message 32646 - Posted: 6 Oct 2017, 9:23:57 UTC - in response to Message 32644. Last modified: 6 Oct 2017, 9:35:51 UTC How do you know yours are over 600 after 2 days of running? I found a manner myself: Lock the screen output with the Lock-key for about 1 minute and then Release and Lock again very quickly after each other. When you're lucky you'll see the most recent events. 327 events done from 1000 - athena's run times 818 minutes. ID: 32646 · Reply Quote

Crystal Pellet Volunteer moderator Volunteer tester Send message Joined: 14 Jan 10 Posts: 1530 Credit: 10,029,455 RAC: 1,541	Message 32649 - Posted: 6 Oct 2017, 12:28:23 UTC - in response to Message 32646. Last modified: 6 Oct 2017, 12:35:39 UTC Finally it's all for nothing. BOINC aborted the task cause too less disk space was reserved. LHC@home 06 Oct 14:08:45 Aborting task h8qMDmrtLJrnSu7Ccp2YYBZmABFKDmABFKDmxSLKDmABFKDmN0WMJn_2: exceeded disk limit: 5726.07MB > 5722.05MB The rsc_disk_bound of 6000000000 bytes is too low for these types of tasks, specially when a user sometimes has to suspend the task and therefore a snapshot will be written into the slot directory. https://lhcathome.cern.ch/lhcathome/result.php?resultid=158726637 ID: 32649 · Reply Quote

Erich56 Send message Joined: 18 Dec 15 Posts: 1957 Credit: 158,790,509 RAC: 53,379	Message 32650 - Posted: 6 Oct 2017, 12:49:40 UTC - in response to Message 32649. Finally it's all for nothing. BOINC aborted the task cause too less disk space was reserved. this is really annoying :-( The rsc_disk_bound of 6000000000 bytes is too low for these types of tasks just out of curiosity: where did you see this figure? I checked your STDERR text, either I overlooked it, or it's not written there and you know that from somewhere else. ID: 32650 · Reply Quote

Crystal Pellet Volunteer moderator Volunteer tester Send message Joined: 14 Jan 10 Posts: 1530 Credit: 10,029,455 RAC: 1,541	Message 32651 - Posted: 6 Oct 2017, 13:23:45 UTC - in response to Message 32650. The rsc_disk_bound of 6000000000 bytes is too low for these types of tasks just out of curiosity: where did you see this figure? I checked your STDERR text, either I overlooked it, or it's not written there and you know that from somewhere else. You can find that in the workunit info part of client_state.xml or in sched_reply_lhcathome.cern.ch_lhcathome.xml created when a requested task is sent to you. ID: 32651 · Reply Quote

Harri Liljeroos Send message Joined: 28 Sep 04 Posts: 795 Credit: 64,524,208 RAC: 31,480	Message 32654 - Posted: 6 Oct 2017, 14:05:07 UTC What do these tasks show as 'Task size' if you right-click the task in Boinc and view Properties? My normal tasks show 43200 GFLOPs. ID: 32654 · Reply Quote

Erich56 Send message Joined: 18 Dec 15 Posts: 1957 Credit: 158,790,509 RAC: 53,379	Message 32656 - Posted: 6 Oct 2017, 16:32:05 UTC - in response to Message 32651. You can find that in the workunit info part of client_state.xml or in sched_reply_lhcathome.cern.ch_lhcathome.xml created when a requested task is sent to you. thanks for the Information. So this is something which comes as a firm value determined by the server. ID: 32656 · Reply Quote

HerveUAE Send message Joined: 18 Dec 16 Posts: 123 Credit: 37,495,365 RAC: 0	Message 32662 - Posted: 7 Oct 2017, 4:37:28 UTC - in response to Message 32644. Last modified: 7 Oct 2017, 4:41:33 UTC How many cores your ATLAS-VM have? How do you know yours are over 600 after 2 days of running? The lines showing the progress of the events appear to be sorted, so after one day one can only see the events that were calculated shortly before midnight. In the window I saw an event number above 300, and since I have 2 cores on the task, I concluded that at least 600 events had been calculated. Both tasks are still running, more than 3 days now, athena.py CPU time at 4000 minutes. We are the product of random evolution. ID: 32662 · Reply Quote

HerveUAE Send message Joined: 18 Dec 16 Posts: 123 Credit: 37,495,365 RAC: 0	Message 32743 - Posted: 10 Oct 2017, 2:12:22 UTC - in response to Message 32662. Both tasks are still running, more than 3 days now, athena.py CPU time at 4000 minutes. Both tasks failed due to lack of disk space as well, but the failure occurred exactly at the time of resuming the task. If you do not suspend it, it seems that the task will continue for ever without failing. We are the product of random evolution. ID: 32743 · Reply Quote