Message boards :
ATLAS application :
Atlas tasks are failing, server status show also 0 tasks ready to send
Message board moderation
Author | Message |
---|---|
Send message Joined: 28 Sep 04 Posts: 728 Credit: 48,863,699 RAC: 21,444 |
My last two latest tasks failed and show this error: WARNING Transform now exiting early with exit code 15 (No events to process: 4400 (skipEvents) >= 4400 (inputEvents of EVNT) The server status page shows that the ready to send queue is empty. Same for sixtrack also. So time to select another subproject. |
Send message Joined: 13 May 14 Posts: 387 Credit: 15,314,184 RAC: 0 |
Indeed we ran out of tasks over the weekend, I have asked for more to be submitted. |
Send message Joined: 17 Sep 04 Posts: 105 Credit: 32,824,853 RAC: 705 |
Seems like a lot of validate errors. Regards, Bob P. |
Send message Joined: 28 Sep 04 Posts: 728 Credit: 48,863,699 RAC: 21,444 |
Server status page shows new Atlas tasks are available. |
Send message Joined: 15 Jun 08 Posts: 2530 Credit: 253,722,201 RAC: 51,175 |
Rather short for a WU with 198 MB initial download, isn't it? Usual walltimes on this host are >5 h. https://lhcathome.cern.ch/lhcathome/result.php?resultid=153867426 |
Send message Joined: 2 May 07 Posts: 2242 Credit: 173,899,841 RAC: 2,807 |
Server show ZERO Tasks to send. |
Send message Joined: 19 Feb 08 Posts: 708 Credit: 4,336,250 RAC: 0 |
After a failed attempt to raise the VirtualMachine memory via the VirtualBox Manager, which resulted in a extremely long computing time and a final failure, I have crunched three Atlas tasks on my Linux laptop with a E-450 AMD CPU. They all finished in time and are validated. They are running alongside a climateprediction.net task with a very extended deadline (one year) and they stop momentarily waiting for memory, then resume crunching. I am glad to be able to run at least some LHC@home tasks, while my SUN WS on Linux and Windows 10 PC are crunching Einstein@home and SETI@home tasks both CPU and GPU. Tullio |
Send message Joined: 15 Jun 08 Posts: 2530 Credit: 253,722,201 RAC: 51,175 |
Unfortunately your ATLAS WUs don't deliver valid scientific results, they are only rewarded for CPU time and that's the reason why they are marked as "valid" in the tasklist. You may look into your error logs, e.g. https://lhcathome.cern.ch/lhcathome/result.php?resultid=153646199. There you find: 2017-08-23 23:19:09 (30418): Setting Memory Size for VM. (4200MB) Setting #CPUs to 2 is not recommended on this host, especially if there is any other project that runs concurrently. You may run ATLAS on a 1-core setting and suspend all other BOINC tasks while it is executed. Setting RAM size to only 4200 MB is also not recommended as it leads to the following errors: 2017-08-24 00:07:21 (30418): Guest Log: PyJobTransforms.transform.execute 2017-08-24 00:03:19,485 CRITICAL Transform executor raised TransformValidationException: Non-zero return code from EVNTtoHITS (65); Logfile error in log.EVNTtoHITS: "AthMpEvtLoopMgr FATAL makePool failed for AthMpEvtLoopMgr.SharedEvtQueueProvider" Even on a 1-core setting, a RAM size of 4200 MB may be not enough for the current batch. What is also suspect: - the very short runtimes of your WUs - a result file named HITS.* is missing in the log |
Send message Joined: 19 Feb 08 Posts: 708 Credit: 4,336,250 RAC: 0 |
OK, I've limited the number of cores to 1 and I shall watch what happens. But I have only 8 GB RAM on that PC, the mobo allows only that, and cannot starve other projects. Tullio |
Send message Joined: 19 Feb 08 Posts: 708 Credit: 4,336,250 RAC: 0 |
One core task started, looks OK, using 3400 MB. Tullio |
Send message Joined: 15 Jun 08 Posts: 2530 Credit: 253,722,201 RAC: 51,175 |
@ tullio Compared to other hosts and the current ATLAS batch I would expect completion times between 12 h and 15 h for your E-450. Since you wrote your last post it reported a couple of ATLAS WUs with completion times of less than 1 h. Although all of them are rewarded they still don't deliver what you probably expect. If you like to spend the time for a test you may rise the RAM setting for a 1-core ATLAS VM to 5000 MB and start only this VM. No other BOINC app should run or even be left in RAM during the test. The critical phase is short after the start when the VM extracts the EVNT.* file. You may check the stderr.txt in the slots dir of the running VM for the messages in my previous post. If this test ends successfully you may repeat it with another project (no vbox project) running concurrently and/or with a slightly reduced RAM setting for your VM until the errors occur again. If the test doesn't end successfully your E-450 is to weak to run ATLAS (and probably also CMS and LHCb). Then you may repeat the test with Theory Simulation at it's default RAM setting. Hope you'll get a success. |
Send message Joined: 19 Feb 08 Posts: 708 Credit: 4,336,250 RAC: 0 |
In the error log of the last failed task I found a phrase about an extension pack not installed. Mea culpa, mea culpa, mea maxima culpa. I installed it and now I shall watch the next task when the laptop finishes two SETI@home tasks I downloaded to keep it crunching something. Thanks anyway for your suggestion, I get very little help from anyone else. Tullio |
Send message Joined: 27 Sep 08 Posts: 846 Credit: 691,127,115 RAC: 110,482 |
tulio, yetis check list is good if you have issues: https://lhcathome.cern.ch/lhcathome/forum_thread.php?id=4161#29359 This one failed with too much disk usuage, you can adjust the setting to 8GB for BOINC in settings. https://lhcathome.cern.ch/lhcathome/result.php?resultid=153909057 |
Send message Joined: 15 Jun 08 Posts: 2530 Credit: 253,722,201 RAC: 51,175 |
tulio, yetis check list is good if you have issues: But beside that the processing time and the error log looks very good. Could have been a success. Nonetheless I would spend more RAM than the configured 3400 MB. |
Send message Joined: 19 Feb 08 Posts: 708 Credit: 4,336,250 RAC: 0 |
Toby, I have 14.30 GB available to BOINC on the HP Laptop with a 1 TB hybrid disk, by Seagate if I remember. But after LHC "consolidation" all LHC tasks fail on all PCs, two Linux and one Windows, save SixTrack. I am testing Atlas@home on the slowest machine, the HP laptop with AMD E-450 CPU while the two other PCs run SETI@home and Einstein@home CPU and GPU tasks with no problem. The Windows 10 PC, updated every month by Microsoft, has 22 GB RAM and 1 TB disk. The SUN WS, my oldest machine, has 1 TB disk, 8 GB RAM and a GTX 750 Ti GPU board. The Windows PC has a GTX 1050 Ti GPU board, with Pascal microprocessor and 640 GPU cores. Maybe CERN should start thinking about GPUs, now that they have taken a role in SKA array computing, according to CERN Courier. Tullio |
Send message Joined: 27 Sep 08 Posts: 846 Credit: 691,127,115 RAC: 110,482 |
I think the 3400MB is visable in BOINC UI? When you made the appconfig this over rides the usage in the VM to 5000MB(?). The UI however isn't updated so there is the difference. |
Send message Joined: 27 Sep 08 Posts: 846 Credit: 691,127,115 RAC: 110,482 |
There was some talk of a sixtrack app for GPU but I think they will target AVX 1st as it's simpler to do. I feel like it will be a long time for GPU, it's not so simple to run Fortran on GPU. The other project will never use GPU as the VM is there to make it easy for the scientist not easy for us ;). If it wasn't easy for them then they wouldn't exist. I just set mine to 250GB, it never uses that much and I'm sure I would notice before it actually use it. |
Send message Joined: 19 Feb 08 Posts: 708 Credit: 4,336,250 RAC: 0 |
I've written the app_config.xml file as suggested which should bring the memory to 5000 MB. But Atlas tasks still start at 3400 MB. This on the SUN WS, a Linux box. Tullio |
Send message Joined: 15 Nov 14 Posts: 602 Credit: 24,371,321 RAC: 0 |
I've written the app_config.xml file as suggested which should bring the memory to 5000 MB. But Atlas tasks still start at 3400 MB. This on the SUN WS, a Linux box. That is OK, the app_config.xml just sets the maximum amount of memory that can be used. But setting it to 5000 MB fixed the problem for me, even though the Atlas tasks still show as 3400 MB. |
Send message Joined: 15 Jun 08 Posts: 2530 Credit: 253,722,201 RAC: 51,175 |
tullio wrote: I've written the app_config.xml file as suggested which should bring the memory to 5000 MB. But Atlas tasks still start at 3400 MB. This on the SUN WS, a Linux box. Do you have a message like the following in your BOINC client's logfile? Do 31 Aug 2017 08:49:44 CEST | LHC@home | Found app_config.xml If not, reload your config files (e.g. via client menu -> options) or restart the client/computer. The changed settings don't affect WUs that are already running, i.e. reside in a "slots" dir. You may also post your app_config.xml here. |
©2024 CERN