Message boards : ATLAS application : FATAL Conditions database connection COOLOFL_TRT/OFLP200 cannot be opened
Message board moderation

To post messages, you must log in.

AuthorMessage
Klaus

Send message
Joined: 27 Aug 15
Posts: 27
Credit: 10,061,286
RAC: 4,436
Message 41903 - Posted: 13 Mar 2020, 16:15:34 UTC

12 of my last 24 wus don't produce hitsfiles.
Stderr: Guest Log: "exeErrorDiag": "Non-zero return code from EVNTtoHITS (33); Logfile error in log.EVNTtoHITS: \"IOVDbSvc FATAL Conditions database connection COOLOFL_TRT/OFLP200 cannot be opened - STOP\"",
ID: 41903 · Report as offensive     Reply Quote
maeax

Send message
Joined: 2 May 07
Posts: 2090
Credit: 158,777,751
RAC: 128,475
Message 41904 - Posted: 14 Mar 2020, 0:54:30 UTC

This is a extract of a comment from David Cameron,
why you get credit without a hits-file.
https://lhcathome.cern.ch/lhcathome/forum_thread.php?id=4178&postid=29560#29560
.....
What the ATLAS WU do is simulate how those particles in each event interact with the detector, which consists of many extremely complex components. The description of the detector is partly in ATLAS simulation software but partly in remote database services from which data is read over the network at the start of each WU. The output of the simulation is in the hits file, which is a description of where each particle "hits" (i.e interacts with) the detector.

Therefore a truly successful WU must have a valid hits file produced, however you can still get credit even if no hits file is present because we don't want people to suffer from problems in ATLAS software or infrastructure.
....
ID: 41904 · Report as offensive     Reply Quote
Klaus

Send message
Joined: 27 Aug 15
Posts: 27
Credit: 10,061,286
RAC: 4,436
Message 41906 - Posted: 14 Mar 2020, 6:10:08 UTC

Dear maeax, thanks for your explanation.
ID: 41906 · Report as offensive     Reply Quote
Klaus

Send message
Joined: 27 Aug 15
Posts: 27
Credit: 10,061,286
RAC: 4,436
Message 41917 - Posted: 16 Mar 2020, 8:06:37 UTC

Credit is not the problem, But I think there is another problem.
Since 11 march afternoon 28 of my last 56 wus don't produce hitsfiles and
EACH wu starting first during the upload of the result of the wu before is terminated in 20 ... 30 minutes with no hitfile as discribed below.
How do you organize, that I get every second task without a collision, although you do not know wich task I run next?
ID: 41917 · Report as offensive     Reply Quote
maeax

Send message
Joined: 2 May 07
Posts: 2090
Credit: 158,777,751
RAC: 128,475
Message 41918 - Posted: 16 Mar 2020, 8:16:44 UTC - in response to Message 41917.  

2020-03-16 04:14:19 (10568): Guest Log: *** Error codes and diagnostics ***
2020-03-16 04:14:19 (10568): Guest Log: "exeErrorCode": 65,
2020-03-16 04:14:19 (10568): Guest Log: "exeErrorDiag": "Non-zero return code from EVNTtoHITS (33); Logfile error in log.EVNTtoHITS: \"DetectorStore FATAL in sysInitialize(): standard std::exception is caught\"",
There is something wrong,
had you checked with Yeti's Checklist?
ID: 41918 · Report as offensive     Reply Quote
computezrmle
Volunteer moderator
Volunteer developer
Volunteer tester
Help desk expert
Avatar

Send message
Joined: 15 Jun 08
Posts: 2401
Credit: 225,395,547
RAC: 123,740
Message 41919 - Posted: 16 Mar 2020, 8:31:36 UTC - in response to Message 41917.  

You may gracefully shutdown BOINC and reboot your computer.

This snippets are from one of successful tasks:
2020-03-16 04:20:24 (10804): Setting CPU throttle for VM. (75%)
.
.
.
2020-03-16 06:59:10 (10804): Setting CPU throttle for VM. (100%)
.
.
.
2020-03-16 07:23:29 (10804): VM state change detected. (old = 'Running', new = 'Paused')
2020-03-16 07:23:39 (10804): VM state change detected. (old = 'Paused', new = 'Running')
2020-03-16 07:26:59 (10804): VM state change detected. (old = 'Running', new = 'Paused')
2020-03-16 07:27:09 (10804): VM state change detected. (old = 'Paused', new = 'Running')
2020-03-16 07:35:30 (10804): VM state change detected. (old = 'Running', new = 'Paused')
2020-03-16 07:35:40 (10804): VM state change detected. (old = 'Paused', new = 'Running')

A CPU throttle of 75% seems to be too low.
Permanently switching between running and pausing may cause problems.

Did you recently do updates?
Windows, VirtualBox ...
In case of VirtualBox you may try if a less recent version solves the problem.
ID: 41919 · Report as offensive     Reply Quote
Klaus

Send message
Joined: 27 Aug 15
Posts: 27
Credit: 10,061,286
RAC: 4,436
Message 41922 - Posted: 16 Mar 2020, 9:05:40 UTC

win10 update KB4551762 was implementet 11 march noon with gracefully shutdown BOINC and reboot .
I checked with Yeti's Checklist 13 Mar 2020 noon, although every second task run with no problem.
In about 2 hours, when the actual running task is finished I will reboot.
My computer is in use and not only for LHC@home tasks. The problem exists at a CPU throttle of 60 ... 100%.
My experience of past period is a CPU throttle of 60% is too low.
Switching between running and pausing is caused by my tasks and didn't cause problems in the past period.
ID: 41922 · Report as offensive     Reply Quote

Message boards : ATLAS application : FATAL Conditions database connection COOLOFL_TRT/OFLP200 cannot be opened


©2024 CERN