Message boards :
CMS Application :
Grafana Errors
Message board moderation
Author | Message |
---|---|
![]() Send message Joined: 15 Jun 08 Posts: 2180 Credit: 185,400,461 RAC: 186,338 ![]() ![]() ![]() |
This morning 02:00 UTC the CMS Grafana job monitoring dropped from 140 running tasks to 0 running tasks. I suspect Grafana doesn't get the correct numbers since my computers are crunching as usual. |
![]() Send message Joined: 29 Aug 05 Posts: 941 Credit: 6,159,077 RAC: 1,098 ![]() |
This morning 02:00 UTC the CMS Grafana job monitoring dropped from 140 running tasks to 0 running tasks. Yes, there appears to have been a glitch in the monitoring -- it affected all of CMS. If you hit the select "Site" filtering and choose "All" you can see the monitor for all of CMS. Something like this... ![]() |
![]() Send message Joined: 15 Jun 08 Posts: 2180 Credit: 185,400,461 RAC: 186,338 ![]() ![]() ![]() |
Grafana shows 0 running jobs since this morning. Some backend services may need help. |
![]() Send message Joined: 29 Aug 05 Posts: 941 Credit: 6,159,077 RAC: 1,098 ![]() |
|
![]() Send message Joined: 29 Aug 05 Posts: 941 Credit: 6,159,077 RAC: 1,098 ![]() |
|
![]() Send message Joined: 15 Jun 08 Posts: 2180 Credit: 185,400,461 RAC: 186,338 ![]() ![]() ![]() |
What a surprise - last night the CMS failure rate dropped significantly. It's now close to zero. https://monit-grafana.cern.ch/d/o3dI49GMz/cms-job-monitoring-12m?viewPanel=81&orgId=11&from=1607061600000&to=1607148000000&var-group_by=CMS_JobType&var-Tier=All&var-CMS_WMTool=All&var-CMS_SubmissionTool=All&var-CMS_CampaignType=All&var-Site=T3_CH_Volunteer&var-Type=All&var-CMS_JobType=All&var-CMSPrimaryDataTier=All&var-adhoc=data.RecordTime Any idea what happened? |
![]() Send message Joined: 29 Aug 05 Posts: 941 Credit: 6,159,077 RAC: 1,098 ![]() |
|
©2023 CERN