Message boards :
CMS Application :
Grafana Errors
Message board moderation
Author | Message |
---|---|
Send message Joined: 15 Jun 08 Posts: 2536 Credit: 254,247,455 RAC: 61,674 |
This morning 02:00 UTC the CMS Grafana job monitoring dropped from 140 running tasks to 0 running tasks. I suspect Grafana doesn't get the correct numbers since my computers are crunching as usual. |
Send message Joined: 29 Aug 05 Posts: 1060 Credit: 7,737,455 RAC: 886 |
This morning 02:00 UTC the CMS Grafana job monitoring dropped from 140 running tasks to 0 running tasks. Yes, there appears to have been a glitch in the monitoring -- it affected all of CMS. If you hit the select "Site" filtering and choose "All" you can see the monitor for all of CMS. Something like this... |
Send message Joined: 15 Jun 08 Posts: 2536 Credit: 254,247,455 RAC: 61,674 |
Grafana shows 0 running jobs since this morning. Some backend services may need help. |
Send message Joined: 29 Aug 05 Posts: 1060 Credit: 7,737,455 RAC: 886 |
|
Send message Joined: 29 Aug 05 Posts: 1060 Credit: 7,737,455 RAC: 886 |
|
Send message Joined: 15 Jun 08 Posts: 2536 Credit: 254,247,455 RAC: 61,674 |
What a surprise - last night the CMS failure rate dropped significantly. It's now close to zero. https://monit-grafana.cern.ch/d/o3dI49GMz/cms-job-monitoring-12m?viewPanel=81&orgId=11&from=1607061600000&to=1607148000000&var-group_by=CMS_JobType&var-Tier=All&var-CMS_WMTool=All&var-CMS_SubmissionTool=All&var-CMS_CampaignType=All&var-Site=T3_CH_Volunteer&var-Type=All&var-CMS_JobType=All&var-CMSPrimaryDataTier=All&var-adhoc=data.RecordTime Any idea what happened? |
Send message Joined: 29 Aug 05 Posts: 1060 Credit: 7,737,455 RAC: 886 |
|
©2024 CERN