Servers froze as Openstack lost free space

Servers froze as Openstack lost free space

I am running this website along with plenty of other, mostly personal, services in Kubernetes on top of Openstack Pike deployed in Docker using kolla project 5.0.1.

The VMs are running in QEMU/KVM with the local files (qcow3) as disks and persistent volumes backed by the host LVM, mounted by Kubernetes via Cinder.

Network is Linuxbridge (L2) instead of Open vSwitch (L3), since simplicity is the ultimate sophistication.
In terms of complexity, L2 is simpler than L3 as per OSI model.

So my Kubernetes services have stopped working today morning and only one VM was accessible while the rest were not even responding to the ICMP ping requests. There was no free space left on my Openstack host so I have removed the old logs.

Continuing checking the dmesg, network bridges, iptables, digging through the logs, I was unable to spot anything useful that would help me indicating why the rest of my VMs would stop responding to a simple ICMP ping.

Looking at the VM itself, I have noticed that most of them (except one that was responding) were in a "paused" state, while the Openstack was reporting "ACTIVE".

[root@kolla ~]# openstack server list
+--------------------------------------+-------------+---------+--------------------------------+--------------------------------------+------------+
| ID                                   | Name        | Status  | Networks                       | Image                                | Flavor     |
+--------------------------------------+-------------+---------+--------------------------------+--------------------------------------+------------+
| 39cb3435-7092-4417-9d50-b4e2882067e1 | k8s-worker1 | ACTIVE  | private1=10.0.0.7, 10.0.2.111  | CentOS-7-x86_64-GenericCloud-1801-01 | k8s.medium |
...

[root@kolla ~]# docker exec -ti nova_libvirt virsh list --all
 Id    Name                           State
----------------------------------------------------
 19    instance-00000063              paused
...

Unpausing the VM back has recovered its state:

[root@kolla ~]# openstack server pause k8s-worker1
[root@kolla ~]# openstack server list
+--------------------------------------+-------------+---------+--------------------------------+--------------------------------------+------------+
| ID                                   | Name        | Status  | Networks                       | Image                                | Flavor     |
+--------------------------------------+-------------+---------+--------------------------------+--------------------------------------+------------+
| 39cb3435-7092-4417-9d50-b4e2882067e1 | k8s-worker1 | PAUSED  | private1=10.0.0.7, 10.0.2.111  | CentOS-7-x86_64-GenericCloud-1801-01 | k8s.medium |
...

[root@kolla ~]# openstack server unpause k8s-worker1
[root@kolla ~]# docker exec -ti nova_libvirt virsh list --all
 Id    Name                           State
----------------------------------------------------
 19    instance-00000063              running
...

You can also just resume the VM's and that would do:
docker exec -ti nova_libvirt sh -c 'virsh list --state-paused --name | grep -v "^$" | while read d; do virsh resume $d; done'

Then the Kubernetes API got back up again along with the all services too. Nothing got restarted - VMs and the Kubernetes containers have just continued their instruction execution as if nothing bad would have happened.

This leaves me with a single question - why did Openstack paused most of the QEMU VMs, except one?

I have not found anything that could help me to answer this question.
Below are some relevant excerpts from the logs.

Openstack logs when no space left on device

  • dnsmasq indirectly points to a space problem

/var/lib/docker/volumes/kolla_logs/_data/neutron/dnsmasq.log:

Feb 23 10:08:33 dnsmasq-dhcp[62]: DHCPREQUEST(ns-a1f7e93e-53) 10.0.0.9 fa:16:3e:b1:b1:41 
Feb 23 10:08:33 dnsmasq-dhcp[62]: DHCPACK(ns-a1f7e93e-53) 10.0.0.9 fa:16:3e:b1:b1:41 host-10-0-0-9
Feb 23 10:08:33 dnsmasq-dhcp[62]: failed to write /var/lib/neutron/dhcp/f2780ea0-8d83-4434-9d0f-914392d1c3b1/leases: No space left on device (retry in 60s)
Feb 23 10:09:33 dnsmasq-dhcp[62]: failed to write /var/lib/neutron/dhcp/f2780ea0-8d83-4434-9d0f-914392d1c3b1/leases: No space left on device (retry in 60s)
Feb 23 10:10:33 dnsmasq-dhcp[62]: failed to write /var/lib/neutron/dhcp/f2780ea0-8d83-4434-9d0f-914392d1c3b1/leases: No space left on device (retry in 60s)
  • Neutron DHCP agent is unable to sync with the RabbitMQ

/var/lib/docker/volumes/kolla_logs/_data/neutron/neutron-dhcp-agent.log:

2018-02-23 07:00:30.077 7 ERROR oslo.messaging._drivers.impl_rabbit [req-c358db4f-8778-4bda-a8e7-0fde7006044a - - - - -] The broker has blocked the connection: connection blocked, see broker logs
2018-02-23 07:02:13.644 7 ERROR oslo.messaging._drivers.impl_rabbit [req-c358db4f-8778-4bda-a8e7-0fde7006044a - - - - -] [0d0d55bb-3bb2-45a4-9518-abac812c03eb] AMQP server on 10.10.10.254:5672 is unreachable: timed out. Trying again in 1 seconds. Client port: None: timeout: timed out
2018-02-23 07:02:14.656 7 INFO oslo.messaging._drivers.impl_rabbit [req-c358db4f-8778-4bda-a8e7-0fde7006044a - - - - -] [0d0d55bb-3bb2-45a4-9518-abac812c03eb] Reconnected to AMQP server on 10.10.10.254:5672 via [amqp] client with port 60590.
2018-02-23 07:02:14.658 7 ERROR oslo.messaging._drivers.impl_rabbit [req-c358db4f-8778-4bda-a8e7-0fde7006044a - - - - -] The broker has blocked the connection: connection blocked, see broker logs
  • Neutron L3 Agent is unable to sync with the RabbitMQ

/var/lib/docker/volumes/kolla_logs/_data/neutron/neutron-l3-agent.log:

2018-02-23 07:00:59.302 50 ERROR oslo.messaging._drivers.impl_rabbit [-] The broker has blocked the connection: connection blocked, see broker logs
2018-02-23 07:02:57.353 50 ERROR oslo.messaging._drivers.impl_rabbit [-] [db4ec34e-af64-4c47-9efb-d1d9d3fff6b8] AMQP server on 10.10.10.254:5672 is unreachable: timed out. Trying again in 1 seconds. Client port: None: timeout: timed out
2018-02-23 07:02:58.368 50 INFO oslo.messaging._drivers.impl_rabbit [-] [db4ec34e-af64-4c47-9efb-d1d9d3fff6b8] Reconnected to AMQP server on 10.10.10.254:5672 via [amqp] client with port 37956.
2018-02-23 07:02:58.376 50 ERROR oslo.messaging._drivers.impl_rabbit [-] The broker has blocked the connection: connection blocked, see broker logs
2018-02-23 07:04:58.370 50 ERROR oslo.messaging._drivers.impl_rabbit [-] [db4ec34e-af64-4c47-9efb-d1d9d3fff6b8] AMQP server on 10.10.10.254:5672 is unreachable: timed out. Trying again in 1 seconds. Client port: None: timeout: timed out
  • Neutron Linuxbridge Agent is unable to sync with the RabbitMQ

/var/lib/docker/volumes/kolla_logs/_data/neutron/neutron-linuxbridge-agent.log:

2018-02-23 03:16:04.824 7 ERROR oslo.messaging._drivers.impl_rabbit [-] The broker has blocked the connection: connection blocked, see broker logs
2018-02-23 03:16:14.264 7 INFO oslo.messaging._drivers.impl_rabbit [-] The broker has unblocked the connection
2018-02-23 07:00:34.884 7 ERROR oslo.messaging._drivers.impl_rabbit [-] The broker has blocked the connection: connection blocked, see broker logs
2018-02-23 07:02:08.526 7 ERROR oslo.messaging._drivers.impl_rabbit [-] [1f1717dc-b2e2-4083-9ad9-bb1231dbdd1f] AMQP server on 10.10.10.254:5672 is unreachable: timed out. Trying again in 1 seconds. Client port: None: timeout: timed out
2018-02-23 07:02:09.544 7 INFO oslo.messaging._drivers.impl_rabbit [-] [1f1717dc-b2e2-4083-9ad9-bb1231dbdd1f] Reconnected to AMQP server on 10.10.10.254:5672 via [amqp] client with port 59826.
2018-02-23 07:02:09.550 7 ERROR oslo.messaging._drivers.impl_rabbit [-] The broker has blocked the connection: connection blocked, see broker logs
2018-02-23 07:04:09.545 7 ERROR oslo.messaging._drivers.impl_rabbit [-] [1f1717dc-b2e2-4083-9ad9-bb1231dbdd1f] AMQP server on 10.10.10.254:5672 is unreachable: timed out. Trying again in 1 seconds. Client port: None: timeout: timed out
  • Neutron metadata agent is unable to sync with the RabbitMQ

/var/lib/docker/volumes/kolla_logs/_data/neutron/neutron-metadata-agent.log:

2018-02-23 03:16:10.725 9 ERROR oslo.messaging._drivers.impl_rabbit [-] The broker has blocked the connection: connection blocked, see broker logs
2018-02-23 03:16:14.261 9 INFO oslo.messaging._drivers.impl_rabbit [-] The broker has unblocked the connection
2018-02-23 07:00:41.702 9 ERROR oslo.messaging._drivers.impl_rabbit [-] The broker has blocked the connection: connection blocked, see broker logs
2018-02-23 07:02:22.523 9 ERROR oslo.messaging._drivers.impl_rabbit [-] [4ac967af-2f64-48e7-8c16-955020ff6382] AMQP server on 10.10.10.254:5672 is unreachable: timed out. Trying again in 1 seconds. Client port: None: timeout: timed out
2018-02-23 07:02:23.541 9 INFO oslo.messaging._drivers.impl_rabbit [-] [4ac967af-2f64-48e7-8c16-955020ff6382] Reconnected to AMQP server on 10.10.10.254:5672 via [amqp] client with port 33402.
2018-02-23 07:02:23.544 9 ERROR oslo.messaging._drivers.impl_rabbit [-] The broker has blocked the connection: connection blocked, see broker logs
2018-02-23 07:04:23.542 9 ERROR oslo.messaging._drivers.impl_rabbit [-] [4ac967af-2f64-48e7-8c16-955020ff6382] AMQP server on 10.10.10.254:5672 is unreachable: timed out. Trying again in 1 seconds. Client port: None: timeout: timed out
  • Neutron server

When the space was freed, neutron stopped showing the dead agents every couple of minutes.

/var/lib/docker/volumes/kolla_logs/_data/neutron/neutron-server.log:

2018-02-23 07:01:34.644 22 INFO neutron.wsgi [req-e99e8e91-124a-4e4a-a5a6-ece7826eaee9 REDACTED REDACTED - default default] 10.10.10.254 "GET /v2.0
/security-groups?id=bd618343-6a83-4bd3-b7e7-189a359f225e HTTP/1.1" status: 200  len: 14800 time: 0.0697820
2018-02-23 07:01:35.542 33 WARNING neutron.db.agents_db [req-8545aa04-7a04-4b00-bd6c-9dd6e70eb2bf - - - - -] Agent healthcheck: found 3 dead agents out of 4:
                Type       Last heartbeat host
  Linux bridge agent  2018-02-23 07:00:04 REDACTED.REDACTED.com
          DHCP agent  2018-02-23 07:00:00 REDACTED.REDACTED.com
      Metadata agent  2018-02-23 07:00:11 REDACTED.REDACTED.com
2018-02-23 07:01:35.568 33 WARNING neutron.db.agentschedulers_db [req-0d14945f-4c53-4d47-9f20-69e5794489fd - - - - -] No DHCP agents available, skipping rescheduling
2018-02-23 07:01:36.692 24 INFO neutron.wsgi [req-8e9b8d52-458a-4500-955c-3eee1f1b85b7 REDACTED REDACTED - default default] 10.10.10.254 "GET /v2.0
/ports?device_id=690ea67a-e624-411c-bd74-33975d5191de HTTP/1.1" status: 200  len: 1073 time: 0.0551870
...
...
2018-02-23 07:54:00.686 33 WARNING neutron.db.agents_db [req-8545aa04-7a04-4b00-bd6c-9dd6e70eb2bf - - - - -] Agent healthcheck: found 4 dead agents out of 4:
                Type       Last heartbeat host
  Linux bridge agent  2018-02-23 07:00:04 REDACTED.REDACTED.com
          DHCP agent  2018-02-23 07:00:00 REDACTED.REDACTED.com
      Metadata agent  2018-02-23 07:00:11 REDACTED.REDACTED.com
            L3 agent  2018-02-23 07:00:29 REDACTED.REDACTED.com
...
2018-02-23 08:43:20.818 33 WARNING neutron.db.agentschedulers_db [req-0d14945f-4c53-4d472018-02-23 12:11:10.341 33 WARNING neutron.db.agentschedulers_db [req-0d14945f-4c53-4d47-9f20-69e5794489fd - - - - -] No DHCP agents available, skipping rescheduling
2018-02-23 12:11:10.478 33 WARNING neutron.db.agents_db [req-8545aa04-7a04-4b00-bd6c-9dd6e70eb2bf - - - - -] Agent healthcheck: found 4 dead agents out of 4:
                Type       Last heartbeat host
  Linux bridge agent  2018-02-23 07:00:04 REDACTED.REDACTED.com
          DHCP agent  2018-02-23 07:00:00 REDACTED.REDACTED.com
      Metadata agent  2018-02-23 07:00:11 REDACTED.REDACTED.com
            L3 agent  2018-02-23 07:00:29 REDACTED.REDACTED.com
  • Cinder Volume is unable to sync with the RabbitMQ

/var/lib/docker/volumes/kolla_logs/_data/cinder/cinder-volume.log:

2018-02-23 07:01:19.341 37 ERROR oslo.messaging._drivers.impl_rabbit [req-ffaf7f8a-33d3-473b-9c95-d20205ef700f - - - - -] The broker has blocked the connection: connection blocked, see broker logs
2018-02-23 07:03:11.745 37 ERROR oslo.messaging._drivers.impl_rabbit [req-ffaf7f8a-33d3-473b-9c95-d20205ef700f - - - - -] [973c0678-277d-40bf-95a7-c06788993574] AMQP server on 10.10.10.254:5672 is unreachable: timed out. Trying again in 1 seconds. Client port: None: timeout: timed out
2018-02-23 07:03:12.762 37 INFO oslo.messaging._drivers.impl_rabbit [req-ffaf7f8a-33d3-473b-9c95-d20205ef700f - - - - -] [973c0678-277d-40bf-95a7-c06788993574] Reconnected to AMQP server on 10.10.10.254:5672 via [amqp] client with port 39846.
2018-02-23 07:03:12.765 37 ERROR oslo.messaging._drivers.impl_rabbit [req-ffaf7f8a-33d3-473b-9c95-d20205ef700f - - - - -] The broker has blocked the connection: connection blocked, see broker logs
  • Keystone Apache admin

/var/lib/docker/volumes/kolla_logs/_data/keystone/keystone-apache-admin-error.log:

2018-02-23 08:40:45.801655 2018-02-23 08:40:45.800 27 INFO keystone.common.wsgi [req-10c88223-8297-4ec8-bd59-c4fd551c9dfc REDACTED REDACTED - default default] GET http://REDACTED.local:35357/v3/auth/tokens
2018-02-23 08:40:45.807948 Traceback (most recent call last):
2018-02-23 08:40:45.807988   File "/usr/lib64/python2.7/logging/__init__.py", line 875, in emit
2018-02-23 08:40:45.824165     self.flush()
2018-02-23 08:40:45.824216   File "/usr/lib64/python2.7/logging/__init__.py", line 835, in flush
2018-02-23 08:40:45.824252     self.stream.flush()
2018-02-23 08:40:45.824283 IOError: [Errno 28] No space left on device
2018-02-23 08:40:45.824296 Logged from file wsgi.py, line 238
2018-02-23 08:40:45.824706 2018-02-23 08:40:45.807 27 WARNING keystone.common.wsgi [req-10c88223-8297-4ec8-bd59-c4fd551c9dfc REDACTED REDACTED - default default] Failed to validate token: TokenNotFound: Failed to validate token
  • Keystone Apache public

/var/lib/docker/volumes/kolla_logs/_data/keystone/keystone-apache-public-error.log:

2018-02-23 08:37:53.604125 2018-02-23 08:37:53.603 21 INFO keystone.common.wsgi [req-c22a6cdb-3a94-4816-a1fd-1bc810ed5f7d - - - - -] POST http://REDACTED.local:5000/v3/auth/tokens
2018-02-23 08:40:45.838461 Traceback (most recent call last):
2018-02-23 08:40:45.838507   File "/usr/lib64/python2.7/logging/__init__.py", line 875, in emit
2018-02-23 08:40:45.839054     self.flush()
2018-02-23 08:40:45.839077   File "/usr/lib64/python2.7/logging/__init__.py", line 835, in flush
2018-02-23 08:40:45.839107     self.stream.flush()
2018-02-23 08:40:45.839131 IOError: [Errno 28] No space left on device
2018-02-23 08:40:45.839141 Logged from file wsgi.py, line 222
  • MariaDB

When the space was freed, mariadb stopped reprinting the "No space left on device" messages.

/var/lib/docker/volumes/kolla_logs/_data/mariadb/mariadb.log:

180130  7:40:59 [Note] WSREP: Synchronized with group, ready for connections
180223  8:40:22 [Warning] mysqld: Disk is full writing './mysql-bin.000102' (Errcode: 28 "No space left on device"). Waiting for someone to free space... (Expect up to 60 secs delay for server to continue after freeing disk space)
180223  8:40:22 [Warning] mysqld: Retry in 60 secs. Message reprinted in 600 secs
180223  8:50:22 [Warning] mysqld: Retry in 60 secs. Message reprinted in 600 secs
...
180223 12:10:22 [Warning] mysqld: Retry in 60 secs. Message reprinted in 600 secs
  • Nova API

/var/lib/docker/volumes/kolla_logs/_data/nova/nova-api.log:

2018-02-23 08:40:08.848 23 INFO nova.osapi_compute.wsgi.server [req-9c3368d0-6567-460e-9e3c-857e038ac19a REDACTED REDACTED - default default] 10.10
.10.254 "GET /v2.1/REDACTED/servers/detail?name=%5Ek8s-worker4%24&status=ACTIVE HTTP/1.1" status: 200 len: 2358 time: 0.3392260
2018-02-23 08:40:13.643 27 ERROR oslo_db.sqlalchemy.exc_filters [req-ff01721b-b052-48c8-b5d2-0f6e9557dd11 REDACTED REDACTED - default default] DBAP
IError exception wrapped from (pymysql.err.InternalError) (14, u'Can\'t change size of file (Errcode: 28 "No space left on device")') [SQL: u'SELECT anon_1.instances_created_at AS anon_1_instance
s_created_at, anon_1.instances_updated_at AS anon_1_instances_updated_at, anon_1.instances_deleted_at AS anon_1_instances_deleted_at, anon_1.instances_deleted AS anon_1_instances_deleted, anon_1.
instances_id AS anon_1_instances_id, .......... [parameters: {u'binary_1': 'nova-compute', u'param_1': 1000, u'display_name_1': u'^k8s-master$', u'deleted_4': 0, u'deleted_3': 0, u'deleted_2': 0, u'deleted_1': 0, u'vm_state_2': 'active', u'vm_state_1': 'soft-delete', u'project_id_1': u'REDACTED', u'deleted_5': 0}]: InternalError: (14, u'Can\'t change size of file (Errcode: 28 "No space left on device")')
2018-02-23 08:40:13.643 27 ERROR oslo_db.sqlalchemy.exc_filters Traceback (most recent call last):
2018-02-23 08:40:13.643 27 ERROR oslo_db.sqlalchemy.exc_filters   File "/usr/lib64/python2.7/site-packages/sqlalchemy/engine/base.py", line 1182, in _execute_context
2018-02-23 08:40:13.643 27 ERROR oslo_db.sqlalchemy.exc_filters     context)
2018-02-23 08:40:13.643 27 ERROR oslo_db.sqlalchemy.exc_filters   File "/usr/lib64/python2.7/site-packages/sqlalchemy/engine/default.py", line 470, in do_execute
2018-02-23 08:40:13.643 27 ERROR oslo_db.sqlalchemy.exc_filters     cursor.execute(statement, parameters)
2018-02-23 08:40:13.643 27 ERROR oslo_db.sqlalchemy.exc_filters   File "/usr/lib/python2.7/site-packages/pymysql/cursors.py", line 166, in execute
2018-02-23 08:40:13.643 27 ERROR oslo_db.sqlalchemy.exc_filters     result = self._query(query)
2018-02-23 08:40:13.643 27 ERROR oslo_db.sqlalchemy.exc_filters   File "/usr/lib/python2.7/site-packages/pymysql/cursors.py", line 322, in _query
2018-02-23 08:40:13.643 27 ERROR oslo_db.sqlalchemy.exc_filters     conn.query(q)
2018-02-23 08:40:13.643 27 ERROR oslo_db.sqlalchemy.exc_filters   File "/usr/lib/python2.7/site-packages/pymysql/connections.py", line 856, in query
2018-02-23 08:40:13.643 27 ERROR oslo_db.sqlalchemy.exc_filters     self._affected_rows = self._read_query_result(unbuffered=unbuffered)
2018-02-23 08:40:13.643 27 ERROR oslo_db.sqlalchemy.exc_filters   File "/usr/lib/python2.7/site-packages/pymysql/connections.py", line 1057, in _read_query_result
2018-02-23 08:40:13.643 27 ERROR oslo_db.sqlalchemy.exc_filters     result.read()
2018-02-23 08:40:13.643 27 ERROR oslo_db.sqlalchemy.exc_filters   File "/usr/lib/python2.7/site-packages/pymysql/connections.py", line 1340, in read
2018-02-23 08:40:13.643 27 ERROR oslo_db.sqlalchemy.exc_filters     first_packet = self.connection._read_packet()
2018-02-23 08:40:13.643 27 ERROR oslo_db.sqlalchemy.exc_filters   File "/usr/lib/python2.7/site-packages/pymysql/connections.py", line 1014, in _read_packet
2018-02-23 08:40:13.643 27 ERROR oslo_db.sqlalchemy.exc_filters     packet.check_error()
2018-02-23 08:40:13.643 27 ERROR oslo_db.sqlalchemy.exc_filters   File "/usr/lib/python2.7/site-packages/pymysql/connections.py", line 393, in check_error
2018-02-23 08:40:13.643 27 ERROR oslo_db.sqlalchemy.exc_filters     err.raise_mysql_exception(self._data)
2018-02-23 08:40:13.643 27 ERROR oslo_db.sqlalchemy.exc_filters   File "/usr/lib/python2.7/site-packages/pymysql/err.py", line 107, in raise_mysql_exception
2018-02-23 08:40:13.643 27 ERROR oslo_db.sqlalchemy.exc_filters     raise errorclass(errno, errval)
2018-02-23 08:40:13.643 27 ERROR oslo_db.sqlalchemy.exc_filters InternalError: (14, u'Can\'t change size of file (Errcode: 28 "No space left on device")')
2018-02-23 08:40:13.643 27 ERROR oslo_db.sqlalchemy.exc_filters 
2018-02-23 08:40:13.712 27 ERROR nova.api.openstack.extensions [req-ff01721b-b052-48c8-b5d2-0f6e9557dd11 REDACTED REDACTED - default default] Unexpected exception in API method: DBError: (pymysql.err.InternalError) (14, u'Can\'t change size of file (Errcode: 28 "No space left on device")') [SQL: u'SELECT anon_1.instances_created_at AS anon_1_instances_created_at, anon_1.instances_updated_at AS anon_1_instances_updated_at, anon_1.instances_deleted_at AS anon_1_instances_deleted_at, anon_1.instances_deleted AS anon_1_instances_deleted, anon_1.instances_id AS anon_1_instances_id, anon_1.instances_user_id AS .......... anon_1.instances_uuid AND anon_1.instances_deleted = %(deleted_4)s LEFT OUTER JOIN services AS services_1 ON services_1.host = anon_1.instances_host AND services_1.`binary` = %(binary_1)s AND anon_1.instances_deleted = %(deleted_5)s 2018-02-23 12:11:35.275 26 INFO nova.api.openstack.compute.server_external_events [req-e18eaa79-4efb-4812-8e59-ddd3d18e694b REDACTED REDACTED - default default] Creating event network-vif-plugged:af3c4043-e98d-4000-8d20-2c1eababa1c6 for instance c31f735a-906c-4abe-a172-aa798ece37b7 on REDACTED.REDACTED.com
  • Nova Compute

/var/lib/docker/volumes/kolla_logs/_data/nova/nova-compute.log:

2018-02-23 07:00:32.529 7 ERROR oslo.messaging._drivers.impl_rabbit [req-0fc0daf5-2b0e-4275-8629-639637fbb24b - - - - -] The broker has blocked the connection: connection blocked, see broker logs
2018-02-23 07:00:36.197 7 ERROR oslo.messaging._drivers.impl_rabbit [-] The broker has blocked the connection: connection blocked, see broker logs
2018-02-23 07:02:10.163 7 ERROR oslo.messaging._drivers.impl_rabbit [req-0fc0daf5-2b0e-4275-8629-639637fbb24b - - - - -] [05465b3e-0561-4db3-ba32-b3f1e1da094c] AMQP server on 10.10.10.254:5672 is
 unreachable: timed out. Trying again in 1 seconds. Client port: None: timeout: timed out
2018-02-23 07:02:11.189 7 INFO oslo.messaging._drivers.impl_rabbit [req-0fc0daf5-2b0e-4275-8629-639637fbb24b - - - - -] [05465b3e-0561-4db3-ba32-b3f1e1da094c] Reconnected to AMQP server on 10.10.
10.254:5672 via [amqp] client with port 60104.
2018-02-23 07:02:11.191 7 ERROR oslo.messaging._drivers.impl_rabbit [req-0fc0daf5-2b0e-4275-8629-639637fbb24b - - - - -] The broker has blocked the connection: connection blocked, see broker logs
  • Nova Conductor

/var/lib/docker/volumes/kolla_logs/_data/nova/nova-conductor.log:

2018-02-23 08:46:00.658 25 ERROR oslo_db.api [-] DB exceeded retry limit.: DBDeadlock: (pymysql.err.InternalError) (1205, u'Lock wait timeout exceeded; try restarting transaction') [SQL: u'UPDATE services SET updated_at=%(updated_at)s, report_count=%(report_count)s, last_seen_up=%(last_seen_up)s WHERE services.id = %(services_id)s'] [parameters: {'last_seen_up': datetime.datetime(2018, 2, 23, 8, 45, 9, 669109), 'services_id': 5, 'updated_at': datetime.datetime(2018, 2, 23, 8, 45, 9, 670006), 'report_count': 1678797}]
2018-02-23 08:46:00.658 25 ERROR oslo_db.api Traceback (most recent call last):
2018-02-23 08:46:00.658 25 ERROR oslo_db.api   File "/usr/lib/python2.7/site-packages/oslo_db/api.py", line 138, in wrapper
2018-02-23 08:46:00.658 25 ERROR oslo_db.api     return f(*args, **kwargs)
2018-02-23 08:46:00.658 25 ERROR oslo_db.api   File "/usr/lib/python2.7/site-packages/nova/db/sqlalchemy/api.py", line 250, in wrapped
2018-02-23 08:46:00.658 25 ERROR oslo_db.api     return f(context, *args, **kwargs)
2018-02-23 08:46:00.658 25 ERROR oslo_db.api   File "/usr/lib64/python2.7/contextlib.py", line 24, in __exit__
2018-02-23 08:46:00.658 25 ERROR oslo_db.api     self.gen.next()
2018-02-23 08:46:00.658 25 ERROR oslo_db.api   File "/usr/lib/python2.7/site-packages/oslo_db/sqlalchemy/enginefacade.py", line 1029, in _transaction_scope
2018-02-23 08:46:00.658 25 ERROR oslo_db.api     yield resource
2018-02-23 08:46:00.658 25 ERROR oslo_db.api   File "/usr/lib64/python2.7/contextlib.py", line 24, in __exit__
2018-02-23 08:46:00.658 25 ERROR oslo_db.api     self.gen.next()
2018-02-23 08:46:00.658 25 ERROR oslo_db.api   File "/usr/lib/python2.7/site-packages/oslo_db/sqlalchemy/enginefacade.py", line 641, in _session
2018-02-23 08:46:00.658 25 ERROR oslo_db.api     self.session.rollback()
2018-02-23 08:46:00.658 25 ERROR oslo_db.api   File "/usr/lib/python2.7/site-packages/oslo_utils/excutils.py", line 220, in __exit__
2018-02-23 08:46:00.658 25 ERROR oslo_db.api     self.force_reraise()
2018-02-23 08:46:00.658 25 ERROR oslo_db.api   File "/usr/lib/python2.7/site-packages/oslo_utils/excutils.py", line 196, in force_reraise
2018-02-23 08:46:00.658 25 ERROR oslo_db.api     six.reraise(self.type_, self.value, self.tb)
2018-02-23 08:46:00.658 25 ERROR oslo_db.api   File "/usr/lib/python2.7/site-packages/oslo_db/sqlalchemy/enginefacade.py", line 638, in _session
2018-02-23 08:46:00.658 25 ERROR oslo_db.api     self._end_session_transaction(self.session)
2018-02-23 08:46:00.658 25 ERROR oslo_db.api   File "/usr/lib/python2.7/site-packages/oslo_db/sqlalchemy/enginefacade.py", line 666, in _end_session_transaction
2018-02-23 08:46:00.658 25 ERROR oslo_db.api     session.commit()
2018-02-23 08:46:00.658 25 ERROR oslo_db.api   File "/usr/lib64/python2.7/site-packages/sqlalchemy/orm/session.py", line 906, in commit
2018-02-23 08:46:00.658 25 ERROR oslo_db.api     self.transaction.commit()
2018-02-23 08:46:00.658 25 ERROR oslo_db.api   File "/usr/lib64/python2.7/site-packages/sqlalchemy/orm/session.py", line 461, in commit
2018-02-23 08:46:00.658 25 ERROR oslo_db.api     self._prepare_impl()
2018-02-23 08:46:00.658 25 ERROR oslo_db.api   File "/usr/lib64/python2.7/site-packages/sqlalchemy/orm/session.py", line 441, in _prepare_impl
2018-02-23 08:46:00.658 25 ERROR oslo_db.api     self.session.flush()
2018-02-23 08:46:00.658 25 ERROR oslo_db.api   File "/usr/lib64/python2.7/site-packages/sqlalchemy/orm/session.py", line 2171, in flush
2018-02-23 08:46:00.658 25 ERROR oslo_db.api     self._flush(objects)
2018-02-23 08:46:00.658 25 ERROR oslo_db.api   File "/usr/lib64/python2.7/site-packages/sqlalchemy/orm/session.py", line 2291, in _flush
2018-02-23 08:46:00.658 25 ERROR oslo_db.api     transaction.rollback(_capture_exception=True)
2018-02-23 08:46:00.658 25 ERROR oslo_db.api   File "/usr/lib64/python2.7/site-packages/sqlalchemy/util/langhelpers.py", line 66, in __exit__
2018-02-23 08:46:00.658 25 ERROR oslo_db.api     compat.reraise(exc_type, exc_value, 2018-02-23 12:11:22.326 26 WARNING oslo.service.loopingcall [-] Function 'nova.servicegroup.drivers.db.DbDriver._report_state' run outlasted interval by 12650.08 sec
  • RabbitMQ

When the space was freed, rabbit stopped showing the warning report messages.

/var/lib/docker/volumes/kolla_logs/_data/rabbitmq/rabbit.log:

=INFO REPORT==== 23-Feb-2018::07:00:29 ===
Disk free space insufficient. Free bytes:49999872 Limit:50000000

=WARNING REPORT==== 23-Feb-2018::07:00:29 ===
disk resource limit alarm set on node rabbit@kolla.

**********************************************************
*** Publishers will be blocked until this alarm clears ***
**********************************************************

...

=WARNING REPORT==== 23-Feb-2018::07:03:08 ===
closing AMQP connection <0.6553.0> (10.10.10.254:58700 -> 10.10.10.254:5672 - neutron-linuxbridge-agent:7:1f1717dc-b2e2-4083-9ad9-bb1231dbdd1f):
client unexpectedly closed TCP connection

...

=WARNING REPORT==== 23-Feb-2018::07:03:13 ===
closing AMQP connection <0.6825.0> (10.10.10.254:59648 -> 10.10.10.254:5672 - neutron-dhcp-agent:7:0d0d55bb-3bb2-45a4-9518-abac812c03eb):
client unexpectedly closed TCP connection

=WARNING REPORT==== 23-Feb-2018::07:03:22 ===
closing AMQP connection <0.6961.0> (10.10.10.254:60844 -> 10.10.10.254:5672 - neutron-metadata-agent:9:4ac967af-2f64-48e7-8c16-955020ff6382):
client unexpectedly closed TCP connection

...

=WARNING REPORT==== 23-Feb-2018::07:03:57 ===
closing AMQP connection <0.16289.0> (10.10.10.254:52656 -> 10.10.10.254:5672 - neutron-l3-agent:50:db4ec34e-af64-4c47-9efb-d1d9d3fff6b8):
client unexpectedly closed TCP connection

=WARNING REPORT==== 23-Feb-2018::07:04:10 ===
closing AMQP connection <0.26678.515> (10.10.10.254:55166 -> 10.10.10.254:5672 - nova-compute:7:05465b3e-0561-4db3-ba32-b3f1e1da094c):
client unexpectedly closed TCP connection

...

=WARNING REPORT==== 23-Feb-2018::07:04:11 ===
closing AMQP connection <0.4300.0> (10.10.10.254:35482 -> 10.10.10.254:5672 - cinder-volume:37:973c0678-277d-40bf-95a7-c06788993574):
client unexpectedly closed TCP connection

Openstack logs when occupied space was freed

  • dnsmasq is back up again

/var/lib/docker/volumes/kolla_logs/_data/neutron/dnsmasq.log:

Feb 23 10:20:33 dnsmasq-dhcp[Feb 23 12:11:35 dnsmasq[110]: started, version 2.76 cachesize 150
Feb 23 12:11:35 dnsmasq[110]: compile time options: IPv6 GNU-getopt DBus no-i18n IDN DHCP DHCPv6 no-Lua TFTP no-conntrack ipset auth no-DNSSEC loop-detect inotify
Feb 23 12:11:35 dnsmasq-dhcp[110]: DHCP, static leases only on 10.0.0.0, lease time 1d
Feb 23 12:11:35 dnsmasq-dhcp[110]: DHCP, sockets bound exclusively to interface ns-a1f7e93e-53
...
  • Neutron DHCP agent is revived

/var/lib/docker/volumes/kolla_logs/_data/neutron/neutron-dhcp-agent.log:

2018-02-23 12:11:22.852 7 INFO neutron.agent.dhcp.agent [req-c358db4f-8778-4bda-a8e7-0fde7006044a - - - - -] Agent has just been revived. Scheduling full sync
2018-02-23 12:11:22.853 7 WARNING oslo.service.loopingcall [req-c358db4f-8778-4bda-a8e7-0fde7006044a - - - - -] Function 'neutron.agent.dhcp.agent.DhcpAgentWithStateReport._report_state' run outlasted interval by 18622.78 sec
2018-02-23 12:11:23.649 7 INFO neutron.agent.dhcp.agent [-] Synchronizing state
2018-02-23 12:11:23.937 7 INFO neutron.agent.dhcp.agent [req-5541106a-7cd2-4df8-a4b0-0e64a670b739 - - - - -] All active networks have been fetched through RPC.
2018-02-23 12:11:23.938 7 INFO neutron.agent.dhcp.agent [-] Starting network 3e7aa9d9-35f7-4a6e-b551-b4bfafdafa4a dhcp configuration
2018-02-23 12:11:23.939 7 INFO neutron.agent.dhcp.agent [-] Finished network 3e7aa9d9-35f7-4a6e-b551-b4bfafdafa4a dhcp configuration
2018-02-23 12:11:23.939 7 INFO neutron.agent.dhcp.agent [-] Starting network f2780ea0-8d83-4434-9d0f-914392d1c3b1 dhcp configuration
2018-02-23 12:11:35.335 7 INFO neutron.agent.dhcp.agent [-] Finished network f2780ea0-8d83-4434-9d0f-914392d1c3b1 dhcp configuration
2018-02-23 12:11:35.335 7 INFO neutron.agent.dhcp.agent [req-5541106a-7cd2-4df8-a4b0-0e64a670b739 - - - - -] Synchronizing state complete
  • Neutron L3 Agent is revived

/var/lib/docker/volumes/kolla_logs/_data/neutron/neutron-l3-agent.log:

2018-02-23 12:11:22.852 7 INFO neutron.agent.dhcp.agent [req-c358db4f-8778-4bda-a8e7-0fde7006044a - - - - -] Agent has just been revived. Scheduling full sync
2018-02-23 12:11:22.853 7 WARNING oslo.service.loopingcall [req-c358db4f-8778-4bda-a8e7-0fde7006044a - - - - -] Function 'neutron.agent.dhcp.agent.DhcpAgentWithStateReport._report_state' run outlasted interval by 18622.78 sec
2018-02-23 12:11:23.649 7 INFO neutron.agent.dhcp.agent [-] Synchronizing state
2018-02-23 12:11:23.937 7 INFO neutron.agent.dhcp.agent [req-5541106a-7cd2-4df8-a4b0-0e64a670b739 - - - - -] All active networks have been fetched through RPC.
2018-02-23 12:11:23.938 7 INFO neutron.agent.dhcp.agent [-] Starting network 3e7aa9d9-35f7-4a6e-b551-b4bfafdafa4a dhcp configuration
2018-02-23 12:11:23.939 7 INFO neutron.agent.dhcp.agent [-] Finished network 3e7aa9d9-35f7-4a6e-b551-b4bfafdafa4a dhcp configuration
2018-02-23 12:11:23.939 7 INFO neutron.agent.dhcp.agent [-] Starting network f2780ea0-8d83-4434-9d0f-914392d1c3b1 dhcp configuration
2018-02-23 12:11:35.335 7 INFO neutron.agent.dhcp.agent [-] Finished network f2780ea0-8d83-4434-9d0f-914392d1c3b1 dhcp configuration
2018-02-23 12:11:35.335 7 INFO neutron.agent.dhcp.agent [req-5541106a-7cd2-4df8-a4b0-0e64a670b739 - - - - -] Synchronizing state complete
  • Neutron Linuxbridge Agent is revived

/var/lib/docker/volumes/kolla_logs/_data/neutron/neutron-linuxbridge-agent.log:

2018-02-23 12:11:22.913 7 INFO neutron.plugins.ml2.drivers.agent._common_agent [-] Linux bridge agent Agent has just been revived. Doing a full sync.
2018-02-23 12:11:22.914 7 WARNING oslo.service.loopingcall [-] Function 'neutron.plugins.ml2.drivers.agent._common_agent.CommonAgentLoop._report_state' run outlasted interval by 18618.03 sec
2018-02-23 12:11:24.055 7 INFO neutron.plugins.ml2.drivers.agent._common_agent [req-1fcd6e39-e732-4a32-8fe8-d15857fc0545 - - - - -] Linux bridge agent Agent out of sync with plugin!
2018-02-23 12:11:24.056 7 INFO neutron.agent.securitygroups_rpc [req-1fcd6e39-e732-4a32-8fe8-d15857fc0545 - - - - -] Preparing filters for devices set(['tapaf3c4043-e9', 'tap6dfd8529-2b', 'tapc7a
e4a2c-df', 'tapc72efcf4-2a', 'tap8ccb3f52-dd', 'tapa580c1fb-92', 'tap221bc6dd-d7', 'tapa1f7e93e-53', 'tapd9120e80-78'])
2018-02-23 12:11:28.332 7 INFO neutron.agent.securitygroups_rpc [req-1fcd6e39-e732-4a32-8fe8-d15857fc0545 - - - - -] Security group member updated [u'bd618343-6a83-4bd3-b7e7-189a359f225e']
... followed by the messages like: security group member updated, Port tapXXXXXX updated, Refresh firewall rules, ...
  • Neutron metadata agent is back

It worth noting the 18569.27 sec visible in the log which is about 5 hours, matches the outage time.

/var/lib/docker/volumes/kolla_logs/_data/neutron/neutron-metadata-agent.log:

2018-02-23 12:10:40.969 9 WARNING oslo.service.loopingcall [-] Function 'neutron.agent.metadata.agent.UnixDomainMetadataProxy._report_state' run outlasted interval by 18569.27 sec
2018-02-23 20:39:27.941 25 INFO eventlet.wsgi.server [-] 10.0.0.9,<local> "GET /openstack/2012-08-10/meta_data.json HTTP/1.1" status: 200  len: 1865 time: 6.5291049
  • Cinder Backup

/var/lib/docker/volumes/kolla_logs/_data/cinder/cinder-backup.log:

2018-02-23 12:11:22.464 6 WARNING oslo.service.loopingcall [req-03d05e53-cbbd-479d-9772-b0cf2eb5e34b - - - - -] Function 'cinder.service.Service.report_state' run outlasted interval by 12641.23 sec
  • Cinder Scheduler

/var/lib/docker/volumes/kolla_logs/_data/cinder/cinder-scheduler.log:

2018-02-23 12:11:22.439 7 WARNING oslo.service.loopingcall [req-c601094c-c815-46fe-871e-1e655017158d - - - - -] Function 'cinder.service.Service.report_state' run outlasted interval by 12642.57 sec
2018-02-23 20:54:51.361 7 INFO cinder.message.api [req-33c2207d-ef3b-44a3-8c05-6e7dcbe0ea2d - - - - -] Deleted 0 expired messages.
  • Cinder Volume

/var/lib/docker/volumes/kolla_logs/_data/cinder/cinder-volume.log:

2018-02-23 12:10:44.050 37 WARNING oslo.service.loopingcall [req-ffaf7f8a-33d3-473b-9c95-d20205ef700f - - - - -] Function 'cinder.service.Service.periodic_tasks' run outlasted interval by 18504.73 sec
2018-02-23 12:11:22.382 37 WARNING oslo.service.loopingcall [req-4b7f1ceb-a594-4730-944c-f92fc47d1d58 - - - - -] Function 'cinder.service.Service.report_state' run outlasted interval by 12646.82 sec
  • Heat Engine

/var/lib/docker/volumes/kolla_logs/_data/heat/heat-engine.log:

2018-02-23 12:11:22.531 22 WARNING oslo.service.loopingcall [req-ef7e9d2e-0898-4bd7-a67b-f0f912274a27 - - - - -] Function 'heat.engine.service.EngineService.service_manage_report' run outlasted interval by 12585.08 sec
2018-02-23 12:11:22.498 24 WARNING oslo.service.loopingcall [req-ead8a370-6ac2-491c-b9e0-6582816933bf - - - - -] Function 'heat.engine.service.EngineService.service_manage_report' run outlasted interval by 12585.22 sec
2018-02-23 12:11:22.565 25 WARNING oslo.service.loopingcall [req-49902f60-d356-440a-9ea3-6755239fe43f - - - - -] Function 'heat.engine.service.EngineService.service_manage_report' run outlasted interval by 12585.06 sec
2018-02-23 12:11:22.590 23 WARNING oslo.service.loopingcall [req-2ccc34d6-8dd8-4adf-8b5d-f97b059f18d7 - - - - -] Function 'heat.engine.service.EngineService.service_manage_report' run outlasted interval by 12584.89 sec
2018-02-23 12:11:22.615 26 WARNING oslo.service.loopingcall [req-4cecfca2-3883-4cde-8f19-17d2d29a4005 - - - - -] Function 'heat.engine.service.EngineService.service_manage_report' run outlasted interval by 12584.89 sec
  • Nova API

/var/lib/docker/volumes/kolla_logs/_data/nova/nova-api.log:

2018-02-23 12:11:35.281 26 INFO nova.osapi_compute.wsgi.server [req-e18eaa79-4efb-4812-8e59-ddd3d18e694b REDACTED REDACTED - default default] 10.10.10.254 "POST /v2.1/REDACTED/os-server-external-events HTTP/1.1" status: 200 len: 582 time: 0.2315381
  • Nova Compute

/var/lib/docker/volumes/kolla_logs/_data/nova/nova-compute.log:

2018-02-23 12:10:40.968 7 INFO oslo.messaging._drivers.impl_rabbit [req-4afefd07-bccb-48cd-ba06-42707516547a - - - - -] The broker has unblocked the connection
2018-02-23 12:10:40.968 7 INFO oslo.messaging._drivers.impl_rabbit [req-0fc0daf5-2b0e-4275-8629-639637fbb24b - - - - -] The broker has unblocked the connection
2018-02-23 12:10:41.092 7 INFO nova.compute.manager [req-4afefd07-bccb-48cd-ba06-42707516547a - - - - -] [instance: 7966a546-48a7-49ef-831a-35e8bf471038] During _sync_instance_power_state the DB power_state (1) does not match the vm_power_state from the hypervisor (3). Updating power_state in the DB to match the hypervisor.
2018-02-23 12:11:22.713 7 WARNING oslo.service.loopingcall [-] Function 'nova.servicegroup.drivers.db.DbDriver._report_state' run outlasted interval by 18636.52 sec
2018-02-23 12:11:22.940 7 WARNING nova.compute.manager [req-4afefd07-bccb-48cd-ba06-42707516547a - - - - -] [instance: 7966a546-48a7-49ef-831a-35e8bf471038] Instance is paused unexpectedly. Ignore.
2018-02-23 12:11:22.940 7 INFO nova.compute.manager [req-4afefd07-bccb-48cd-ba06-42707516547a - - - - -] [instance: 39cb3435-7092-4417-9d50-b4e2882067e1] VM Paused (Lifecycle Event)
2018-02-23 12:11:23.024 7 INFO nova.compute.manager [req-4afefd07-bccb-48cd-ba06-42707516547a - - - - -] [instance: 39cb3435-7092-4417-9d50-b4e2882067e1] During _sync_instance_power_state the DB power_state (1) does not match the vm_power_state from the hypervisor (3). Updating power_state in the DB to match the hypervisor.
2018-02-23 12:11:23.439 7 WARNING nova.compute.manager [req-4afefd07-bccb-48cd-ba06-42707516547a - - - - -] [instance: 39cb3435-7092-4417-9d50-b4e2882067e1] Instance is paused unexpectedly. Ignore.
2018-02-23 12:11:23.439 7 INFO nova.compute.manager [req-4afefd07-bccb-48cd-ba06-42707516547a - - - - -] [instance: 5b57e825-1e26-4993-9cb5-06f83c538974] VM Paused (Lifecycle Event)
2018-02-23 12:11:23.633 7 INFO nova.compute.manager [req-4afefd07-bccb-48cd-ba06-42707516547a - - - - -] [instance: 5b57e825-1e26-4993-9cb5-06f83c538974] During _sync_instance_power_state the DB power_state (1) does not match the vm_power_state from the hypervisor (3). Updating power_state in the DB to match the hypervisor.
2018-02-23 12:11:23.737 7 WARNING nova.compute.manager [req-4afefd07-bccb-48cd-ba06-42707516547a - - - - -] [instance: 5b57e825-1e26-4993-9cb5-06f83c538974] Instance is paused unexpectedly. Ignore.
2018-02-23 12:11:23.738 7 INFO nova.compute.manager [req-4afefd07-bccb-48cd-ba06-42707516547a - - - - -] [instance: f89b6c01-1d0c-47f6-82ce-0c8eab33dcff] VM Paused (Lifecycle Event)
2018-02-23 12:11:23.854 7 INFO nova.compute.resource_tracker [req-0fc0daf5-2b0e-4275-8629-639637fbb24b - - - - -] Final resource view: name=REDACTED.REDACTED.com phys_ram=32665MB used_ram=18944MB phys_disk=98GB used_disk=120GB total_vcpus=8 used_vcpus=12 pci_stats=[]
2018-02-23 12:11:23.861 7 INFO nova.compute.manager [req-4afefd07-bccb-48cd-ba06-42707516547a - - - - -] [instance: f89b6c01-1d0c-47f6-82ce-0c8eab33dcff] During _sync_instance_power_state the DB power_state (1) does not match the vm_power_state from the hypervisor (3). Updating power_state in the DB to match the hypervisor.
2018-02-23 12:11:23.965 7 WARNING nova.compute.manager [req-4afefd07-bccb-48cd-ba06-42707516547a - - - - -] [instance: f89b6c01-1d0c-47f6-82ce-0c8eab33dcff] Instance is paused unexpectedly. Ignore.
2018-02-23 12:11:23.966 7 INFO nova.compute.manager [req-4afefd07-bccb-48cd-ba06-42707516547a - - - - -] [instance: 690ea67a-e624-411c-bd74-33975d5191de] VM Paused (Lifecycle Event)
2018-02-23 12:11:24.049 7 INFO nova.compute.manager [req-4afefd07-bccb-48cd-ba06-42707516547a - - - - -] [instance: 690ea67a-e624-411c-bd74-33975d5191de] During _sync_instance_power_state the DB power_state (1) does not match the vm_power_state from the hypervisor (3). Updating power_state in the DB to match the hypervisor.
2018-02-23 12:11:24.073 7 INFO nova.virt.libvirt.imagecache [req-0fc0daf5-2b0e-4275-8629-639637fbb24b - - - - -] image d7f5c6de-ae46-40ef-992b-7c100274caf0 at (/var/lib/nova/instances/_base/e9d2aa7635d0db5e7fdeb18f68bad367189ccfb1): checking
2018-02-23 12:11:24.396 7 WARNING nova.compute.manager [req-4afefd07-bccb-48cd-ba06-42707516547a - - - - -] [instance: 690ea67a-e624-411c-bd74-33975d5191de] Instance is paused unexpectedly. Ignore.
2018-02-23 12:11:27.219 7 INFO nova.virt.libvirt.imagecache [req-0fc0daf5-2b0e-4275-8629-639637fbb24b - - - - -] Active base files: /var/lib/nova/instances/_base/e9d2aa7635d0db5e7fdeb18f68bad367189ccfb1
2018-02-23 12:11:35.236 7 WARNING nova.compute.manager [-] [instance: f89b6c01-1d0c-47f6-82ce-0c8eab33dcff] Instance is paused unexpectedly. Ignore.
2018-02-23 12:11:35.273 7 WARNING nova.compute.manager [-] [instance: 7966a546-48a7-49ef-831a-35e8bf471038] Instance is paused unexpectedly. Ignore.
2018-02-23 12:11:35.286 7 WARNING nova.compute.manager [req-e18eaa79-4efb-4812-8e59-ddd3d18e694b REDACTED REDACTED - default default] [instance: c31f735a-906c-4abe-a172-aa798ece37b7] Received unexpected event network-vif-plugged-af3c4043-e98d-4000-8d20-2c1eababa1c6 for instance
2018-02-23 12:11:35.293 7 WARNING nova.compute.manager [-] [instance: 39cb3435-7092-4417-9d50-b4e2882067e1] Instance is paused unexpectedly. Ignore.
...
...
2018-02-23 20:37:59.401 7 INFO nova.compute.manager [req-4afefd07-bccb-48cd-ba06-42707516547a - - - - -] [instance: 5b57e825-1e26-4993-9cb5-06f83c538974] VM Resumed (Lifecycle Event)
2018-02-23 20:37:59.517 7 INFO nova.compute.manager [req-4afefd07-bccb-48cd-ba06-42707516547a - - - - -] [instance: 5b57e825-1e26-4993-9cb5-06f83c538974] During sync_power_state the instance has 
a pending task (unpausing). Skip.
2018-02-23 20:37:59.518 7 INFO nova.compute.manager [req-4afefd07-bccb-48cd-ba06-42707516547a - - - - -] [instance: 5b57e825-1e26-4993-9cb5-06f83c538974] VM Resumed (Lifecycle Event)
2018-02-23 20:37:59.610 7 INFO nova.compute.manager [req-4afefd07-bccb-48cd-ba06-42707516547a - - - - -] [instance: 5b57e825-1e26-4993-9cb5-06f83c538974] During sync_power_state the instance has 
a pending task (unpausing). Skip.
...
  • Nova Conductor

/var/lib/docker/volumes/kolla_logs/_data/nova/nova-conductor.log:

2018-02-23 12:11:22.963 25 INFO nova.servicegroup.drivers.db [-] Recovered from being unable to report status.
2018-02-23 12:11:22.963 25 WARNING oslo.service.loopingcall [-] Function 'nova.servicegroup.drivers.db.DbDriver._report_state' run outlasted interval by 60.69 sec
2018-02-23 12:11:23.056 24 INFO nova.servicegroup.drivers.db [-] Recovered from being unable to report status.
2018-02-23 12:11:23.057 24 WARNING oslo.service.loopingcall [-] Function 'nova.servicegroup.drivers.db.DbDriver._report_state' run outlasted interval by 60.78 sec
2018-02-23 12:11:23.375 27 INFO nova.servicegroup.drivers.db [-] Recovered from being unable to report status.
2018-02-23 12:11:23.375 27 WARNING oslo.service.loopingcall [-] Function 'nova.servicegroup.drivers.db.DbDriver._report_state' run outlasted interval by 60.10 sec
2018-02-23 12:11:23.450 23 INFO nova.servicegroup.drivers.db [-] Recovered from being unable to report status.
2018-02-23 12:11:23.450 23 WARNING oslo.service.loopingcall [-] Function 'nova.servicegroup.drivers.db.DbDriver._report_state' run outlasted interval by 60.18 sec
  • Nova Consoleauth

/var/lib/docker/volumes/kolla_logs/_data/nova/nova-consoleauth.log:

2018-02-23 12:11:22.361 7 WARNING oslo.service.loopingcall [-] Function 'nova.servicegroup.drivers.db.DbDriver._report_state' run outlasted interval by 12647.37 sec
  • Nova Scheduler

/var/lib/docker/volumes/kolla_logs/_data/nova/nova-scheduler.log:

2018-02-23 12:11:22.406 7 WARNING oslo.service.loopingcall [-] Function 'nova.servicegroup.drivers.db.DbDriver._report_state' run outlasted interval by 12645.99 sec