Search Engine: Elastic

Article ID: 131566, created on Oct 6, 2017, last review on Oct 6, 2017

  • Applies to:
  • Operations Automation

Symptoms

After a Virtuozzo node outage, a lot of VEs (more than 100) were relocated to a single node and some of them failed to start with the following error in /var/log/pa/vps.log on OACI IM node:

    2017-08-31 14:27:14,343 (bb598b68-b195-4439-8ff4-e7ec92e81ad5) INFO  GenericVm2VfTask [Shared executor thread #11 @1 @INTERACTIVE] - done_with_message(-2147482544, PRL_ERR_TRY_AGAIN)
    2017-08-31 14:27:14,344 (bb598b68-b195-4439-8ff4-e7ec92e81ad5) WARN  GenericVm2VfTask [Shared executor thread #11 @1 @INTERACTIVE] - VM2VF operation [START] (reqId=113193) finished with rc=-2147482544 

Cause

In case a lot of VEs are started simultaneously on a Virtuozzo node, they hit a bottleneck of parallel operations and some of them face the error and are not attempted to be started again. This behavior is planned to be improved in scope of internal request with ID CCU-17188.

Resolution

In order to prevent such effect, it is recommended to distribute VEs across the cluster more evenly, so that in case of a failover, VEs from a failed nodes are relocated across many nodes, but not on a single node in a large portion.

Such failed VEs should be started manually.

5356b422f65bdad1c3e9edca5d74a1ae caea8340e2d186a540518d08602aa065 e12cea1d47a3125d335d68e6d4e15e07

Email subscription for changes to this article
Save as PDF