Search Engine: Elastic

Article ID: 130878, created on May 25, 2017, last review on Jul 26, 2017

  • Applies to:
  • Operations Automation 7.1

Symptoms

Some applications of control panel like "Cloud Infrastructure" or the whole Provider's Control Panel became unavailable. Login attempt on management node's UI via http://mn_ip:8080 produces an error:

java.lang.NullPointerException: null

Different types of Operations Automation's tasks fail with the following error:

WFLYEJB0442: Unexpected Error

Common symptom for all tasks that in core.log each of them ends with the following error message:

[task:159017197:17829 p:-default-threadpool;-w:-Idle:490 pau]: c.p.p.tracer exit by exception: com.parallels.pa.service.host.ejb.HCLSenderBean.sendHCLjava.lang.OutOfMemoryError: unable to create new native thread

Note the method that could not created thread - sendHCL.

In console.log OutOfMemory errors occur frequently:

SEVERE [org.glassfish.jersey.server.ServerRuntime$Responder] (pa-rest task-192) An exception was not mapped due to exception mapper failure. The HTTP 500 response will be returned.: com.google.common.util.concurrent.ExecutionError: com.google.common.util.concurrent.ExecutionError: java.lang.OutOfMemoryError: unable to create new native thread

Nevertheless, memory statistics shows no outage of resources.

Cause

Thread leak in method SendHCL. Threads initiated in this method stay open and at some point, pau process reaches system limit for allowed amount of threads.

Tasks "Get traffic usage" and "Collect resources usage statistics from web clusters" contribute mostly since by default they run frequently and send a lot of requests.

This issue was passed for further investigation to the Engineering team as POA-111472: "Outage of several WildFly applications, sendHCL java.lang.OutOfMemoryError".

Resolution

Issue could be workarounded by performing the following steps:

  1. Increase thread limit to 16192 for user jboss in file /etc/security/limits.conf:

    jboss soft nproc 16192
    

    Restart of OA services is required to apply changes.

  2. Make "Get traffic usage info" and "Collect resources usage statistics from web clusters" tasks less frequent (once an hour)

  3. Restart OA services per KB during usual Maintenance time to reset thread count

Please contact your technical manager to clarify status of POA-111472.

5356b422f65bdad1c3e9edca5d74a1ae caea8340e2d186a540518d08602aa065 e12cea1d47a3125d335d68e6d4e15e07 8c199f0ee4305da1a577740620df4a51 1941880841f714e458ae4dc3d9f3062d

Email subscription for changes to this article
Save as PDF