Symptoms
Various Operations Automation (POA) background tasks fail with CORBA errors:
CORBA/TIMEOUT
CORBA/TRANSIENT
- CORBA/COMM_FAILURE
Examples of more detailed diagnostics are provided below:
Destination host 'lsh.provider.com' (#32), IP '10.39.93.101' :
The remote server is temporarily down. Please, make sure that destination host is accessible from POA management node and POA agent is running there.
Details: system exception, ID 'IDL:omg.org/CORBA/TRANSIENT:1.0'
OMG minor code (2), described as 'No usable profile in IOR.', completed = NO
or:
Destination host 'lsh.provider.com' (#32), IP '10.39.93.101' :
Communication failure with remote server. Please, make sure that destination host is accessible from POA management node and POA agent is running there.
Details: system exception, ID 'IDL:omg.org/CORBA/COMM_FAILURE:1.0'
TAO exception, minor code = 1 (failed to recv request response; ETIMEOUT), completed = MAYBE
or:
Request has been timed out, details:
system exception, ID 'IDL:omg.org/CORBA/TIMEOUT:1.0'
TAO exception, minor code = 79 (timeout during recv; low 7 bits of errno: 121 No error), completed = MAYBE
or:
Operation execution is aborted by timeout. Please increase timeout for this operation and try again.
Details: system exception, ID 'IDL:omg.org/CORBA/TIMEOUT:1.0' TAO exception, minor code = 3e (timeout during recv; low 7 bits of errno: 62 Timer expired), completed = MAYBE
Cause
The tasks fail as the result of various communication problems between the POA Management Node (core server) and managed hosts.
POA uses CORBA (Common Object Request Broker Architecture) technology to communicate with managed hosts through the POA Agent. The POA Management Node sends requests to managed hosts to perform required provisioning operations using CORBA requests to the POA Agent, which must be installed on any host registered in and managed by POA.
In the event of problems sending requests to or receiving responses from the POA Agent, whatever the cause, background tasks may fail in POA with CORBA errors.
The most likely causes of communication problems between the POA Management Node and the POA Agent working on managed hosts are:
The managed host is down
The POA Agent has stopped on the managed host or is stuck
The POA Management Node cannot establish a connection to the POA Agent on the managed host due to problems with routing or a firewall
- The provisioning system (MPS or WPE) is stuck
In the case of APS tasks, CORBA timeouts may be caused by communication problems with provisioning scripts from an APS application with an external system, such as Microsoft Office 365, Hyper-V, Open-Xchange, LiveOffice, and many others.
In this case, the POA Management Node may successfully send a request to the so-called Provisioning Gateway Host, which in turn sends the request to an external system that may time out waiting for a response.
Resolution
Make sure the POA Management Node can establish a connection to the POA Agent working on the managed host, send a request to the POA Agent, and receive a response from it.
Troubleshooting steps are provided below:
Step 1. First, simply restart the failed task. If the problem with connectivity between the POA Management Node and the POA Agent is resolved, the task will be processed. If the task fails with a CORBA error after restarting, follow the next steps.
Step 2. Find the hostname and/or IP address of the problem host. This is mentioned in the failed task in the POA Task Manager as the destination host.
Step 3. Make sure the problem host is online:
Try to ping it from the POA Management Node
Try to open an SSH or RDP connection to the host
Make sure the host does not experience any performance issues (if the POA task is one that takes a long time to run, its execution time may go beyond the timeout limit on a slow server)
Check the host physically
Try to log in to the host's local console or use KVM
- Make sure there is enough diskspace on the problem host
If the managed server is down, powered off, or stuck, reboot it. Then, once the managed server is online, restart the failed task in POA.
Step 4. Make sure the POA Agent is up and running on the problem host. Use the instructions from Odin Knowledgebase article #114184: How to check POA Agent status.
Restart the failed task in POA.
Step 5. Restart the POA Agent on the managed host, using the instructions from Odin Knowledgebase article #4642: How to restart POA system services: UI, Management Node, Agents.
Restart the failed task in POA.
Important: The managed host will be unavailable for provisioning tasks during a POA Agent restart. This means the host will be unavailable for new service provisioning and for tasks to modify/reconfigure/remove existing customers' services. Customer services such as websites or mailboxes will not be affected by a POA Agent restart.
Step 6. If the failed task provisions a Windows service, consider restarting the corresponding Provisioning Engine:
MPS (Microsoft Provisioning System) - Use the instructions in Knowledgebase article #1851: How to restart MPS Provisioning Engine properly.
- WPE (Windows Provisioning Engine) - Restart IIS on the WPE host using the iisreset utility.
Also make sure all the ports required for communication are open in accordance with the Firewall guide (Windows-related section): http://download.pa.parallels.com/poa/5.5/doc/index.htm?fileName=41301.htm
Step 7. Make sure that TCP ports 8352-8500 are open for connections between the POA Management Node and all managed hosts in the POA-managed environment.
Use the telnet utility on the POA Management Node to check that the necessary ports are open (i.e., to check if the POA Management Node can connect to the POA Agent):
telnet MANAGED_HOST 8352
Replace MANAGED_HOST in the command above with the actual hostname or IP address of the problem host from the failed POA task.
Restart the failed task in POA.
Use the tcpdump utility on Linux and the Network Monitor or WireShark utilities on the Windows POA Management Node and/or managed host to find out which port(s) the POA Management Node is trying to connect to when a POA task is running, then open the necessary ports.
For more details on the ports required for proper communication between the POA Management Node and the POA Agent, refer to the PA Firewall Configuration Guide > POA Management Node.
Step 8. In the event of a failing APS task, check the Provisioning Gateway Host (PGH) can connect to an external system. Check the logs of the external system to determine if it received a request from the POA PGH and sent a response within a proper timeout period (usually 1 hour).
Restart the failed task in POA.
Step 9. Restart the POA service on the Management Node. This will cause a short amount of downtime for the POA Control Panel service, but no customer services will be affected as they are not hosted on the Management Node.
Use the instructions from Odin Knowledgebase article #4642: How to restart POA system services: UI, Management Node, Agents.
Restart the failed task in POA.