Table of Contents
- 1.1 Configure system-wide settings
- 1.2 Configure system security and staff members
- 1.3 Configure firewall and network
- 1.4 Configure management node full server backups and store them remotely
- 3.1 Monitoring the system
- 3.2 Monitor task queue to ensure stable provisioning
- 3.3 Monitor limits and capacities regularly
- 3.4 Do FULL Vacuum every 3 months
- 4.1 OA system server OS
- 4.2 OA minor updates
- 4.3 OA major upgrades - general recommendations
- 4.4 Validate customizations after every update
- 4.5 Virtuozzo hardware nodes
- 4.6 APS packages
Main Odin Automation maintenance guide is a hub of useful articles and proved best practices for general system configuration, services maintenance, daily operations and etc. It will help you to make sure that Odin Automation system operates properly. It is crucial that you monitor the stability of all components, check their status regularly, track changes in progress, know what to check in case of any problem, and react on time.
1. General aspects of system configuration
1.1 Configure system-wide settings
There is a set of global system parameters (also known as System Properties) that are applied to the whole OA installation and affect general workflow. They may be configured in OA Provider Control Panel at Top > System > Settings > System Properties > General. Please review the following article to configure the most important parameters:
KB #113948 Configure system-wide settings
1.2 Configure system security and staff members
- Make sure that you have a staff member who is familiar with Odin platform, get acknowledged with Odin training materials (possibly hire dedicated engineer). Having a strong knowledge in Odin platform allows provider's technical team to resolve issues using Odin public KB. Issues that cannot be resolved by educated staff member can be submitted to Odin Technical Support with profound issue description and basic troubleshooting already performed - it will minimize time for support investigation, thus allowing quicker resolution.
- Setup granular roles\privileges for staff members in OA and BA. For example, people from finance should not make changes supposed to be made by accounting team. Granular privileges minimize the risk of unauthorized actions on the platform. This practice allows to minimize downtime and financial loss for provider. Privileges concept is fully described in the following guide: Understanding OA Operations Privileges Management Concept Please get acquainted with important chapters of the provider's guide about managing accounts:
For further details please refer to the following article:
KB #113949 Configure system security
1.3 Configure firewall and network
General recommendations for adopting network security to Odin Automation needs you can find in the firewall configurations guide. However, backnet part of OA network is designed for internal use only and more secured by it's purpose.Therefore there is no need in intensive security measures for backnet part of the network. In spite of this, IPv6 should be disabled due to security reasons on the Odin Automation management node, UI and branding nodes. Please see additional information in Hardware requirements. Some aspects of network configuration are listed below:
Some of the security practices may even harm Odin Automation stability. To name a few: blocking of web server verbs in backnet (PUT, POST, DELETE), forcibly closing long connections, blocking "keep alives". In some cases this can lead to provisioning failures due to the fact that some operations are long-running by its nature.
OACI specifics: for standard Cloud Infrastructure configuration IM nodes communicate with the OA Operations Management Node via single IP address. To maximize the Instance Manager throughput and increase its reliability, set up a load balancer to distribute incoming traffic across all IM Nodes. You can use any load-balancing software of your choice to load-balance traffic to Instance Manager.
Make sure that all required ports are open on Virtuozzo hardware nodes
Place Odin Automation name servers in different subnets.
- Review the section of the guide about OACI: Network Groups
Odin does not recommend to use traffic shaping solutions in Odin Automation networks. Some types of Odin Automation services have strong requirements on network bandwidth. This guarantees high performance of such services as Odin Automation control panel. Traffic shaping limits bandwidth and delays some or all data grams which could result in performance degradation.
For details about the firewall configuration please refer to the following article:
KB #113950 Configure firewall
1.4 Configure management node full server backups and store them remotely
Full server backups stored separately from management node allow to restore platform functionality in case of disaster/hardware failure. Odin provides solution to backup all necessary configuration:
We recommend you to have 2 types of backups: full backup of the OA management node server and separate database dump. This principle brings flexibility in restore operations: both single database record and the whole server can be restored in case of disaster recovery. Additionally, please check information in the following guide: Business Automation backup mechanism.
For hosted services we also recommend to maintain a backup solution. There is a significant number of industry standard backup solutions, so it is hard to recommend some particular one. The key principle is to have a full, regularly maintained backup of each slave node that hosts end-customer's data. Ability of granular data restoration could be more then efficient in case when some small (in terms of entire server) object should be restored, for example, a separate web site, mailbox and so on. Please refer to the following articles for details about backup configuration:
KB #113963 Configure system backup
2. Services Maintenance
2.1 Linux web hosting
NG (next generation) web hosting is a recommended solution for Linux Web Hosting. Odin recommends to run clustered environment for the web hosting purpose. Please see Deployment guide. High speed Internet connection with multiple access channels is strongly recommended for cluster members and load balancer.
KB #114326 Linux Shared Hosting NG: General Information, Best Practices and Troubleshooting
KB #115790 Website Hosting: General Info, Best Practices and Troubleshooting
2.2 APS applications provisioning and maintenance
By providing a common platform for distributing applications, APS simplifies your relationship with ISVs and reduces the risk and effort of marketing and selling their applications. When an ISV packages its applications using a standard that is familiar and convenient for you to resell, you save costs and gain scale by simply integrating the new services using your common APS platform. The article below provides you with extensive information about using APS on your Odin Automation installation:
KB #115664 APS: General Info, Best Practices and Troubleshooting
2.3 Office 365
Keep application up2date. Microsoft corporation continuously improving Office 365 service offerings. New offers are being published, while older ones are discontinued. Odin implements Office 365 APS package to enable providers automatically provision Office 365 subscriptions. Keeping application updated allows to have new offers available for customers and recently found issues fixed. Besides keeping the application up-to-date, the most important recommendations are:
- follow the service configuration guide. The most important part is service templates/plans configuration. Following Odin recommendations minimizes services downtime and negative impact to end-users.
- use APS screens only to manage Office 365 users and licenses. This practice helps to avoid misconfiguration during the resource number change.
- Avoid using "localdb" engine as datastore for Office 365 endpoint server. Use standard Microsoft SQL data file instead.
For more details please refer to the article:
KB #115127 Office 365 Maintenance Guide
3. Regular activities and monitoring
3.1 Monitoring the system
To simplify daily operations and minimize platform downtime Odin recommends providers to set up a monitoring solution. We recommend to automate monitoring of the following services:
- Provider control panel
- Diskspace on the most important servers (OA management node, OA core database server (if database installed separately from the OA management node), BA application server, BA database server)
- Failed orders in Billing and failed tasks in Operations. It is possible to set-up notifications about all failed orders by changing order flow
- Odin Automation for Cloud Infrastructure availability
- Section "Monitoring the System" of the Provider's Guide helps to get the idea what and how should be monitored.
- Exchange Services: you can check statuses of Exchange mailboxes via Top » Services » E-mail » Mailboxes. Normally mailboxes should be in
Readystatus. Those who stay in transitional statuses like
Deletingfor a long period of time, as well as those marked as
Failedrequire administrative attention. For such mailboxes, it makes sense to search for failed tasks by subscription ID. In some cases
Subscription IDcan be empty, then required tasks can be found by the filter like
ADSCTask_S00xxxxxxxin the field Queue name where xxxxxxx is the subscription ID.
Zabbix is one of the most popular monitoring solutions available. Zabbix monitoring module can be used to monitor all Odin services as well as the underlying infrastructure such as network, servers, operating systems and support services. Examples of monitored items and metrics:
Network Level: ICMP Ping, Bandwidth, Network Providers, BGP Peers
OS Level: CPU, Memory, Disks, Network Interfaces, System Services, Performance, Network, Security, Backups
Odin Application Level/APS: Odin Services, HTTP, HTTPS, Application Performance
If you are a Premier Support customer, you are eligible to set up Zabbix monitoring system that will be maintained by Odin Technical Support. Please contact your TAM for further clarifications.
Articles below will also help you to monitor different services used on your installation:
KB #113987 How to check overall status of system services
KB #113988 Monitor DNS services to ensure websites' availability
KB #113991 Monitor Virtuozzo servers - legacy VPS module
KB #113992 How to check statuses of mail services
KB #113994 How to check status of Microsoft Hosted services
KB #113996 How to check status of wireless services (Blackberry/Good Mobile)
KB #113997 Monitor Microsoft Office 365 services
KB #113999 How to gather system statistics
3.2 Monitor task queue to ensure stable provisioning
Task Manager is a useful tool which allows tracking almost everything happening in your Odin Service Automation environment. Most of the Odin Automation daily operations (subscription provisioning, mailbox creation, website creation, resource usage collection and so on) are performed by background tasks that can be seen in Task Manager. You may see the following task statuses in Task Manager:
Unprocessed - the task awaits its turn to be executed (marked white).
Rescheduled - the task failed during the its execution and is enqueued again (marked yellow). Periodic tasks always have the rescheduled status.
Running - the task is running at the moment (marked green).
Cancelled - the task that was manually recalled from execution (marked black with white cross).
Restarted - the task was interrupted (marked yellow). This task status appeared in system since 7.0 release
Failed - it is impossible to execute the task due to some reason (marked red with the white exclamation point).
Normally all tasks in should stay Unprocessed, Rescheduled or Running. Scheduled Tasks queue should be kept as short as possible. Failed tasks could be re-ran or cancelled. Cancelled tasks could be executed again in future.
Keeping an eye on the task execution flow helps to react quickly on every incident.
Reasons of failure can differ and should be investigated separately in every case. Odin highly recommends not to simply cancel tasks in case of failure. It can cause database inconsistencies in some cases. Task cancellation is an exclusive situation - we recommend you to consult with Odin Technical Support Representative beforehand.
Generally, failed task indicates that there is an issue in the system that prevents some operation to complete. Exact impact and solution to apply strongly depends on the task type. Please refer to the following articles for more details:
KB #2246 How to deal with failed tasks
KB #113980 Monitor task queue to ensure that provisioning is stable
There are 2 main types of tasks:
Scheduled tasks - one-time tasks that stand in a queue for execution.
Periodic tasks - executed on regular basis defined by task
Period setting (once per hour, per week, etc) and generate an amount of scheduled tasks to perform necessary actions according to individual settings. Examples of such tasks:
Cleanup login history,
Get traffic usage info,
Monitor VPS Migration state. Schedule of periodic tasks can be edited via Top » Operations » Tasks » Periodic » » Edit in OA Provider Control Panel.
The key recommendation are as follows:
- Review the task output. If the task output is self-explanatory (for example, it says that some host in your environment is not available), then fix the problem (e.g. reboot / turn on the problem host) and try to re-run the task
- If the output is not clear, then search for available solution in public Odin knowledge base located at KB #2246 How to deal with failed tasks
- If no solution found - report an incident to Odin technical support team
- Do not cancel periodic tasks without a serious need, as it may lead to unwanted results. For example cancelling the task
Get traffic usage infowill prevent you from billing your customers for traffic usage correctly.
3.3 Monitor limits and capacities regularly
Just-in-time monitoring of limits and capacities will help to prevent hardware nodes' overload and will help to provide quality services to your customers without interruptions. Here are some hints what and how to monitor. Regularly make sure that items listed in the article below are not approaching their limits (some of the features may or may not exist depending on your configuration):
KB #113986 Monitor limits and capacities regularly
3.4 Do FULL Vacuum every 3 months
Odin Automation is highly loaded platform performing hundreds of database queries per second. We recommend to perform database 'vacuuming' every 3 months to keep performance on high level. The procedure itself presumes the entire system maintenance: provisioning operations, new services ordering, user interface, background tasks processing, resource usage collection will be stopped during the procedure. All these operations will be successfully resumed as soon as Odin Automation is set back online. Odin team recommends to perform an overview to ensure that there are no failed orders and tasks afterwards. The first solution if any of them found - simply resubmit/re-run them.
4. Keep system up-to-date
4.1 OA system server OS
Security patches, issues fixes, new functionality and so on are delivered by OS vendors as a part of updates. Odin highly recommends to keep operating systems up to date:
- Microsoft Windows Updates supported by Odin Service Automation
- Supported APS Application Packages Versions in Odin Service Automation
- Full list of Supported Third-Party Products in Odin Service Automation
- Supported Microsoft Exchange Server Versions
- How to install security updates (Apache, BIND, and OpenSSL) on OA-managed servers
4.2 OA minor updates
As every software vendor Odin brings new software features and bugfixes in scope of regular software updates. Most critical issues are covered by hotfixes. Provider can install all necessary Odin updates himself as described in the article Operations Automation (formerly POA) updates installer or request assistance from Odin Technical Support.
NOTE: If you have any customizations or custom/specific configuration of your Odin Automation installation, please make sure to provide the list of all customizations/configurations to Odin Technical support before updates installation. If you install all updates on your own, it is critical to ensure that everything is tested in Lab environment first and to have procedure set up that allows to recover all customizations after update.
4.3 OA major upgrades - general recommendations
Odin Automation major upgrade and its stages are described below:
- Go through Odin Automation Upgrade Procedure Overview to get the understanding of upgrade stages.
- Perform the Gap analysis procedure based on Odin Automation Premium Release Notes and known issues – please check with Pooled Technical Associate (email@example.com) or your named Technical Account Manager.
How you can help Odin to perform the Odin Automation upgrade smoothly:
a. When requested, please provide Odin upgrade team with the access to all required nodes of your OA environment in advance, to make it possible for Odin upgrade team to upload the OA distro files and run a precheck scripts.
b. It is very important, at the earliest preparation stage, to provide Odin Support representative who is working on the upgrade with you the full list of customizations or specific configurations made on your Odin Automation platform.
c. Fix all issues detected by Odin precheck script(s) before the upgrade. Please note that Odin will not start the upgrade until all issues revealed by the precheck script(s) are fixed.
d. Make sure you have a Lab environment with configuration very close to the production’s one – in terms of configured service templates, service plans and customizations (ideally - same hardware, and every OS component/updates).
NOTE: In case you do not have a Lab environment to test the upgrade first, Odin cannot take accountability to estimate maintenance window required for the platform upgrade.
e. If you are a Premier Support customer, there is a special option for your upgrade testing – please contact your Technical Account Manager to learn more.
f. Make sure you have defined testing plans for the OA environment that correlate with your business needs.
g. Make sure that the testing is started right after Lab or Production upgrade.
h. Please cooperate closely with Odin upgrade coordinator (Pooled Technical Associate or Technical Account Manager) to comply with planned upgrade’s timeline and to avoid any interruption of the normal business operations right after the upgrade.
i. All issues found during Lab environment upgrade, which are considered as being risky for the Production environment as well, must be addressed right after Lab upgrade and before production upgrade.
4.4 Validate customizations after every update
In case if you environment have some customizations, do perform detailed testing on Lab after every change of the environment. For example, there is an order flow customization in place. Exact recommendation is to place appropriate orders (for customized chains) to ensure that system operates as expected. Other popular customizations are:
- Online Store
- Customized Skins
- Additional tabs and buttons
- CSD modules
Before upgrading any module, service or making any change in platform or backend - in case you are not sure what may happen - verify with support representative or online resource (Documentation/KB). Make it a rule. All changes should be tested on Lab environment first.
Lab/Staging/Dev installation provides ability to test new service offerings, APS applications, all > changes on the platform and verify consistency of the system after installation of each update > > or hotfix.
Having tested and verified scenarios allows to launch new products with minimized risk of technical issues and perform full Regression testing. Having the Lab installation allows to implement change management as a part of ITIL framework, including major upgrades and updates.
It is highly recommended to keep the lab environment equal to production in terms of hardware, operating system versions, installed software packages, customizations, service offerings, reseller configurations, brand settings, roles and privileges and so on. The more lab is close to production environment - the more test results on lab are valid for production.
If you do not have a lab environment, but plan to deploy it, please contact your Account Manager, TAM or PTA team (firstname.lastname@example.org) to clarify further action plan regarding the deployment.
4.5 Virtuozzo hardware nodes
Virtuozzo software serves as a basis for Odin services - containers and virtual machines based on Virtuozzo can be used for almost all services in scope of Odin Service Automation platform. It is important to keep Virtuozzo up-to-date and to install fresh updates in a timely fashion. Please refer to Virtuozzo website to track new updates. How to keep Virtuozzo up-to-date? Please refer to the Virtuozzo user guide for detailed info: Keeping Your Parallels Virtuozzo Containers System Up-to-Date. Additional information can be found in Virtuozzo knowledge base.
If you face any Virtuozzo-related issue, please submit the ticket to Virtuozzo support. If you think that the nature of the issue lies in interaction between Odin and Virtuozzo product, feel free to submit the ticket to Odin Technical Support for clarification.
For for details about Virtuozzo please refer to the article:
KB #114000 Keep Virtuozzo hardware nodes up-to-date.
4.6 APS packages
Obtain the latest versions of APS packages from official APS Standard site. Install the latest versions available by following the procedure provided in Managing Application Versions guide. Note that some applications require additional steps that are described in app-specific guides (usually they are included in APS package contents). For more details please read this article:
KB #114008 Keep APS Applications up-to-date
5. Requesting Support
Odin Service Automation customers’ contracts assume unlimited 24x7 e-mail and phone support. General info about how to get support is mentioned in the article KB #122973 How to get support for Odin products?
For details about e-mail support please visit Odin Support Home page.
To get assistance by phone please use Odin Phone Support Hotline.
To prioritize urgent ticket, please check Escalation path for Odin Automation Premium.
Information about support ticket severity you can find in the article KB #125810 Ticket Severity.