Wiki Home | Recent changes | View source | Page history

Printable version | Disclaimers | Privacy policy

Not logged in
Log in | Help
 

Cloud Computing Made Easy®

N-Tier Guide System Definition

From Cloud Computing Wiki - Kaavo

Share/Save/Bookmark

Contents

Important Note on Using this Guide

Most users will not need to refer to this guide to use the IMOD application. Kaavo IMOD provides UI Wizards for configuring the deployment and runtime management information for the entire deployment, the XML system definition file is generated behind the scene automatically. The XML System Definition file is used by the IMOD engine internally. Please use the online UI Wizard for configuring your deployment. This guide is provided for advance users who want to tweak the file manually for advance features or are writing their own UI for the IMOD engine using Kaavo Web Services and want to manipulate the System Definition file from their own UI. Users who are looking to get a deeper understanding of concepts and how Kaavo IMOD can fully automate any custom application/workload deployment and runtime management will find this guide useful.

Kaavo’s App-centric approach

Kaavo has developed world’s first app-centric approach to manage infrastructure and middleware on the cloud. A complete n-tier system is defined in an XML file, called the System Definition file. This definition is then deployed into Kaavo’s IMOD application that orchestrates the execution of the overall system. The system definition file has two components one is the deployment time and other is the run-time. The deployment section has all the information about provisioning resources, deploying and configuring the middle-ware, and deploying and configuring application components to bring the application online; the run-time section has the workflows to be executed in response to defined events to make sure application service levels are met during run-time automatically without human intervention. See the figure below for the structure for the System Definition File. Watch this 30-min video for a tutorial on how to create Kaavo System Definition from scratch.

File:system-definition-file.jpg Figure 1: Structure of System Definition File used by IMOD

The end result is that you have a single-click approach to completely start a complex system that consists of multiple tiers (e.g. a database tier, an application tier, a web tier) as well as the fire-wall rules between the tiers etc. And once the system is deployed you have auto-pilot functionality for streamlining and automating production support for your application. For details on the structure of the System Definition file please also refer to the latest XSD for the system definition file.

Designing an N-tier System Definition

You define your n-tier system via an n-tier System Definition file. The task of designing the System Definition file can broadly be divided into three steps:

   1. Define your system artifacts - tiers, server types and servers
   2. Define the tasks to be done to configure the servers - commands and actions
   3. Define the events and their corresponding handlers to to perform custom workflows in response to events

Finally, you need to do the wiring of the actions at proper points in the activity graph of vital-events (e.g. startup and shutdown etc.) and custom-events (e.g. scale-up, scale-down and recovery) in the lifecycle of the n-tier system.

Structural Aspects

First, let us discuss the structural aspects of the n-tier System Definition file, related to the system artifacts and the vital-events.

N-Tier System

An n-tier system is composed of one or more tiers. Each tier is composed of groups of servers with a specific role - called servertypes. Each servertype has servers, which may be provided by multiple cloud providers. Currently we support Amazon EC2 provider - later we plan to include support for other providers as well.

This structure is appealing to the notion of an n-tier system - where there are multiple tiers and each tier has multiple servers performing specific responsibilities.

Tier

A tier has one of more “serverTypes” (described below.) A tier also has a unique name, which is used to refer to it in the n-tier System Defintion file. A tier has a notion of “displayIndex” and “order.” “DisplayIndex is used to render the graphical view of the n-tier system, showing the tiers in a particular stacking order” in the graphical user interface. “Order” is used to determine the order in which this tier will be instantiated and set-up by the Kaavo’s n-tier engine during execution. This helps decide which tier comes up first - so that a dependent tier can assume the existence of another tier when it comes up in its turn. With the addition of the facility to define custom events, we have changed the way to look at the scaling and recovery of the system. Scaling and recovery of the system can now be defined for each server-type by defining events and handling them with corresponding handlers, more on this later. The scalable attribute on the tier element is deprecated now.

ServerTypes and Servers

We group servers with a particular responsibility in a servertype - a server type has a role attribute which is unique in the scope of a tier. Servers are sub-classed by the provider of the server – e.g. awsServer for the Amazon EC2.

The system, tier, and servers have a lifetime and they have some vital events in the course of their lifetime. These vital events are startup, shutdown, scale-up and scale-down.

Services

Services sections contains supported services for the deployment, e.g. Load Balancer services by Amazon or Rackspace, DNS services. In this section we can add support for any service required for your application or deployment. Please let us know if you want us to support any specific service.

Runtime Aspects – Kaavo’s N-Tier Orchestration Engine

Kaavo’s n-tier orchestration engine, which is part of the IMOD application, takes the n-tier System Definition file, and orchestrates it according to all the instructions in it.

The engine responds to events and executes a workflow to handle those events. These events can come, for example, from the user interaction on the n-tier dashboard to start/stop a system. In addition, the events can also come from monitoring system to trigger scale-up/scale-down workflows.

For example, when the engine encounters the startup event on a server defined in the input n-tier System Definition file, it invokes the startup sequence on the described and guides through a set of predefined states until the server is started, is accessible, and optionally configured by executing some post-startup commands. The post-startup commands could be any arbitrary shell-script to be run on the server just started, or it can be some cloud provider specific command (e.g. in the case of AWS, associating an elastic IP, attaching EBS volume from a snapshot or otherwise, or downloading a set of files from S3 to that instance.)

We can now invoke the custom-event from the UI manually by pressing the events button; if the handler for the event is a scaling activity or a recovery activity the corresponding actions will be done by the n-tier engine and the system will behave likewise, more on this later.

Actions

Actions are units of activities which have a unique name which is used to refer to it from points in the activity graph to be performed on each of the target set of servers. To be able to refer to a set of servers in the description we use an expression (e.g. [tier=aTierName][serverrole=aServerTypeRole]) - this effectively returns a set of servers in the mentioned named tier and with the server type with the mentioned role in the expression. Under the new implementation of the n-tier engine, where we can scale the (tiers+server-types) by a user defined number of servers, we have extended the expression, where we can refer to the target servers which are used for scaling by using the string as below: [tier=aTierName][serverrole=aServerTypeRole][scalingserver] We have also brought in the recovery event in the current implementation – as case where a server has died and needs to be replaced by a new server. To be able to refer to the server being recovered, we have introduced the expression [tier=aTierName][serverrole=aServerTypeRole][recoveringserver].

In case you want to delete the generated file on the server side after executing an action, you can do it by setting delete-on-exit flag on the action. This is helpful in case there is sensitive information like username or password in the file which shouldn't be left on the server. See the example below.

<action name="start-apache" execute="true" delete-on-exit="true" target="[tier=web_tier][serverrole=default]">
       <scriptName>apache.sh</scriptName>
       <scriptPath>/root</scriptPath>
       <scriptTemplate type="inline">
       <![CDATA[
               #!/bin/sh
               /etc/init.d/httpd start
               .............
               .............
</action>

Execution of actions is centrally logged for audit and debugging. For more info please refer to Debugging Custom Actions on Servers

By default imod actions are executed without any pseudo terminal. Some images may be configured to require (pseudo)terminal for using sudo. If you set the use-pty flag on an action the action is executed with a pseudo terminal.

<action name="start-apache" execute="true" use-pty="true" target="[tier=web_tier][serverrole=default]">
       <scriptName>apache.sh</scriptName>
       <scriptPath>/root</scriptPath>
       <scriptTemplate type="inline">
       <![CDATA[
               #!/bin/sh
               sudo mkdir /test
               .............
               .............
</action>

File:Central-log-for-Audit-and-Debugging.jpg Figure 2: Action execution log appearing in the central log.

Users don't have to manually write actions. There is GUI for creating actions as shown in the figure below:

Figure 3: Defining actions using GUI.

Different types of actions

Actions are like re-usable procedures. They are used for generating executable scripts and configuration files from velocity templates and copying them to target servers and optionally executing them (in case of executable scripts). There are two types of actions:

Executable

Executable actions generate an executable script and copy the script to a specified location on the specified set of servers. Given below is an example of an executable action.

<action name="start-apache" execute="true" target="[tier=web_tier][serverrole=default]">
       <scriptName>apache.sh</scriptName>
       <scriptPath>/root</scriptPath>
       <scriptTemplate type="inline">
       <![CDATA[
               /etc/init.d/httpd start
               ]]>
       </scriptTemplate>

</action>

The ‘execute’ attribute specifies whether the action is executable or not. In this example, execute="true" specifies that the action is an executable one. The target attribute represents the set of servers on which the script will be executed. The expression [tier=web_tier][serverrole=default] returns the set of servers in the named tier(web_tier) and with the named server type(default). The element scriptPath specifies the directory where the generated script will be copied to; scriptName specifies the file name the script will be saved as. The scriptTemplate element contains the body of the action. The body of an action is a velocity template. But in this trivial example, it is a static text. Actions can be invoked inside event-handlers using the following syntax

<command type=”action”  name=”action-name” />

It is easy to see that when this action, or any other executable action for that matter, is executed the following 3 steps are executed for each target server.

  1. The velocity template is rendered using the velocity template engine. In this example, the template is static text, so the generated content is the same as the template (i.e /etc/init.d/httpd start)
  2. A file with the above content is created in the path /root/apache.sh in the target server.
  3. The script /root/apache.sh is executed on the target server.

Given below is another example with a non-trivial velocity template:

<action name="grant-mysql-phpcolab" description="Grant mysql client node" execute="true" target="[tier=db_tier][serverrole=ndbd]">      
      <scriptName>GrantPhpDbAccess.sh</scriptName>  
      <scriptPath>/root</scriptPath>  
      <scriptTemplate type="inline">
        
        <![CDATA[
#!/bin/sh
#foreach ($clientNode in $SqlClientNodes)
mysql -uroot -p${mysqladminpassword} -e "GRANT ALL PRIVILEGES ON ${appdb}.* TO '${appdbuser}'@'${clientNode.publicIP}' 
IDENTIFIED BY  '${appdbpassword}'"
#end	  
        ]]> 

      </scriptTemplate>  
      <parameters> 
        <parameter name="mysqladminpassword" type="literal" value="password"/>  
        <parameter name="appdb" type="literal" value="php_collab"/>  
        <parameter name="appdbuser" type="literal" value="phpcollab"/>  
        <parameter name="appdbpassword" type="literal" value="admin"/>  
        <parameter name="SqlClientNodes" type="serverref" value="[tier=app_tier][serverrole=default]"/> 
      </parameters>      
</action>

The action body now contains dynamic content. This action accepts five parameters named mysqladminpassword, appdb, etc. They have default values specified using the value attribute. First four parameters (type="literal") are initialized to strings specified using the value attribute. The last one (SqlClientNodes) is initialized to a collection of server objects returned by the expression [tier=app_tier][serverrole=default]. When this action is executed in the step 1 mentioned above references to these parameters are resolved while rendering the template.

Each server object has a number of properties (e.g. publicIP) that can be accessed in the template. The foreach loop iterates over $SqlClientNodes collection with the $clientNode as the loop variable. In each iteration of the loop ${clientNode.publicIP} fetches the public IP of the server object. This action generates the following script

#!/bin/sh
mysql -uroot -ppassword -e "GRANT ALL PRIVILEGES ON php_collab.* TO 'phpcollab'@'publicIP1' IDENTIFIED BY 'admin'
mysql -uroot -ppassword -e "GRANT ALL PRIVILEGES ON php_collab.* TO 'phpcollab'@'publicIP2' IDENTIFIED BY 'admin'
...
...
mysql -uroot -ppassword -e "GRANT ALL PRIVILEGES ON php_collab.* TO 'phpcollab'@'publicIPn' IDENTIFIED BY 'admin'

where publicIP1, publicIP2, …, publicIPn represent the public IP’s of the all the servers in the SqlClientNodes collection i.e the public IP’s of all the servers in the tier app_tier and server type default.

The parameters defined above are user-defined parameters. In addition to these, there are three implicit parameters that can be referred to in the template. For example the parameter CurrentTarget is accessible to all the templates by default and represents the individual server object on which the action is executing.

Non-Executable

Non-executable actions are similar to the executable actions. Only difference is that these actions generate non-executable files in step 1 and step 2. Evidently step 3 is missing for these types of actions. For these actions the attribute “execute” is set to “false”.

Executing Actions on Specific Server

The syntax for executing actions on specific server is [tier=app][serverrole=jira][index=2] e.g. if there are 5 jira servers in the app tier, specifying index=2 will execute the action on the second app server. Users can execute actions on all servers in the group by not adding the index, e.g. [tier=app][serverrole=jira] will execute actions on all 5 jira servers in the app tier.

Parameterized Templates

An action in its core is a velocity template (http://velocity.apache.org/engine/releases/velocity-1.5/user-guide.html) which has access to some runtime system artifacts (servers mentioned in the action target) and some user defined parameters to generate a script/configuration file that gets transferred to each of the target servers and optionally gets executed. This way, we can easily create a configuration file for a newly launched server and upload this configuration file to the server. We can also create a custom shell-script this way, upload the shell script to the newly launched server and execute it there.

We provide some best practice templates for some common scripts (e.g. MySQL cluster configuration scripts, JBOSS cluster scripts etc.). However the user also can write inline templates for the script/configuration in the action definition. Templates get the parameters defined in the action definition as part of its context. These parameters can be of type “literal”, which implies that the string value of parameter will be used as is in the velocity context, or it can be of type “serverref”, which will be evaluated to get a set of servers from the System Definition. For each of these servers we can access their properties, namely, PrivateDNS and PublicDNS.

As of the current implementation, we can look at the actions as typical global functions in any standard programming language, which are defined with some parameters and can be invoked at runtime. In our case we can defined the action prototype in the actions section and when being invoked by a command (see below in the Commands section), we can redefine the target of the action and their parameters. In addition to these parameters certain parameters are passed implicitly to all templates, namely, CurrentTarget(the node on which the script is being created), AllTargets(set of all nodes that are target of the action) and OtherTargets(set of all nodes except the CurrentTarget).

List of Server Attributes Available to Velocity Templates

Please also refer to the common server attribute names. Following is the list of all server attributes that are available for each integrated cloud provider for dynamically generating scripts, code, or configuration files in the context of dynamic cloud infrastructure for the application or workload.


Common Server Attribute Names

To allow reuse of actions we have implemented common names for server attributes which are supported by all integrated cloud providers but in some cases have slightly different name. If you want to reuse your Actions using dynamic server attributes across providers we recommend using the common name for the attribute.

Common Name Amazon EC2 Rackspace Cloud IBM Cloud Eucalyptus Cloud Terremark vCloud Express Cloud.com OpenStack VCD
publicIP publicIP publicIP publicIP publicIP publicIP publicIP publicIP publicIP
privateIP privateIP privateIP N/A privateIP privateIP privateIP privateIP privateIP
instanceId awsInstanceId serverId instanceId eucaInstanceId serverId serverId serverId vmId
imageId amiImage imageId imageId emiImage templateId templateId imageId templateId
instanceType instanceType flavorId instanceType instanceType vpu, memory serviceOfferingId flavorId NA*
(*vappTemplate covers instance type)

Commands

Commands provides the context for executing actions. Using programming analogy, Actions are methods/procedures, whereas as Commands are specific calls with parameters. Actions are invoked during the execution of a “command”- which you will find in the post-startup definitions of the artifact definitions or event-handler sections. Post-startup section is executed after all the servers have been launched. Thus, when the generic activities of starting up of the respective artifact are complete, these commands are taken up for execution and performed on each of the target servers. Commands of type action have a global context – i.e. they can be executed from any point of the post-startup of any artifact even if the target of the action is a set of different artifacts. For example, we have an action to create MySQL databases on the database tier - the target of this action are the servers on the database tier and yet this action can be invoked from the post-startup block of the web tier. The post-scale-up and pre-scale-down are deprecated now, as the scale-up and scale-down are now definable in the custom events section and have a generic command-action mapping.

Commands are like function invocations, where we can override the target of the action being invoked and the parameters for creating the script/configuration to be used for the corresponding action. The action thus redefined, will act on the new set of target servers instances with the changed script/configuration rather than the one defined in the global actions pool.

Some commands can be relevant only in the context of a server (specifically awsServer); these commands are for associating an elastic IP address to the enclosing server definition, associating an EBS volume to the server either by creating from a EBS snaphot or a predefined volume, or setting up the server for S3 operations. The post-startup section in <awsServer> can have commands of type “script”, “ec2”, or “s3” only.

Commands are taken up in the order that they are described in the post-startup or event-handler sections and executed accordingly.

Events

Since the release of IMOD 1.4.5 we have introduced the feature to define custom events and describe their corresponding handling mechanism. Events are elements in the system definition which define a specific message to the n-tier engine about a state of a user's system semantically relevant only to their system, and informs the engine as to how to react to that message. The message can be manually sent to the engine through the IMOD Application-Centric dashboard UI or may be triggered from an external monitoring system which may be looking at the user's n-tier system servers. In either case, the message that is sent to the n-tier engine has information about the event name, the relevant system, identity of the IMOD customer who owns the system and any other information as a may be required. This message elements define the context of the event, which the n-tier engine reacts to. The behavior of the engine on receipt of such a message is defined by the handler of the event – the handler can be of 4 types: scaleup, scaledown, recovery or simple. Thus, the event maps to the course of activities that may be required by the user for their individual systems. The scaleup, scaledown and recovery handlers are guided by corresponding workflows specialised to do the needful. All handlers have their sequence of commands (invocation of actions defined in the global pool of actions) which are part of the corresponding activities. In case of the scaleup handler (where new servers are added to the servertype defined in the scope of the event) and recovery handler (where dead servers are replaced by similar servers to the servertype defined in the scope of the event), the commands are executed as post-actions to the workflow. In case of the scale-down handler (where servers are killed in the pool of servertypes defined in the scope of the event) the commands are executed as pre-actions to the workflow. In the case of the simple handler, there are no workflows and the commands are executed sequentially. In release 1.5 we introduced enhanced format for configuring custom events, using the new format we can configure scale up, scale down, recovery and custom events, we deprecate all the old format of events, scale-up, scale-down, recovery, and simple event.

Subversion Integration

Users can commit actions and templates to Subversion. To use this feature, first configure Subversion credentials and urls for storing actions and templates as in figure below. Each account can have one url for actions and one url for templates.

SVN Config
SVN Config

When actions are committed to svn, a special tag called svn is added to the action. The tag also contains the revision no of the action. In the example below the action has revision 1202.

<action name="install-java" description="install jdk" execute="true" target="[tier=system1_tier]">
     <scriptName>install-java.sh</scriptName>  
     <scriptPath>/tmp</scriptPath>  
     <scriptTemplate type="inline"><![CDATA[JAVA_BIN=`which java`; if [ -z "$JAVA_BIN" ]; then echo "Java binary not present, safe to continue"; else rm -fv $JAVA_BIN;fi
JAVAC_BIN=`which javac`; if [ -z "$JAVAC_BIN" ]; then echo "Javac binary not present, safe to continue"; else rm -fv $JAVAC_BIN;fi
mkdir -p /usr/local/java
cd /usr/local/java
wget $download_file_url
chmod a+x $file_name
yes | ./$file_name
JAVA_DIR=`ls -l |grep ^d | awk '{print $NF}'`
echo "export JAVA_HOME=/usr/local/java/$JAVA_DIR" > /etc/profile.d/jdk.sh
echo 'export PATH=$PATH:$JAVA_HOME/bin' >> /etc/profile.d/jdk.sh
rm -f $file_name]]></scriptTemplate>  
     <parameters>
       <parameter name="download_file_url" type="literal" value=""/>  
       <parameter name="file_name" type="literal" value="jdk-6u34-linux-x64.bin"/>
     </parameters>  
     <svn revision="1202"/>
   </action>

We added a new button called Svn Update which is used for updating actions from svn.

SVN Update
SVN Update

To commit actions, go to the actions tab and click on the icon below heading SVN Commit. It will bring up the next dialog where you have to specify comments. The figures below show how actions are committed to Subversion

Commit Action
Commit Action
Commit Action
Commit Action

To commit templates, go to definition tab and click on publish to catalog button. In the dialog that come up, check the checkbox svn commit and specify comment to commit the template to Subversion in addition to publishing it to catalog. If, you keep the checkbox unchecked then the template will be published to catalog but not committed to Subversion.


Commit Template
Commit Template

Deployment Cost Reporting

One of the use cases we have been hearing from customers is ability to track total infrastructure costs of applications running on cloud. To support this use case, in addition to metering the Cloud usage, we allow users to track total cost for each of their deployments in US dollars. For the cloud providers who provide pricing information via API we use that information for calculating the total infrastructure cost. In addition we also allow user to manually enter the cost information in US Dollars for each server role in the System Definition XML. For example rate per hour is defined as 0.1 dollars in the following server definition.

<serverType role="OpenStack-Pricing-Test_role" min="1" max="1" ratePerHour="0.1">

Note: Manually entered pricing information by the users in the System Definition takes precedence over pricing information provided by the cloud provider via API in cost computations. Users can enter Server Rate Per Hour information for all supported clouds in Kaavo IMOD. This feature is also useful for software companies looking to distribute their software on various cloud providers and are looking for metering and billing functionality to charge customers for their software in addition to the infrastructure charges by cloud providers.

Tracking of Bandwidth and Storage Costs

In addition to tracking server costs, we also allow user to track Bandwidth and Storage costs for the application. To get approximate tracking of the total costs, users can enter Bandwidth and Storage costs as a percentage of server costs. We now allow users to provide approximate cost percentage for Storage costs and Bandwidth costs as a percent of total server/compute cost. Two tags can be added to any System Definition before the end ntier tag, see the example below.

   <bandwidthCostPercent>5</bandwidthCostPercent> 
   <storageCostPercent>15</storageCostPercent>
 </ntier> 

The bandwidth cost percent of 5 in this example implies that the total bandwidth costs are expected to be 5 percent of the total server costs. The storage cost percentage of 15 implies that the total storage costs are expected to be 15 percent of the total server costs for the application. In this example let say the total server costs for the application is 100 dollars, then total cost will be 120 = 100 + 5 (5% of server/compute cost)+ 15 (15% of server/compute cost).

User can enter Bandwidth Cost Percent and Server cost Percent for all supported clouds in Kaavo IMOD. For application with very large storage of bandwidth costs compared to the compute cost, the value of bandwidth or storage cost percent can be more than 100 percent, e.g. entering 600 for the bandwidth cost will imply that the bandwidth costs for the deployment are expected to be six (6) times of the total compute/server cost.


Tracking Costs for Individual Departments or Business Units

Enterprise customers, especially IT departments offering shared services want ability to track usage of the cloud resources not only by each application but also by each department. Kaavo IMOD allows this by associating department id to each deployment. A CSV report can be downloaded to get details of usage of cloud resources by applications/deployments and by departments.

Add Department
Add Department
Associate Department
Associate Department
Select the Department to Associate
Select the Department to Associate
Generate and Download Report
Generate and Download Report
Report Format
Report Format

Example Cases

Now we go through the possible cases while defining the system.

Defining the ntier section

   <ntier>
       <workflow>
               <startup timeout="1500"/>
               <shutdown timeout="900"/>
       </workflow>

The ntier is primarily a container for tiers. It has a generic startup and shutdown sequence that is performed by the n-tier engine on respective message from the user - its main activity is to start or stop the contained tiers.

The generic startup and shutdown activity sequence has a timeout to prevent the corresponding activities running for too long a time; which you can customize the activity timeout with the timeout attribute - each of the artifacts (ntier, tier and server have their timeouts. Based on your own needs yo may want to put your own timeouts). The default timeout is 1800 seconds.

Startup Tag

The startup sequence has provision for defining user-defined actions to be invoked after the generic startup activities are done. For this you will need to add a post-startup element under the startup definition. This post-startup block can have a ordered sequence of commands with reference to actions defined in the actions section(described later).

       <startup timeout="1500"/>
           <post-startup>
               <command type="action" name="notify-me"/>
           </post-startup>
       </startup>

These user-defined actions will be executed after the n-tier generic startup sequence is complete.

Defining the tier section

Example:

<tier displayindex="1">
 <name>web_tier</name>
  <workflow>
     <startup timeout="300" order="3">
        <post-startup>
           <command type="action" name="create-apache-jira-conf"/>
           <command type="action" name="start-apache"/>
        </post-startup>
     </startup>
     <shutdown/>
   </workflow>

You can put several tiers inside the ntier definition. Each tier has a displayIndex attribute which helps decide which tier is displayed before the others after it on the UI. The tier has a name which has to be unique among all tiers in this descriptor. This will help us locating servers contained in this tier later on in actions. A tier, like the n-tier, has a generic startup and shutdown sequence, with customizable timeouts. A tier also has the provision of mentioning the order in which the startup of the tier will be done among other tiers in this descriptor. Based on the value of the order attribute of the startup sequence, the n-tier startup sequence starts the tier. The main responsibility of the tier startup or shutdown sequences are to startup or shutdown the servers contained in it. The tier configuration also has to capability to execute user-defined actions in the post-startup definition of the startup definition. It is similar to the post-startup of the n-tier; however it is executed after the generic activities of the tier startup are done.

Defining the server types and servers

<tier displayindex="2">
  <name>app_tier</name>
   <workflow>
   …
   </workflow>
   <serverTypes>
      <serverType role="default" min="2" max="4">
         <awsServer>
               <name>app node</name>
               <awsaccount>147404622121</awsaccount>
               <securityGroup>Kaavo-Demo</securityGroup>
               <keypair>demo.kaavo.us</keypair>
               <machineIdentifier>ami-87c522ee</machineIdentifier>
               <instanceType>m1.small</instanceType>
               <parameters></parameters>
               <startupCount>2</startupCount>
               <workflow>
                     <startup timeout="800">
                     <post-startup>
                     <command type="script" name="myscript.sh">
                     <![CDATA[
                                			#!/bin/sh
                                			rm -f /var/www/html/php-colab/includes/settings.php
                                			cd /var/www/html/php-colab/includes/
                                			wget http://php-collab.s3.amazonaws.com/settings.php 
                                			chmod 755 /var/www/html/php-colab/includes/settings.php
                             			]]>
                      </command>
                     </post-startup>
                     </startup>
                   <shutdown />
                </workflow>
             </awsServer>
      </serverType>
      </serverTypes>
      …
</tier>
<tier displayindex="3">
 <name>db_tier</name>
   <workflow>
      ...
   </workflow>
   <serverTypes>
      <serverType role="manager" min="1" max="1">
         <awsServer>
               <name>mysql-manager-node</name>
               <awsaccount>YOUR AWS ACCOUNT ID</awsaccount>
               <securityGroup>YOUR AWS SECURITY GROUP</securityGroup>
               <keypair>YOUR AWS SERVER LAUNCH PVT KEYNAME</keypair>
               <machineIdentifier>ami-005eba69</machineIdentifier>
               <region>us-east-1</region>
               <availabilityZone>us-east-1a</availabilityZone>
               <!— Note when picking EU region please make sure to pick the appropriate AMI for EU region -->
               <instanceType>m1.small</instanceType>
               <parameters></parameters>
               <startupCount>1</startupCount>
               <workflow>
                  <startup timeout="300"/>
                  <shutdown/>
               </workflow>
         </awsServer>
       </serverType>
        <serverType role="ndbd" min="2" max="4">
         <awsServer>
               <name>mysql-ndb-node</name>
               <awsaccount>YOUR AWS ACCOUNT ID</awsaccount>
               <securityGroup>YOUR AWS SECURITY GROUP</securityGroup>
               <keypair>YOUR AWS SERVER LAUNCH PVT KEYNAME</keypair>
               <machineIdentifier>ami-005eba69</machineIdentifier>
               <region>us-east-1</region>
               <availabilityZone>us-east-1a</availabilityZone>
               <instanceType>m1.small</instanceType>
               <parameters></parameters>
               <startupCount>2</startupCount>
                       <workflow>
                               <startup timeout="300"/>
                               <shutdown/>
                       </workflow>
               </awsServer>
       </serverType>
         ...
       </serverTypes>
</tier>

Server Types

Server types are, as discussed earlier, a grouping of servers in a tier based on roles played by the servers. They are identified by the role name in the context of the containing tier. Currently we have two other required attributes, min and max, which are respectively the minimum and maximum count of a type of server in the containing tier. Scale-up is limited by the max count of servers in this type, whereas scale-down is limited by the min count.

One of the major challenges in porting applications across cloud providers is that at present there are significant differences in service level, server types, etc. among cloud providers. Starting version 4.4 of Kaavo IMOD, users can define can define multiple cloud providers for each server role so in case the primary cloud provider is down it is easy to switch the cloud provider and bring up the servers for the roles alternate cloud provider. So for example if there is a server role App-Server, user can define how to manage the servers in the server role App-Server on different providers.

This gives users full control over the behavior of their multi-tier distributed applications on different clouds. This is especially useful in following two use cases:

Following is the sample XML structure. Learn more about this feature.

<serverType role="App-Server" min="1" max="2" defaultProvider="rackspaceServer">
  <awsServer>
   .
   .
   .
  </awsServer>
  <rackspaceServer>
   .
   .
   .
  </rackspaceServer>
  <ibmServer>
   .
   .
   .
  </ibmServer>
  .
  .
  .
</serverType>

Servers

Servers are the atomic unit of artifacts in the n-tier system definition and represent the actual machine instances to comprise the n-tier system. Currently we support AWS EC2 and Rackspace server instances, the awsServer element has all the required elements to be able to launch an ec2 instance. The rackspaceServer element has all the required information to be able to launch Rackspace server.

The sub-elements of the awsServer element are pretty self-explanatory for one who is conversant with EC2 parlance. Though ec2 allows for multiple security groups while launching an ec2 instance, we in the n-tier engine accept only one security group. The security group in the n-tier is modified to authorize access for the IMOD maintenance security group, so that n-tier engine can configure, monitor, and manage the servers in the system. There is also a startup Count element which defines the number of servers that will be launched at the startup of this server definition. Rackspace servers also follow similar structure, and it is easy to define a system with severs running on multiple providers for robustness. According to the new region support of EC2, we have implemented the choice of region where the aws server will be launched by the ntier engine. This is an optional element in the definition file. If the user does not provide any option of region, the server will be launched in the US-EAST-1 region.

Similar, to the n-tier and tier counterparts, the server also has the generic startup and shutdown sequences with customizable timeouts. Similar to the other artifacts, the awsServer also has a post-startup section where you can associate actions that you define globally. These will be executed after the generic server startup activity has completed.

See below for a reference to server tag structure

AWS Server

<awsServer>
  <name>lb</name>
  <awsaccount>147404622121</awsaccount>
  <securityGroup>Kaavo-Demo</securityGroup>
  <keypair>demo.kaavo.us</keypair>
  <machineIdentifier>ami-70c42319</machineIdentifier>
  <instanceType>m1.small</instanceType>
  <parameters></parameters>
  <startupCount>2</startupCount>
  <workflow>
       <startup timeout="1200"/>
       <shutdown />
       </workflow>
       <awsStoragePool/>  
       <awsAddressPool/>
</awsServer>
<kernelImage>aki-20c12649</kernelImage>
<ramDisk>ari-21c12648</ramDisk>


Multi-Region Support

AMI IDs are different for different regions to make easy to support multiple regions we support the following format for specifying AMI IDs for various regions.

<machineIdentifier>us-east-1:ami-0349a76a,us-west-1:ami-03t9a763</machineIdentifier>
AWS Spot Instance Support

To launch servers as spot instances just enter the spot price in US dollars per server hour in the optional spot price tag. Spot price tag can be added within the AWS Server type definition in the System Definition XML. In the following example spot price is defined as $0.04/hr. If the spot price tag is left blank or missing from the server definition the instance/s will be launched as on-demand instance/s.

  <spotPrice>0.04</spotPrice>

In all System Definitions created using the UI wizard after August 20, 2012, a blank spot price tag is automatically added to the server definition. Users can enter spot price of their choice if they want to launch spot instance for the server type otherwise they can just leave this tag blank.

AWS VPC Support
  <vpcSubnetId></vpcSubnetId>

also add the following tag to the awsAddressPool tag, please note vpcIpId is the identifier for the VPC Elastic IP address

  <awsAddress>
     <vpcIpId></vpcIpId>
  </awsAddress>

AWS VPC configuration for Onsite Deployment: If the onsite or private Kaavo IMOD deployment is part of a network connected to AWS VPC, to communicate with the servers in the VPC IMOD should use the private IPs within the VPC instead of the public IPs. This can be configured by selecting a check box under the configuration section for the super admin. Without this configuration IMOD will launch servers within AWS VPC but wouldn't be able to connect to the servers for configuring them.

AWS Termination Protection Support

add the below tag for enabling termination protection on an aws server

   <terminationProtection>true</terminationProtection>
AWS EBS Optimized Support

add the below tag for enabling EBS Optimization on an aws server

   <ebsOptimized>true</ebsOptimized>

Rackspace Server

<rackspaceServer>
  <name>jira rackspace node</name>
  <rackspaceAccount>Rackspace-Cloud-username</rackspaceAccount>   
  <imageIdentifier>8</imageIdentifier>
  <flavorIdentifier>1</flavorIdentifier>
  <parameters/>
  <startupCount>1</startupCount>
  <workflow>
       <startup timeout="3000">
       <post-startup>
            <command type="action" name="setup-monitoring"/>
            <command type="action" name="install-jdk"/>
       </post-startup>
       </startup>
       <shutdown/>
  </workflow>
</rackspaceServer>


For Rackspace server make sure to give each server type unique name per Rackspace account to avoid name conflict in the Rackspace side. Use the following table to specify <imageIdentifier> for Rackspace server.

Image Name			Rackspace Image ID
Debian 5.0 (lenny)		4
Fedora 10 (Cambridge)		5
CentOS 5.3			7
Ubuntu 9.04 (jaunty)		8
Arch 2009.02    		9
Ubuntu 8.04.2 LTS (hardy)	10
Ubuntu 8.10 (intrepid)		11
Red Hat EL 5.3			12
Fedora 11 (Leonidas)		13
Red Hat EL 5.4			14
Fedora 12 (Constantine)	        17
Ubuntu 9.10 (karmic)		14362
CentOS 5.4			187811

Also use the following table to specify <flavorIdentifier> for Rackspace

Flavor Name	Flavor ID
256MB  server	1
512MB  server	2
1GB    server	3
2GB    server	4
4GB    server	5
8GB    server	6
15.5GB server	7

Eucalyptus Server

<eucalyptusServer>
           <name>apache node</name>  
           <eucalyptusaccount></eucalyptusaccount>  
           <securityGroup></securityGroup>  
           <keypair></keypair>  
           <machineIdentifier></machineIdentifier>  
           <eucaAddressPool>
              <eucaAddress>
                 <ip><!-- your IP--></ip>
              </eucaAddress>
              <eucaAddress>
                 <ip><!-- your another IP--></ip>
              </eucaAddress>
           </eucaAddressPool>
           <eucaStoragePool>
             <eucaSnapshot>
               <id>snap-B9553ED9</id>
               <size>1</size>
               <mount-point>/mnt/data</mount-point>
               <device>/dev/sdp</device>
             </eucaSnapshot>
         </eucaStoragePool>
           <kernelImage></kernelImage>  
           <ramDisk></ramDisk>
           <instanceType></instanceType>  
           <parameters/>  
           <startupCount>1</startupCount>  
           <availabilityZone></availabilityZone>  
           <workflow>
             <startup timeout="3000">
               <post-startup>
                 <command type="script" name="install_zabbix.sh"> <![CDATA[
                              apt-get update
                              apt-get install -f -y --force-yes perl
                              apt-get install -f -y --force-yes zabbix-agent
                              sed -i 's/Server=localhost/Server=monitor2.kaavo.org/' /etc/zabbix/zabbix_agentd.conf
                              /etc/init.d/zabbix-agent restart
                          ]]> </command>
               </post-startup>
             </startup>  
             <shutdown/>
           </workflow>  
</eucalyptusServer>

Similar to the IBM servers, eucalyptus servers may have adddressPool and storagePool tags. Please refer to the documentation of these tags in for the IBM servers below.

IBM Server

<ibmServer>
  <name></name>  
  <ibmAccount></ibmAccount>  
  <keypair></keypair>  
  <imageIdentifier></imageIdentifier>  
  <instanceType></instanceType>  
  <location>1</location>  
  <addressPool>
     <address>
        <id>12093</id>
     </address>
     <address>
        <id>12099</id>
     </address>
  </addressPool>  
  <storagePool>
     <storage>
        <id>2178</id>
        <mount-point>/data</mount-point>
     </storage>
     <storage>
        <id>2196</id>
        <mount-point>/data</mount-point>
     </storage>
  </storagePool>  
  <parameters/>  
  <startupCount>2</startupCount>  
  <workflow>
    <startup timeout="9000">
      <post-startup>
        <command type="action" name="install-jdk"/>
      </post-startup>
    </startup>  
    <shutdown/>
   </workflow>
</ibmServer>

The tags addressPool and storagePool are used to associate ip addresses and storage volumes with servers during system startup. In the above example, the startupCount is 2 so, there are two servers in the group. The first server gets the first ip address and storage volume and the second server gets the second ip address and storage volume. Individual ip addresses in the pool are specified by address tags nested within addressPool tag. address tags have the id tag as child. They specify the id of the corresponding ip address. Individual storage volumes are specified by the storage tags. In addition to the id tag there is the mount-point tag. This is used to specify the mount point for the volume. If the mount-point tag is blank the corresponding storage tag is ignored. The configuration UI leaves mount-point field blank; so, please go to the xml and edit the mount-point manually for volume assignment to work. If the address pool or the storage pool have size less than server group size(startupCount), unmatched servers get no ip address or storage volumes. Similarly if the pool sizes are greater than the startupCount, extra ip addresses and storage volumes are ignored. Address pool and storage pool may have different sizes. Currently, this feature is available only during system startup and not during scaleup/scaledown.

Some IBM images take custom parameters. When an image is selected in the server configuration dialog, IMOD automatically creates input boxes for for those parameters by dynamically querying the parameter list. However, for some images this parameter list may not be available and no input boxes will appear for them. The work-around for those cases is to manually provide the parameter information by using the <parameters> element.

<parameters>
    <parameter name="WASAdminUser" type="literal" value="testuser"/>
    ...
    <parameter name="WASConfigureIHS" type="literal" value="true"/>
    ...
    <parameter name="WASAugmentList" type="literal" value="all"/>
</parameters>

Terremark Server

<terremarkServer>
  <name>jira-node</name>
  <terremarkAccount/>
  <password/>
  <templateIdentifier/>
  <cpu/>
  <memory/>
  <openPorts>
    <openPort protocol="TCP" fromPort="8080" toPort="8080"/>
  </openPorts>
  <parameters/>
  <startupCount>1</startupCount>
     <workflow>
       <startup timeout="3000">
       <post-startup>
          <command type="action" name="install-jre"/>
       </post-startup>
       </startup>
       <shutdown/>
     </workflow>
</terremarkServer>

Cloud.com Server

<cloudstackServer>
 <name>test532</name>  
 <cloudstackAccount>My-InfiCloud</cloudstackAccount>  
 <templateId>465</templateId>  
 <zoneId>1</zoneId>  
 <serviceOfferingId>13</serviceOfferingId>  
   <cstAddressPool>
     <cstAddress>
         <ip></ip>
     </cstAddress>
   </cstAddressPool>  
    <parameters/>  
    <startupCount>1</startupCount>  
    <workflow>
     <startup timeout="3600">
       <post-startup>
       </post-startup>
     </startup>  
     <shutdown/>
     </workflow>
</cloudstackServer>

Physical Server

Physical server is treated as a cloud provider so you can add keys to the phyiscal server account in the profile page. Workload can be managed as a single system across hybrid deployments consisting of physical servers, private clouds, and public clouds.

<physicalServer>
  <name>jira node</name>  
  <physicalAccount>my-physical-name</physicalAccount>
  <privateKey>my-private-key-name</privateKey>
  <sshUser>root</sshUser>  
  <host>555.106.199.555</host> 
  <parameters/>  
    <workflow>
      <startup timeout="3000">
       <post-startup>
        <command type="action" name="install-jre"/>
       </post-startup>
      </startup>  
      <shutdown/>
    </workflow>
</physicalServer>
<sshPort>replace-this-with-your-port-number</sshPort>

In case the physical server uses password instead of key pair authentication, you can replace the privateKey tag with password tag e.g.:

<password>mypassword</password>

vCloud Director Server

    <serverTypes> 
       <serverType role="testvcd_role" min="1" max="2"> 
         <vcdServer> 
           <name>my-server</name>  
           <vcdAccount>my-vcd-account</vcdAccount>  
           <vdcName>Kaavo</vdcName>  
           <templateIdentifier>Template_CentOS6</templateIdentifier>  
           <parameters/>  
           <startupCount>1</startupCount>  
           <workflow> 
             <startup timeout="1200"> 
               <post-startup/> 
             </startup>  
             <shutdown/> 
           </workflow> 
         </vcdServer> 
       </serverType> 
    </serverTypes>

OpenStack Server

<serverType role="default" min="2" max="3" logMode="verbose"> 
         <openstackServer> 
           <name>jira-node</name>  
           <openstackAccount>my-openstack-acc</openstackAccount>  
           <imageIdentifier></imageIdentifier>  
           <flavorIdentifier></flavorIdentifier>  
           <keypair>default</keypair>  
           <ostAddressPool> 
             <ostAddress> 
               <ip></ip> 
             </ostAddress>  
           </ostAddressPool>  
           <parameters/>  
           <startupCount>2</startupCount>  
           <workflow> 
             <startup timeout="3000"> 
               <post-startup> 
                 <command type="action" name="install-jdk"/> 
               </post-startup> 
             </startup>  
             <shutdown/> 
           </workflow> 
         </openstackServer> 
       </serverType>

Deployment Timeout Handling

In version 1.8, we added ability to handle timeout conditions during deployment and server configuration by adding onTimeout attribute in the server startup workflow tag <startup timeout="300" onTimeout="Continue"/>

<serverType role="loadbalancer" min="1" max="8">
    ...............................
    ...............................
    ...............................
    <startupCount>2</startupCount>
    <workflow>
              <startup timeout="1200"  onTimeout="Continue"></startup>
              <shutdown/>
    </workflow>
    ...............................
    ...............................
</serverType>

Deployment Error Handling

In version 2.5, we added ability to handle server error conditions during deployment and server configuration by adding onError attribute in the server startup workflow tag <startup timeout="300" onError="Continue"/>

If we do not specify any onError attribute, it will act according to the current default behavior, i.e. the tier and the system deployment will be in error state and users have to manually abort the system and retry the deployment after addressing the issue responsible for the error. See the following example of how to configure this within the serverType tag:

serverType role="loadbalancer" min="1" max="8">
   ...............................
   ...............................
   ...............................
   <startupCount>2</startupCount>
   <workflow>
             <startup timeout="1200"  onError="Continue"></startup>
             <shutdown/>
   </workflow>
   ...............................
   ...............................
</serverType>

AWS EBS Boot Instance

Starting version 1.9 we added support for EBS instances using EBS for booting. EBS boot instances, can be included in the n-tier system by adding optional information to the awsServer tag. Setting terminateOnStopInstance false ensures that the server is stopped and not terminated

<serverType role="default" min="1" max="2" >
 <awsServer>
 ..................
 ..................
 <securityGroup></securityGroup>
 <keypair></keypair>
 <rootDeviceType>ebs</rootDeviceType>
 <terminateOnStopInstance>false</terminateOnStopInstance>
 <machineIdentifier>ami-df6489b6</machineIdentifier>
 ..................
 ..................
 </serverType>

Monitoring Custom Images (images not provided by Kaavo)

Starting version 1.9 we added automatic installation of monitoring agents on servers with Linux OS. This will allow users to use any custom Linux images without being limited to the use of Kaavo provided images for Linux OS. All flavors of Linux are supported. By default, any time a Linux OS based server is launched from IMOD N-Tier, IMOD N-Tier Engine checks if the monitoring agent is installed on the server, if the agent is not installed, IMOD N-Tier Engine installs the agent. If users don’t want the monitoring agent to be installed on the server, they can disable the default behavior by adding the optional flag agentSetup="manual” for the serverType

<serverType role="default" min="1" max="4" agentSetup="manual">

Debugging Custom Actions on Servers

IMOD logs exit codes for the actions executed on the servers over SSH in the application centric system logs are accessible from the LOG tab for the deployed systems. For debugging users can add the verbose flag to display up to last 10 lines of the message generated from the execution of actions on the servers in the application centric logs. Detailed messages are logged on the servers in the same directory where the actions were executed in the log files for the corresponding actions (<action-name>.log). To disable both exit codes and the messages from actions the value "none" has to be used for logMode .

<serverType role="default" min="1" max="4" logMode="verbose">

Connecting to Servers as a non-root User

Some images don’t allow users to connect as root user over SSH. The non-root username can be configured for IMOD engine to connect to the server by setting the sshUser attribute in the server tag.

<serverType role="default" min="1" max="2" >
 ............................
 ............................
 <sshUser>ubuntu</sshUser>                     
 ............................
 ............................
</serverType>

Manage Servers without SSH Connectivity

Sometime users only want to launch the servers and don’t want to connect to the server over SSH or have firewall rules to not allow IMOD to connect to certain servers over SSH. Users can now bring up multi-server systems using IMOD without IMOD engine waiting for SSH connectivity by just adding the optional checkSSH=”false” flag to the startup timeout in the server tag

<serverType role="default" min="1" max="2">
 <awsServer>
  ............................
  ............................
  <startupCount>1</startupCount>  
   <workflow>
    <startup timeout="300" checkSSH="false">
  ............................
  ............................
</serverType>

Actions

Actions have a unique name, a target (written as an expression which resolves to a set of servers at execution time) and a Boolean flag: execute, that indicate whether the action will evaluate to an executable script if it is true, or a server configuration file. The script or configuration file is created dynamically and is saved with the name scriptName on the server path scriptPath and will contain the contents evaluated by replacing the values of the parameters in the template scriptTemplate. Parameters are elements with a name referenced in the scriptTemplate, and has a value. The type of a parameter is either a literal or a server expression which is resolved to a set of servers.

<action name="grant-mysql-phpcolab" execute="true" target="[tier=db_tier][serverrole=sql]">
   <scriptName>GrantPhpDbAccess.sh</scriptName>
   <scriptPath>/root</scriptPath>
   <scriptTemplate type="inline">
   <![CDATA[
       #!/bin/sh
       #foreach ($clientNode in $SqlClientNodes)
       /usr/local/mysql/bin/mysql -uroot -p${mysqladminpassword} -e "GRANT ALL PRIVILEGES ON ${appdb}.* TO '${appdbuser}'@'${clientNode.PrivateDNS}' IDENTIFIED BY '${appdbpassword}'"
       #end
   ]]>
   </scriptTemplate>
   <parameters>
       <parameter name="mysqladminpassword" type="literal" value="passwd" />
       <parameter name="appdb" type="literal" value="php_collab" />
       <parameter name="appdbuser" type="literal" value="phpcollab" />
       <parameter name="appdbpassword" type="literal" value="apppasswd" />
       <parameter name="SqlClientNodes" type="serverref" value="[tier=app_tier][serverrole=default]"/>
   </parameters>
</action>

Events

An NTier, a Tier, a Server Group (or each of the individual servers in a group) are controlled by sending events to them. An event has a specified target. Events can be fired in different ways:

  1. Manually – users may fire events by using the user interface.
  2. By the event scheduler.
  3. By making web service call (http://wiki.kaavo.com/index.php/Kaavo_Web_Services).
  4. By setting up trigger in the monitoring system(http://wiki.kaavo.com/index.php/N-Tier_Guide_System_Definition#Configure_Events)

There are two kinds of events:

Standard Events

Standard events are predefined events. These events are defined whenever a system is defined in IMOD. There are two kinds of standard events start and stop. All of Ntier, Tier, and Server (Group) can be a target of these events. So, for each of them we have a start event and a stop event. For example, start ntier, start tier1, start tier2, start server1, start server2, stop tier1, etc. Every event has a handler which is a block of code that is executed in response to the occurrence of the event. For standard events, event handlers are predefined. Nevertheless, the default behavior of these event handlers can be customized by implementing predefined hooks for them. The start events have post-startup hooks; the stop events have pre-shutdown hooks. The structure of these event-handlers is illustrated below in pseudocode.

Proc Start-server-handler
 Begin
      Execute start-server-default-logic
      Execute post-startup hook
 END

Proc Stop-server-handler
 Begin
      Execute pre-shutdown hook
      Execute stop-server-default-logic       
 END

The default logic for start server (referred to as start-server-default-logic above) is to provision a server in the back-end cloud. The default logic for stop server is to de-provision the server. The post-startup and pre-shutdown hooks are blank by default. So, when we specify a standard event, only these hooks have to be specified.

Users don't have to manually specify hooks. The can use the GUI for it as shown below:

File:Define-hooks.jpg Figure 4: Define hook for start server event.

Custom Events

In addition to the standard events that are available by default. Users may define their own events. Given below is an example of a custom event. An event is identified by a name and the handler of the event is defined along-with the event. So, if the below-given event is defined in a system whenever an event with the name “system-scaleup” happens the specified handler is executed.

 
 <event name="system-scaleup" description="Jira system scaleup" type="custom"> 
      <handler timeout="6000"> 
        <pre-process> 
          	<startServers> 
            	       <serverType role="[tier=jira_tier][serverrole=default]" count="1" addToEvent="true"/> 
          	</startServers> 
        </pre-process>  
        <process> 
         	 <command type="action" name="start-jira" target="[tier=jira_tier][serverrole=default][scaleupserver]"/>  
          <command type="action" name="configure-apache-balancer"/>  
          <command type="action" name="reset-apache"/>        
        </process> 
     </handler> 
 </event>  

  

The event handler above starts a new server and configures it by executing a sequence of commands on it. This type of event handlers may be used for scaling up an existing system by more instances. Similarly, events can be defined for recovering a dead server. Elements startServers, recoverServers, and stopServers are available only in the event handler of custom events.

 
 <event name="server-recovery" description="Jira server recovery" type="custom"> 
      <handler timeout="6000"> 
        <pre-process> 
          <recoverServers> 
            		<server server-to-recover="[context=event][param=instanceid]"/> 
          </recoverServers> 
        </pre-process>  
        <process> 
         	<command type="action" name="start-jira" target="[tier=jira_tier][serverrole=default][recoveringserver]"/>  
          	<command type="action" name="configure-apache-balancer"/>  
          	<command type="action" name="reset-apache"/> 
        </process> 
      </handler> 
 </event>  
 

Similar to actions and standard events, users don't have to specify custom events manually.

File:Define-Custom-Event.jpg Figure 5: Define Custom Events using GUI

Event Format Examples

Custom event examples

<event name="custom-scaleup" description="Jira server custom scaleup" type="custom">
   <handler  timeout="1200">
    <pre-process>
         <!—can have a sequence of actions here. they will be executed sequentially before start/stop/recoverServers-->
         <startServers>					
               <serverType role="[tier=jira_tier][serverrole=default]" count="1" addToEvent="true"/>
         </startServers>	
       </pre-process>
     <process>
         <command type="action" name="start-jira" target="[tier=jira_tier][serverrole=default][scaleupserver]"/>
     </process>
   </handler>
 </event>

<event name="custom-recovery" description="Jira server is died" type="custom">
   <handler timeout="1200">
       <pre-process>
              <!—can have a sequence of actions here. they will be executed sequentially before start/stop/recoverServers--
          <recoverServers>
              <server server-to-recover="[context=event][param=instanceid]"/>
          </recoverServers>
       </pre-process>
    <process>
       <command type="action" name="start-jira" target="[tier=jira_tier][serverrole=default][recoveringserver]"/>
    </process>
   </handler>
</event>

<event name="custom-scaledown" description="Jira server custom scaledown" type="custom">
    <handler timeout="1200">
          <pre-process>
              <!—can have a sequence of actions here. they will be executed sequentially before start/stop/recoverServers. 
For example ,<command type="action" name="xyz" target="[tier=jira_tier][serverrole=default [scaledownserver]"/> -->
              <stopServers>
                 <serverType role="[tier=jira_tier][serverrole=default]" count="1"/>
              </stopServers>
           </pre-process>
                <!—no process section because we don’t need it -->
     </handler>
 </event>

“custom-scaleup” is an event. The attribute type=”custom” indicates that this is follows the enhanced format. This event can act on different server types in different tiers in different ways. Hence, <event> tag does not have any scope attribute defined. The <handler> tag does not have any type attribute – the nature of the handler is completely defined by its body. It contains a couple of new tags, namely, <pre-process>, <process>, <startServers>, etc. The <pre-process> tag is used for grouping certain operations. <startServers> and a couple of nested <serverType> tags represent the operation of starting/stopping a number (specified by the count attribute) of new servers belonging to a certain role (specified by the role attribute). The <process> tag contains a sequence of actions to be executed after operations specified by <pre-process> are executed. Hence, the handler for “custom-scaleup” performs the following:

  1. Starts one server of role “default” of tier “jira_tier”. And stops one server of the same role.
  2. Executes the action “start-jira “on the newly started server (scaleupserver).
  3. Adds the new server to the configured event as the attribute addToEvent is set to “true”.

On the contrary, please note that a recovery event is automatically associated with the recovered instance and stopped instances are automatically deleted from the events they belong to.“custom-recovery” is an event whose handler performs auto-recovery of dead servers. It uses the new tag <recoverServer>. <recoverServers> and a nested <server> tag represent the operation of recovering server specified by the attribute server-to-recover. Hence, the handler for “custom-recovery” performs the following:

  1. Recovers the dead server.
  2. Executes the action “start-jira” on the recovered server.

“custom-scaledown” is an event whose handler performs scaledown of servers. It uses the new tag <stopServer> for stopping servers. To refer to servers that are going to be scaled down the selector [scaledownserver] is used e.g. target="[tier=jira_tier][serverrole=default][scaledownserver]"

Enhanced format still does not support the event context based target expressions such as “[context=event][param=instanceid]" in actions/commands. In the same event the same server role should not appear in both startServers and stopServers blocks.

Old Format Event

Before the event specification syntax as described above was introduced, IMOD supported an old event syntax. This event format has been deprecated and is replaced with enhanced event specification format. Please use the new format which allows user to define any of the following custom events type plus more using a single standard format.

The old syntax supports four types of handlers are described in the following example:

 <events>
    <event name="jira-service-died" scope="[tier=jira_tier][serverrole=default]">
        <handler type="simple">
            <command type="action" name="restart-jira" target="[context=event][param=instanceid]"/>
        </handler>
    </event>

    <event name="jira-server-died" scope="[tier=jira_tier][serverrole=default]">
       <handler type="recovery" timeout="600" server-to-recover="[context=event][param=instanceid]">
           <command type="action" name="start-jira" target="[tier=jira_tier][serverrole=default][recoveringserver]"/>
       </handler>
    </event>

    <event name="jira-tier-overloaded" scope="[tier=jira_tier][serverrole=default]">
        <handler type="scaleup"  timeout="600" scalecount="1">
            <command type="action" name="start-jira" target="[tier=jira_tier][serverrole=default][scalingserver]"/>
        </handler>
    </event>

    <event name="jira-tier-underused" scope="[tier=jira_tier][serverrole=default]">
        <handler type="scaledown"  timeout="600" scalecount="1">
        </handler>
    </event>
 </events>

The “jira-service-died” is an event, defined in the scope of the tier(jira_tier) and the servertype(default) and has a handler of type simple; this implies that the commands defined in the handler will executed in sequence. The action referenced here “restart-jira” has the target overridden by a server expression which takes its reference from the event context. The expression [context=event][param=instanceid] will look for a parameter named instanceid in the event context, and will resolve to a set of servers from the scope of the given event as it is a target of an action. Such an expression will expect a comma-separated string of AWS instanceids and will pick the respective servers from the scope of this event. The “jira-server-died” is an event, defined in the scope of the tier(jira_tier) and the servertype(default) and has a handler of type recovery; this implies that a workflow will replace the server defined in the expression mentioned in the handler attribute: server-to-recover, and as post-actions of the workflow will execute the sequence of commands defined in the handler. The “jira-tier-overloaded” is an event, defined in the scope of the tier(jira_tier) and the servertype(default) and has a handler of type scaleup; this implies that a workflow will increase the count of servers in this scope and configure them by commands defined in the post-startup section of the scope servertype. The count increase by the value of the scalecount attribute of the handler. Also, as part of the post-scaleup actions, the commands defined in the handler section will be executed sequentially. The “jira-tier-underused” is an event, defined in the scope of the tier(jira_tier) and the servertype(default) and has a handler of type scaledown; this implies that a workflow will decrease the count of servers in this scope by a value of the scalecount attribute of the handler. Also, as part of the pre-scaledown actions, the commands defined in the handler section will be executed – here there are no commands defined. Note you may give any name to your custom events as long as name is unique within the system definition. Different systems can have same event names.

Services

Rackspace Load Balancer

Following is an example of how Rackspace Load Balancer information is captured in the System Definition File. For scale up, scale down, start up , shut down, and recovery cases Kaavo IMOD automatically adds or remove servers to load balancer. Users don't need to write any scripting to do the plumbing required for registering and removing the servers from the load balancer.

 <services>
   <rackspaceloadbalancer name="webbalancer"> 
     <rackspaceAccount>kaavosupport</rackspaceAccount>  
     <region>ORD</region>  
     <lbnodes>[tier=simplerack_tier][serverrole=simplerack_role]</lbnodes> 
     <listenerconfigs> 
       <httpconfig stickysession="true">
         <loadbalancer port="80" />
         <instance port="80" />
       </httpconfig> 
     </listenerconfigs>
     <healthcheck pingprotocol="http" pinginterval="30" pingtimeout="20" healthy-threshold="3" pingpath="/" status-regex="200" body-regex=".*"/> 
   </rackspaceloadbalancer> 
 </services>

Amazon Load Balancer

Following is an example of how Amazon Load Balancer information is captured in the System Definition File. For scale up, scale down, start up , shut down, and recovery cases Kaavo IMOD automatically adds or remove servers to load balancer. Users don't need to write any scripting to do the plumbing required for registering and removing the servers from the load balancer. Note, make sure to specify the availability zones in the System Definition for the servers you are adding as lbnodes.

<services>
  <awsloadbalancer name="webbalancer">
     <awsaccount>896321137534</awsaccount>
     <region>us-east-1</region>
     <lbnodes>[tier=Amazon_Loadbalancer][serverrole=default]</lbnodes>  
     <listenerconfigs>
       <httpconfig>
         <loadbalancer port="80" secure="false"/>  
         <instance port="80" secure="false"/>  
         <policy type="lb"> 
         <cookieduration>600</cookieduration> 
         </policy> 
       </httpconfig> 
     </listenerconfigs>  
     <healthcheck pingprotocol="http" pingport="80" pingpath="/" pinginterval="30" pingtimeout="3" unhealthy-threshold="2" healthy-threshold="2"/> 
   </awsloadbalancer> 
 </services>

Following AWS Load Balancer service attributes are available to the velocity template engine for automation using actions. E.g. to automatically adding newly created load balancer service to the DNS

LoadBalancer within AWS VPC

To use AWS LoadBalancer within VPC add xml element with the subnet information under the lbnodes element in the LoadBalancer definition in the System Definition file. The subnets tag should be added between lbnodes and listnerconfigs xml tags. A LoadBalancer can support servers in multiple subnets, just add a comma separated list of the subnets. See the example below:

<services>
  <awsloadbalancer name="webbalancer">
     <awsaccount>896321137534</awsaccount>
     <region>us-east-1</region>
     <lbnodes>[tier=Amazon_Loadbalancer][serverrole=default]</lbnodes>  
     <subnets>subnet-dd97fcb6,subnet-aa97fcb6</subnets>
     <listenerconfigs>
       <httpconfig>
         <loadbalancer port="80" secure="false"/>  
         <instance port="80" secure="false"/>  
         <policy type="lb"> 
         <cookieduration>600</cookieduration> 
         </policy> 
       </httpconfig> 
     </listenerconfigs>  
     <healthcheck pingprotocol="http" pingport="80" pingpath="/" pinginterval="30" pingtimeout="3" unhealthy-threshold="2" healthy-threshold="2"/> 
   </awsloadbalancer> 
 </services>

Amazon Specific Commands

Apart from the script like actions, we also provide cloud-provider specific commands that can be executed in the post-startup sections. For example, here are the commands supported for the Amazon cloud:

Associate an elastic IP address to a running instance

Example:

       <post-startup>
           <command type="ec2" name="associate-ip">
               174.129.251.111
           </command>
               ...
        </post-startup>

Note: Since you can have an elastic IP to only one instance at a time, please ensure that the above command definition is done in a server definition which results in only one server instance at runtime to avoid any unpredictable consequences.

Attach an EBS Volume to a running instance and mount it as a device on some path

Example:

           <post-startup>
               <command type="ec2" name="attach-ebs-vol">
                   [volume-id=vol-3de40054][device-name=/dev/sdh][mount-path=/mnt/apache]
               </command>
               ...
           </post-startup>

Note: Since you can attach a volume to only one instance, please ensure that the above command definition is done in a server definition which can result in only one server instance at runtime to avoid any unpredictable consequences. Also, ensure that the volume is formatted, as we do not do the formatting yet. Don't put any space characters between brackets.

Attach an EBS Volume to a running instance after creating the volume from a snapshot

Example:

       <post-startup>
           <command type="ec2" name="attach-ebs-vol">
               [snapshot-id=snap-c7f012ae][volume-size=5]
               [device-name=/dev/sdh][mount-path=/mnt/mysql]
           </command>
           ...
       </post-startup>

Note: This command will ensure that a volume is created from the mentioned snapshot in the same zone as the server instance; and mounted to the same path and device mentioned.

Configure the instance to be able communicate with S3 buckets belonging to to this AWS account that launched the instance

Eample:

       <post-startup>
           <command type="s3" name="setup-s3config"/>
           ...
       </post-startup>

Cleanup the S3 configuration from the instance to disable any further communication with S3 from this instance

Example:

       <post-startup>
           <command type="s3" name="teardown-s3config"/>
           ...
       </post-startup>

Normal shell script executions on the instance

Example:

       <post-startup>
           <command type="script" name="cloudondemandconfig.sh">
              <![CDATA[
                  echo "copying from s3"
                  s3cmd -f get s3://your_bucket/your_file  /somepath/your_file_on_the_server
              ]]>
           </command>
           ...
       </post-startup>

Creating Custom Events

Along with auto-deployment of an N-tier system, IMOD also automates the management of the deployed N-tier systems by providing a framework for defining custom events and mapping events to actions and workflows in the system definition file. This functionality enables fully automated lifecycle management of deployed applications. IMOD automatically execute pre-defined actions in response to registered events. Think of this as an auto-pilot for managing the service levels for your application, IMOD automatically takes corrective actions without requiring any human intervention to ensure service levels are met.

To leverage this functionality requires two main steps; one is to define action and map action to event and second is to configure the event conditions for triggering the events.

Define Actions and Map Events

In the System Definition file you can define any custom actions and map actions to events. Events are generated by the Monitoring System or can be generated manually from the n-tier Dashboard. To automate the response to events we need to map the events to actions. See example System shown in Figure 1.

File:Event-to-Action-Mapping.jpg Figure 6: Event-to-Action Mapping

Configure Events

You have to configure Events so that the Monitoring System can trigger them at specified condition. In IMOD you can do it from the Monitoring dashboard. Click on Configure System Event button to open the Create Event Dialog Box as shown in Figure 2. Perform the following operations to define an Event. Since release 1.5, IMOD supports persistent events. Persistent events are statically defined because they can be defined on systems that have not been started. Statically configured events are persistent in nature because they are persisted along with the system definition whereas dynamically configured events cease to exist when the system is stopped. When a system is undeployed even statically configured events are deleted. Statically configured events are activated when the system is started and it is initially associated with all the servers in the selected server roles. On the other hand, dynamically configured events are initially associated with the selected server instances only. In both the cases, the server association may change as servers come and go. • Chose the System • Check the Servers/Server roles whose metric values will be used in deciding the bounding condition. Since release 1.5 we can select server roles as well as specific servers. • Select the name of the event, you want to configure, from the Event Name combo box. Whenever you chose a system the Event Name combo box is populated with all the events that you have defined in the system definition file. • Write down your comments for that event that will be used for the Alert mechanism. Sending alerts for trigger events is optional you can click on the check box to send alerts and specify the severity and priority of the alerts. In case an alert is configured you receive an email anytime the trigger occurs. There is a separate button called 'Create Alert'. This is used to define alerts on single servers. Only action taken by such an alert is sending email and it is currently supported for AWS servers only.


File:create-event-dialog-box.jpg Figure 7 : Create Event Dialog Box

• Select the Event Type. Events may be aggregated or non-aggregated. If you chose an event to be aggregated then the event will be triggered based on the aggregated metric values of all the selected servers. Otherwise the event will be triggered based on individual metric values of all the selected servers. Generally scale up and scale down events are aggregated events whereas instance recovery and service recovery are non-aggregated events. You can safe guard your scaling mechanism by properly setting the max & min values for serverType tag in your system definition file. IMOD will not scale up beyond the max value (maximum number of servers for that tier) and scale down below the min value (minimum number of servers for that tier). • Properly select/deselect the Dynamic? checkbox to indicate whether you want to configure the event statically or dynamically. Statically configured events can be defined when the system is not in running state. In this case, server roles are selected in the second step above (2nd bullet from top). Dynamically configured events can be defined when the system is running and has running servers. In this case, running servers are selected in the second step above. • Chose the metric for deciding bounding condition. IMOD supports, CPU, Memory, I/O, Disk-Space, Swap Memory, and Number of Requests. The metric ‘Ping to Server (TCP)’ is provided to support instance recovery. If you need any other metrics please let us know and we will add them. • Chose the bounding expression for metrics other than ‘Ping to Server (TCP)’. Current release of IMOD supports the following expressions: o Average value for period of T times < N o Average value for period of T times > N o Average value for period of T times = N o Average value for period of T times NOT N • Chose the value of N if required. • Chose the value of T if required. • Click on Submit button.

After defining the Event you see it in the Monitoring dashboard under Alerts/Events Tab. Just click on a System name under Systems and Standalone Instances Tab. It will list all the events defined for that System. To delete an Event click on the corresponding delete icon.

File:list-of-custom-events-for-system.jpg Figure 8 : List of Custom Events for a System

Event configuration Example

Refer to the sample php-collab system definition template provided as an example in IMOD. php-collab is a php-based collaboration application, and the System Definition is for deploying the application in the following 3-tier setup: • A web-tier with two apache load balancer configured with round robin dns. • An app-tier with php-collab application deployed on the apache web server. Initially it consists of two app nodes. • A db-tier configured with mysql cluster consisting of a manager node, a sql node and two ndb nodes. The System Definition file has three pre-configured actions as an example; one action is for recovering the system failure in the app-tier, second for scaling-up the app-tier, and third for scaling-down the app-tier.

To test drive the event to action functionality, deploy and run the php-collab system and configure the events on the monitoring page for events in the System Definition file. You can simulate the failure of the server by killing one of the servers in the app-tier, e.g. for EC2 server you can kill the server from Elastic Fox Firefox plug-in or command-line EC2. IMOD will discover the server failure and automatically recover the system by launching a server and configuring it in the app-tier by executing the recovery action. If you shutdown the system using IMOD, it will be treated as a planned shutdown and out-recovery event would be triggered.

In addition to auto-triggering of the events from the monitoring system, you can also fire the events manually from the n-tier dashboard by click the Event button and selecting the appropriate event from the drop down list and firing it by clicking on Fire button. Some events may require inputs, e.g. recovery depending on how it is defined in the system definition file may need instance id (server id), this information can be passed using the add button and adding the name value pairs to any event when manually firing the events.


File:Manually-Firing-Events.jpg Figure 9: Manually Firing Events

Following pages have some screen shots as an example how the system UI looks during scale-up and scale-down events.

File:php-collab.jpg Figure 10 : php-collab system


File:php-collab-during-scaling-up.jpg Figure 11 : php-collab during scaling up


File:php-collab-scaling-down.jpg Figure 12 : php-collab during scaling down

Service Monitoring

Starting version 2.6 we have added service monitoring to IMOD. Service monitoring can enabled by adding the monitoring tag. Currently we are supporting monitoring of MySQL database and Apache HTTP service. Custom event triggers can be configured for the monitored services to enable auto-pilot to automatically execute the corrective action.

MySQL Monitoring

<serverType role="db-server" min="1" max="1"> 
………………… …          
           <startupCount>1</startupCount>  
           <monitor>mysql[user=your-db-username,password=your-db-password,location=/usr/local/mysql/bin]</monitor>  
           <workflow> 
           </workflow>  
           ......................
         </awsServer> 
       </serverType>

Apache Monitoring

<serverType role="db-server" min="1" max="1"> 
………………… …          
           <startupCount>1</startupCount>  
           <monitor>apache[user=apache]</monitor>  
           <workflow> 
           </workflow>  
           .................
         </awsServer> 
       </serverType>

Monitoring Multiple Services

<serverType role="db-server" min="1" max="1"> 
………………… …          
     <startupCount>1</startupCount>  
     <monitor>mysql[user=your-db-username,password=your-db-password,location=/usr/local/mysql/bin]:apache[user=apache]</monitor>  
           <workflow> 
           </workflow>  
           
         </awsServer> 
       </serverType>

[If you have standard installation of MySQL , you don't need to put the value of location , if your are running your MySQL without password (not recommended) then you don't need to use password as well]

Retrieved from "http://wiki.kaavo.com/index.php/N-Tier_Guide_System_Definition"

This page was last modified on 5 March 2014, at 13:10. Content is available under Copyright 2013 Kaavo,All rights reserved..


[Wiki Home]
Wiki Home
Guides and Tutorials
Kaavo Web Services
Solutions
Webinars
FAQs
Free Trial
Release Notes
Kaavo Forums
Join our Mailing List
Contact us
YouTube Channel
Follow on Linkedin
Follow on Twitter
Follow on Facebook
Watch IMOD Demo
Kaavo Home

View source
Discuss this page
Page history
What links here
Related changes

Special pages