Reward Models

After describing the model, we can obtain different measures of interest using rate or s. Rate rewards are associated with the states in the model. These rewards are generic enough to permit the definition of a wide range of measures of interest. (For references on reward models and some of the techniques implemented, see [#!deso89!#,#!deso-perfsurv!#,#!trans-finalg98!#].)

If a $r_i$ is associated with state $i$, them the system gains reward $r_i$ per time unit spent in state $i$. Impulse rewards are associated with state transitions. If a reward $\rho_{ij}$ is associated with the transition from state $i$ to $j$, then the system gains a reward $\rho_{ij}$ each time it makes a transition from $i$ to $j$. The s may be used as a counting mechanism. An impulse reward can be associated with the triggering of an event, or with the reception of a message.

Below, we illustrate the use of rewards with the M/M/1/k model. Return to the ``Model Specification Module'', and open our model (using TGIF). We specify the rewards using the Rewards attribute:

Rewards=
  rate_reward = <name of reward>
  condition =
  value =     ;
Suppose that we are interested in calculating the utilization of the queue in the Server_Queue object. We can define a reward variable utilization that can express this measure of interest. The user should specify the name of the reward, used as an identifier. The condition identifies the subset of states that will be associated with the reward. The reward value is specified in value. If the condition is not satisfied, that means we are not in our chosen subset of states and a reward of 0 is given.
Rewards=
  rate_reward = utilization
  condition = (Queue > 1)
  value = 1.0;
The expected value of the queue size can be specified as a as follows:
Rewards=
  rate_reward = q_size
  condition= (TRUE)
  value= queue;
Note that the total accumulated reward in $(0,t)$ averaged over $t$ is the expected queue size in the interval.

Suppose now we are interested in counting the number of customers served in $(0,t)$. An can be associated with the event Packet_Service and defined to have a value of 1 when this event triggers. We have:

Rewards=
  impulse_reward = served
  event= Packet_Service,1
  value= 1;
Clearly the accumulated impulse reward in $(0,t)$ averaged over $t$ is the number of customers serviced per unit time.

Figure [*] shows the model and the objects attributes specified.

Figure: The model with Rewards
\includegraphics[width=4in]{figuras/MM1kcompletemodel.eps}

Guilherme Dutra Gonzaga Jaime 2010-10-27