The special reward rate_reward_sum

To monitor and control a set of rewards, there is a special reward that keeps the sum of the accumulated reward of a set of specified rewards.

An example of the great importance of this feature is the use of the rate_reward_sum to represent a shared buffer that has more that one reward to indicate each class of traffic. Fluid classes can be specified and the total shared buffer can be defined as a bound in the rate_reward_sum that monitors all classes. As in one single reward, the c.r. value of the rate_reward_sum can be bounded by a range, and the simulator maintains the correct ratio of each c.r. value of the individual rewards.

To illustrate this feature, we are going to analyze the following example:

  rate_reward_sum = buffer
  bounds = 0, B
  rewards = fluid1 + fluid2;

  rate_reward = fluid1
  cr_bounds = 0, B
  condition = (FALSE)
  value = 0;

  rate_reward = fluid2
  cr_bounds = 0, B
  condition = (FALSE)
  value = 0;

\includegraphics[width=3in]{figuras/reward_sum.eps}

Suppose that fluid1 and fluid2 represent two classes of traffic and they share the same buffer $B$. Each fluid should be bounded according to the buffer size: $0,B$. The amount of buffer will be given by a rate_reward_sum called buffer. The behavior of the fluids will depend upon the rate_reward_sum. If fluid1 and fluid2 are growing and the buffer reaches the value $B$, the c.r. values of each fluid will be frozen because the buffer size can't exceed $B$.

This is a very simple example, but many other cases are treated. This complete treatment makes the feature robust enough to represent all situations where the global bound can affect the rewards.

One short example of this complexity is 3 rewards $a,b,c$, where $a$ and $b$ are growing and $c$ is decreasing. If the rate_reward_sum reaches the bound $B$, the amount of c.r. that will be released by the decreasing $c$ is distributed proportionally to the increasing $a$ and $b$. At this point, none of the rewards are frozen. $c$ continues decreasing at the same rate, and $a,b$ will suffer a slope change that will limit their growth according to the distributed rate as shown in Fig. [*].

Figure: Illustration of bound affecting rewards.
\includegraphics[width=3in]{figuras/reward_sum2.eps}

NOTE: All rewards whose sum will be gathered in a rate_reward_sum must not appear in any other rate_reward_sum reward. Tangram-II will report the error ``Duplicated reward reference xxx. The rewards reference must be mutually exclusive.'' in this case.

Guilherme Dutra Gonzaga Jaime 2010-10-27