Dynamic Refresh:

Optimal Inventory Policy for Compute Services in Enterprise IT

Executive Summary

Current server refresh policies fail to fully monetize the transformative economic characteristics of newer technologies, thereby unnecessarily inflating total compute service costs by 10-25% annually.

Current fixed-time refresh (FTR) policies rely on predetermined, fixed periods of use before servers are eligible for replacement.  This constrains the egress rate existing servers are removed from a pool of servers, which then limits the ingress rate new, higher performance servers are added to the pool.  This unnecessarily increases server inventory levels that drive even larger, far more expensive commitments in space, power, and software.

To address this problem, we propose dynamic refresh (DR), an inventory policy that governs compositional change to server pools by determining optimal ingress rates based on lowest total pool cost.  Servers are treated as consumable inventory, not fixed assets, with no age-based rules to constrain server egress.  DR is calculated using a piecewise-smooth dynamical system with jump events, which provides five desirable qualities compared to FTR.

  • First, DR administers performance as a medium of exchange (i.e., currency) to acquire other compute resources, as the rate of server refresh is governed by the financial value of performance.
  • Second, DR reflects changes to currency value, as performance gains are worth more the higher compute resource costs are relative to a server’s cost.
  • Third, DR is responsive to the arrival of higher performance gains, as more performance currency motivates higher server ingress rates.
  • Fourth, DR governs compositional change to a pool as a sequence of combinatorial transactions, as non-server resource acquisition may be funded using a mix of performance currency and hard cash.
  • Finally, DR identifies least total cost, as many feasible sequences exist.

Introduction

Inventory management is a foundational building block for cost-effect compute services, realized by:

  • High rates of resource utilization
  • Minimization of resource costs

Server inventory levels drive total pool costs, yet servers are unique for their high rates of per unit improvement and low cost.  Minimizing server inventory levels in order to consume fewer high-cost resource units is therefore dependent on how compositional change to server pools is executed for a series of capacity fulfillment cycles.

The inventory policy selected to govern compositional change ultimately determines inventory and cost-basis outcomes. How different policies impact costs can be illustrated by a sinusoid wave, with frequency representing capacity fulfillment cycle duration and amplitude representing inventory level.

There are two basic strategies to control pool inventory levels and therefore cost-basis:

  • Increase capacity cycle fulfillment frequency
  • Increase server turnover to capture more performance currency

Although both methods are complementary, they are distinct.  Increasing capacity fulfillment cycle frequency is an adaptation of just-in-time (JIT) and Lean principles to reduce unused resources per fulfillment cycle. Specifically, JIT is a policy to receive orders for new servers into the datacenter only when needed. This increases resource utilization and compositional change frequency, which can dampen inventory peaks. However, the typical benefits of JIT are diminished by FTR policy but enhanced by DR policy.

Whether JIT is used or not, it is compositional change policy that actually determines what a server order specification should be, and therefore the order specification for other compute resources. Because FTR mandates predetermined use time for servers, the amount of performance currency captured doesn’t change, as server turnover remains fixed at lower levels and unaffected by JIT.  

DR policy is not similarly constrained in turning over server inventory. Server orders are governed by how performance currency is captured and then used to minimize other compute resources. Server use duration and turnover are dynamically adjusted to optimize this outcome, or more succinctly:

DR is an inventory policy designed to minimize total cost of compositional change to server pools for a series of capacity fulfillment cycles, whereas FTR is not.

 

The rest of this whitepaper is organized into the following sections. Motivation describes market conditions, FTR deficiencies, and cost savings potential of DR. Compute Inventory Economics identifies inventory transaction types, objectives of each type, and how performance currency is optimized. Piecewise Inventory Model introduces how DR is executed.   Piecewise Inventory Model with Jump Events shows how DR accounts for discrete events and dynamically adjusts server refresh for those events.

MOTIVATION

Once workloads were decoupled from servers, workload portability and server fungibility was created and the value of better chip technology has been up for grabs. This “disaggregation” has disrupted existing business models and unleashed new competition. Suppliers of all stripes seek to coopt this value from enterprise IT through newly fashioned converged infrastructure offerings (i.e., re-aggregation), whether in the form of hardware and software bundled as an on-site product, a public cloud service like IaaS, or some variation of the two (i.e., hybrid cloud).

FTR policies hobble enterprises’ ability to compete and capture the value of better chip technology for themselves. If new servers offer 10% or 100% better performance, FTR is not responsive, restricting opportunities to harvest more value from space, power and software spend. Software alone now exceeds server cost by 2-30 times and fees are often correlated to server performance. Intuitively, it’s simply less expensive to refresh servers more often and at faster rates than to pay-up for more software resources.

DR bridges the gap from intuition to execution with a new methodology that identifies specific pool ingress and egress transactions without predetermining a fixed-time of server use (i.e., model input). Instead, server use times are results (i.e., model output) that derive from, among thousands of feasible transactional combinations, the lowest total cost solution.

To illustrate the savings potential, DR is compared to FTR and summarized in Table I and Table II. Key assumptions include:

  • 10% annualized speed improvement that arrives at three discrete times over four years (i.e., 10% after 1.5 years, 20% after 2.5 years, and 10% after 4 years)
  • 10% annual service demand
  • Software costs that approximate 3x and 5x server costs, indicated as cost ratio (K)
  • Capacity update intervals every four months in JIT manner

Table I and II show the cost savings of DR compared to 3-year and 4-year FTR, and illustrate a common pattern with DR. Performance currency value increases as the cost ratio (K) increases, and when optimally captured, significant savings are realized.

Table I   Table II  
 

 
 

COMPUTE INVENTORY ECONOMICS

Compute inventory economics are governed by add transactions and exchange transactions.  Add transactions increase pool inventory and consume more datacenter space, power, and software resources. All are paid for in hard cash, which requires formal authorization such as a purchase order to execute. In contrast, exchange transactions involve a trade, which acts as a conduit for performance currency capture and consumption.  Ordered servers necessary to execute the trade are paid for in hard cash, whereas other resource costs may increase, decrease, or remain the same depending on trade execution.

Value realized from trade execution derives from server use duration.  Longer server use periods generally yield higher net performance gains when refreshed.  It also spreads server costs over longer periods of time, which lowers server costs. Longer use periods also produce higher server inventory levels, which increases other opportunity costs.  Optimizing value (i.e., performance currency) is therefore a function of maximizing performance gain while minimizing server inventory levels, proportional to their respective costs, for a series of transactions.

For example, a public IaaS provider may be motivated to refresh servers sooner, at faster rates, to both avoid adding a new datacenter and increase revenue per unit of datacenter space. Performance currency is obviously worth more if revenue-to-cost is 10:1 versus 2:1.  The same holds true for enterprise IT.  The higher the cost ratio (K), the more financial incentive exists to speed up server refresh in order to avoid additional space, power and software costs.  Both examples illustrate server refresh serves three important functions.

  • (Existing) Removal of non-functional servers
  • (Existing) Uptake of new server features
  • (New) Optimize performance currency capture/consumption processes to drive enterprise value

Unfortunately, few enterprise IT organizations optimize exchange transactions for enterprise value, putting them at a competitive disadvantage to the more advanced inventory management practices of large pubic clouds.

To illustrate the value potential of exchange transactions, Figure I shows a breakeven function (continuous form) which plots performance improvement rate (P) to cost ratio (K). The area below the time-based plot lines indicate it would cost less to add new servers to a pool without exchange, whereas the area above the plot lines indicate it would cost less to exchange some unknown quantity of new servers for some unknown quantity of existing servers. Optimal refresh time and rates would therefore lie somewhere above the plot lines.

Since server performance improvement does not arrive in the datacenter in continuous form, nor does it accurately describe the nature of a pool’s cost structure (with the exception of electricity), DR is implemented using a piecewise dynamical system.

PIECEWISE INVENTORY MODEL

To model compositional change to a group of servers, the group is treated as a container whose state is evolved over time in a piecewise fashion. Evolution of container state can be described as a queuing process along three stages:

Stage I represent ingress, which is the rate new servers are added to a pool. Stage II represents pool operation and use interval (i.e., duration, or wait-time). Stage III represents egress, which is the rate servers are removed from a pool. Use intervals are constrained by lower and upper utilization boundaries to define complete capacity cycles.

To illustrate how the cost structure of a new, 100 server pool impacts server refresh and total costs, a three-year FTR scenario is compared to a DR scenario, with key variables still expressed in continuous/variable form:

  • Server performance improvement (P) is 10% annually in continuous form
  • Service demand rate is 10% annually in continuous form
  • Cost ratio (K) is 3x server cost and treated as monthly variable
Figure II
 

 

Although server performance and demand rates are both 10%, server inventory levels continue to increase until after the three year wait-time specified for FTR, whereas DR curtails peak growth earlier. By constraining server inventory removal, FTR peak inventory hits 128 servers before decreasing, whereas DR server inventory peaks at 116. FTR removal constraint also produces 27% longer inventory duration. Finally, both scenarios result in the same net cost. That’s because the cost ratio (K) was based on monthly variable cost solely to illustrate how peak inventory levels are caused by different inventory refresh policies.

In practice, required commitment levels in data center equipment and space, power, and software fees are determined by peak inventory levels, incurred as semi and fixed costs, and often higher than 3x times server cost. Therefore, the net costs for each of these examples would not be equal in practice. FTR would costs more than DR, and the cost discrepancy would rapidly increase as performance rates (P) and/or cost ratio (K) increased.
To better account for these real-world dynamics, the container model needs to incorporate discrete events.

PIECEWISE INVENTORY MODEL with JUMP EVENTS

Capturing better performing servers is dependent on product release cycles, which arrive in uneven but reasonably predictable times. To account for these “jumps” in performance at discrete times, the model injects these types of “jump events” into the piecewise evolution of capacity cycles as shown in Figure III.

Figure III

 

As capacity fulfillment cycles are evolved over time, the container’s state is adjusted to reflect discrete events. Evolution of capacity fulfillment cycles continues, post jump event, with updated state. This injection technique is also applied to a range of discrete events, such as changes in server configuration, demand, prices, license terms, etc. Jump events are easily identified and provide a concise way to describe discrete event parameters in order to simulate and test how those events impact future inventory levels and required cost commitments in other compute resources.</>

To illustrate, consider the growing size of today’s software stack. Large enterprises already incur significant costs for perpetual license and support fees, and most continue to increase stack size with investments in development and operations automation (DevOps). Whether proprietary or open source software, software related fees continue to grow and most are correlated to rack, server, processor, or core counts.

At first glance, software fees appear as a typical mixed “fixed and variable” cost structure. However in practice, most software agreements highly incent enterprise customers to continue paying support fees, even when licenses are not used, in order to retain entitlement or support rights. There is nothing inherently bad about such agreements. FTR policies simply lead to over-provisioning of these resources because it’s biased to server costs and depreciation schedules, not performance gain or other compute resource costs.

It is therefore common for FTR to yield marginal cost savings in software licensing fees over the short to mid-term. This can be illustrated using DR for a new, 100 server pool with the following model assumptions:

  • Server configuration: 2x8
  • Three year FTR
  • Server performance improvement (P) is arrives as discrete speed improvement events as follows:
    • 10% at year 1
    • 20% at year 2.5
    • 10% at year 4
  • Service demand growth rate is 10% annually
  • Cost ratio (K) is 5x for software based on perpetual licenses and maintenance fees
    • $3000 per core with 25% maintenance fee
    • $1500 per processor with 25% maintenance fee

Scenarios A and B are based on three year FTR. After three years, Scenario A shows the expected sinusoid wave as existing servers are exchanged for new. Scenario B restrains economically unnecessary exchanges that increase total costs, producing 20% and 24% lower ingress and egress rates, respectively, for Scenario B than Scenario A. Scenario A is marginally more expensive than Scenario B because more servers were refreshed, which led to no subsequent cost avoidance or reductions in other compute resource from years three (3) to five (5).

Whether organizations faithfully adhere to a three year FTR policy, or drag out refresh over time, FTR often provides little immediate financial reward.

Scenario A   Scenario B
   

In contrast, Scenario C uses DR and produces 12% lower peak inventory, 60% higher ingress rate, and 88% higher egress rate. Total annualized costs for Scenario C are 10% lower than both Scenario A or B.

Scenario C
 

 

If all three scenarios were based on four year FTR, the pattern would hold the same and the total cost advantage of DR over FTR would expand to 17% annually. Table I and II in the Motivation section further enumerate financial savings with extrapolations for different server inventory levels above 100.

CONCLUSIONS

Dynamic Refresh was introduced and compared to Fix-Time Refresh. FTR was shown to be near and mid-term less responsive to arrival of higher performing servers due to wait rules that constrain server egress, which then limits server ingress. This causes higher peak inventory levels that forces higher quantities of other compute resources, many of which significantly exceed the cost of a server.

In contrast, DR was shown to be more responsive to the arrival of higher performing servers, convert better chip technology into higher rates of cost avoidance or reduction, and realize lower total compute costs when compared to FTR policies. DR is applied individually to server pools, which incorporates their unique attributes. This rewards increasingly specialized server and software investments made by enterprises to advance particular service capabilities, exceed public cloud service quality, and enhance stakeholder competitiveness. Pool segmentation also enables enterprises to selectively target high cost services for DR and incrementally roll-out across the server estate as appropriate.

Inventory management is a foundational building block to cost-effect compute services and data center modernization, and the manner in which server inventory is managed and refreshed significantly impacts total compute service costs. DR enables enterprises to capture higher ROI from compute resources and sourcing strategies — from software, facilities, and power to hybrid cloud implementations — by continuously right-sizing compute resources and future commitments that reflect high capture rates of performance currency from better chip technology.