According to Gurobi, in 2020, 85% of Fortune 500 companies utilized optimization, a significant increase from the 5% reported by Gartner in 2013. Consequently, optimization has emerged as a highly coveted tool, offering a competitive edge to any organization. Given the scarcity of professionals equipped with specialized training to develop optimization models, optimization has the potential to enhance the toolkit of any data scientist. Yet it is a technique that presents a greater challenge compared to other tools typically employed in data science. For example, regression coefficients can easily be extracted from tabular data by using a standard library in Python, R or other various software libraries. Optimization software operates differently; while excellent tools are available for solving optimization models once they are constructed, the model developer must possess strong skills in model construction. Software cannot perform this task autonomously. This is because optimization models are tailored to each specific scenario. So, human insight and business acumen is essential. This implies that while there are shared foundational elements among different models, proficiency in optimization comes through dedicated practice and familiarity with the nuances of business problems.

What is Optimization?

Optimization underlies many or nearly all data science techniques (think regression, neural networks, clustering, etc.). This is because most statistical and machine learning methods use optimization to find parameter values. However, Optimization is an important topic on its own right. In short, it tells us:

>“What’s the best course of action to take, given what I know and what I can predict.”

The diagram below illustrates the integration of optimization techniques within the broader domain of analytics, encompassing various data-driven processes and decision-making tasks.

  ANALYTICS ->   
      OPTIMIZATION TECHNIQUES ->  
          (a) Decision Optimization   
          (b) Resource Allocation   
          (c) Process Optimization

Analytics: The outermost layer, representing the broad domain of analytics, which involves the analysis of data to derive insights and support decision-making across various domains and industries.
Optimization: The core layer that highlights the role of optimization methods in enhancing analytics processes. These techniques are crucial for:

Decision Optimization: Applying optimization algorithms to make data-driven decisions and solve complex decision-making problems efficiently.
Resource Allocation: Optimizing the allocation of resources, such as budget, manpower, and inventory, to maximize efficiency and minimize costs.
Process Optimization: Improving operational processes and workflows by optimizing parameters and constraints to achieve desired outcomes.

Some of the questions answered by optimization are:

How should a machine process different inputs to meet client demands, adhering to deadlines and accounting with failure and error rates?
How can transplantable organs can be allocated among patients while meeting medical requirements, balancing needs and benefits of recipients, time constraints?
How can a cattle producer maximize profit while ensuring that right mix of nutrients for cattle and meeting the amount customers are willing to pay for a pound of meat?
How should two tracts of public land be allocated to various uses to maximize their profit and sustainability given the labor needs to maintain each acre, the potential profit and environmental services provided to visitors?
How can the local police department schedule police officers to minimize the number whose days off are not consecutive?
How can an airline minimize fuel cost considering the differences in fuel cost across cities and the length in miles of each trip segment?

While optimization is a powerful tool capable of providing answers independently, we will often find it being utilized in conjunction with descriptive and predictive inquiries. For example,

Before recommending what’s the best traffic route I need to predict what traffic is like.
When optimizing production schedules in a manufacturing plant, it’s crucial to forecast demand accurately beforehand.
In financial portfolio optimization, predicting future market trends and grouping various types of assets is essential before selecting the optimal investment strategy.
Before optimizing staffing levels in a retail store, it’s important forecast customer foot traffic for specific store zones to ensure adequate coverage during peak hours.

Note: The word optimization can mean many things in our daily life: optimized user experience, optimized morning routine, optimize sleep, optimize advertising campaigns, etc. While all of these phrases give an idea of improvement. In strict technical terms, to optimize means “to find the best solution”. That is, the term assumes that there is only one solution that is optimal. In practice this is quite complicated to achieve, most of the time impossible. This is why, what we aim for is to get as close as possible to the optimal state. And, even if we cannot reach an optimal solution, we refer to that search process as optimization.

***
To explore the significant impact of optimization and learn about the history of Operations Research, you may want to check my article “The Science of Winning” available through ORMSToday here!.
***

A quick overview of optimization modeling

In the simplest case, an optimization problem consists of maximizing or minimizing a real function representing an OBJECTIVE by systematically choosing values for a set of variables (DECISION VARIABLES) which may be related through one or more CONSTRAINTS.
Since optimization uses mathematical models to represent reality, we sometimes simply use the term mathematical programming. In specific, a classical optimization problem is that in which the objective and constraints have the following form:

$Z = F(x_{1},x_{2}..x_{n})$

$Subject to (constraints):$
$g1(x_{1},x_{2}..x_{n}) <= b_{1}$
$g2(x_{1},x_{2}..x_{n}) <= b_{2}$
$…$
$g3(x_{1},x_{2}..x_{n}) <= b_{3}$
$x_{1},x_{2}..x_{n} >= 0$

$Z$ is the function representing the objective, which we want to minimize or maximize; each of the $g$ functions are called a constraint and the $m$ constraints are bounded by the $b$ constants on the right through a less or equal to, equal to, or greater or equal to sign. The constraintnts in the last row indicates all decision variables need to have a nonnegative value.

The optimization problem includes:
Objective function: The expression that defines the quantity to be maximized or minimized.
Decision variable: One or more controllable inputs.
Constraints or restrictions that limit the settings of the decision variables.
Nonnegativity constraints: A set of constraints that requires all variables to be nonnegative.

Let’s review next the step by step process that operation researchers use to develop and deploy optimization models. Alternatively, for a practical example, head directly to this other article.

Steps to build and deploy optimization models

Implementing an optimization model involves several steps, from problem formulation to model implementation, solution analysis and scale. But, just like with any other analytics solution, we start by choosing the right areas of opportunity. This step is critical and determines in large degree the project’s level of success. If possible, multiple stakeholders should be involved and priorities should be established in terms of business impact and feasibility, also best practices to validate product ideas should be used along with simple heuristics to establish a baseline model or entry point to gain business support.
Once this is done, and optimization has been chosen as an adequate method to solve the business problem, the following steps will help build and deploy an optimization model:

  1. Problem definition.  
  2. Formulate the real model as a mathematical model.  
  3. Choose an appropriate solver and modeling environment.  
  4. Model building or development.  
  5. Solve the model.  
  6. Analyze the solution.  
  7. Model validation and sensitivity analysis.  
  8. Report results and make decisions.  
  9. Iterate, improve and scale.

Step 1. Problem definition.

Identify the Objective: Clearly define what you want to optimize (e.g., minimize costs, maximize profits).
Specify Constraints: Determine the constraints that must be satisfied (e.g., resource limitations, demand requirements).
Gather Data: Collect the necessary data required for the model (e.g., costs, capacities, demands).

Keep in mind that:
> A model is an abstraction that emphasizes certain aspects of reality to assess or understand the behavior of a system under study.

So, at this stage, it is crucial to determine if our theoretical representation of the real system will capture the system elements that will lead to improved decision making and that can be practically represented as a mathematical model. On top of this, you may want to consider some of the nuances of model building.

Step 2. Formulate the real model as a mathematical model.

As indicated before, defining a mathematical model involves three parts:

Define Decision Variables: Identify the variables that will be manipulated to achieve the objective.
Formulate the Objective Function: Express the objective in terms of the decision variables.
Develop Constraints: Write the constraints as mathematical expressions involving the decision variables.

However, in optimization, the modeler can choose among different types of mathematical MODELS or “math programs” to solve the chosen business problem:

in Linear Programs (LP) both the objective and the constraints functions are linear and the decision variables assume continuous or fractional values,
in Integer Programming (IP) all of the variables assume discrete or integer values and all the functions are linear,
in Nonlinear programming (NLP) some of the constraints or the objective function are nonlinear,
in Quadratic Programming (QP) the objective has a quadratic form while the constraints are linear,
in Stochastic Programming (SP) randomness is present both in the objective and in the constraints and the process of finding the optimal solution also uses random iterations.

Moreover, we have hybrids of some of these fundamental forms. So, in Mixed Integer Linear Programming (MILP) while the equations are linear, some decision variables may take continuous while others take integer values. About 90% of optimization models used in practice are MILP.

Formulating a mathematical model is an art. It deeply reflects the modeler’s mathematical toolkit and her ability to translate a real-world problem into an effective math representation.

Alternative classifications of optimization problems:

By the type of managerial issue: O.R. researchers have similarly classified optimization problems according to the type of managerial issue that arises in practice. So, Job Shop Scheduling Problems deal with finding a schedule that minimizes the total processing time for a given set of jobs and a set of machines having different processing times. In the Knapsack Problem, for a given set of items, we determine the number of each item to include in a collection so that the total weight is less than or equal to a given limit and the total value is as large as possible (think of a delivery truck wanting to optimize weight and reduce shipping costs). The Traveling Salesman Problem (TSP) is another widely known problem in optimization which involves finding the shortest possible route that visits each city exactly once and returns to the starting city. As you can see, these problems are found in many types of systems and business situations.
One more variation of optimization models is given by the number of objectives. So, the classical optimization problem considers only one objective function, but we may have two objective functions in which case we talk about bicriteria mathematical programming (BCMP), or more objectives in which case we have Multicriteria Mathematical Programming (MCMP).

Other typical types of optimization problems include:

Other mathematical optimization approaches worth mentioning are the following:

Dynamic Programming (DP) covers problems that can be solved as a sequence of interrelated subproblems. Inventory control and portfolio management are common problems solved through dynamic programming.
Constraint Programming (CP) does not require an explicit objective function to be optimized but has a large number of constraints that must be satisfied in order to find a feasible solution. For example, scheduling problems often involve finding a feasible schedule that satisfies all of the constraints without necessarily optimizing for a specific objective.
Network Programming (NP) involves finding the most efficient way to use and allocate resources within a network or system. This can include optimizing the flow of goods, information, or services through a transportation network, or the allocation of resources in a telecommunications network, among other applications. The goal here is typically to minimize costs or maximize efficiency while meeting specific constraints and objectives.
Game Theory determines the best course of action for each player or opponent in a given situation or game in order to maximize their chances of success, while taking into account the actions of the other players. For this the interactions and interdependence among the players are formulated using a mathematical model to find the optimal solutions.

Continue to next steps…