In the world of analytical modeling, there are several solutions that use different cost modeling methodologies. To understand how easyKost works, it is important to understand the high-level choices a solution can adopt for modeling costs, and their benefits.

First, let’s introduce easyKost. It is a costing, data minning, estimation and optimization software that allows determining the cost of new products or services only in few seconds, by exploiting the richness of your data; and this without need of any in-depth knowledge of industrial technologies and processes. This SaaS software uses a non-linear machine learning algorithm mainly based on Random Forests in the estimation of costs.

The 4 main cost modeling methodologies are:

  • Knowledge bases / Case-based Interpretation: In this top-down approach, the user relies on a database of previous services or products that have attributes similar to the one being costed. The user specifies the attributes of the new cost of the product and service. The case-based reasoning algorithm interpolates between similar objects passed into the database to provide a cost estimate for the new object
  • Mechanistic bottoms-up costing: The price of the part or the service is based on the physical quantity of each material, the content of labor and machine time, the use of tooling… These cost items are based on their present value, without regard to any past price or cost. They require understanding the actual relationship that generates the use of physical resources, such as time, tooling, mass… These models calculate first the intermediate values (time, tooling, mass…), then these values are transmitted to cost through a series of financial rates
  • Cost Modeling by Indexation: in this Top-Down methodology, the user defines a quote or a price for the product or service in the past. Then, the user breaks down the cost into its constituent components and links each raw material to a product index. This could also include the costs of services, such as labor, which could be related indices such as the CPI or the average labor rate
  • Statistical Modeling Methods: In this Top-Down method, a population of past data is analyzed. The modeler provides a matrix of independent variables (cost drivers) for each product or service, as well as output variables of interest. The system will build an empirical relationship between the independent and the dependent variables. Then, a new set of cost drivers is introduced; these cost drivers represent a new product or customer service and calculate the estimated cost of the corresponding product or service

At easyKost, we use the latter methodology for estimating the costs of products and / or services: empirical stochastic or statistical methods. The majority of companies familiar with statistical methods are familiar with “curve fitting”, which is generally slang for multivariate linear least squares regression. We rely on a very flexible algorithm called “Random Forest”. It is a non-linear algorithm, allowing you to better tailor your data.

The algorithm works as follows:

  • Introducing the historical learning database (Cost-Driver matrix for the population of parts or services) and the output variable vector (cost, cycle time…)
  • The random forest algorithm will analyze the data by creating thousands of decision trees, to model the cost by randomly removing part of the data and observing the error against the historical result for each of these trees
  • These trees are aggregated to minimize an error variable

The advantages of the random forest algorithm are numerous:

  • This algorithm allows us to increase the accuracy of more than 50% compared to multivariate linear regression or other statistical methods
  • The covariance of the input variables is supported by the algorithm, while it is always a challenge when using the standard multivariate linear regression
  • Outliers are well managed by this algorithm since it is non-linear
  • Compared to other AI techniques such as fuzzy logic and machine learning, random forest is much more efficient and calculates faster
  • Random Forest can handle data missing values in the source database, replacing them with the statistically most likely values