Skip to main content

Predictive modelling of critical variables for improving HVOF coating using gamma regression models


Thermal spray coating is a critical process in many industries, involving the application of coatings to surfaces to enhance their functionality. This paper proposes a framework for modelling and predicting critical target variables in thermal spray coating processes, based on the application of statistical design of experiments (DoE) and the modelling of the data using generalized linear models (GLMs) with a particular emphasis on gamma regression. Experimental data obtained from thermal spray coating trials are used to validate the presented approach, demonstrating that it is able to accurately model and predict critical target variables. As such, the framework has significant potential for the optimization of thermal spray coating processes, and can contribute to the development of more efficient and effective coating technologies in various industries.

1 Introduction

Thermal spraying is a surface modification process that involves the deposition of a coating material onto a substrate by heating and accelerating a feedstock material through a spray gun. The high-velocity oxygen fuel (HVOF) spraying technique, schematically depicted in Fig. 1, represents a sophisticated and intricate thermal spray process that relies on the combined kinetic and thermal energy of the sprayed particles to produce coatings with exceptional properties, which makes it a subject of great interest and ongoing research in the field of materials engineering [26].

Figure 1
figure 1

Schematic depiction of the High Velocity Oxygen Fuel (HVOF) process

Numerous techniques have been employed to model and forecast the properties of coatings produced via HVOF spraying. These methods include empirical models based on regression analysis, mechanistic models that simulate the physical and chemical processes occurring during spraying [8, 25, 32], and hybrid models that integrate both approaches [34]. Linear and nonlinear regression models have been used to establish a relationship between process variables and coating properties [18, 24, 30, 33], while computational fluid dynamics (CFD) models have been utilized to simulate the gas flow, heat transfer, and particle behavior in the spray gun [12, 19]. In addition, artificial neural networks (ANNs) [4, 20, 21, 37] and genetic algorithms (GAs) [17] have been implemented to optimize process conditions and predict coating properties.

Despite the notable progress, the prediction of coating properties is still a challenging task, due to the complex interactions among the process variables, material properties, and the microstructure of the coatings. This study proposes a novel approach for modelling HVOF coatings through systematic variation of process variables that have received relatively limited attention in prior research. To achieve this, a Central Composite Design (CCD) of experiments is employed, which enables efficient exploration of a vast parameter space. The subsequent analysis focuses on developing gamma regression models derived from generalized linear models (GLMs), which are particularly well-suited for modeling data with skewed, non-negative distributions. By integrating these adjusted process variables into the models, a promising opportunity arises to identify novel associations between process conditions and coating properties.

Our approach provides insights into the intricate HVOF process, improving predictive models for key coating characteristics. The framework presented in this study supports the development of efficient coating technologies with enhanced attributes like wear resistance, corrosion protection, and oxidation resilience. These advancements have practical applications in industries such as aerospace, automotive, and manufacturing.

The structure of this paper is as follows: Sect. 2 presents a comprehensive overview of the HVOF process, including a detailed explanation of the main factors influencing coating properties and particle in-flight characteristics. Section 3 introduces the powerful application of generalized linear models (GLMs) and maximum likelihood estimation (MLE) to model and accurately estimate the dependence of coating properties on process conditions. This section also provides an overview of the theoretical foundations related to the asymptotic properties of MLE and hypothesis testing. The assessment of the predictive performance of the proposed GLMs is presented in Sect. 4. Section 5 examines the statistical design of experiments (DoE) and central composite design (CCD) as a potent approach for efficient data collection on various levels of factors in the thermal spray coating process. The findings of the proposed framework applied to experimental data is presented in Sect. 6, with a specific focus on the precise prediction of critical target variables. Finally, Sect. 7 concludes the paper with a discussion on the effectiveness and potential of the proposed framework for predicting critical target variables in thermal spraying processes. It highlights the contribution of the framework to the development of more efficient coating technologies.

2 Technical background

Thermal spraying is a versatile and widely used surface engineering technique that involves the deposition of coatings on the surface of a substrate to enhance its functional properties, such as wear resistance, corrosion resistance, and thermal insulation. The thermal spray coating process typically involves the application of thermal and kinetic energy to induce partial liquefaction of the coating material, thereby accelerating its projection towards the substrate surface. The amount of thermal and kinetic energy depends on the thermal spray coating technique. Various techniques, such as flame spraying, plasma spraying, arc spraying, and high-velocity oxygen fuel (HVOF) spraying can be used for coating using different types of coating material such as powder or wire. In this work we focus on the gas-fuel HVOF technology, which is described in more detail below.

HVOF thermal spraying has gained significant attention in recent years due to its ability to produce high-quality coatings with superior mechanical and chemical properties [15]. The gas-fuel HVOF process creates its thermal energy by combustion of a mixture of oxygen and fuel gas, typically propane, methane or hydrogen, in a high-pressure chamber [11]. The kinetic energy in the thermal spray process is created by its specific geometric convergent-divergent nozzle design which accelerates the gas stream to supersonic velocities. The kinetic energy is transferred to the sprayed material particles, causing them to partially melt and deform upon impact with the substrate. This results in coatings with high density and superior adhesion [7], almost independent of the thermal spray material composition (metallic, ceramic, cermet).

Figure 1 shows a schematic depiction of the HVOF process, where the distinct symmetric structure of the torch becomes evident [8]. The diagram reveals an axial axis of symmetry, i.e., a balanced and mirrored arrangement of components along a horizontal line within the HVOF system. In the combustion chamber, a defined but variable fuel gas reacts with oxygen, leading to combustion. Compressed air is used as a shroud gas and forms a cover inside the nozzle and outside in the spray plume. The powder coating material is inserted axially and is fed by using nitrogen as a carrier gas. Every characteristic of the gases and the powder, as well as the process attributes, influences combustion and consequently impacts the kinetic and thermal energy transferred to the spray particles. Influencing factors are for example:

  • Characteristics of gases (e.g., pressure, flow rate, temperature),

  • Additional process variables (e.g., the ratio of the combustion gases, powder feed rate, spraying distance),

  • Powder characteristics (e.g., size, chemical composition, density).

The influence of these factors on the thermal and kinetic energy of the spray material can be measured by using in-situ camera equipment that is able to simultaneously detect the temperature and velocity of the sprayed particles. Process input conditions not only affect the particle characteristics but also the performance of the process, as well as the quality of the coating. Relevant measures for the performance of coating processes are

  • Deposition rate,

  • Deposition efficiency.

These performance indicators are especially important when taking into account economic aspects of the thermal spray coating process. However, the coating characteristics are the most important properties, since they influence the industrial performance. The desirable properties are, e.g.,

  • Specific porosity of coating,

  • Specific hardness of coating,

  • Specific phase and chemical composition,

  • Specific thickness of coating.

In order to obtain best wear resistance, corrosion resistance, or thermal insulation, a specific combination of coating properties is essential. Additionally, proper surface preparation prior to spraying is necessary to ensure maximum adhesion strength and achieve the desired coating performance characteristics. Table 1 provides a comprehensive overview of the various HVOF process variables and coating characteristics discussed above.

Table 1 Examples of HVOF process variables and characteristics

The formulation of a robust mathematical relationship that links the input conditions controlling the spraying process, the dynamics of particles during flight, and the resulting coating characteristics is essential. Such a correlation not only facilitates a deeper understanding of the fundamental physical mechanisms underlying thermal spraying but also enables the optimization of deposition conditions to achieve the desired coating properties.

The accurate prediction of coating properties remains a challenge, primarily due to the complex and non-linear nature of the relationships between process attributes and coating properties. Achieving high accuracy in property prediction is often elusive, considering the multifaceted interactions at play. Hence, the demand for a reliable and precise prediction model becomes apparent, as it can serve as a catalyst for optimizing the process and elevating the overall quality of the resulting coatings.

3 Predictive modelling of HVOF coating properties

The following section is dedicated to the derivation of mathematical models that enable the prediction of coating properties for the HVOF process. For this, we propose the use of Generalized Linear Models (GLMs) along with Maximum Likelihood Estimation (MLE) as an effective approach for modelling and estimating the expected values of target variables, conditioned on the explanatory variables (= process variables) in the HVOF process. We adopt the theoretical framework and notation of [10], to develop the statistical model used in this study, which can be expressed by the following general equation:

$$ \mathbb{E}(y_{i}|\boldsymbol{x_{i}}) = \mu _{i} = g^{-1}\bigl(\boldsymbol{x_{i}}^{T} \boldsymbol{\beta}\bigr), $$

where \(\boldsymbol{\mu}= (\mu _{i})_{i=1}^{n}\) is a vector denoting the conditional mean of the response variable \(\mathbf{y}=(y_{i})_{i=1}^{n}\), i.e., the coating properties of interest. The coefficient vector \(\boldsymbol{\beta} = (\beta _{0}, \beta _{1}, \dots , \beta _{k})\) encodes the effects of the explanatory variables (\(\mathbf{x}_{1}, \dots , \mathbf{x}_{k}\)), i.e., the potentially influential process conditions with observations \(\mathbf{x}_{i} = (1, x_{1}^{i},x_{2}^{i},\dots ,x_{k}^{i})_{i=1}^{n}\). The mean vector μ is related to the linear combination \(\boldsymbol{x_{i}}^{T}\boldsymbol{\beta}\) of the process input attributes by a one-to-one mapping \(g(\cdot )\), which is often referred to as the link function [9]. In regression analysis, the assumption of additive random errors allows for the decomposition of the response \(y_{i}\) into a systematic component \(\mathbb{E}(y_{i}|\boldsymbol{x_{i}})\) and a random component \(\epsilon _{i}\), yielding the equation:

$$ y_{i} = g^{-1}\bigl(\boldsymbol{x_{i}}^{T} \boldsymbol{\beta}\bigr) + \epsilon _{i}, $$

where the measurement error \(\epsilon _{i}\) is assumed to be independent of the covariates. The primary objective of regression analysis is to use the data \((y_{i},\boldsymbol{x_{i}})_{i=1}^{n}\) to estimate the systematic component \(\mathbb{E}(y_{i}|\boldsymbol{x_{i}})\).

Based on the broad background presented in Sect. 2, the use of Bayesian generalized linear models to model coating properties appears to be a logical choice. The Bayesian framework offers distinct advantages over classical inference by accommodating prior knowledge regarding model parameters. This facilitates the incorporation of insights into the effects of process input variables, enhancing parameter estimation accuracy. Even in the absence of specific information, non-informative priors like the uniform distribution or Jeffreys’s prior can be specified [16]. Nevertheless, a classical frequentist statistical approach is adopted in this study to develop a model that characterizes the relationship between input variables and coating properties because the knowledge of suitable priors for Bayesian modeling is absent in this context, and the use of non-informative priors does not offer substantial advantages over classical approaches. Subsequent investigations may explore the application of Bayesian GLMs in future research.

3.1 Generalized linear models (GLMs)

GLMs, as defined in [23], include a broad range of useful statistical models and serve as a powerful tool for data analysis in various fields such as engineering, physics, and biology. They extend the concept of linear regression to handle non-normal response variables, such as binary or count data, by introducing a link function that relates the mean of the response variable to the linear predictor. Here, the effectiveness of GLMs in analyzing data from the HVOF process is explored, aiming to establish a comprehensive model equation that effectively captures the conditional dependence of coating characteristics on process input variables.

In the context of GLMs, the response vector \(\mathbf{y} = (y_{1}, y_{2}, \ldots, y_{n})\), is modeled as a vector that follows any distribution from the exponential family, where each element \(y_{i}\) is distributed with a mean \(\mu _{i}\) and variance \(\sigma _{i}^{2}\). To model the relationship between the response variable y and the predictor variables \(\mathbf{x}_{i} = (x_{1}^{i},x_{2}^{i},\dots ,x_{k}^{i})\), a predictor vector \(\boldsymbol{\eta} = (\eta _{1}, \eta _{2}, \ldots, \eta _{n})\), with elements \(\eta _{i} = \mathbf{x}_{i}^{T} \boldsymbol{\beta}\) is introduced, which is linked to the mean vector \(\boldsymbol{\mu}= (\mu _{1}, \mu _{2}, \ldots, \mu _{n})\) via a link function g, as expressed by

$$ g(\boldsymbol{\mu}) = \boldsymbol{\eta} . $$

This formulation allows for the incorporation of multiple predictors and the estimation of their effects on the response variable. The choice of link function g depends on the distribution of the response variable y and can vary between models. For instance, when the response variable is binary (\(y_{i}=0\) or \(y_{i}=1\)), the logit link function is frequently employed, connecting the mean of the response μ variable to the logarithm of the odds ratio. This can mathematically be expressed as:

$$ g(\boldsymbol{\mu}) = \log \biggl( \frac{\boldsymbol{\mu}}{1-\boldsymbol{\mu}} \biggr) = \boldsymbol{\eta} . $$

Similarly, if the response variable is a non-negative variable, the link function can be logarithmic, relating the mean of the response variable to the linear predictor via

$$ g(\boldsymbol{\mu}) = \log (\boldsymbol{\mu}) = \boldsymbol{\eta} . $$

After selecting the appropriate link function, the GLM implies that the linear predictor can be represented as a linear combination of the predictor variables \(\mathbf{x}_{i} = (x_{1}^{i},x_{2}^{i},\dots ,x_{k}^{i})\), i.e.,

$$ \eta _{i} = \mathbf{x}_{i}^{T} \boldsymbol{\beta} = \beta _{0} + \beta _{1}x_{1}^{i}+ \beta _{2}x_{2}^{i} + \cdots + \beta _{k}x_{k}^{i} , $$

where \(\beta _{0}\) is called the intercept, and \(\beta _{1},\beta _{2}, \dots , \beta _{k}\) are the regression coefficients. The estimation of these coefficients will be performed using the method of maximum likelihood estimation, detailed in Sect. 3.2.2. This method quantifies the regression coefficients that are most probable given the observed data \((y_{i},\boldsymbol{x_{i}})_{i=1}^{n}\), considering the assumed conditional distribution of the response variable y and the selected link function g.

3.2 Application of generalized linear models to HVOF data

In the HVOF process, the response variables of interest include the coating properties, such as roughness, porosity, layer thickness, and hardness, as well as the in-flight properties, such as particle temperature and particle velocity (see Table 1). Since these response variables are continuous and positive, it is necessary to choose an appropriate probability distribution and link function for modelling them. The gamma distribution with a log link function is a common choice for non-negative continuous data [10] and thus, we will employ it in our analysis.

3.2.1 The gamma distribution

A continuous, non-negative random variable Y is said to follow a gamma distribution with shape parameter \(a>0\) and rate parameter \(b>0\), denoted as \(Y \sim G(a,b)\), if it has the density function:

$$ f(y|a,b) = \frac{b^{a}}{\Gamma (a)} y^{a-1}\exp (-by) , \quad y > 0 . $$

The expected value and variance are given by \(\mathbb{E}(Y) = \frac{a}{b}\) and \(\mathbb{V}(Y) = \frac{a}{b^{2}}\). An illustrative comparison of gamma distributions with varying shape and rate parameters is provided in Fig. 2. Occasionally, the gamma distribution is defined via an alternative parameterization. Depending on the expected value μ and the scale parameter \(\nu > 0\), the density is then given by:

$$ f(y|\mu ,\nu ) = \frac{1}{\Gamma (\nu )} \biggl( \frac{\nu}{\mu} \biggr)^{ \nu}\exp \biggl(-\frac{\nu}{\mu}y \biggr) , \quad y > 0 , $$

where \(\mu = \mathbb{E}(Y)\) is the parameter of interest and the variance \(\nu = \mathbb{V}(Y)\) is considered as a nuisance parameter, meaning that the value of ν is not the main focus of the analysis. In other words, while ν plays a role in determining the shape of the density function, it is not the parameter that one aims to estimate or draw conclusions about.

Figure 2
figure 2

Comparison of gamma distributions with varied parameters. The left panel displays the probability density function (PDF) of a gamma-distributed random variable y with a fixed shape parameter \(a=5\) and varying rate parameters b. The right panel illustrates the PDF of a gamma-distributed random variable y with a fixed rate parameter \(b=2\) and varying shape parameters a

3.2.2 Maximum likelihood estimation gamma regression

The likelihood function is a fundamental concept in statistical inference that quantifies the plausibility of the observed data under a given statistical model. As a consequence of the conditional independence of \(y_{i}\) given \(\boldsymbol{x_{i}}\), it is defined as the product of the probability density function of each observation in the sample, conditioned on the parameter values. In other words, the likelihood is the joint probability of the observed data, viewed as a function of the parameters. Therefore, the product of the likelihood contributions, defined in (4), yields the likelihood for the observed data, providing a basis for inference on the unknown parameters. However, it is essential to acknowledge that the assumption of conditional independence of the response variable y may not always hold in practical applications, particularly in complex systems or processes where various factors can influence the response variables.

In the context of this study on the coating process, assuming conditional independence implies consistent equipment- and process performance across observations, with response variable variation attributed solely to considered explanatory variables. Nonetheless, factors like equipment stability, environmental conditions, or procedural variations could introduce dependencies among response variables, challenging this assumption. To address this concern, careful execution of the experiments was performed, encompassing thorough validation of equipment functionality and accurate examination for potential issues, such as instrument cleaning and procedural consistency. The impact of changing environmental conditions can be disregarded due to the operations being conducted within a coating booth equipped with a continuous suction system. These measures were taken to validate the conditional independence assumption and ensure the reliability of the experimental results.

An empirical investigation of the dataset considered in this study reveals a notable right-skewness across all examined response variables. While marginal distributional properties alone do not necessarily dictate the choice of distributional family for modeling conditional means, this observation suggests a departure from symmetric distributional patterns. Furthermore, these variables exhibit non-negativity and continuity. To account for these specific distributional characteristics, the assumption of a gamma regression framework is made (cf. Fig. 2). In addition to the assumed gamma distribution, other distributions such as the log-normal distribution or the inverse Gaussian distribution can also be considered in this setting. Since similar results are anticipated for these alternative distributions, the emphasis on the specific distribution assumption is relaxed. Therefore, the assumption made in this study is that the response variables are conditionally gamma-distributed, with the expected value depending on the explanatory variables. Future investigations are warranted to explore the implications of alternative distributional assumptions in this domain.

The gamma regression model with a logarithmic link function assumes that the response variable \(y_{i}\) for \(i = 1, \dots , n\) follows a Gamma distribution with mean \(\mu _{i}\) and scale parameter \(\nu > 0\). The mean \(\mu _{i}\) is modeled as a function of the covariates \(\mathbf{x}_{i} = (x_{1}^{i},x_{2}^{i},\dots ,x_{k}^{i})\) through the logarithmic transformation, which is defined as \(\mu _{i} = \exp (\beta _{0}+\beta _{1}x_{1}^{i}+ \cdots + \beta _{k}x_{k}^{i}) = \exp (\eta _{i})\), where \(\eta _{i}\) is the linear predictor.

For simplicity, we consider a gamma regression model with a single covariate x. Nevertheless, it is important to emphasize that extending the model to multiple covariates is possible and adheres to the same theoretical framework outlined here. In particular, the incorporation of additional predictors would require a simple augmentation of the linear predictor \(\eta _{i}\) to account for their effects. Therefore, the model can be readily extended to encompass more intricate predictor configurations, as warranted by the research question at hand. The univariate regression model used in this work, in which a single dependent variable is considered, can be summarized in the form of the following equations:

$$ \begin{aligned} Y_{i} &\overset{ \mathrm{iid}}{\sim} G(\mu _{i}) \quad i=1,\dots ,n , \\ \mu _{i}(\boldsymbol{\beta}) &= \mu _{i}(\boldsymbol{\beta}|x_{i}) = \exp (\beta _{0}+\beta _{1}x_{i}) = \exp \bigl(\eta _{i}( \boldsymbol{\beta})\bigr), \\ \eta _{i}(\boldsymbol{\beta}) &= \eta _{i}(\boldsymbol{\beta}|x_{i}) = \beta _{0}+\beta _{1}x_{i} . \end{aligned} $$

The characterization of HVOF coating quality involves multiple aspects, prompting consideration of a multivariate modeling approach. Such an approach allows for the estimation of covariate effects and the assessment of a variance-covariance matrix, which quantifies correlations among different quality criteria and potentially enhances predictive capabilities. However, due to practical limitations such as the limited sample size and the need for additional data for variance-covariance matrix estimation, a univariate approach was chosen. Moreover, within the specific applications where the coated material is applied, only a subset of properties listed in Table 1 is relevant, thus a univariate approach was considered more appropriate.

Given the univariate model in (3) and assuming conditional independence of \(y_{i}|\boldsymbol{x_{i}}\) it is now possible to define the likelihood function.

Definition 1

The likelihood function \(L(\boldsymbol{\beta}| \mathbf{y})\) of the observed data for the model described in (3) is defined as the product of the likelihood contributions \(L_{i}(\boldsymbol{\beta}| y_{i})\), i.e.,

$$ L(\boldsymbol{\beta}| \mathbf{y}) := \prod _{i=1}^{n} L_{i}(\boldsymbol{ \beta}|y_{i}) = \prod_{i=1}^{n} \frac{1}{\Gamma (\nu )} \biggl( \frac{\nu}{\mu _{i}(\boldsymbol{\beta})} \biggr)^{\nu}\exp \biggl(-\frac{\nu}{\mu _{i}(\boldsymbol{\beta})}y_{i} \biggr) . $$

The log-likelihood function \(\ell (\boldsymbol{\beta}| \mathbf{y}):= \log L(\boldsymbol{\beta}| \mathbf{y})\) is often preferred in statistical inference due to its numerical stability, computational simplifications, and theoretical properties [10]. Maximizing \(\ell (\boldsymbol{\beta}| \mathbf{y})\), instead of (4) enables accurate parameter estimation and reliable statistical inference.

In the process of maximizing the log-likelihood function, the score function serves as an essential mathematical tool. It accurately measures the sensitivity of the log-likelihood to changes in the parameters of interest.

Definition 2

The score function \(\boldsymbol{s}(\boldsymbol{\beta}| \boldsymbol{y})\) is defined as the gradient of the log-likelihood function with respect to the model parameters, i.e.,

$$ \boldsymbol{s}(\boldsymbol{\beta}| \boldsymbol{y}) := \nabla _{\beta} \ell ( \boldsymbol{\beta}| \mathbf{y}). $$

Furthermore, the maximum likelihood estimator (MLE) \(\hat{\boldsymbol{\beta}}\) is defined as the solution of

$$ \boldsymbol{s}(\hat{\boldsymbol{\beta}}| \boldsymbol{y}) = \mathbf{0} . $$

The score function \(\boldsymbol{s}(\boldsymbol{\beta}| \boldsymbol{y})\) quantifies the rate of change of the log-likelihood function as the parameter values are varied, and provides a measure of the direction and magnitude of the parameter updates that increase the log-likelihood. It can be computed by either numerically or analytically differentiating \(\ell (\boldsymbol{\beta}| \mathbf{y})\), which in our case leads to

Proposition 1

Let \(\boldsymbol{s}(\boldsymbol{\beta}| \boldsymbol{y})\) represent the score function for a response vector y, as defined in Definition 2, and let y consist of observed values \(y_{i}\) from a random variable \(Y_{i}\) following a gamma distribution, as described in (3). Then

$$ \boldsymbol{s}(\boldsymbol{\beta} | \boldsymbol{y}) = \mathbf{X}^{T} \nu \biggl( \frac{\mathbf{y}}{\boldsymbol{\mu}(\boldsymbol{\beta})} - 1 \biggr) . $$

Here, \(\mathbf{X} = (\mathbf{1} \mathbf{x})\) represents the design matrix, consisting of the explanatory variable \(\mathbf{x}= (x_{1}, \dots , x_{n})^{T}\), where \(x_{i} \in \mathbb{R}\), \(\mathbf{y} = (y_{1}, \dots , y_{n})^{T}\) is the response vector with \(y_{i} \in \mathbb{R}^{+}\), \(\boldsymbol{\mu}(\boldsymbol{\beta}) = (\mu _{1}(\boldsymbol{\beta}), \dots , \mu _{n}(\boldsymbol{\beta}))^{T}\) is the mean vector with \(\mu _{i}(\boldsymbol{\beta}) \in \mathbb{R}^{+}\), given in (3), and \(\mathbf{1} = (1, \dots , 1)^{T}\) is a vector of ones.


This proof is adapted from [10], with suitable changes accounting for the gamma regression framework considered here. First of all, the first partial derivatives of individual log-likelihoods \(\log L_{i}(\boldsymbol{\beta}|y_{i})\) are given by

$$ \begin{aligned} \frac{\partial \log L_{i}(\beta _{0},\beta _{1}|y_{i})}{\partial \beta _{0}} &= \biggl( -\frac{\nu}{\mu _{i}(\boldsymbol{\beta})} + \frac{\nu}{\mu _{i}(\boldsymbol{\beta})^{2}} y_{i} \biggr) \mu _{i}( \boldsymbol{\beta}) = \biggl( \frac{\nu}{\mu _{i}(\boldsymbol{\beta})} y_{i} - \nu \biggr) \\ &= \nu \biggl( \frac{y_{i}}{\mu _{i}(\boldsymbol{\beta})} - 1 \biggr), \\ \frac{\partial \log L_{i}(\beta _{0},\beta _{1}|y_{i})}{\partial \beta _{1}} &= \biggl( -\frac{\nu}{\mu _{i}(\boldsymbol{\beta})} + \frac{\nu}{\mu _{i}(\boldsymbol{\beta})^{2}} y_{i} \biggr) \mu _{i}( \boldsymbol{\beta}) x_{i} = \biggl( \frac{\nu}{\mu _{i}(\boldsymbol{\beta})} y_{i} - \nu \biggr) x_{i} \\ &= \nu x_{i} \biggl( \frac{y_{i}}{\mu _{i}(\boldsymbol{\beta})} - 1 \biggr) , \end{aligned} $$

Together with the definitions of the vectors x, y, \(\boldsymbol{\mu}(\boldsymbol{\beta})\), and 1, as well as of the design matrix \(\mathbf{X} = (\mathbf{1} \mathbf{x})\) and the definition of the score function \(\boldsymbol{s}(\boldsymbol{\beta} |\boldsymbol{y})\) there holds

$$ \begin{aligned} \mathbf{s}(\beta _{0},\beta _{1}|\mathbf{y}) &= \begin{pmatrix} \sum_{i=1}^{n} \nu ( \frac{y_{i}}{\mu _{i}(\boldsymbol{\beta})} -1 ) \\ \sum_{i=1}^{n} \nu x_{i} ( \frac{y_{i}}{\mu _{i}(\boldsymbol{\beta})} -1 ) \end{pmatrix} = \begin{pmatrix} \mathbf{1}^{T} \nu ( \frac{\mathbf{y}}{\boldsymbol{\mu}(\boldsymbol{\beta})} - 1 ) \\ \mathbf{x}^{T} \nu ( \frac{\mathbf{y}}{\boldsymbol{\mu}(\boldsymbol{\beta})} - 1 ) \end{pmatrix} = \mathbf{X}^{T} \nu \biggl( \frac{\mathbf{y}}{\boldsymbol{\mu}(\boldsymbol{\beta})} - 1 \biggr) , \end{aligned} $$

which completes the proof. □

Remark 1

Note that the specific form of the design matrix X, as described here, applies only to the model under consideration. In general, the design matrix X comprises not only the original explanatory variables but also their higher-order powers and/or products. This expanded form allows for the estimation of higher-order effects or interaction effects between two or more covariates. [10]

By setting the score function \(\boldsymbol{s}(\boldsymbol{\beta} |\boldsymbol{y})\) to zero, a linear system of equations for \((\beta _{0},\beta _{1})\) arises that needs to be solved numerically. The numerical algorithm used in this work (cf. Sect. 3.2.3) involves the computation of the observed information matrix \(\boldsymbol{H}(\boldsymbol{\beta} |\boldsymbol{y})\) (= Hessian matrix) or expected information matrix \(\boldsymbol{F}(\boldsymbol{\beta} |\boldsymbol{y})\) (= Fisher matrix), which is a key component of the algorithm. Note that setting the score function \(\boldsymbol{s}(\boldsymbol{\beta} |\boldsymbol{y})\) to zero is independent of ν, meaning that the process of finding solutions for \((\beta _{0}, \beta _{1})\) is not influenced by the value of ν. While solving the equation \(\boldsymbol{s}(\boldsymbol{\beta} |\boldsymbol{y}) = 0\) does not inherently guarantee a maximum, the concave nature of the log-likelihood function \(\ell (\boldsymbol{\beta}| \mathbf{y})\) in this model ensures its maximization [35]. This concavity can be confirmed by verifying the positive semi-definiteness of \(\boldsymbol{H}(\boldsymbol{\beta} |\boldsymbol{y})\), defined in (5). For further insights into the existence and uniqueness of the maximum likelihood estimator in generalized linear models, refer to [35].

In general, there is no guarantee that solving the equation function \(\boldsymbol{s}(\boldsymbol{\beta} |\boldsymbol{y}) = 0\) yields a maximum. However, since the log-likelihood \(\ell (\boldsymbol{\beta}| \mathbf{y})\) for this model is a concave function, solving the equation will yield a maximum. The concativity for \(\ell (\boldsymbol{\beta}| \mathbf{y})\), defined in (5), can be verified by checking if \(\boldsymbol{H}(\boldsymbol{\beta} |\boldsymbol{y})\) is positive definite.

Definition 3

The observed information matrix \(\boldsymbol{H}(\boldsymbol{\beta} |\boldsymbol{y})\) is defined as the Hessian matrix of the log-likelihood function \(\ell (\boldsymbol{\beta} |\boldsymbol{y})\), i.e., the matrix of second derivatives with respect to the model parameters β,

$$ \boldsymbol{H}(\boldsymbol{\beta} |\boldsymbol{y}) := - \frac{\partial ^{2} \ell (\beta _{0},\beta _{1}|\mathbf{y})}{\partial \boldsymbol{\beta} \partial \boldsymbol{\beta}^{T}} . $$

The expected information matrix \(\boldsymbol{F}(\boldsymbol{\beta} |\boldsymbol{y})\) is defined as

$$ \boldsymbol{F}(\boldsymbol{\beta} |\boldsymbol{y}) := \mathbb{E} \biggl[- \frac{\partial ^{2} \ell (\beta _{0},\beta _{1}|\mathbf{y})}{\partial \boldsymbol{\beta} \partial \boldsymbol{\beta}^{T}} \biggr] , $$

where \(\mathbb{E}[\cdot ]\) denotes the expected value.

The matrices \(\boldsymbol{H}(\boldsymbol{\beta} |\boldsymbol{y})\) and \(\boldsymbol{F}(\boldsymbol{\beta} |\boldsymbol{y})\) quantify the amount of information that the observed data provides about the unknown parameters of the model. For our specific gamma regression framework, they can be computed explicitly as described in the following

Proposition 2

Let y be as in Proposition 1, let \(\mathbf{W} = \operatorname{diag}( \nu y_{i} / \mu _{i}(\boldsymbol{\beta}) )_{i=1,\dots ,n}\) be a diagonal matrix with elements \(\nu y_{i}/\mu _{i}(\boldsymbol{\beta})\) and \(\tilde{\mathbf{W}} = \operatorname{diag}(\nu )\) be a diagonal matrix with elements ν. Then the observed information matrix \(\boldsymbol{H}(\boldsymbol{\beta} |\boldsymbol{y})\) and the expected information matrix \(\boldsymbol{F}(\boldsymbol{\beta} |\boldsymbol{y})\), defined in (5) and (6), respectively, can be expressed as

$$ \boldsymbol{H}(\boldsymbol{\beta} |\boldsymbol{y}) = \mathbf{X}^{T}\mathbf{W} \mathbf{X} , \quad \textit{and} \quad \boldsymbol{F}(\boldsymbol{\beta} |\boldsymbol{y}) = \mathbf{X}^{T}\tilde{\mathbf{W}} \mathbf{X} . $$


The second partial derivatives of individual log-likelihoods \(\log L_{i}(\boldsymbol{\beta} | \boldsymbol{y})\) are given by

$$ \begin{aligned} \frac{\partial ^{2} \log L_{i}(\beta _{0},\beta _{1}|\mathbf{y})}{\partial \beta _{0}^{2}} &= - \frac{\nu y_{i}}{\mu _{i}(\boldsymbol{\beta})} , \qquad \frac{\partial ^{2} \log L_{i}(\beta _{0},\beta _{1}|\mathbf{y})}{\partial \beta _{1}^{2}} = -\frac{\nu x_{i}^{2} y_{i}}{\mu _{i}(\boldsymbol{\beta})} , \\ \frac{\partial ^{2} \log L_{i}(\beta _{0},\beta _{1}|\mathbf{y})}{\partial \beta _{0} \beta _{1}} &= -\frac{\nu x_{i} y_{i}}{\mu _{i}(\boldsymbol{\beta})} . \end{aligned} $$

The observed information matrix \(\boldsymbol{H(\beta |\mathbf{y})}\) is obtained through the aggregation of the second partial derivatives of individual log-likelihoods \(\log L_{i}(\boldsymbol{\beta} |\boldsymbol{y})\), i.e.,

$$ \begin{aligned} \boldsymbol{H}(\boldsymbol{\beta} |\boldsymbol{y}) &= - \frac{\partial ^{2} \ell (\beta _{0},\beta _{1}|\mathbf{y})}{\partial \boldsymbol{\beta} \partial \boldsymbol{\beta}^{T}} \overset{\text{(4)}}{=} - \sum _{i=1}^{n} \frac{\partial ^{2} \log L_{i}(\beta _{0},\beta _{1}|\mathbf{y})}{\partial \boldsymbol{\beta} \partial \boldsymbol{\beta}^{T}} \overset{\text{(7)}}{=} \begin{pmatrix} \sum_{i=1}^{n} \frac{\nu y_{i}}{\mu _{i}(\boldsymbol{\beta})} & \sum_{i=1}^{n} \frac{\nu x_{i} y_{i}}{\mu _{i}(\boldsymbol{\beta})} \\ \sum_{i=1}^{n} \frac{\nu x_{i} y_{i}}{\mu _{i}(\boldsymbol{\beta})} & \sum_{i=1}^{n} \frac{\nu x_{i}^{2} y_{i}}{\mu _{i}(\boldsymbol{\beta})} \end{pmatrix} . \end{aligned} $$

Together with the definition of W we thus obtain

$$ \boldsymbol{H}(\boldsymbol{\beta} |\boldsymbol{y}) = \mathbf{X}^{T}\mathbf{W} \mathbf{X} . $$

Since \(\mathbb{E}(y_{i}) = \mu _{i}(\boldsymbol{\beta})\), the Fisher matrix \(\boldsymbol{F}(\boldsymbol{\beta} |\boldsymbol{y})\) is given by

$$ \begin{aligned} \boldsymbol{F}(\boldsymbol{\beta} |\boldsymbol{y}) &= \mathbb{E} \biggl[- \frac{\partial ^{2} \ell (\beta _{0},\beta _{1}|\mathbf{y})}{\partial \boldsymbol{\beta} \partial \boldsymbol{\beta}^{T}} \biggr]= \begin{pmatrix} \sum_{i=1}^{n}\frac{\nu \mathbb{E}(y_{i})}{\mu _{i}(\boldsymbol{\beta})} & \sum_{i=1}^{n} \frac{\nu x_{i} \mathbb{E}(y_{i})}{\mu _{i}(\boldsymbol{\beta})} \\ \sum_{i=1}^{n}\frac{\nu x_{i} \mathbb{E}(y_{i})}{\mu _{i}(\boldsymbol{\beta})} & \sum_{i=1}^{n}\frac{\nu x_{i}^{2} \mathbb{E}(y_{i})}{\mu _{i}(\boldsymbol{\beta})} \end{pmatrix} \\ &= \begin{pmatrix} \sum_{i=1}^{n} \frac{\nu \mu _{i}(\boldsymbol{\beta})}{\mu _{i}(\boldsymbol{\beta})} & \sum_{i=1}^{n} \frac{\nu x_{i} \mu _{i}(\boldsymbol{\beta})}{\mu _{i}(\boldsymbol{\beta})} \\ \sum_{i=1}^{n} \frac{\nu x_{i} \mu _{i}(\boldsymbol{\beta})}{\mu _{i}(\boldsymbol{\beta})} & \sum_{i=1}^{n} \frac{\nu x_{i}^{2} \mu _{i}(\boldsymbol{\beta})}{\mu _{i}(\boldsymbol{\beta})} \end{pmatrix} = \mathbf{X}^{T}\tilde{ \mathbf{W}}\mathbf{X} , \end{aligned} $$

which yields the assertion. □

3.2.3 Numerical computation of the maximum likelihood estimator

Numerical algorithms are essential for estimating the maximum likelihood estimator of parameters in a statistical model, particularly when an analytical solution to the likelihood equations is unattainable [13]. Frequently, the likelihood function is an intricate, nonlinear function of parameters, lacking a closed-form expression for its maximum, e.g., in gamma regression with a logarithmic link function.

In such cases, numerical algorithms such as the Newton-Raphson algorithm are employed to iteratively approximate the solution of the likelihood equations until convergence is reached [10]. These methods rely on numerical techniques to estimate the derivatives of the likelihood function, which are used in computing the updates to the parameter estimates.

Newton-Raphson Method [13]:

is an iterative method used to find a value of β that satisfies the equation \(\mathbf{s}(\boldsymbol{\beta} |\boldsymbol{y}) = 0\), which corresponds to the point where the log-likelihood function is maximized. The Newton-Raphson algorithm achieves this by iteratively approximating the solution of \(\mathbf{s}(\boldsymbol{\beta} |\boldsymbol{y}) = 0\) using Taylor series expansion of \(\mathbf{s}(\boldsymbol{\beta} |\boldsymbol{y})\) around the current estimate of β. Specifically, the expansion can be written as:

$$ \mathbf{s}(\boldsymbol{\beta} |\boldsymbol{y}) \approx \mathbf{s} \bigl( \boldsymbol{\beta}^{(k)}|\mathbf{y}\bigr) - \boldsymbol{H\bigl( \beta}^{(k)}| \mathbf{y}\bigr) \bigl(\boldsymbol{\beta} - \boldsymbol{ \beta}^{(k)}\bigr) , $$

where \(\boldsymbol{\beta}^{(k)}\) is the estimate of β at the k-th iteration, \(\mathbf{s}(\boldsymbol{\beta}^{(k)}|\mathbf{y})\) is the score function evaluated at \(\boldsymbol{\beta}^{(k)}\), and \(\boldsymbol{H(\beta}^{(k)}|\mathbf{y}) = - \partial \mathbf{s}( \boldsymbol{\beta}^{(k)}|\mathbf{y}) / \partial \boldsymbol{\beta}^{T} \) is the observed information matrix evaluated at \(\boldsymbol{\beta}^{(k)}\). The score function is approximated using a linear tangent line, resulting in an improved approximate solution. This involves finding the root of the tangent line in (8). Thus, the algorithm approximates the maximum likelihood estimator of β by solving the equation:

$$ \mathbf{s}\bigl(\boldsymbol{\beta}^{(k)}|\mathbf{y}\bigr) - \boldsymbol{H\bigl(\beta}^{(k)}|\mathbf{y}\bigr) \bigl(\boldsymbol{ \beta} - \boldsymbol{\beta}^{(k)}\bigr) = 0 , $$

for β, which yields:

$$ \boldsymbol{\beta}^{(k+1)} = \boldsymbol{ \beta}^{(k)} + \boldsymbol{H\bigl(\beta}^{(k)}|\mathbf{y} \bigr)^{\dagger} \mathbf{s}\bigl( \boldsymbol{\beta}^{(k)}| \mathbf{y}\bigr) . $$

The algorithm iterates until convergence is achieved, which is typically defined as the point at which the change in the estimate of β between two successive iterations falls below a certain threshold.

Fisher Scoring Method [10]:

is a useful approach for maximum likelihood estimation that involves replacing the observed information matrix \(\boldsymbol{H(\beta}^{(k)}|\mathbf{y})\) by the expected information matrix \(\boldsymbol{F(\beta}^{(k)}|\mathbf{y})\) in the update formula (9), i.e.,

$$ \boldsymbol{\beta}^{(k+1)} = \boldsymbol{ \beta}^{(k)} + \boldsymbol{F\bigl(\beta}^{(k)}|\mathbf{y} \bigr)^{\dagger} \mathbf{s}\bigl( \boldsymbol{\beta}^{(k)}| \mathbf{y}\bigr) . $$

This simplifies the required computations, making it faster and more efficient.

3.2.4 Asymptotic properties of the maximum likelihood estimator (MLE)

Given the gamma regression model with logarithmic link function, as defined in (3), and the MLE procedure presented in the previous section, we now investigate the asymptotic properties of the MLE of the regression coefficients \(\boldsymbol{\beta} = (\beta _{0}, \dots , \beta _{k})^{T}\). Specifically, under mild regularity conditions introduced below, the MLE can be proven to be a consistent and asymptotically normal estimator, with its asymptotic covariance matrix being equivalent to the inverse of the Fisher information matrix [9].

Assumption 1

([9] Regularity Assumptions)

Let \(\hat{\boldsymbol{\beta}} \in B \subset \mathbb{R}^{p}\) denote the ML estimator for the true parameter β, p be the number of predictor variables in the model, and M the image \(\boldsymbol{\mu (\beta )}\) of β. Furthermore, the linear combination of the predictor variables η is related to the mean \(\boldsymbol{\mu (\beta )}\) of the response y by an injective link function \(g: M \rightarrow \mathbb{R}^{p}\), i.e., \(\boldsymbol{\eta} = g(\boldsymbol{\mu (\beta )})\) (compare with (3)). Additionally, there holds

  1. (i)

    B is open in \(\mathbb{R}^{p}\),

  2. (ii)

    The design matrix X has full rank, i.e., \(\operatorname{rank}(\mathbf{X})=p\),

  3. (iii)

    \(g(\cdot )\) is twice continuously differentiable on M.

Note that Assumption 1 is valid for our gamma regression model with a logarithmic link function (1), i.e., where the response variable follows a gamma distribution (3).

Definition 4

An estimator \(\hat{\boldsymbol{\beta}}\) is consistent for the true parameter vector β if, as the sample size n goes to infinity, \(\hat{\boldsymbol{\beta}}\) converges in probability to β. In other words, for any small positive number ϵ, it holds that

$$ \lim_{n \to \infty} P\bigl(\|\hat{\boldsymbol{\beta}} - \boldsymbol{\beta} \| > \epsilon \bigr) = 0 . $$

Using the Law of Large Numbers, the sample mean of a sequence of i.i.d. random variables with finite mean converges in probability to the expected value. Since the log-likelihood function \(\ell (\boldsymbol{\beta}| \mathbf{y})\) in this model (3) is the sum of i.i.d. Gamma distributions, the Law of Large Numbers can be used to establish convergence in probability of the MLE to the true parameter values. In the following proposition, two key properties of the gamma regression model are established without providing formal proof.

Proposition 3


  1. (i)

    In the setting of the gamma regression model (3), the MLE \(\hat{\boldsymbol{\beta}}\) is consistent for β. In particular, under the regularity conditions stated in Assumption 1, the ML estimator \(\hat{\boldsymbol{\beta}}\) converges in probability to the true regression coefficients β for increasing sample size, i.e., \(\hat{\boldsymbol{\beta}} \overset{p}{\to} \boldsymbol{\beta}\), where \(\overset{p}{\to}\) denotes convergence in probability.

  2. (ii)

    Let the assumptions of Proposition 3hold. Then the gamma regression model defined in (3) is asymptotically normal in relation to the maximum likelihood estimator (MLE) \(\hat{\boldsymbol{\beta}}\), i.e., \(\sqrt{n}(\hat{\boldsymbol{\beta}} - \boldsymbol{\beta}) \overset{d}{\to} \mathcal{N}(\boldsymbol{0}, \boldsymbol{F}^{\dagger}(\boldsymbol{\beta |\mathbf{y}}))\), where \(\overset{d}{\to}\) denotes convergence in distribution, and n denotes the sample size.

3.2.5 Linear hypothesis testing

By conducting hypothesis tests on the estimated regression coefficients \(\hat{\boldsymbol{\beta}}\), one can provide evidence-based justifications for the inclusion or exclusion of specific predictors, ensure the robustness and reliability of a model, and enhance the interpretability and generalizability of the findings. Testing a linear hypothesis on the coefficients of the Gamma GLM can be represented as follows:

$$ H_{0}: \boldsymbol{C}\boldsymbol{\beta} = \boldsymbol{d} , $$

where C is a known \(r\times p\) matrix of rank r, β is the \(p\times 1\) vector of regression coefficients, and d is the \(r\times 1\) vector of known constants. This matrix C is used to define the specific hypothesis being tested, and its structure depends on the research question at hand. In the context of our study, C is constructed to examine the significance of certain predictors in relation to the response variable.

Under hypothesis \(H_{0}\), the unrestricted maximum likelihood estimator \(\hat{\boldsymbol{\beta}}\) is not efficient, and therefore we need to consider restricted estimators that take into account the constraints imposed by \(H_{0}\) [10]. For this, we consider the Wald statistic w given in

Definition 5

The Wald statistic w is defined as:

$$ w = (\boldsymbol{C}\hat{\boldsymbol{\beta}}- \boldsymbol{d})^{T} \bigl[ \boldsymbol{C} \underbrace{\bigl( \boldsymbol{X}^{T}\boldsymbol{\tilde{W}}\boldsymbol{X}\bigr) \quad ^{\dagger}}_{ \boldsymbol{F^{\dagger}(\hat{\beta}|\mathbf{y})}} \boldsymbol{C}^{T} \bigr]^{-1}(\boldsymbol{C}\hat{\boldsymbol{\beta}}-\boldsymbol{d}), $$

where X is the \(n\times p\) design matrix, and W is the \(n\times n\) diagonal matrix with the weights \(w_{i}\) on the diagonal.

Under \(H_{0}\), the Wald statistic has an asymptotic \(\chi ^{2}\)-distribution with r degrees of freedom [10], i.e.,

$$ w \stackrel{d}{\rightarrow} \chi _{r}^{2} \quad \text{as } n \rightarrow \infty . $$

We reject \(H_{0}\) at level α if \(w > \chi _{r,1-\alpha}^{2}\), where \(\chi _{r,1-\alpha}^{2}\) is the \(1-\alpha \) quantile of the \(\chi ^{2}\)-distribution with r degrees of freedom.

In the specific case of predictive modelling in HVOF coating, hypothesis testing plays a vital role in determining the relevance of regression coefficients \(\boldsymbol{\beta}_{j}\), where \(\boldsymbol{\beta}_{j}\) denotes a subvector of β. Specifically, we consider the case where the null hypothesis \(H_{0}: \boldsymbol{\beta}_{j} = 0\) versus the alternative hypothesis \(H_{1}: \boldsymbol{\beta}_{j} \neq 0\).

Proposition 4

Let \(\boldsymbol{\beta}_{j}\) be a subvector of β with dimension r, \(\boldsymbol{d}=\boldsymbol{0}\), and C be a \(r \times p\) matrix with 1 at the entries corresponding to the elements of \(\boldsymbol{\beta}_{j}\) and 0 elsewhere. With this choice the Wald statistic w, defined in (11), takes the form

$$ w = \hat{\boldsymbol{\beta}}_{j}^{T} \mathbf{A}_{j}^{-1} \hat{\boldsymbol{\beta}}_{j} , $$

where \(\mathbf{A}_{j}\) is the submatrix of the asymptotic covariance matrix \(\mathbf{A}=(\boldsymbol{X}^{T}\boldsymbol{\tilde{W}}\boldsymbol{X}) ^{\dagger}\) corresponding to the elements of \(\boldsymbol{\beta}_{j}\).


Assuming that the prerequisites for \(\boldsymbol{\beta}_{j}\) and d, as specified in Proposition 4, are met, and with the matrix C taking on the following form:

figure c

Here, \(\boldsymbol{\beta}_{j}\) represents the initial r regression coefficients, given by \((\beta _{1}, \dots , \beta _{r})^{T}\). Together with the definition 5 there holds

$$ w \overset{\text{(11)}}{=} (\boldsymbol{C}\hat{\boldsymbol{\beta}}- \boldsymbol{d})^{T} \bigl[\boldsymbol{C} \underbrace{\bigl( \boldsymbol{X}^{T}\boldsymbol{\tilde{W}}\boldsymbol{X}\bigr) \quad ^{\dagger}}_{ \boldsymbol{F^{\dagger}(\hat{\beta}|\mathbf{y})}} \boldsymbol{C}^{T} \bigr]^{-1}(\boldsymbol{C}\hat{\boldsymbol{\beta}}-\boldsymbol{d}) = \hat{\boldsymbol{\beta}}_{j}^{T} \mathbf{A}_{j}^{-1} \hat{\boldsymbol{\beta}}_{j} , $$

which yields (12). □

In accordance with Proposition 4, the assessment of the relevance of a subvector \(\boldsymbol{\beta}_{j}\) is determined by (12). If \(\boldsymbol{\beta}_{j}\) is one-dimensional, the Wald statistic w corresponds to the application of a t-test [10]. The test statistic, denoted as \(t_{j}\), quantifies the extent to which the estimated coefficient \(\hat{\beta}_{j}\) deviates from \(H_{0}\), taking into account the corresponding standard error, i.e.,

$$ t_{j} = \frac{\hat{\beta}_{j}}{\sqrt{a_{jj}}} , $$

with \(a_{jj}\) the j-th diagonal element of \(\mathbf{A}=(\boldsymbol{X}^{T}\boldsymbol{\tilde{W}}\boldsymbol{X}) ^{\dagger}\). According to [10], \(t_{j}\) is t-distributed with \(n-p\) degrees of freedom and \(H_{0}\) is rejected at significance level α if

$$ \vert t_{j} \vert > t_{1-\alpha /2}(n-p) . $$

Alternatively, one can also perform the Likelihood-Ratio test using the likelihood ratio \(\mathcal{L}\) statistic, defined as

$$ \mathcal{L} := -2\log \bigl(L(\hat{\boldsymbol{\beta}}_{H_{o}}| \mathbf{y})/L(\hat{\boldsymbol{\beta}}|\mathbf{y}) \bigr) , $$

where \({L(\hat{\boldsymbol{\beta}}|\mathbf{y})}\) is the likelihood function for the unrestricted estimator, and \({L(\hat{\boldsymbol{\beta}}_{H_{o}}|\mathbf{y})}\) is the likelihood function for the restricted estimator obtained by maximizing the likelihood subject to \(H_{0}\). Analogous to the Wald statistic, \(\mathcal{L}\) follows an asymptotic \(\chi ^{2}\)-distribution with r degrees of freedom under the null hypothesis \(H_{0}\).

Linear hypothesis testing serves as a tool to assess the significance of estimated regression coefficients within a specified confidence level. This approach enables the determination of whether a particular predictor variable contributes meaningfully to the model’s description or if a simpler model could suffice without sacrificing essential information. In contrast, model selection criteria, described in the next subsection, aim to identify the most suitable model for predicting outcomes accurately.

Remark 2

(Note on Statistical Power and Variable Selection)

In statistical inference, it is crucial to consider two types of errors. Type I error occurs when the null hypothesis is incorrectly rejected, mistakenly identifying an effect or relationship that does not exist. This risk is quantified by the significance level α. Conversely, Type II error arises when one fails to detect a genuine effect, incorrectly retaining the null hypothesis. This does not necessarily mean there is no effect; rather, it may reflect the test’s limitations. Decisions to exclude terms from a model based solely on statistical significance should be made cautiously. While simple models are preferred for their ease of interpretation, overly strict criteria for variable selection may lead to important predictors being overlooked. Type II error risk is often denoted by β (distinct from regression parameters). The probability that a statistical test will correctly reject a false null hypothesis is known as the power of the test and is represented by \(1 - \beta \). High power increases confidence in hypothesis test outcomes, while low power raises doubts about non-significant findings. Statistical power relies on factors such as the significance level α, the sample size n, and the population effect size (ES) [5].

In the context of predictive modeling for HVOF coating, domain expertise is essential in addressing statistical power challenges. Due to the constraints of a small sample size, compounded by the laborious and expensive nature of experiments (cf. Sect. 5), the statistical power of hypothesis tests is inherently limited. Consequently, a thorough examination of regression coefficients was made in collaboration with thermal coating technicians to assess the relevance of predictors, particularly in cases where the performed test might not achieve statistical significance or the model selection criterion decides to exclude the respective effect. Further techniques for calculating and enhancing statistical power in regression analysis are explored in [6].

3.2.6 Model selection criteria

In practice, it is often necessary to compare different models and select the one which provides the best balance between model fit, reflecting the agreement with the observed data, and model complexity. Various criteria can be used for this purpose, including the Akaike Information Criterion (AIC) [2]. The AIC is based on the maximized log-likelihood function \(\ell (\boldsymbol{\beta}|\mathbf{y})\) and is defined by:

$$ \text{AIC} := -2\ell (\hat{\boldsymbol{\beta}}| \mathbf{y}) + 2p ; $$

where \(\hat{\boldsymbol{\beta}}\) is the maximum likelihood estimate of the model parameters, and p is the number of parameters in the model. The AIC penalizes models with many parameters, thus favoring models that fit the data well but are not too complex. Smaller AIC values indicate better models, with a difference of 2 between AIC values suggesting strong evidence in favor of the model with the lower AIC. However, note that the AIC is a relative measure of model fit and should be used for comparing models within the same class. For example, the AIC cannot be used to compare a gamma regression model to a Poisson regression model, since they belong to different classes.

The application of model selection criteria such as the AIC is valuable in predicting HVOF coating properties based on process conditions. While it is important to develop accurate prediction models to optimize coating performance and ensure the desired coating properties, it is worth to consider that including too many irrelevant parameters in the model can introduce disturbances and adversely affect its predictive ability.

4 Assessing predictive performance of HVOF coating models

To assess the predictive performance of the HVOF regression model, the commonly employed technique of Leave-One-Out-Cross-Validation (LOOCV) is utilized. It allows for a comprehensive evaluation of the model’s generalization ability and its accuracy in forecasting coating properties. LOOCV is particularly suitable for evaluating the model’s generalization capability when only a limited number of observations is available [36]. The LOOCV approach is a computationally intensive procedure, requiring the model to be fit n times, i.e., once for each observation in the dataset. To improve computational efficiency, alternative resampling techniques such as k-fold cross-validation may be used.

The LOOCV procedure involves iteratively fitting the model using all observations except one, and then using the fitted model to predict the response for the left-out observation. This is repeated for each observation in the dataset, resulting in n predicted responses. The predicted response for the i-th observation is denoted as \(\hat{y}^{(-i)}\), where the superscript \((-i)\) indicates that the i-th observation was left out during the fitting.

The prediction error for the i-th observation is defined as the difference between the predicted response and the observed response, i.e., \(\epsilon _{i} = y_{i} - \hat{y}^{(-i)}\).

Definition 6


The LOOCV estimate of the expected out-of-sample prediction error, i.e., the expected difference between the model’s predictions and the true values of new, unseen observations, is defined by:

$$ CV_{(n)} := \frac{1}{n}\sum_{i=1}^{n} \epsilon _{i}^{2} = \frac{1}{n} \sum _{i=1}^{n}\bigl(y_{i} - \hat{y}^{(-i)}\bigr)^{2} ; $$

where n is the number of observations in the dataset.

The LOOCV estimate of the expected out-of-sample prediction error is an unbiased estimator of the true out-of-sample prediction error and can be used to compare the predictive performance of different models. The smaller the value of \(CV_{(n)}\), the better the predictive performance of the model. In addition to the LOOCV, we also use the \(R^{2}\) statistic, which measures the proportion of variance in the observed response that is explained by the model.

Definition 7

The \(R^{2}\) statistic is defined as:

$$ R^{2} := 1 - \frac{\sum_{i=1}^{n}(y_{i} - \hat{y}_{i})^{2}}{\sum_{i=1}^{n}(y_{i} - \bar{y})^{2}} , $$

where n is the number of observations, \(y_{i}\) is the observed response for the i-th observation, \(\hat{y_{i}}\) is the predicted response for the i-th observation, and ȳ is the mean of the observed responses.

The \(R^{2}\) statistic can take values between 0 and 1, with higher values indicating a better fit of the model to the data. However, the \(R^{2}\) statistic can be biased towards models with more predictors, even if the predictors have little or no effect on the response. To address this issue, the adjusted \(R^{2}\) statistic is used, which adjusts \(R^{2}\) for the number of predictors in the model.

Definition 8

The adjusted \(R^{2}\) statistic is defined as:

$$ R_{\mathrm{adj}}^{2} := 1 - \frac{(n-1)}{n - p} \bigl(1-R^{2}\bigr) , $$

where p is the number of predictor variables in the model, n is the number of observations, and \(R^{2}\) is the statistic, defined in (14).

The adjusted \(R^{2}\) takes into account the trade-off between model complexity and model fit, and provides a more reliable measure of the model’s predictive performance, compared to the traditional \(R^{2}\), since it also accounts for the number of predictors p.

5 Application to HVOF coating: practical implementation

The HVOF process is influenced by a multitude of variables, making it challenging to identify the most important factors that actually impact coating properties. In this study, a selection of five factors was deliberately chosen, guided by the knowledge of thermal spray experts who identified these variables as significant determinants influencing the HVOF process. Moreover, a well-designed experiment is crucial to efficiently collect data on the effects of various factors on the process outcomes. The selection of an optimal experimental design is essential within the domain of HVOF coating, primarily attributed to the considerable costs and time-intensive nature associated with conducting experiments using coating materials. Furthermore, a carefully planned experimental design enables strategic allocation of available experiments, maximizing information and providing valuable insights within a limited experimental scope.

In industrial processes, statistical design of experiments (DoE) is considered a reliable technique for conducting experiments. A DoE allows for the systematic variation of process variables (= explanatory variables), which enables a more comprehensive understanding of their impact on the outcome. In contrast to the traditional one-factor-at-a-time approach, where interaction effects between two or more explanatory variables cannot be estimated, a DoE approach enables concise mathematical analysis of the resulting data and facilitates the identification of significant factors and their interactions. In addition, DoE allows researchers to investigate complex relationships between explanatory, revealing hidden insights, and supporting the optimization of industrial processes.

5.1 Central composite design

The central composite design (CCD), a well-established and commonly employed experimental design in the field of industrial process optimization, is utilized in this work to acquire empirical data for the HVOF process. Compared to other designs, the CCD is particularly useful as it can be efficiently used for fitting second-order models [22], i.e., estimating the effects of factors and their interactions in a quadratic form. The design incorporates a systematic variation of pre-defined factors, employing three levels \((-1,0,1)\) for each factor. Additional star points are included to enable the inclusion of quadratic terms in the model [22].

Explanatory variables, which are given in quantitative form, are transformed into qualitative factors. The center point \(x_{0}\), represented by the level 0 for each factor, serves as a reference point and is used to assess the impact of factors on the system. The cube points correspond to the corners of the experimental region, represented by the levels \((-1,1)\). The star points are additional experimental points that are used to estimate the behavior beyond the linear response and to identify potential quadratic effects of factors. These points are positioned at a value of α, where α is determined for a explanatory variable x as

$$ \alpha = x_{0} \pm \delta _{x}\sqrt{k}, $$

where \(x_{0}\) is the center point, \(\delta _{x}\) the difference between \(x_{0}\) and the quantitative value that corresponds to −1, and k is the number of explanatory variables under consideration. After transformation from quantitative values into qualitative ones, the values of \(x_{0}\), \(x_{0}\pm \delta _{x}\), and ±α are replaced by 0, ±1, and \(\pm \sqrt{k}\) respectively. These qualitative values are then used in the design matrix X to represent the explanatory variables (= factors). Figure 3 depicts a Central Composite Design (CCD) with \(k=2\) and \(k=3\) factors.

Figure 3
figure 3

Central Composite Design with a) \(k=2\) factors and b) \(k=3\) factors

The number of factors k, which represent the process variables under investigation (cf. Sect. 5.2.1), directly determines the number of cube and star points in the CCD. Specifically, there are \(2^{k}\) cube points and 2k star points included in the design. Additionally, the CCD consists of \(n_{c}\) center points, where \(n_{c}\) represents the total number of (potentially repeated) center points. To enhance the efficiency and accuracy of the design, a spherical CCD was employed with the choice of \(\alpha = \sqrt{k}\) concerning the star points. The spherical design allows for the estimation of effects of any factor with equal precision and reduces the risk of overemphasis on any factor. Thus, an optimal balance between precision and stability of the model parameters is obtained, which is important for receiving reliable estimates of the factor effects and their interactions. As recommended in [22], it is essential to randomize the experimental runs to avoid the influence of uncontrolled sources of variation.

5.2 Experimental setup

5.2.1 Identification of influencing factors

Based on a review of the literature [27, 31, 38], previous one-factor-at-a-time experiments, and expert knowledge by thermal spray coating experts of voestalpine TSM [1], five key factors, which are described in the next paragraph, were identified for systematic variation: powder feed rate (PFR), stand off distance (SOD), lambda (λ), i.e. the stoichiometric ratio of oxygen to fuel, coating velocity (CV), and total gas flow (TGF). The schematic diagram in Fig. 4 provides a comprehensive visual representation of the considered key process factors. Using these \(k=5\) factors and conducting \(n_{c}=7\) replications at the central point, a total of 49 trials were carried out, forming the CCD. The experiments were conducted using a rotational setup that included a turning lathe, allowing for the application of the thermal spray coatings (cf. Fig. 5).

Figure 4
figure 4

Illustration of the considered key factors in the HVOF coating process

Figure 5
figure 5

Photograph illustrating the experimental setup during the HVOF coating process, showing the robot, turning lathe, and coating stream in action

The selected factors play a critical role in the HVOF coating process, exerting significant influence on the quality and performance of the resultant coatings. The PFR governs the amount of coating material supplied, while the SOD regulates the spacing between the spray gun and the substrate. The stoichiometric ratio of oxygen to fuel (λ) ensures specific combustion conditions. Furthermore, the CV, determined by the combined influence of the robot traverse speed and the rotational speed of the turning lathe (cf. Fig. 4), enables precise control over the deposition process. Finally, the TGF is constituted by the summed gas flow of fuel, oxygen, and air, collectively governing the overall flow rate of the combustion gases.

Each of the five factors is accompanied by a designated set of predefined levels of variation, which are listed in Table 2. These levels were determined to cover a range of values that would effectively capture the variability and impact of these factors on the desired coating properties. The chosen levels allow for a systematic and comprehensive exploration of the parameter space.

Table 2 Levels of key factors for HVOF coating experiments depicted in Fig. 3

5.2.2 Experimental procedures

The HVOF coatings were produced using an Oerlikon Metco thermal spraying equipment, namely the DJ 2700 gas-fuel HVOF system with water-cooled gun assembly. The fuel gas used for these tests was propane, its amount and ratio defined by the two key factors TGF and Lambda. For the process preparation, steel plates of type 1.4404 were welded onto an axis mounted on a turning lathe for rotational spraying. All samples were degreased with acetone and sandblasted with alumina before thermal spraying. The powder used for the spraying process was an agglomerated sintered tungsten carbide powder (WC-10Co-4Cr) with a grain size in the range of −45 + 15 μm, supplied by Oerlikon Metco. The photograph presented in Fig. 5 showcases the experimental setup employed during the HVOF coating process, wherein the dynamic engagement of the robot, turning lathe, and coating stream can be observed.

Additional thermal spraying attributes like cooling, powder feed gas, pressure and number of passes, i.e., number of times the coating material was applied or sprayed onto the substrate during each experimental run, were kept constant throughout the experiments. In addition to the classic coating properties such as roughness, porosity, layer thickness, and surface hardness, the deposition rate, deposition efficiency, and in-flight particle properties such as particle velocity and particle temperature of the powder particles were measured.

Two different in-situ measurements were performed in the course of these trials. On the one hand, the in-situ particle characterization and on the other hand, the in-situ pyrometric temperature measurement. The particle characteristics were measured using a Spraywatch camera with the software SW4 (supplied by Oseir). The temperature of the sample surface was constantly measured using a Keller pyrometer.

The surface roughness of the sprayed samples was measured using a mobil roughness tester Hommel Etamic Waveline W5. Coating hardness was assessed on the surface using a Cisam-Ernst S.r.l E-Computest mobile hardness tester, using a spheroconical diamond at a load of 5 kg and a testing time of 2 seconds. In addition to the surface characterization, cross-sections of each sample were prepared (according to internal preparation procedure WC) to analyse the coating thickness. The coating thickness and the coating porosity were determined using image analysis software, IMS Client, applied to microscopic images captured with a Zeiss Axio Observer.Z1m. Figure 6 provides visual evidence of the observed variations in coating thickness, as captured in the microscopic images acquired from the IMS Client software.

Figure 6
figure 6

Microscopic images obtained from the IMS Client software, showing the observed variations in coating thickness across the cross-sectional profiles of the sprayed samples

6 Empirical results of experiments in HVOF thermal spraying

In this section, the empirical findings derived from the comprehensive analysis of the experimental data are presented, demonstrating the effectiveness and utility of the gamma regression approach in analyzing the relationships between key process variables and coating properties. The analysis and modelling were performed using the statistical software R (version 4.2.2). The gamma regression models were implemented using the glm function [28] with the Fisher scoring algorithm for estimating the regression coefficients \(\hat{\boldsymbol{\beta}}\) (= ML estimates).

The properties listed in Table 1 serve as an overview and exemplify various potentially relevant properties of the HVOF process. This study focuses on analyzing 8 properties, which include in-flight characteristics (particle velocity and particle temperature), performance metrics (deposition rate and deposition efficiency), and coating attributes (thickness, roughness, hardness, and porosity). Analysis of the phase composition is typically not relevant for the coating material used. It should be emphasized, however, that the presented methodology holds equal relevance for the analysis of the other response variables in Table 1.

Table 3 provides a detailed illustrative example of the estimated regression coefficients and their corresponding standard errors of the deposition rate model and the deposition efficiency model. Additional tables containing analogous information for the remaining response variables can be found in the Appendix, specifically Table 5 and 6. These tables contain two kinds of models for each property, a full and a reduced version. The full model encompasses all predictor variables that can be estimated by utilizing the CCD methodology, while the reduced model is derived through variable selection based on criteria such as the AIC and hypothesis testing for coefficient relevance, as described in Sects. 3.2.5 and 3.2.6. The model selection procedure involved the following steps: Initially, the full model was constructed, and non-significant coefficients were iteratively eliminated in a backward direction using the AIC as the guiding metric. If the removal of a coefficient resulted in a reduction in the AIC, the significance of the respective predictor was reassessed through a hypothesis test in consultation with thermal spray technicians. This consultation aimed to assess the practical significance of the coefficients in the context of thermal coating processes. Subsequently, a decision was made regarding the justification for the non-relevance of the coefficient, leading to its exclusion from the model. This approach was adopted due to observations indicating that the statistical power for many model coefficients fell below the recommended threshold of 0.8, as suggested by Cohen [6].

Table 3 Estimated regression coefficients and standard errors (in brackets) for deposition rate and deposition efficiency models

Each row in Table 3 corresponds to a specific predictor variable, i.e., main and quadratic effects of PFR, SOD, Lambda, CV, and TGF and interaction effects between them. The associated coefficients (= ML estimates) indicate the magnitude and direction of the predictor impact on the deposition rate and coating thickness. The values in parentheses next to the coefficients denote the respective standard errors. These regression coefficients and standard errors enable an assessment of the statistical significance of the associations between the predictor variable and the coating properties. The corresponding significance levels of the regression coefficients are denoted by asterisks. In particular, a significance level of 0.001 is indicated by , 0.01 by , 0.05 by , and 0.1 by , where lower values (i.e., more stars) indicate a stronger level of statistical significance. Effects that do not exhibit significance symbols in the reduced model are considered to be of marginal relevance and have been incorporated into the analysis only due to their potential importance based on domain expertise.

To evaluate the goodness-of-fit and performance of the models, the log-likelihood values play a crucial role. Specifically, the full models demonstrate higher log-likelihood values of −91.492 and 109.118 compared to −96.371 and 103.878 for the reduced models, indicating a stronger fit in capturing the observed data patterns compared to the reduced models. The reduced model exhibits lower AIC values of 214.742 and −187.756 compared to the AIC values of 226.984 and −174.236 obtained by the full model. These AIC values in Table 3 indicate that the reduced model is favored over the full model in terms of achieving a better trade-off between model complexity and goodness-of-fit for both the deposition rate and deposition efficiency. Despite the full model potentially providing a better overall goodness-of-fit, the AIC criterion takes into account the complexity of the model and penalizes excessive complexity.

Consistent with these findings, the supplementary Tables 5, 6, and 7 in the appendix uniformly show similar results regarding the AIC values and log-likelihoods. Notably, these results consistently favor the reduced models, indicating their ability to achieve a better balance between model complexity and goodness-of-fit. Moreover, across all regression models, each of the five explanatory variables demonstrates significant effects, providing robust evidence for their appropriate selection. Interestingly, the squared effects of individual factors exhibit greater statistical significance compared to the interaction effects. In addition, the results indicate that only the effects of Lambda and TGF are consistently significant across all models, suggesting their shared dependence. This finding also highlights the intricate nature of the relationships involved. For instance, despite the expected correlation between deposition efficiency and deposition rate, it becomes apparent that these two properties cannot be adequately explained by the same set of parameters. This observation further emphasizes the technical challenges involved in handling and managing these interdependencies.

To ascertain whether the reduced model exhibits superior predictive performance compared to the full model, the metrics introduced in Sect. 4 are computed for each model individually. Table 4 summarises the outcomes of the gamma regression analysis for in-flight properties (velocity and temperature), performance indicators (deposition rate and deposition efficiency), and coating properties (thickness, roughness, hardness, and porosity). Once again, the outcomes of both the full and reduced models are presented, highlighting their ability to model the studied properties. In addition to the number of coefficients \(N_{p}\) and the model selection criterion AIC as in the preceding Table 3, this table also incorporates important performance metrics, namely \(R^{2}\), \(R^{2}_{\mathrm{adj}}\), and \(CV_{(n)}\), to measure the predictive quality of the regression models, as described in Sect. 4.

Table 4 Results of the gamma regression analysis for in-flight properties (velocity and temperature), performance indicators (deposition rate and deposition efficiency), and coating properties (thickness, roughness, hardness, and porosity) using full and reduced models
Table 5 Estimated regression coefficients and standard errors (in brackets) for particle velocity and particle temperature models
Table 6 Estimated regression coefficients and standard errors (in brackets) for coating thickness and coating roughness models
Table 7 Estimated regression coefficients and standard errors (in brackets) for surface hardness and coating porosity models

Concerning the in-flight properties, both the full and reduced models demonstrate favorable results. The full model for particle velocity exhibits a high \(R^{2}\) value of 0.94, indicating a strong fit to the observed data. However, taking into account the number of predictors \(N_{p}\) in the model, it is advisable to consider the adjusted \(R^{2}\) value of 0.89, which accounts for the model’s complexity. Conversely, the reduced model for velocity yields a slightly lower \(R^{2}\) value of 0.93, yet a higher adjusted \(R^{2}\) value of 0.92 compared to the full model. These findings, coupled with lower values of the Akaike Information Criterion AIC and reduced out-of-sample prediction error \(CV_{(n)}\), suggest that the reduced model offers superior predictive performance. Similar patterns emerge for the regression models investigating the other target variables in Table 4, consistently favoring the reduced models. According to the adjusted \(R^{2}\), all reduced models demonstrate a good fit to the data, with values exceeding 0.8. However, the model for coating porosity yields less satisfactory results. This observation may be attributed to the volatile nature of the porosity measurement technique using Image Analysis, as discussed in [3].

In addition to the findings in Table 4, Fig. 7 provides a visual representation of the deposition rate predictions obtained from the full and reduced models. Each data point on the scatter plot represents an experimental trial, where the y-axis corresponds to the observed values, and the x-axis represents the LOOCV predictions (refer to Sect. 4). The color-coded points differentiate between the center points, cube points, and star points obtained from the CCD.

Figure 7
figure 7

Scatter plot showing the comparison between observed values and LOOCV predictions for deposition rate using the full and reduced models. The data points are color-coded based on the corresponding center points (blue), cube points (green), and star points (yellow) from the Central Composite Design

Upon analysis of the scatter plot, it is evident that the reduced model yields predictions that are closer to the diagonal line, indicating a higher degree of agreement between the predicted and observed values. This closer alignment implies a more accurate prediction of the deposition rate by the reduced model compared to the full model. Moreover, as expected, the prediction accuracy varies across different design points. The star points (yellow) demonstrate relatively lower predictive accuracy compared to the center points (blue) and the cube points (green), although this discrepancy is observed only in a subset of star points. This outcome underscores the challenges associated with extrapolating the model’s behavior to regions outside the training data, emphasizing the need for caution when interpreting predictions for such points.

Overall, Fig. 7 provides strong evidence supporting the superior predictive performance of the reduced model in estimating the deposition rate. The analysis of these visual results further strengthens the findings presented in Table 4, reinforcing the advantages of employing the reduced model in understanding and predicting the deposition rate more accurately. Furthermore, similar findings regarding the superior predictive performance of the reduced model are also observed for the other analyzed target variables.

7 Conclusion

This study proposed a framework for modelling and predicting critical target variables in HVOF coating processes. By utilizing DoE and GLMs, accurate estimation of model parameters was achieved through maximum likelihood estimation. The framework incorporated a careful selection of predictor variables based on their significance and contribution to the coating properties, enhancing the model’s interpretability and predictive performance. The application of this framework to experimental data from thermal spray coating experiments demonstrated its effectiveness in predicting target variables and providing insights into the relationships between factors and coating properties. The systematic variable selection process helps identify the most influential factors and eliminates irrelevant or redundant variables, simplifying the modelling process and improving the accuracy of predictions. The proposed framework has the potential to optimize thermal spray coating processes and contribute to the development of more efficient coating technologies in various industries. By developing a comprehensive understanding of the intricate interplay among process variables, material properties, and coating microstructure, manufacturers can enhance the functionality and performance of coated surfaces. This, in turn, can lead to improved product quality, extended component lifespan, and reduced maintenance costs.

In future investigations, we intend to expand our framework by introducing additional variables and exploring their interactions. This includes varying previously held constant factors, such as process gas pressures, across different levels to assess their impact. Furthermore, we plan to compare gamma regression models with alternative statistical models that require distinct distributional assumptions. While our previous work revealed the effectiveness of machine learning algorithms, particularly support vector machines, in predicting HVOF-related properties [29], we aim to explore a hybrid approach combining these methodologies to further enhance predictive accuracy. Given the dynamic nature of the HVOF process, where process variables often deviate from target values, sensors will be installed within the booth to monitor these variations. Using advanced modeling techniques, this sensor data together with additional quantitative features are used to improve predictive capabilities. Further experiments will be conducted to ensure that a sufficient number of samples is available. Following satisfactory performance in predicting coating quality properties related to WCCoCr, the framework will be extended to other coating materials and their associated characteristics.

Overall, this study provides a systematic and data-driven approach to modeling and predicting coating properties in thermal spray coating. By leveraging this framework, researchers and practitioners can advance the understanding and optimization of thermal spray processes, leading to advancements in surface technology and its applications across industries. The variable selection process improves prediction accuracy and facilitates informed decision-making in the coating optimization process, contributing to the overall improvement of coating methodologies.

Data availability

The datasets generated and/or analyzed during the current study are not publicly available due to company confidentiality but are available from the corresponding author on reasonable request. The data used in this paper are proprietary and subject to confidentiality agreements. However, interested parties may request access to the data by contacting the corresponding author. Requests will be considered on a case-by-case basis, subject to company compliance and confidentiality agreements.


  1. Technischer Service der voestalpine Stahl GmbH. Accessed: 2023-05-23.

  2. Akaike H. A new look at the statistical model identification. IEEE Trans Autom Control. 1974;19(6):716–23.

    Article  MathSciNet  Google Scholar 

  3. Ang ASM, Berndt CC. A review of testing methods for thermal spray coatings. Int Mater Rev. 2014;59(4):179–223.

    Article  Google Scholar 

  4. Becker A, Fals HD, Roca AS, Siqueira IB, Caliari FR, da Cruz JR, Vaz RF, de Sousa MJ, Pukasiewicz AG. Artificial neural networks applied to the analysis of performance and wear resistance of binary coatings Cr3C237WC18M and WC20Cr3C27Ni. Wear. 2021;477:203797.

    Article  Google Scholar 

  5. Cohen J. Statistical power analysis. Curr Dir Psychol Sci. 1992;1(3):98–101.

    Article  Google Scholar 

  6. Cohen J. Statistical power analysis for the behavioral sciences. San Diego: Academic Press; 2013.

    Book  Google Scholar 

  7. Davis JR et al.. Handbook of thermal spray technology. Materials Park: ASM International; 2004.

    Google Scholar 

  8. Dongmo E, Wenzelburger M, Gadow R. Analysis and optimization of the HVOF process by combined experimental and numerical approaches. Surf Coat Technol. 2008;202(18):4470–8.

    Article  Google Scholar 

  9. Fahrmeir L, Kaufmann H. Consistency and asymptotic normality of the maximum likelihood estimator in generalized linear models. Ann Stat. 1985;13(1):342–68.

    Article  MathSciNet  Google Scholar 

  10. Fahrmeir L, Kneib T, Lang S, Marx B, Fahrmeir L, Kneib T, Lang S, Marx B. Regression: models, methods and applications. Berlin: Springer; 2013.

    Book  Google Scholar 

  11. Fauchais PL, Heberlein JV, Boulos MI. Thermal spray fundamentals: from powder to part. Berlin: Springer; 2014.

    Book  Google Scholar 

  12. Gu S, Eastwick C, Simmons K, McCartney D. Computational fluid dynamic modeling of gas flow characteristics in a high-velocity oxy-fuel thermal spray system. J Therm Spray Technol. 2001;10:461–9.

    Article  Google Scholar 

  13. Hardin JW, Hilbe JM. Generalized linear models and extensions. Stata Press; 2007.

    Google Scholar 

  14. Hastie T, Tibshirani R, Friedman JH, Friedman JH. The elements of statistical learning: data mining, inference, and prediction. Berlin: Springer; 2009.

    Book  Google Scholar 

  15. Herman H, Sampath S, McCune R. Thermal spray: current status and future trends. Mater Res Soc Bull. 2000;25(7):17–25.

    Article  Google Scholar 

  16. Ibrahim JG, Laud PW. On Bayesian analysis of generalized linear models using Jeffreys’s prior. J Am Stat Assoc. 1991;86(416):981–6.

    Article  MathSciNet  Google Scholar 

  17. Jalali AM, Salehi M. Fracture toughness of HVOF thermally sprayed WC-12Co coating in optimized particle temperature. Int J Adv Manuf Technol. 2017.

  18. Kuhnt S, Rehage A, Becker-Emden C, Tillmann W, Hussong B. Residual analysis in generalized function-on-scalar regression for an HVOF spraying process. Qual Reliab Eng Int. 2016;32(6):2139–50.

    Article  Google Scholar 

  19. Li M, Christofides PD. Modeling and control of high-velocity oxygen-fuel (HVOF) thermal spray: a tutorial review. J Therm Spray Technol. 2009;18:753–68.

    Article  Google Scholar 

  20. Liu M, Yu Z, Zhang Y, Wu H, Liao H, Deng S. Prediction and analysis of high velocity oxy fuel (HVOF) sprayed coating using artificial neural network. Surf Coat Technol. 2019;378:124988.

    Article  Google Scholar 

  21. Mojena MAR, Roca AS, Zamora RS, Orozco MS, Fals HC, Lima CRC. Neural network analysis for erosive wear of hard coatings deposited by thermal spray: influence of microstructure and mechanical properties. Wear. 2017;376:557–65.

    Article  Google Scholar 

  22. Montgomery D. Design and analysis of experiments. 8th ed. New York: Wiley; 2012.

    Google Scholar 

  23. Nelder JA, Wedderburn RW. Generalized linear models. J R Stat Soc, Ser A, Stat Soc. 1972;135(3):370–84.

    Google Scholar 

  24. Palanisamy K, Gangolu S, Antony JM. Effects of HVOF spray parameters on porosity and hardness of 316L SS coated Mg AZ80 alloy. Surf Coat Technol. 2022;448:128898.

    Article  Google Scholar 

  25. Pan J, Hu S, Yang L, Ding K, Ma B. Numerical analysis of flame and particle behavior in an HVOF thermal spray process. Mater Des. 2016;96:370–6.

    Article  Google Scholar 

  26. Prasanna N, Siddaraju C, Shetty G, Ramesh M, Reddy M. Studies on the role of HVOF coatings to combat erosion in turbine alloys. Mater Today Proc. 2018;5(1):3130–6.

    Article  Google Scholar 

  27. Pukasiewicz A, De Boer H, Sucharski G, Vaz R, Procopiak L. The influence of HVOF spraying parameters on the microstructure, residual stress and cavitation resistance of FeMnCrSi coatings. Surf Coat Technol. 2017;327:158–66.

    Article  Google Scholar 

  28. R Core Team. R: a language and environment for statistical computing. Vienna, Austria: R Foundation for Statistical Computing; 2022.

    Google Scholar 

  29. Rannetbauer W, Hambrock C, Hubmer S, Ramlau R. Enhancing predictive quality in HVOF coating technology: a comparative analysis of machine learning lechniques. Proc Comput Sci. 2024;232:1377–87.

    Article  Google Scholar 

  30. Ribu DC, Rajesh R, Thirumalaikumarasamy D, Kaladgi AR, Saleel CA, Nisar KS, Shaik S, Afzal A. Experimental investigation of erosion corrosion performance and slurry erosion mechanism of HVOF sprayed WC-10Co coatings using design of experiment approach. J Mater Res Technol. 2022;18:293–314.

    Article  Google Scholar 

  31. Saaedi J, Coyle T, Arabi H, Mirdamadi S, Mostaghimi J. Effects of HVOF process parameters on the properties of Ni-Cr coatings. J Therm Spray Technol. 2010;19:521–30.

    Article  Google Scholar 

  32. Tabbara H, Gu S, McCartney D. Computational modelling of titanium particles in warm spray. Comput Fluids. 2011;44(1):358–68.

    Article  Google Scholar 

  33. Tillmann W, Kuhnt S, Baumann IT, Kalka A, Becker-Emden EC, Brinkhoff A. Statistical comparison of processing different powder feedstock in an HVOF thermal spray process. J Therm Spray Technol. 2022;31(5):1476–89.

    Article  Google Scholar 

  34. Tyagi A, Murtaza Q, Walia R. Evaluation of the residual stress of HVOF sprayed carbon coating after wear testing conditions using ANN coupled Taguchi approach. Surf Topogr Metrol Prop. 2021;9(3):035027.

    Article  Google Scholar 

  35. Wedderburn RWM. On the existence and uniqueness of the maximum likelihood estimates for certain generalized linear models. Biometrika. 1976;63(1):27–32.

    Article  MathSciNet  Google Scholar 

  36. Wong TT. Performance evaluation of classification algorithms by k-fold and leave-one-out cross validation. Pattern Recognit. 2015;48(9):2839–46.

    Article  Google Scholar 

  37. Zhang G, Kanta AF, Li WY, Liao H, Coddet C. Characterizations of AMT-200 HVOF NiCrAlY coatings. Mater Des. 2009;30(3):622–7.

    Article  Google Scholar 

  38. Zhao L, Maurer M, Fischer F, Dicks R, Lugscheider E. Influence of spray parameters on the particle in-flight properties and the properties of HVOF coating of WC-CoCr. Wear. 2004;257(1–2):41–6.

    Article  Google Scholar 

Download references


The authors gratefully acknowledge voestalpine Stahl GmbH for their support through the research center, provision of materials, and financial contribution to this investigation.


SH and RR are funded by the Austrian Science Fund (FWF): F6805-N36 within the SFB F68 “Tomography Across the Scales”.

Author information

Authors and Affiliations



WR conceived and designed the study, gathered the data, performed the analysis, estimation, and modeling, and wrote the manuscript. The experiments were performed by WR and CH. CH contributed to data interpretation and provided critical revisions. SH and RR supported with theoretical knowledge and provided critical revisions. All authors read and approved the final manuscript.

Corresponding author

Correspondence to Wolfgang Rannetbauer.

Ethics declarations

Competing interests

The authors declare that they have no competing interests.

Additional information

Publisher’s Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Appendix:  Supplementary tables of estimated regression coefficients and standard errors

Appendix:  Supplementary tables of estimated regression coefficients and standard errors

Rights and permissions

Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article’s Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Rannetbauer, W., Hubmer, S., Hambrock, C. et al. Predictive modelling of critical variables for improving HVOF coating using gamma regression models. J.Math.Industry 14, 7 (2024).

Download citation

  • Received:

  • Accepted:

  • Published:

  • DOI: