• Sonuç bulunamadı

On multicollinearity in nonlinear regression models

N/A
N/A
Protected

Academic year: 2021

Share "On multicollinearity in nonlinear regression models"

Copied!
8
0
0

Yükleniyor.... (view fulltext now)

Tam metin

(1)

Selçuk J. Appl. Math. Selçuk Journal of Special Issue. pp. 65-72, 2010 Applied Mathematics

On Multicollinearity in Nonlinear Regression Models Ali Erkoç1, Müjgân Tez2, Kadri Ula¸s Akay3

12University of Marmara, Art and Sciences Faculty, Departments of Statistics,

Göztepe-Kadıköy, 34722, ˙Istanbul, Türkiye

e-mail: 1ali66_ m ath@ hotm ail.com,2mtez@ m arm ara.edu.tr

3˙Istanbul University, Faculty of Science, Department of Mathematics,34134 Vezneciler,

˙Istanbul, Türkiye

e-mail: kulas@ istanbul.edu.tr

Presented in 2National Workshop of Konya Ere˘gli Kemal Akman College, 13-14 May 2010.

Abstract. Regression analysis includes many techniques for modeling and analyzing the relationship between a dependent variable and one or more in-dependent variables. Linear and nonlinear regression models has widely used in many fields of applied science. One of the frequency problems in regres-sion analysis is multicollinearity problem between the explanatory variables. If there is no linear (approximately linear) relationship between the regressors, they are said to be orthogonal. In the case of orthogonal variables, statistical inference on the model is quite reliable. But in real life, fully unbound variables which are explaining the dependent variable are likely to be very low. When the explanatory variables are not orthogonal, then least squares parameter es-timation method will not provide a suitable convergence, and deviations from reality will ocur. For the linear model, many techniques were developed for the multicollinearity problem (Hoerl, AE (1962), Hoerl AE and Kennard RW (1968.1970)), but for nonlinear models there has not been any conclusive work yet. In this study, multicollinearity in nonlinear models will be analyzed and a remedy for the problem will be given.

Key words: Nonlinear models, Multicollinearity, Ridge regression, Mean square error.

2000 Mathematics Subject Classification: 62J02, 62J07. 1. Introduction

Regression analysis in many areas (data analysis, data mining, etc.) is widely used. In general, engineers and scientists often make use of regression analysis to analyze and interpret data sets. The regression models used, may be linear or

(2)

nonlinear. In many cases, linear regression analysis can be used to construct the model, although in each case is not suitable. There are many known problems based on nonlinear models in science and engineering. For example chemical engineers use the nonlinear model,  = 1( + 2) +  named Michaelis-Menten to investigate the effect of explanatory variables on the response variable [7].

One of the problems encountered in the multiple regression analysis, often called multicollinearity is that the explanatory variables are related to each other. Although this is an undesirable situation, it is one of most problems encountered in real life. For the multicollinearity problem in linear models, many techniques have been developed. Hoerl (1962), Hoerl and Kennard (1970, 1974), Hoerl, Kennard and Baldwin (1975), Marquardt (1970), Montgomery, Peck and Vining (2001) have conducted studies in this direction. Ridge Regression proposed by Hoerl and Kennard is one of the most widely used methods among these techniques. Although many studies have been conducted for the solution of the problem in linear models, multicollinearity in nonlinear models still needs to be investigated.

In this study, we aim to examine the multicollinearity in nonlinear models. Before the study of multicollinearity problem in nonlinear models is conducted, the multicollinearity in linear models will be given in section 2. In section 3, parameter estimation process in the nonlinear model will be examined and a new approach will be proposed as an alternative to the least squares estimation method. In the last section we will give some results related to the proposed approach.

2. Overview to Linear Models

In general, the multiple linear model is given

(1)  =  + 

where  is an  × 1 vector of observations,  is an  ×  matrix of the levels of the regressor variables,  is a  × 1 vector of the regression coefficients, and  is an  × 1 vector of random errors. In equation (1), it is assumed that the error term  has normal distribution with zero mean and 2variance.

Least squares method is one of the most widely used methods to estimate  parameter in equation (1) [7]. When minimization of sum of squares error () =

 P =1

2 = ´ = ( − )´( − ) is provided, then the least squares estimator of  is

(2) b = (´)−1´

Least squares estimator b given by equation (2) is the unbiased estimator with minimum variance.

(3)

2.1. Multicollinearity Problem

In multiple regression models, particularly in observational studies, a com-mon problem that is emerged, is that some explanatory variables are related to each other. Let the column of the  matrix be denoted , so that  = [1 2 ]. We may formally define multicollinearity in terms of linear dependence of the columns of . The vectors 1 2  are linearly depen-dent if there is a set of constants 1 2  not all zero, such that

 P =1

 ≈ 0.

2.2. Effects of Multicollinearity

The presence of multicollinearity in design matrix seriously affects least squares estimation of parameters [6]. The variances of the parameters are given

(3)  (b) = 2 1 1 − 2

(2 ≤ 1)  = 1 2  

where b is the least squares estimator of . 1 1 −2

 is the diagonal elements of

(´)−1 matrix in equation (3). Also, 2

 is the coefficient of multiple determi-nation from the regression of  on the remaining  − 1 regressor variables. If there is strong multicollinearity between  and any other regressors, the value of 2 will be close to unity (2 −→ 1), diagonal elements of variance-covariance matrix will be grow considerable (1−1 2

 −→ ∞)

Mean square error is widely used to compare the different estimators of a para-meter or to measure any estimator’s proximity to the real parapara-meter [1], [11]. In case of presence of multicollinearity in the model, to show how to change the least square estimator, mean square error,

(4) (2 1) = [(b − )´(b − )] = (b) =  P =1 (b− )2 =  P =1  (b) = 2 (´)−1 = 2 P =1 1     0  = 1 2 

can be used. In equation (4), 2

1= (b − )´(b − ) is square of distance from the estimator to real parameter, values of  are eigenvalues of ´ correlation ma-trix. If ´ correlation matrix is ill-conditioned because of the multicollinearity

(4)

in the design matrix, at least one  will be quite small and this situation will remove away values of estimation from the real parameter [6].

Examining the correlation matrix, variance inflation factor and eigenvalue eigen-vector analysis of correlation matrix are widely used in the literature for diag-nosing multicollinearity [1], [7]. In general, multicollinearity is due to model selection. For example, to get explanatory variables that are high correlated with each other to the model will cause an unwanted multicollinearity problem. The first improvement in order to resolve problems caused by multicollinearity needs to be to review the model and, if necessary, to remove the variables related to each other. But this may not always be a good solution. If the explanatory variables which providing a really good contribution are removed the validity of the model would be undermined. In this case, the many analytical solutions have been developed to remove multicollinearity. The ridge regression which is one of these solutions will be reviewed.

2.3. Ridge Regression

Ridge regression is the estimation method used to reduce the level of multi-collinearity in model. This was first developed by Hoerl and Kennard in 1970. The difference of ridge regression from the least squares method is eliminated the condition of the neutrality. According to the Gauss-Markov theorem, the least squares estimator has the smallest variance between the all linear unbiased estimator. But there is no guarantee that the variance is small [7]. Ridge re-gression, at this stage, provides to obtain the parameters with smaller variance by adding a slight bias to the estimation.

In general, the mean square error is

 (b∗) = (b∗− )2=  (b∗) + [(b∗) − ]2 =  (b∗) + [(b∗)]2

where b∗is any estimator of . If b∗is unbiased estimator of , then  (b∗) =  (b∗) will be supplied because of (b∗) = . The ridge estimator of  is

(5) b= (´ + )−1´

Biased estimators with smaller variance are obtained from equation (5) by adding a sufficient positive constant  to matrix.´. Here, the important point is to find   0 satisfying the condition below.

(6) [(b) − ]2  (b) −  (b) = [ (b) + 2  X =1 1 +  ]

The existence of value of   0 supplying condition (6) and consequently in-equality of  (b)   (b), is proved by Hoerl and Kennard (1970).

(5)

Since variance of parameters and bias depend on value of , the optimal choice of parameter  is very important. If the mean square error is examined, it can be seen that the variance decreases, while the bias shows an increasing trend due to the appropriate value of . Many techniques have been developed for the selection of bias parameter . The ridge trace proposed by Hoerl and Kennard (1970) is one of methods used to determine the appropriate value of  [11]. 3. Estimation Process in Nonlinear Models

In general, the nonlinear model can be written as

(7)  =  ( ) + 

where  is a ×1 vector of unknown parameters and  is an uncorrelated random error term such that  ∼ (0 2). The difference between nonlinear model and linear model is that in a nonlinear regression model, at least one of the derivatives of the expectation function with respect to the parameters depends on at least one of the parameters.

In nonlinear regression models, one of the methods that can be used to estimate the parameters is least squares methods. This method is based on minimization of the () =

 P =1

[− ( )]2=k  − ( ) k2 objective function. But this process requires to solve  normal equations

 P =1

[− ( )][ ()]= = 0

  = 1 2 . The normal equation system is quite difficult to solve, as  ( ) is a nonlinear function. At this stage Gauss-Newton method is used to esti-mate the parameters. This method is based on using least squares iteratively to the linearized model which is an approach to nonlinear model, by expand-ing Taylor series to the expected function in the neighbourhood of a start-ing point. First degree of Taylor series expansion of  ( ), about the point 0´= [10 20  0] is (8)  ( ) =  ( 0) +  X =1 [ ( )  ]=0(− 0) If we set 0=  ( 0) 0 = [ ( )  ]=0 − 0= 0

the nonlinear regression model given by equation (7) can be written as linear form

(6)

(9) 0= 0( − 0) + 

In the linear model above,  is the Jacobian matrix of  function. From the equation \ − 0= b1− 0= (0´0)−10´0that is the least squares estimator of ( − 0) in equation (9), the least squares estimator of  parameter is obtained as

(10) b1= 0+ (0´0)−10´0

For second iteration, the parameter value obtained from first iteration plays the same role with 0. This iterative process continues until convergence, that is, until

(11) k b− [+1k (k bk + )

where  = 10−5 and  = 10−3. At the  iteration, the estimator of 

 can be obtained from the equation\− −1= b− −1= (−1´−1)−1−1´−1. 3.1. An Alternative Approach to the Estimation Process of Nonlinear Models

Multicollinearity problem in nonlinear model has a more complex structure than linear model in terms of identification and elimination. While the linear rela-tionship between the columns of the  design matrix is reviewed in linear model, columns of Jacobian matrix are examined in nonlinear model. Not only Jacobian matrix depend on the columns of  matrix, but also depends on the parameter vector. The relationship between the column vectors of Jacobian matrix may or may not be related to the columns of matrix  [3]. If ´ is ill-conditioned due to multicollinearity in the Jacobian matrix, the least squares method will not provide the optimal approach for parameters in the model (7) that is desired to be estimated by way of Gauss-Newton method. In equation (7), the ridge estimator for parameter  − 0is given

(12) ( − \0)= d1− 0= (0´0+ 0)−10´0

and from the equation (12), the ridge estimator for parameter  can be obtained as

(13) d1= 0+ (0´0+ 0)−10´0

If the ridge estimator is compared with the least squares estimator by using mean square error,

(7)

(14)  (b) = [(b− )´(b− )] = [(b−  + 0− 0)´(b−  + 0− 0)] = {[(b− 0) − ( − 0)]´[(b− 0) − ( − 0)]} =  (b− 0)   (b − 0) = {[(b − 0) − ( − 0)]´[(b − 0) − ( − 0)]} = [(b − )´(b − )] =  (b)

is obtained. It is observed that the mean square error of the ridge estimator is smaller than the mean square error of the least square error. Here, because of b− 0 and b − 0 are the ridge estimator and least squares estimator of parameter  − 0 in linear model given by equation (9) respectively, inequality  [(b− 0)]   [(b − 0)] will be provided for at least one   0 is proved by Hoerl and Kennard (1970). Because ridge estimator is closer to the real parameter, it can be used as a new approach for the parameter estimation in nonlinear models, too. This new approach to the model at the first iteration can be generalized and can be used in case of multicollinearity at each iteration. 4. Results and Discussion

Since parameter estimation process in nonlinear models has more complicated structure than the linear model, in the case of multicollinearity, convergence may not be suitable to the actual parameters. There is therefore the ridge estimator which is used to obtain reduction of the degree of multicollinearity in linear models was applied to nonlinear model. Since the estimation process will continue iteratively until convergence is supplied, the estimator should be preferred as ridge or least squares and diagnosis should be done iteratively. A simulation study for the proposed approach can be made for the purpose of compliance. Moreover, an algorithm can be developed for suitable parameter estimation to the proposed approach.

References

1. BELSLEY David A. (1991), Conditioning Diagnostics Collinearity and Weak Data in Regression, John Wiley & Sons, Inc., New York.

2. GRUBER, Marvin, H. J.(1998) Improving Efficiency By Shrinkage: The James-Stein and Ridge Regression Estimators Statistics, Textbooks and Monographs; V. 156, CRC Press.

3. HILL R. Carter and ADKINS Lee C. (2001), Collinearity, In Edited by BALTAGI Badi H. A Companion to Theoretical Econometrics , Texas A & M University, pp. 256-278.

4. HOERL A. E. and KENNARD R. W. Ridge regression: Application to Nonorthog-onal Problems, Technometrics, Vol. 12, No. 1. (Feb., 1970), pp. 69-82.

(8)

5. HOERL A. E. and KENNARD R. W.(2000) Ridge regression: Biased Estimation for Nonorthogonal Problems (1970a), Techonometrics, Vol. 42, No. 1, pp. 80-86. 6. MARQUARDT Donald W. and SNEE Ronald D.,(1975) Ridge Regression in Prac-tice, The American Statistician, Vol. 29, No. 1, pp. 3-20.

7. MONTGOMERY D. C., PECK E. A., VINING G. G.,(2001) Introduction to Linear Regression Analysis, John Wiley & Sons, Inc., New York.

8. NETER John, WASSERMAN William, KUTNER Michael H., (1983) Applied Linear Regression Models, McGraw-Hill/Irwin.

9. NGO S. H., KEME´NY S., DEA´K A.,(2003) Performance of the ridge regres-sion method as applied to complex linear and nonlinear models, Chemometrics and Intelligent Laboratory Systems, 67, pp. 69-78.

10. WEISBERG Sanford, (2005) Applied Linear Regression, Third Edition, John Wiley & Sons, Inc., New York.

11. ZHANG John, IBRAHIM Mahmud, (2005) A Simulation Study on SPSS Ridge Regression and Ordinary Least Squares Regression Procedures for Multicollinearity Data, Journal of Applied Statistics, Vol.32, No.6, pp. 571-588.

Referanslar

Benzer Belgeler

In our study, it was aimed to investigate allergen sensitivities, especially house dust mite sensitivity in pre-school children with allergic disease complaints by skin prick

interval; FrACT ⫽ Freiburg Visual Acuity Test; VbFlu ⫽ COWAT Verbal Fluency; DigitF ⫽ Digit Span Forward; DigitB ⫽ Digit Span Backward; WCST⫽ Wisconsin Card Sorting Test; SimpRT

(12), akut miyokard infarktüslü hastalarda infarktüsten sonraki ilk 3 günde serum total ve lipide bağlı sialik asid düzeylerinde dereceli bir artışın olduğunu, total sialik asid

dimension of information updating, the primary school preservice teachers who were in the phases of mastery, impact and innovation similarly had advanced levels of TPACK

Tevşih, naat ve durak gibi dînî formların Rast makamında ilk kez kullanıldığı 18.yüzyıl, Ağır Çenber, Ağır Düyek, Darb-ı Türkî, Durak Evferi ve

Aim of this study to determination of disinfection effect of ozonated water in the washing process of iceberg lettuce leaves.. For this purpose lettuce samples were washed

Dermatoloji uzmanlık sonrası eğitim kurulu’nun (DUSEK) ikinci kursu “Her Yönüyle Saç” 5-6 Aralık 2015 tarihlerinde İstanbul Point Barbaros Otel’de

Elde edilen sonuçlar iki yönlü nedenselliği işaret etmekle beraber, endeks fiyatından net yabancı işlem hacmine doğru istatistiksel olarak daha güçlü bir nedensellik