• Sonuç bulunamadı

View of Bayesian Hierarchical Growth Model for Experimental Data on the Effectiveness of an Incentive-Based Weight Reduction Method

N/A
N/A
Protected

Academic year: 2021

Share "View of Bayesian Hierarchical Growth Model for Experimental Data on the Effectiveness of an Incentive-Based Weight Reduction Method"

Copied!
12
0
0

Yükleniyor.... (view fulltext now)

Tam metin

(1)

Bayesian Hierarchical Growth Model for Experimental Data on the Effectiveness of an

Incentive-Based Weight Reduction Method

Md Azman Shahadan1, Mujeeb Khan2, Aimillia Mohd Ramli3

1Faculty of Human Development, Universiti Pendidikan Sultan Idris 2Faculty of Human Development, Universiti Pendidikan, Malaysia 3

Kulliyyah of Islamic Revealed Knowledge and Human Science, International Islamic University, Malaysia mdazman@fpm.upsi.edu.my1

Article History: Received: 10 November 2020; Revised: 12 January 2021; Accepted: 27 January 2021; Published online: 05 April 2021

Abstract: The objective of this current research is to model the experimental data on the effectiveness of an incentive-based weight reduction method by using Bayesian hierarchical growth models. Three Bayesian hierarchical growth models are proposed, namely parametric Bayesian hierarchical growth model with correlated intercept and slope random effects model, parametric Bayesian hierarchical growth model with no correlated intercept and slope random effects model and semi-parametric Bayesian hierarchical growth model with Dirichlet process mixture prior model. The data is obtained from forty eight (48) students who had participated in an experiment on weight reduction method. The students were divided equally into two groups: single and pair groups. The experiment was carried out over the period of three months with a weight reading session for every two weeks. At the end of the study, we had six repeated measures of each student’s weight in kg and some measures of covariates and factors. Our results showed that the best model for the above data based on the Bayesian fit indexes and the models’ flexibility is the semi-parametric Bayesian hierarchical growth model with Dirichlet process mixture prior model. The results of the semi-parametric model showed that the ‘growth’ or reduction rates of the weight reduction experiment relate to the students’ gender, height in cm, experimental group (single or pair) and time in term of weeks.

Keywords: Bayesian hierarchical growth models; weight reduction program; semi-parametric growth models; Dirichlet process mixture; random effects

1. Introduction

Overweight and obesity in humans have increased globally in both developed and developing countries. Rising incidences of obesity is a major public health issue worldwide. The past two decades have witnessed an alarming increase in rates of overweight and obesity amongst humans, and studies indicate that the situation is worsening rather than improving [Centers for Disease Control and Prevention (CDC) 2004]. The impact of obesity on health and longevity is substantial. Approximately 300,000 deaths per year are caused by obesity-related complications (Allison, Fontaine, Manson, Stevens, & VanItallie, 1999). Obesity-obesity-related conditions include heart disease, stroke, hypertension, hyperlipidemia, gallbladder disease, osteoarthritis, certain types of cancer, sleep apnea, and psychological disorders (e.g., binge eating, depression). The impact of obesity on life expectancy varies according to the level of BMI. Specifically, when BMI reaches 30, the risk of death is elevated by 30%, and when BMI reaches 40, risk of death becomes 100 % higher than for a normal-weight person (Manson et al., 1995). Some psychosocial problems that are related to obesity are clinical depression, lowered self-esteem, job discrimination and other forms of social stigmatization (Kumanyika, Jeffery, Morabia, Ritenbaugh, & Antipatis, 2002). In addition, body image dissatisfaction is a strong predictor of other negative psychological outcomes, including eating disorders, and has been specifically linked to binge eating disorder (BED) in obese individuals (Schwartz & Brownell, 2004).

Obesity is linked to decreases in physical activities (Kumanyika, Jeffery, Morabia, Ritenbaugh, & Antipatis, 2002). Exercise training promotes a favorable change in body composition for weight loss (Stiegler & Cunliffe, 2006). Weight-loss maintenance requires expending approximately 2500 kcal/wk during exercise (Schoeller, Shay, & Kushner, 1997; Jakicic, Marcus, Gallagher, Napolitano, & Lang, 2003). This level of energy expenditure can be accomplished through vigorous activity (aerobics, cycling, or jogging) for approximately 30 min/day or more moderate activity for 60 to 75min/day as increased physical activity will decrease risk for future weight gain (Wadden, Vogt, Foster, & Anderson, 1998). Programs that directly increase physical activity would have larger intervention effects than those that do not increase activity (Stice, Shaw, & Marti, 2006). The only modifiable factor of overweight and obesity in this study was moderate and low level of physical activity in Malaysian postgraduate students (Said & Ismail, 2014).

The most widely studied treatment for weight loss is behavioral strategy or behavioral treatment that is also called behavioral weight loss (BWL) (Butryn, Webb, & Wadden, 2011; Wing, 2002). While BWL programs

(2)

vary in terms of the specific dietary and exercise prescriptions they provide (Wing, 2002), these generally incorporate strategies, such as self-monitoring of food intake, problem solving, cognitive restructuring, social support initiatives, education about nutrition and physical activities, and the use of stimulus control techniques and reinforcement (Wadden, Crerand, & Brock, 2005). Most studies of BWL have used a group treatment format, typically lasting from 16 and up to 26 weeks and include 10 to 20 participants (Wadden & Osei, 2002). Taken together, results from studies of group BWL suggest that individuals in such groups lose approximately 9 to 10% of their initial body weight (Wadden & Foster, 2000; Wing, 2002). This represents a clinically-significant weight loss (NHLBI Obesity Education Initiative Expert Panel, 1998). However, Alimaradi et al. (2016) showed that lack of motivation and fear of failure may have serious impact on the effectiveness of conventional weight-loss program.

There are many programs and interventions that have been developed in order to reduce body weight. Our main concern in the present study is the effectiveness of the program’s overweight reduction. One of the programs for weight loss involves intervention based on operational conditioning behavioural theory through the practice of reinforcement (Mujib, Rahamatullah, Shahadan, 2020). This research is based on an experimental study that was undertaken over a period of three months with the repeated measure design. The weight measurements were taken every two weeks after intervention in order to assess the effectiveness of the above intervention. The general method to analyse the above data is the ANOVA with repeated measures. However, this method does not take into account any unobserved heterogeneity.

2. Focus of Present Study

The aim of the present paper is to use the hierarchical Bayesian model to assess the effectiveness of the weight-loss intervention program. The hierarchical Bayesian model can incorporate repeated measures or responses (unobserved heterogeneity) via "random effects" for all measurements referring to the same individual (Ntzoufras, 2009, Lunn, et.al. 2013). Special attention is paid to handling weight reduction through time in which changes are captured by an individual's underlying reduction trajectory. The Hierarchical Bayesian growth models (in the context of the likelihood method, also known as random coefficients model, multilevel model and mixed effect model) are flexible enough to handle any study of change at the group as well as the individual levels.

To handle reduction or growth problem, we use different models: the hierarchical Bayesian random effects growth models with correlated intercept and slope random parameters, the hierarchical Bayesian random effects growth models with no correlated intercept and slope random parameters and the semi-parametric hierarchical Bayesian random effects growth models approach. The present research aims to identity the best approach to these problems for our dataset in terms of unbiased parameter estimates and their robustness.

According to Lunn, et.al. (2013), the hierarchical Bayesian model is also referred to as multilevel or mixed effects model as it combines both fixed effects with random effects. A Bayesian analysis combines information from observed data with prior information about unknown quantities in a statistical model, thereby generating a posterior distribution. Such an analysis requires a prior density to summarize our knowledge about parameter values before seeing the data. These basic concepts are important for the Bayesian approach to a random effect model or a hierarchical model. In order to estimate all parameters using the Bayesian approach with Gibbs sampling, we use the freely available software in R with rjags package (Plummer, 2019; R Core Team, 2020).

In our model, we split the source of variation into two sources: between-subject and within-subject variations. Moreover, the correlation between measurements (in case of repeated measure) of the same individual can be cooperated by introducing random effects.

We follow Broemeling’s (2016) Bayesian approach that defines the general model with random effects of intercept and slope, where for the ith subjects

Yij=β1 + β2tij+b1i + b2itij + εij (1)

with εij ~ N(0, σ

2

) and b1i and b2i have bivariate (Wishart) normal distribution with 2 by 1 mean vector 0 and 2

by 2 covariance matrix G, for j=1,2,..,ni and i=1,2,...,N.

The popular prior distribution with non-informative prior (also denoted as either flat, vague or diffuse) can be used in which distributions are often said `to let the data speak for themselves', so that inferences are unaffected

(3)

by information external to the current data (Lunn, et.al., 2009). Hence we can adopt β1~ N(0,1000), σ2 =

IG(0.001, 0.001) and σa2= IG(0.001,0.001) to express our non-informative prior. The components of G matrix

are determined by the covariance matrix of the joint normal distribution of the two random effects, b1i and b2i.

We consider this as our Model 1 (hierarchical Bayesian random effects growth models with bivariate normal distribution). Furthermore, we assume that the precision matrix of the two random effects (b1i and b2i) has a

non-informative Wishart distribution. We have assigned independent non-non-informative N(0.00,0.001) to the ‘fixed effect’ parameters and the precision τ of the measurement error εij is assigned a non-informative gamma

(0.0001,0.0001) distribution. Furthermore, in our model (Model 1), we assume there is a strong correlation between individuals and high initial measure (intercept) in weight (kg) as this will change at a faster rate over time (decrease more in weight reduction) as compared (slope) to those with low measure in weight at the initial time (baseline).

For our second model, we assume that the covariance matrix has a zero correlation between the intercept and the slope random effects. In this model, we separated two random effects (covariance constrain to be zero) with the varying components of b1i (b1i ~ N(0, σa2)) and b2i (b2i ~ N(0, σb2), where σa2 and σb2 are ~IG(0.001, 0.001)

respectively. Furthermore, by allowing two random effects to be separated, we assume that the two random effects (intercept and slope) have their own trajectories through time (week) .

The problem of parametric estimation in random effects modelling of repeated measures data is well-known (Lunn, et.al.,2013; Congdon, 2020). If the b1i and b2i were to be bivariate or separated normal distributions, these

could be considered as being too bold. Alternatively, we may assume a heavy-tailed student t distribution (Congdon, 2020; Chib, 2008) with skew normal and skew-t densities. A flexible discrete mixture distribution can also be obtained under the Dirichlet process and related semi-parametric prior (Congdon, 2020; Ohlssen et al., 2007). Following Ohlssen et al. (2007), we assume that the semi-parametrics hierarchical Bayesian random effects growth models b1i is unknown and a prior is placed over the infinite dimension space of distribution

functions. A possible prior for b1i is the Dirichlet process (DP) prior that is a random probability measure

defined on the space of distribution functions. While the Dirichlet process prior is a discrete distribution, we can still consider a Dirichlet process mixture (DPM) as continuous distribution (Santos & Loschi, 2020; Ohlssen et, al., 2007). According to Ohlssen et al, (2007) it is possible to extend the model to allow for a flexible continuous random-effects (b1i ) distribution based on a mixture of normals with a large number of components or clusters.

We have to define a mean and a variance parameter for each component of the mixture by mixing proportions defined by the pk. We follow Ohlssen et al. (2007) in our choice of the priors specification for our model (the

semi-parametric of the normal mixture model of random effects).

The random effect b1i ~N(θk, σk) with probability pk , where pk is defined by the stick-breaking algorithm, is

mentioned by Ohlssen et, al. (2007) and Lunn et, al., (2009). Moreover, according to Ohlssen et al. (2007), when we employ a stick-breaking algorithm, we must define a mean and a variance for each component of the mixture. Our model for the individual effects b1i, i=1,..,t, becomes

b1i= N(θk, σ2k) = with probability pk (2)

where pk is the stick-breaking algorithm. In order to avoid problems with empty components or identifiability,

we impose a hierarchical structure on the variance parameters σ2k , in which Ohlssen, et al. (2007) suggest the

imposition of a Gamma prior of the hierarchical structure. We follow Ohlssen et al. (2017) in the imposition of a Gamma prior (Gamma(0.03, 0.03) and also we conducted a sensitivity analysis with several alternative Gamma priors, such as Gamma (0.1, 0.5) and Gamma (0.5, 0.03) ). The precision parameters that controls the expected numbers of cluster is parameter α, which is initially fixed at 1 but can be estimated from the data (Lunn, et.al.,2013). In our model, we used uniform prior of constant α (α=1) , but sensitivity analysis can be done to various selection of α prior.

The idea of DPM model for random effects is the specification of the maximum number of cluster C, but C should be selected with the condition that all the parameter estimates are in good quality (Lunn, et, al. , 2009; Santos & Loschi, 2020). Ohlssen, et al. (2007) mentions that the specification of C can either be based on a pragmatic approach or on a comparison of the total variation metric between the truncated DPM model and the full DPM model. For the purpose of our research, the selection of C is based on the pragmatic approach. For the best specification of C in order to maximise the number of cluster, we conducted sensitivity analysis.

In this model, we still allow the random slope (b2i) to be parametric of non-informative hyperprior of IG

(0.001, 0.001)) ((b2i ~ N(0, σb 2

)), where σb 2

(4)

process can be performed using a constructive definition of stick-breaking in length 1 which has been successively broken into smaller and smaller pieces. For a detailed implementation of semi-parametric random effects of Bayesian models, please refer to Ohlssen et. al.,(2007) and Santos & Loschi, (2020).

In general, the prior distribution represents information about all the uncertain quantities of interest by treating them as random variables. Prior distributions can be divided into informative priors and non-informative priors. On the one hand, the informative priors contain information that has been obtained from previous experiments or from expert knowledge. On the other hand, the non-informative priors aim to express that there is little or no prior knowledge about the parameters. Finding the appropriate specification of priors is an important problem in the Bayesian approach. The selection of prior distribution in this research is based on pragmatic reasons. Typically, either a flat, a vague or a diffuse distribution can be used as a non-informative prior. Gelman et al. (2004) mentions that the rationale for using non-informative prior distribution is often said to be `to let the data speak for themselves', so that inferences are unaffected by information external to the current data. In other words, with little or no prior information, the alternative is to use non-informative priors. This implies that the posterior distribution ensures that the Bayesian analysis will be close to a classical likelihood analysis. Various justifications and interpretations of noninformative priors have been proposed, such as Jeffreys' priors of invariance and maximum entropy (Gelman, et. al., 2004; Congdon, 2001).

A full probability model must be defined in order to use rjags and this does not allow for improper prior distribution. For ‘fixed effects’, almost flat priors or non-informative priors are specified using the normal distribution with a large variance, e.g. 1,000,000 (N(0, 106). However, in random effects (hyperprior distributions), we adopt an inverse-gamma (ϵ, ϵ) prior for the precision of the random effects, where ϵ is a very small number such as 0.001, wherein we would have a mean of 0.001/0.001=1 and a variance of 0.001/0.0012=1000. This represents a "just proper" form of the improper inverse-gamma (0,0) prior which is formally equivalent to an assumed uniformed distribution on the log of the variance parameter σ = τ -1/2. According to Gelman et al. (2004), use of this inverse-gamma prior distribution involves the difficulty that, within the limit of ϵ ---> 0, yields an improper density and, thus, ϵ must be set at a reasonable value. Moreover, Gelman (2004) has shown that, in the case of a low value or near-zero value of the variance parameter or hyperparameter (σ), inferences become very sensitive to ϵ and the prior distribution hardly looks non-informative.

As a consequence of parametrization in jags, our hyperprior for random effects was specified as N(0, τ ). The hyperprior of τ was specified as IG(0.001,0.001), where IG is the inverse gamma density. We ran 1000 iterations or chain values as a burn-in period followed by a chain for 75,000 cycles. Convergence of the chain was checked using the Gelman-Rubin convergence statistics, implemented in the rjags package (Gelman et al., 2004; Plummer, 2019).

We also used the Dirichlet process mixture prior for random effects as a semi-parametric model. The discussion of prior specification of semi-parametric model can be found at Ohlssen, et. et., (2007) and Santos & Loschi, (2020). Furthermore, it is clear from the literature that the inverse-gamma prior has a problem, like other non-informative priors, in that it leads to improper posterior distributions (Gelman et al., 2004; Congdon, 2020). Gelman et al. (2004) suggest that in order to overcome the above problem, a sensitivity analysis has to be conducted to see how robust the parameter estimates are when different priors are used.

The psychological theory of weight loss sometimes offers guidance on the distribution of observed variables, such as the response variable and its covariates. However, it rarely offers guidance on the distribution of the unobserved heterogeneity. The choice of a particular distribution for the unobservables is usually justified by the grounds of familiarity, ease of manipulation and consideration of computational cost. Although weight loss theory may not offer any guidance as to the distribution of the unobserved heterogeneity, nevertheless, if we have conducted an exploratory analysis, such as using a graph of individual trajectories through time, we may pick up some idea of the form of the unobserved heterogeneity distribution. For a more detailed example of this, see a graph of individual trajectories through times.

Competing models are usually compared using relative fit criteria, which normally compare how well two related models fit the data (Skrondal & Rabe-Hesketh ,2004; Lunn, et al., 2013). Within the classical modelling framework, the model comparison is more straightforward as it defines a measure of fit, typically a deviance statistics, as a function of the number of free parameters in the model (Spiegelhalter et al., 2002). However, in the Bayesian framework, especially for hierarchical models, the model comparison is more complex. Model comparison should proceed together with fit statistical measures and substantive arguments (psychological

(5)

theory). This is because equivalent models cannot be distinguished empirically although they may represent different or even contradictory substantive processes. The same applies to nonequivalent models which happen to yield similar fits for the particular data set being analysed (Skrondal & Rabe-Hesketh ,2004).

Lunn, et al. (2013) mentioned that 'Bayesian Deviance' is a measure of a model’s level of complexity based on the concept of the excess of the true model over the estimated residual information. The 'Bayesian Deviance' is denoted by D(θ) and defined as

D(θ) = -2 log p(y|θ) (3)

which is explicitly a function of θ and so has a posterior distribution. We also used pD measure of fit based on a Bayesian measure of model complexity or 'effective number of parameters' as suggested by Spiegelalter et al. (2002) and Lunn, et. al. (20013). Moreover, we can combined the 'Bayesian Deviance' measure of fit with the pD measure of complexity to produce the Deviance Information Criterion (DIC). Spiegelhalter et al. (2002) shows that DIC is intended as a generalization of Akaike Information Criterion (AIC) for a broad range of statistical models. Smaller values of DIC denote better fits.

This section has provided a short discussion on the methodology that we are using to analyse our data. For analysing the data, we propose the three models of the hierarchical Bayesian random effects growth: a correlated hierarchical Bayesian random effects growth model, an independent hierarchical Bayesian random effects growth model and a semi-parametric hierarchical Bayesian random effects growth model. The various growth models have been discussed and the sign and statistical significance of their parameters are of particular interest in this research.

3. Theoretical Framework

Based on the experimental data provided to us and from previous studies on weight reduction or loss, we take into consideration 2 covariates (height of the participants in cm and time in weeks) and 3 other factors. For the height of the participants covariate, we consider a time invariant covariate, whereas for the time covariate, we consider a time varying covariate. The first factor is the gender of the participants that is coded as 0 (male) and 1 (female). The second factor is the physical activity levels of the participants, which is either active lifestyle or sedentary lifestyle. The sedentary lifestyle or non-active lifestyle is coded by 0 and the active lifestyle is coded by 1. The last factor of explanatory variable is whether the participants enter into the experiments as individuals or pairs, which is coded by 0 (individual) and 1 (pair) (Mujeeb, Khan and Shahadan (in press).The proposed explanatory variables above are based on previous studies on weight reduction program (Norlaila 2008; Kee et al. 2008; Alimaradi et al., 2016).

4. Research Method

The secondary data for this study was obtained from an experimental study conducted by Mujeeb, Khan and Shahadan (in press). Further information can be obtained from the original research. The primary motivation for conducting the experiment was to find out the effectiveness of monitory incentives intervention for reducing weight among university students. These data sets were based on a repeated experimental design.

A total of 320 students of University Pendidikan Sultan Idris (UPSI) signed up for the study. The participants were informed about incentives of RM 25 for every kilogram weight loss at the end of the study (which lasted three months). Only 120 students fulfilled the experiment’s criteria of having a Body Mass Index (BMI) exceeding 25. Some of the students were disqualified or dropped-out from the study because they were unable to attend the weight reading sessions which were held every two weeks for three months. The final sample of the study comprised about 48 students who were divided equally into two groups. In order to analyse the data in our research, we had used the listwise deletion method. Based on the listwise deletion, at the base line (time 0), 48 students were divided into two groups of individuals or in pairs. The experiment was carried out over the period of three months with a weight reading session for every two weeks. At the end of the study, we had six repeated measures of students weight in kg. The detailed description of sampling method and procedures of the experiment can be found inside our research publication (Mujeeb, Rahmattullah and Shahadan, in press).

From our perspectives of the hierarchical Bayesian growth models, the response variable is the students’ weight in kg for every two weeks over the period of three months. In determining the effectiveness of the weight reduction program, there are five factors of explanatory variables to consider. The lists of variables are the

(6)

height of the participants in cm, the time in week, the gender of the participants, the physical activity levels (active or inactive) of the participants and the group (individual or in pair) of the participants.

5. Data Exploration

Before we started the modeling process, we explored the data using an exploratory analysis. We used two plots facility in R, namely Spaghetti Plot (Wickham, 2016) and Trellis Plot (Pinheiro & Bates, 2000). Both plots can be used to analyse the pattern of repeated measures data. Figure 1 shows a Spaghetti Plot for individual participant’s weight trajectory (in kg) through time (in week). Moreover, we added a line with standard error shading using locally weighted regression (lowess) to 'smooth' over all the variability in order to give a sense of the overall or average trend. This plot is basically a scatterplot of responses to variable scores versus the time variable (in week) with a separate line for each person that connects his/her scores (weight in kg) over time. This type of plot can provide crucial information about the model that we are going to develop, especially on linearity, outliers in terms of their patterns of progress, and subject versus subject effects. Our Spaghetti Plot on Figure 1 shows that the lowess is fairly in a linear trend. Figure 1 clearly shows a few outliers cases in terms of patterns of progress. Moreover, although the plot can be said to be fairly linear, it has different individual trajectories through time. Hence, it would be very difficult to find similar trajectories among individuals. Although, the Spaghetti Plot is useful to us, its trends and patterns are obscured. For example, the trajectories commonly overlap, creating a confusing jumble of intersecting lines with no discernible patterns.

In an attempt to handle the weakness of the Spaghetti Plot, we also applied the Trellis Plot by using the lme4 package in R software (Pinheiro & Bates, 2000). Figure 2 showed the Trellis Plot for weight loss data (in kg) over time (week). From Figure 2, it appears that there are qualitative differences in growth patterns among individual participants. In fact, all the individual participants have their own weight-loss trajectory through time. Moreover, our Trellis Plot also indicates that each individual has his/her own initial value or baseline value of weight (in kg). Figure 2 also indicates that, for most of the participants, the weight measurement (in kg) decreases (loss) with time (week) and that its growth is approximately linear over this range of weeks. It appears that the intercept and, possibly, the slopes of these growth curves may differ between individuals. It is clear that the individual growth or decrease (loss) patterns are totally different between individuals. This may indicate that the estimated coefficient of variation in intercept random effects would be high. Because of the above problems, we may also propose the semi-parametric of intercept random effects growth model for our data.

(7)

Figure 2. Trellis Plot for weight in kg versus time in week for each individual participant 6. Results of Bayesian Hierarchical Growth Model

In this sub-section, we present Bayesian based estimation procedures, which are parametric model (correlated and independent model) for random intercept and slope model (growth model) and semi-parametric model (based on Dirichlet process mixture prior for random effects) as shown in Table 1. Beside our use of the Dirichlet process mixture prior, we always maintain the use of non-informative priors. Non-informative priors with the inverse gamma (ϵ, ϵ) in rjags yielded an improper posterior density when the scalar ϵ tends to zero. The analysis is executed with 75,000 observations for the simulation, with the burn in of 10,000 and a refresh of 100. From our model, the "fixed effects" represent the mean population as a linear regression with intercept β0 and

slope β's, while random effects b1i and b2i for subjects i represent each individual’s intercept and slope. Table 1

showed the best fit model of the three methods of Bayesian Hierarchical growth model based on Bayesian goodness fit statistics. The first model is a parametric Bayesian Hierarchical growth with correlated (dependent) random effects model. The results of the full Bayesian approach, including a 95% credible interval for the parameter estimates using non-informative prior for the variance parameter, are given in Table 1. The results showed that the linear growth model (only statistically significant covariates and factor) is

Yij=β0 + β1(heightij) + β2 (time in weekij) + β3 (gender ij) + b1i + b2i(time in weekij) + εij,

Where Yij is the outcome at time point i for individual j , β0 is the population estimate of the mean intercept,

β1 is the population estimate of the linear slope for high at time point i for individual j , β2 is the mean growth

rate at time point i for individual j and β3 is to capture the estimates of the mean difference in intercept between

gender (female) at time point i for individual j. b1i and b2i are random effects that allow the intercepts and

growth rates to vary across individuals and εij is a time-specific residual that expresses the difference between an

individual's fitted linear trajectory and the observed data. The exploratory variables in our models are taken into account for the heterogeneity in growth trajectories. By estimating covariance (σ12) between random effects in

Model 1, we can obtain a relationship between an individual’s weight in kg at initial values (baseline) and the individual’s reduction weight rates of change on the response variable. Moreover, an individual with a high initial weight will change at a faster rate compared to an individual with a low initial weight.

In our Model 2, we can relax the assumptions of correlated random effects. The 95% credible interval for covariance (σ12) of the two random effects is (-2.351, 0.408), which contains 0. Thus, it can be concluded that the

covariance between two random effects is not statistically significant (0). Furthermore, we may need to revise the assumptions of the two random effects that are not correlated. In addition, as mentioned by Lunn et al. (2013)

(8)

and Ntzofras (2009), the credible interval statistics should be interpreted with caution. Moreover the fit indexes, such as deviances, pD and DIC, do not indicate any changes in Model 2. Although a further analysis can be done to resolve this issue, it is not the aim of this study to do this.

From our data exploratory section, we mentioned that the use of a normal random effect distribution of intercept (b1i ~ N(0, σa2)) cannot be justified based on the Trellis Plot. Table 4 shows the poster mean estimates

and a 95% credible interval for a number of important parameters, which include fix and random effects parameters. Results on Table 4 also showed a semi-parametric model of random intercept model (normal mixture model) based on n=288 and the number of cluster (C=10). Like the above two models, the analysis is executed with 75,000 observations for the simulation, with the burn in of 10,000 and a refresh of 100. The linear growth model (only statistically significant covariates and factor) is

Yij=β0 + β1(heightij) + β2 (time in weekij) + β3 (gender ij) + β4(Groupij)+ b1i + b2i(time in weekij) + εij,

Where Yij is the outcome at time point i for individual j , β0 is the population estimate of the mean intercept,

β1 is the population estimate of the linear slope for high at time point i for individual j, β2 is the mean growth rate

at time point i for individual j, β3 is to capture of the estimates of the mean difference in intercept between

gender (female) at time point i for individual j and β4 is to capture the estimate of the mean difference in

intercept between group (pair) at time point 1 for individual j. Moreover, the random effect b1i follows a mixture

of normal models (semi-parametric). However, the b2i is a normal random effect and εij is a time-specific residual

that expreses the difference between an individual's fitted linear trajectory and the observed data. Compared to the above parametric models, the semi-parametric model (normal mixture model) seems to be flexible enough to estimate parameters. The normal mixture model of intercept random effects showed that the linear growth of weight reduction is related significantly to the participants’ height in kg, time in week, gender and group. Clearly, the parameter of active (fixed effect) has a tendency to be significant.

We conducted a sensitivity analysis in imposing a hierarchical structure on the variance parameters (σ2k)

when we employed the stick-breaking algorithm (in equation 2). The result showed that when we imposed to Gamma prior (0.1, 0.5), the fit statistics indexes (Deviance=1039.00, pD= 79.324 and DIC=1118.320) increased. When we imposed to Gamma (0.5, 0.03), the fit statistics indexes did not improve compared to the results of the original impose prior (Deviance= 1034.0, pD=78.8, DIC=1112.710). The sensitivity analysed clearly showed that the original imposing of Gamma (0.03, 0.03) would be the best prior.

By using α prior of constant, the results showed that a posterior median with 95 % credible interval of the number of clusters is 5 (2-9). We replaced the constant α prior with the uniform prior (0.3,10). The results showed that a posterior median of the number of clusters is 8 (5-10). The uncertainty around the number of clusters is slightly increased. Moreover, the fit statistics indexes of the new α prior was not improved (Deviance= 1033.6, pD=78.823, DIC=1112.400). We also conducted another sensitivity analysis by replacing α prior with a Gamma ( 2,4). However, the chain fails to converge when we use rjags package in R.

Based on the pragmatic approach, we choose cluster C=10 for the normal distribution. To evaluate our choice of C, we conducted a sensitivity analysis. Three values of C are considered: C=5, C=15 and C=48 (the total number of individuals). For the sensitivity analysis, we monitored the posterior estimate of Mean(F0) (μF0) and

Sigma2 (σ2F0) (random effects) parameters, the posterior median of the number of cluster and the Bayesian

method fit statistics indices. The above parameter estimates of the sensitivity analysis are shown in Table 6. The result showed that C=10 is the best choice for cluster C. At C=10, the maximum point in which the estimated posterior median (K) achieve the maximum height, in which the increase of C will not be changed in posterior median (K). Furthermore, the fit statistics indexes also suggested that model C=10 is the best fit model, since other model fit indexes hardly changed.

Our analysis favoured the use of the semi-parametric hierarchical Bayesian random effects growth models (a mixture of models of Random effects) with the Dirichlet process mixture prior. The semi-parametric model provides us with very flexible interpretations of clustered random effects. Moreover, by taking into account the clusters of the random effect model, the estimates of the fixed parameters and their standard error are slightly lower than that provided by the parametric normal random effects model. It is very clear that the semi-parametric models produce good parameter estimates compared with those produced by the semi-parametric normal random effects model.

(9)

Table 1. Posterior parameter estimates for Model1: Bayesian Hierarchical Growth Dependent (correlated) Random Effects Model

Normal dist. IG (0.001,0.001) Parameter Mean SD 2.5% median 97.5% Significant Intercept (β0) -38.071 28.471 -93.346 -38.213 18.058 N.Sig height (β1) 0.821 0.164 0.496 0.821 1.141 Sig week (β2) -0.227 0.047 -0.320 -0.227 -0.135 Sig gender (β3) -10.371 4.417 -19.143 -10.333 -1.820 Sig active (β4) -4.463 3.760 -11.864 -4.463 2.940 N.Sig. group (β5) 5.923 3.756 -1.489 5.929 13.296 N.Sig. Hyper parameters Intercept (σ12) 165.158 37.761 106.641 159.957 253.434 Sig covariance (σ12) -0.857 0.694 -2.351 -0.817 0.408 N.Sig ICC (ρ) 0.987 - - - - Slope (σ22) 0.080 0.020 0.048 0.077 0.127 Sig Error (σ2) 2.130 0.215 1.749 2.116 2.589 Sig

Table 2. Posterior parameter estimates for Model1: Bayesian Hierarchical Growth Independent Random Effects Model Normal dist. IG (0.001,0.001) Parameter Mean SD 2.5% median 97.5% Significant Intercept (β0) -36.384 28.716 -92.54 -36.491 20.353 N.Sig height (β1) 0.811 0.166 0.483 0.821 1.135 Sig week (β2) -0.227 0.037 -0.300 -0.227 -0.154 Sig gender (β3) -9.861 4.498 -18.819 -9.841 -1.134 Sig active (β4) -3.898 3.876 -11.542 -3.899 3.726 N.Sig group (β5) 4.647 3.735 -2.715 4.646 11.986 N.Sig Hyper parameters Intercept (σ12) 164.088 37.710 105.771 158.819 252.296 Sig ICC (ρ) 0.987 - - - - - Slope (σ2 2 ) 0.039 0.015 0.015 0.037 0.073 Sig Error (σ2 ) 2.206 0.233 1.797 2.189 2.710 Sig

Table 3. Posterior parameter estimates for Bayesian Hierarchical Growth Model using Semi-Parametric Random Effects Model

Normal Mixture with C=10 α=1 Diric. Prior Parameter Mean SD 2.5% i median 97.5% Significant Intercept (β0) -38.910 24.750 -87.120 -39.110 10.72 N.Sig height (β1) 0.867 0.134 0.606 0.867 1.138 Sig week (β2) -0.215 0.035 -0.284 -0.215 -0.145 Sig gender (β3) -6.473 3.295 -13.720 -6.189 -0.613 Sig active (β4) -5.253 2.577 -9.966 -5.435 0.486 N.Sig group (β5) 5.609 2.517 0.105 5.852 10.080 Sig Hyper parameters K 5.327 1.707 2.000 5.000 9.000 alpha (α) 1 pi1 (π1) 0.351 0.208 0.025 0.328 0.828 pi2 (π2) 0.268 0.172 0.012 0.255 0.657 pi3 (π3) 0.171 0.141 0.003 0.139 0.497

(10)

pi4 (π4) 0.098 0.107 0.005 0.059 0.378 pi5 (π5) 0.053 0.074 0.001 0.023 0.275 pi6 (π6) 0.028 0.048 0.002 0.009 0.173 pi7 (π7) 0.015 0.031 0.000 0.003 0.104 pi8 (π8) 0.008 0.019 0.000 0.001 0.059 pi9 (π9) 0.004 0.011 0.000 0.000 0.031 pi10 (π10) 0.004 0.011 0.000 0.000 0.031 Mean(F0) (μF0) Sigma2 (σ2F0) Slope (σ22) Error (σ2 ) -2.902 139.800 0.036 2.133 11.860 22.950 0.013 0.224 -28.290 100.800 0.015 1.739 -1.774 137.800 0.035 2.117 16.150 190.400 0.067 2.611 Sig Sig Sig

Table 4. Summary of the best fit for Bayesian Hierarchical Model (Growth Model)

Model Deviance pD DIC

Model 1 1033.616 127.600 1161.227

Model 2 1043.561 162.800 1206.317

Model 3 1033.470 78.801 1112.290

Table 5. Sensitivity Analysis for the Selection of Cluster C

Cluster C=5, α=1 C=10, α=1 C=15, α=1 C=48, α=1 posterior median (K) 3(1-6) 5(2-9) 5(2-9) 5(5-9) Mean(F0) (μF0) -16.12 (6.762) -2.902(11.860) -12.91(8.896) -7.03(12.76) Sigma2 (σ2F0) 137.60(22.42) 139.80(22.95) 138.3(22.73) 139.20(23.01) Fit index Deviance 1033.500 1033.470 1033.500 1033.80 pD 78.800 78.800 78.800 78.90 DIC 1112.200 1112.290 1112.60 1112.410 K= Number of cluster 7. Discussion

The literature review is concerned with the psychological factors that affect the effectiveness of weight loss intervention program (Mujeeb, Rahmattullah and Shahadan, in press) . Generally, for the Bayesian hierarchical liner growth model, the best fit model is semi-parametric normal (Mixture Normal) with the Dirichlet prior, which consists of variables of fixed effects parameters. The variables of fixed parameters include height in kg, time in weeks, gender, and groups (single or pair groups). Moreover, as far as random effects of intercepts is concerned, the 5 cluster groups consist of individuals at baseline outcomes as subjects with unobserved heterogeneity. Moreover, the slope of random effects parameter indicates that growth rates vary across individuals. To the best of our knowledge, there is no previous study on the effectiveness of intervention program of weight reduction that has employed the semi-parametric linear growth models. This paper provides new findings for the study on the effectiveness of the intervention program in weight reduction (Mujeeb, Khan and Shahadan, in press; Norlaila 2008; Kee et al. 2008; Alimaradi et al., 2016).

As far as the grouping is concerned, we had fixed the single individual as a reference group. The pair group is significantly different from the single person group. Results from our model showed that the pair group is better at weight reduction through time compared to the single individual group. This study does not seem to support findings by previous research that finds no significant difference between single and pair groups (Mujeeb, Khan and Shahadan, in press). The data analysis of the previous research employed only a simple repeated measure of ANOVA. It is clear that the current data analysis (semi-parametric linear growth model) is superior to the previous data analysis (Lunn, et al., 2013).

(11)

Our model also showed that the gender variable is a significant factor in determining weight reduction through time. The results from all our models (parametric and semi- parametric models) showed a significant difference between male and female students in weight reduction through time. Furthermore, the male students are better compared to the female students at reducing weight trajectories through time.

Our research does not support previous findings about the relationship between obesity and physical activity (Kumanyika, Jeffery, Morabia, Ritenbaugh, & Antipatis, 2002; Stiegler & Cunliffe, 2006). Our model showed that being physically active (active in physical activities as opposed to sedentary behaviour) does not significantly relate to the growth or reduction of a person’s weight. However, the results showed that the physical activity variable does seem to have the tendency to be significant in our semi-parametric Bayesian hierarchical linear growth model. Clearly, more study is needed to resolve this issue in weight reduction program.

Our results also indicated a significant positive correlation between height and weight reduction trajectories through time. The increase in one unit in the student’s height is followed by 0.867 unit of weight reduction trajectories through time. The results also indicated a significant negative correlation between time in weeks and weight reduction trajectories through time. The body weight appears to "grow" or reduce linearly with time with different intercepts and slopes (random effects) in all of our three models.

Lastly, at the technical level, especially for the semi-parametric linear growth model, a lot of issues needs to be resolved. All our models, especially the semi-parametric growth model, is based on the pragmatic approach rather than on an in-depth study. Clearly, the selection of prior distributions for the variance components, or hyperparameters, is sensitive. Further work needs to be done to ensure that the choice of prior distribution incorporates uncertainty concerning the variance parameters in the Bayesian estimation processes in order to achieve more precise results. Moreover, the selection of a prior, especially for the hyperparameter, should be guided by the nature of the data or by a psychologist who specializes in this area. Furthermore, additional effort is needed to maintain the desired estimations properties of Bayesian methods, especially in relation to α , stick-breaking algorithm and clusters (Lunn, et al., 2013, , et. et., 2007; Santos & Loschi, 2020).

8. Acknowledgement

We would like to express our gratitude and appreciation to all those who made the completion of this report possible. Special thanks to Research Management and Innovation Center (RMIC), Sultan Idiris Education University (UPSI), for granting us University Research Grant 2016-0167-106-01 to conduct this study. Our appreciation also goes to Mr. Mujeeb Khan who allowed us to use his experimental data on the effectiveness of an incentive-based weight reducing technique and made it possible for us to come up with these findings. Thanks to everybody who have directly or indirectly supported this study.

References

1. Alimoradi, M., Abdolahi, M.,Aryan, L., Vazirijavid, R. & Ajami, M. (2016). Cognitive behavioural therapy treatment of adult obesity. International journal of medical reviews. 3(1), 371-379.

2. Allison, D. B., Fontaine, K. R., Manson, J. E., Stevens, J., & VanItallie, T. B. (1999). Annual deaths attributable to obesity in the United States. JAMA, 282(16), 1530-1538.

3. Broemeling, L.D. (2016). Bayesian methods for repeated measures. NY: CRC press.

4. Butryn, M. L., Webb, V., & Wadden, T. A. (2011). Behavioral treatment of obesity. The Psychiatric clinics of North America, 34(4), 841.

5. Congdon, P. D. (2020). Bayesian hierarchical models with application using R, (2nd ed.). NY: CRC press.

6. Gelman, A., Carlin, J.B., Stern, H. S., and Rubin, D.B. (2004). Bayesian data analysis, (2nd ed.). NY: CRC press.

7. Jakicic, J. M., Marcus, B. H., Gallagher, K. I., Napolitano, M., & Lang, W. (2003). Effect of exercise duration and intensity on weight loss in overweight, sedentary women: a randomized trial. JAMA, 290(10), 1323-1330.

8. Kumanyika, S., Jeffery, R., Morabia, A., Ritenbaugh, C., & Antipatis, V. (2002a). Public Health Approaches to the Prevention of Obesity (PHAPO) Working Group of the International Obesity Task Force (IOTF). International Journal of Obesity and Related Metabolic Disorders, Journal of the International Association for the Study of Obesity, 26(3), 425-436.

9. Lunn, D., Jackson, C., Best, N., Thomas, A. & Spiegelhalter, D. (2013). The BUGS Book: A practical introduction to Bayesian analysis. NY : CRC press.

(12)

10. Manson, J. E., Willett, W. C., Stampfer, M. J., Colditz, G. A., Hunter, D. J., Hankinson, S. E., Speizer, F. E. (1995). Body weight and mortality among women. New England Journal of Medicine, 333(11), 677-685.

11. Mujib, K., Rahamatullah, K., Shahadan, M.A. (in press). The effectiveness of an incentive-based weight reducing technique among University Pendidikan Sultan Idris (UPSI) students, Malaysia, Malaysia. Malaysian journal of public health medicine.

12. Norlaila, M. T. (2008). Preliminary study on factors contributing to obesity among married women-a qualitative approach. Malaysian Journal of Nutrition, 14(2; suppl), S14.

13. Ntzoufran, I. (2009). Bayesian modelling using winBUGS. NY: Wiley

14. Ohlssen, D. Sharples, L. & Spiegelhalter, D. (2007). Flexible random-effects models using Bayesian semi-parametric models: Applications to institutional comparisons. Statistics in medicine, 26, 2088-2112.

15. Pinheiro, J.C. & Bates, D. M. (2000) Mixed-effects models in S and S-plus. NY: Springer. 16. Plummer, M. (2019). JAGS version 4.10 user manual [Computer software manual]

17. R Core Team (2020) R: A and environment for statistical computing [Computer software manual]. Vienna, Austria: R Foundation for Statistical Computing. Retrieved from http://www. R-project Org/. 18. Said, S. M., & Ismail, S. (2014). Prevalence and Factors associated with Overweight and Obesity

among Malaysian Post-Graduate Students in a Public University. International Journal of Public Health and Clinical Sciences, 1(1), 131-140

19. Santos, C.C., & Loschi, R.H. (2020). Semi-parametric Bayesian models for heterogeneous degradation data: An application to laser data. Reliability engineering and system safety, 202, 107038

20. Schoeller, D. A., Shay, K., & Kushner, R. F. (1997). How much physical activity is needed to minimize weight gain in previously obese women? The American journal of clinical nutrition, 66(3), 551-556 21. Schwartz, M. B., & Brownell, K. D. (2004). Obesity and body image. Body image, 1(1), 43-56.

22. Skrondal, A. & Rabe-Hesketh, S. (2004). Generalized latent variable modeling: Multilevel, longitudinal, and Structural equation models. Interdisciplinary statistics. Boca Raton: Chapman & Hall/CRC.

23. Spiegelhalter, D. J., Best, N. G., Carlin, B. P. & Van Der Linde, A. (2002). Bayesian measures of model complexity and fit. Journal of the Royal Statistical Society, B 64, 583-639.

24. Stice, E., Shaw, H., & Marti, C. N. (2006). A meta-analytic review of obesity prevention programs for children and adolescents: the skinny on interventions that work. Psychological bulletin, 132(5), 667. 25. Stiegler, P., & Cunliffe, A. (2006). The role of diet and exercise for the maintenance of fat-free mass

and resting metabolic rate during weight loss. Sports medicine, 36(3), 239-262.

26. Wadden, T. A., Crerand, C. E., & Brock, J. (2005). Behavioral treatment of obesity. Psychiatric Clinics of North America, 28(1), 151-170

27. Wadden, T. A., & Foster, G. D. (2000). Behavioral treatment of obesity. Medical Clinics of North America, 84(2), 441-461.

28. Wadden, T. A., Vogt, R. A., Foster, G. D., & Anderson, D. A. (1998). Exercise and the maintenance of weight loss: 1-year follow-up of a controlled clinical trial. Journal of Consulting and Clinical Psychology, 66(2), 429.

Referanslar

Benzer Belgeler

maddesinde yer alan aile hayatının korunması teminatının evlilik içi aile ve aile yaşamı için olduğu kadar evlilik dışı aile ve aile yaşamı için de geçerli

Bu yönlerle ilişkili oranlarının çevrimleri alınarak minör aralıkları temsil eden yönler belirlenir, örneğin minör altılı aralık büyük üçlü yönü- nün çevrimi

tam olarak tanıtmaya yetmediğinden, Baydar, tefrika ile kita­ bın çıkışı arasında geçen iki yıla yakın zamanda, Tanrıöver’i bü­ tün yönleriyle

[r]

Çok sesli müzik gelene­ ği olmayan, ama bin yıllık bir müzik geleneği olan bir ülkede bunu yapmak oldukça güç. Türkiye’de çoksesli müzik geleneğinin

Belediyeler tarafından yapı kullanma izin belgesi verilen ya- pıların 2016 yılının ilk üç ayında bir önceki yıla göre, bina sa- yısı %2,6, yüzölçümü %2,2, değeri

His research interests are Motion estimation, 3-D motion models, non-rigid motion analysis, Gibbs Random Field based models, object-based coding and very low bit rate

Sadece genel sa¤l›k alg›lamas› (GSA) de¤erleri yoga grubunda egzersiz grubuna göre daha yüksek bulundu (Tablo 2).Tedavi sonra- s›nda yap›lan de¤erlendirmede, sol ve sa¤