Journal of Risk Model Validation
ISSN:
1753-9579 (print)
1753-9587 (online)
Editor-in-chief: Steve Satchell
Need to know
- The paper presents a new model which allows the incorporation of the presence of an excessive number of zero counts and overdispersion phenomena (where the variance is larger than the mean).
- The new model is estimated by developing a Bayesian approach, which allows the model validation to be conducted with the posterior distribution, (prior beliefs and the sample data).
- The process of validation includes the comparison with other standard and Bayesian methodologies concluding that the new model performs the others.
Abstract
ABSTRACT
Generalized linear models (GLMs) that use a regression procedure to fit relationships between predictor and target variables are widely used in risk insurance data. It is crucial to detect the risk factors that significatively affect the number of claims, as this will eventually allow the insurer to fix premiums more precisely. We pay attention to power series distributions, instead of the exponential family, and develop a Bayesian methodology as an alternative to traditionally used maximum-likelihood-based methods. We use sampling-based methods in order to detect relevant risk factors in an automobile insurance data set. This new model allows us to incorporate the presence of an excessive number of zero counts and overdispersion phenomena (where the variance is larger than the mean). Then, we validate this model by comparing the results with other standard and Bayesian models. As part of the process of validation, information criteria such as the deviance information criterion (DIC), Akaike information criterion (AIC) and Bayesian information criterion (BIC) have been considered. For real data collected from 2004 to 2005 in an Australian insurance company, an example is provided using the Markov chain Monte Carlo method; this is developed using the WinBUGS package. The results show that the new Bayesian method outperforms the previous models.
Introduction
There are a lot of applications involving discrete data for which the observed data shows a zero-observing frequency that is significantly higher than that predicted by the assumed model. The problem of observing a high proportion of zeros has been of interest in data analysis and modeling in many fields, such as medicine, engineering applications, manufacturing, economics, public health, road safety, epidemiology and, in particular, actuarial data. Models with a significantly higher number of zeros are known as zero-inflated models. Poisson regression models provide a standard framework for the analysis of count data. However, count data is often overdispersed relative to the Poisson distribution. One frequent factor of overdispersion is that the incidence of zero counts is greater than expected for the Poisson distribution; this is of interest, because zero counts frequently have special issue. For example, in counting claims from policyholders, a policyholder may have no claims either because they are a good driver or simply because no risk factors have happened “near” their driving. This is the distinction between structural zeros, which are (almost) inevitable, and sampling zeros, which occur by chance. On the other hand, as has been pointed out by Denuit et al (2009), overdispersion leads to underestimates of standard errors and overestimates of chi-squared statistics. This could lead to serious consequences. For example, some explanatory variables may become not significant after overdispersion has been accounted for.
Over the last few decades, there has been considerable interest in models for count data that allow for excess zeros, particularly in the econometric literature. Mullahy (1986) explores the specification and testing of some modified count data models. Lambert (1992) provides a manufacturing defects application of these models and discusses the case of zero-inflated Poisson (ZIP) models. Gupta et al (1996) provide a general analysis of zero-inflated models. Gurmu (1997) develops a semi-parametric estimation method for hurdle (two-part) count regression models. Ridout et al (1998) consider the problem of modeling count data with excess zeros and review some possible models. Hall (2000) adapts Lambert’s methodology to an upper-bounded count situation, thereby obtaining a zero-inflated binomial (ZIB) model. Ghosh et al (2006) introduce a flexible class of zero-inflated models that includes other familiar models, such as ZIP models, as special cases by using a Bayesian estimation method. An overview of count data in econometrics including zero-inflated models is provided in Cameron and Trivedi (1998). In insurance, Yip and Yau (2005) provide a better fit to their insurance data by using zero-inflated count models. Boucher et al (2007) revise zero-inflated and hurdles models with applications to a Spanish insurance company. More recently, Mouatassim and Ezzahid (2012) analyzed zero-inflated models with an application to private health insurance data.
In this paper, we use power series distributions to develop a novel and flexible zero-inflated Bayesian methodology. We employ sampling-based methods in order to model an automobile insurance data set. This model leads us to incorporate the presence of an excessive number of zero counts and overdispersion phenomena. The Bayesian approach allows model validation to be conducted with the posterior distribution, ie, by taking prior beliefs and the sample data into account. Recently, this methodology has been used as an alternative to traditional approaches to validating models in credit risk portfolios (see Jacobs 2015; Parnes 2015). The validation process of a model includes a comparison with other standard and Bayesian models by analyzing the significant factors and some information criteria, such as the Akaike information criterion (AIC), the Bayesian information criterion (BIC) and the deviance information criterion (DIC). This last one is particularly useful in Bayesian model selection problems where the posterior distributions of the models have been obtained by Markov chain Monte Carlo (MCMC) simulation, as occurs in our model (see Spiegelhalter et al (2002, 2014) for further details).
The structure of this paper is as follows. Section 2 provides the zero-inflated power series distributions and the new Bayesian model proposed here. Section 3 looks at automobile insurance application, and Section 4 briefly concludes.
2 Modeling zero-inflated data
It is known that power series distributions form a useful subclass of one-parameter discrete exponential families suitable for modeling count data. Since the original works of Kosambi (1949) and Noack (1950), the power series distribution has been very popular in the statistical literature dealing with discrete distributions that belong to this simple class. Two references concerning these features are Patil (1962a, 1962b). A revision of the power series distribution can be viewed in Johnson et al (2005), Chapter 2.
The probability function of the power series distribution becomes
(2.1) |
where , is a function of or constant, is convergent and is the power parameter of the distribution. The family of discrete distributions defined in (2.1) includes a broad class of known distributions, including the Poisson, binomial, negative binomial, logarithmic series and Conway–Maxwell–Poisson distributions. After computing the probability generating function, which is given by , , it is easy to see that the mean and variance of the power series distribution are as follows:
(2.2) | ||||
(2.3) |
Thus, the index of dispersion,
(2.4) |
accommodates for overdispersion when . For example, when the Poisson distribution is considered, we have that and , ie, we get equidispersion. If , , the distribution in (2.1) reduces to the negative binomial distribution; from (2.4), we get that , and the overdispersion phenomenon is obtained. Observe that for the binomial and negative binomial cases the corresponding additional integer parameters, usually called and , are considered to be nuisance parameters.
Starting with a distribution belonging to the power series, a more flexible model can be constructed by adding a parameter that leads us to inflate the zero value of the empirical data when inflation of this exists. Thus, zero-inflated power series distribution contains two parameters. The first parameter indicates an inflation of zeros, and the other parameter is that of the power series distribution. A zero-inflated power series distribution is a mixture of a power series distribution and a degenerate distribution at zero, with a mixing probability for the degenerate distribution. As Johnson et al (2005) point out, a very simple alternative for modeling this setting is to add an arbitrary proportion of zeros, decreasing the remaining frequencies in an appropriate manner. In conclusion, zero-inflated models deal with the problem that the data displays a higher number of zeros (nonclaims in our case), and they are therefore appropriate for modeling counts that encounter disproportionately large frequencies of zeros.
If we start with a discrete distribution , we can build a zero-inflated distribution in a simple form (see Cohen 1966) by assuming
(2.5) |
where , , is the parent distribution, and
(2.6) |
This last inequality allows the distribution to be well defined for certain negative values of . The downside to this representation of the support of instead of the usual is that the mixing interpretation is lost; however, in practice, the parameter can incorporate negative values into the support given in (2.6), and therefore (2.5) is genuine (see, for example, Bhattacharya et al 2008). Later, we will see that this is the case for the data considered here.
So, the probability mass function of the zero-inflated power series distribution, , results in
(2.7) |
where , and . The mean and variance are
(2.8) | ||||
(2.9) |
where and denote the mean and variance of the power series distributions given in (2.2) and (2.3), respectively.
Now, zero-inflated forms assuming different count distributions belonging to the power series distribution can be defined easily. Gupta et al (1996) and Ghosh et al (2006), for example, investigated the zero-inflated form of the generalized Poisson distribution.
Maximum-likelihood estimators of and can be obtained by maximizing , , with respect to and , where
(2.10) |
Here, is the sample size and is the number of zero counts in the sample. Following Ghosh et al (2006), and using binomial expansion, the likelihood function in (2.10) can be written as
(2.11) |
After obtaining the normal equations, we have to solve
to get the maximum-likelihood estimate of , where is the sample mean. Once is obtained, the parameter is obtained from
Therefore, the maximum-likelihood estimation of the parameters under the power series distribution is simple. In a similar manner, the regression coefficients when covariates are implemented in the model can also be obtained in a simple way.
2.1 Including covariates
In practice, practitioners usually use a data set with commonly available exogenous covariates in order to explain the variable , known in this case as the endogenous variate. That is, suppose that for the th observation the covariates and are available. In order to adapt the model to this framework, we need to relate these covariates with endogenous variables via the parameters and . This can be done through the following links:
with and vectors of unknown regression parameters associated with covariates. Of course, in practice it is common to suppose that the design matrix and are the same.
A nice reformulation of the zero-inflated model above was proposed recently by Ghosh et al (2006), who considered that the zero-inflated model can be represented as , where is a Bernoulli, Bernoulli(), random variable, and independently to has a discrete distribution on the power series, . Under this representation, the mean and variance can be rewritten as
(2.12) | ||||
(2.13) |
where denotes the coefficient of dispersion of the latent random variable, . If does not have an underdispersed distribution (ie, ), then the distribution of is overdispersed. On the other hand, if does have an underdispersed distribution (ie, ), then has an underdispersed distribution if and only if . In their paper, Ghosh et al (2006) suppose that follows the power series distribution.
In our paper, two discrete distributions belonging to the power series distributions will be considered. They are the Poisson distribution with parameter and the geometric distribution with parameter , . These models will be denoted as the ZIP model and the zero-inflated geometric (ZIG) model.
2.2 The Bayesian model
In this section, a Bayesian methodology is carried out; this allows us to estimate the model above in a simple way, facilitating the process of incorporating covariates and providing exact posterior inference up to a Monte Carlo error. This model can easily accommodate multiple continuous and categorical predictors.
From a Bayesian point of view, prior distributions for and will be required. In this sense, and looking to the loglikelihood in (2.11), it is adequate to assume a Beta prior distribution for and the natural conjugate prior (Ghosh et al 2006) for the power series distribution in the following way:
Both assumptions establish a congruent model, present important computational advantages and, in addition, have a tradition in Bayesian statistical literature. However, in this paper, the covariates that affect and are fixed. So, we specify independent prior distributions for the parameters of the regression models, ie, and , as follows:
where the constants , , and are assumed to be known. In particular, and , which expresses our lack of knowledge about the regression parameters. These noninformative uniform distributions are appropriate if no prior knowledge is available about the likely range of values of the parameters (see Lempers (1971) and Mitchell and Beauchamp (1988), among others).
Given that the prior distributions for parameters have been assessed, the next procedure is to combine the likelihood function in (2.11) with priors in order to make a Bayesian inference. Since no closed forms are available for marginal posterior distributions, numerical approaches have to be used to generate them. The numerical approaches used are based on simulations from the posterior distributions, which are proper, since we consider proper priors, although with somewhat large variances. The simulation approach used is MCMC, which can be conducted using the WinBUGS software. MCMC is in this setting a powerful tool that allows us to get estimates of the parameters involved. As is well known, MCMC is a method of posterior simulation and leads us to compute the posterior density function for arbitrary points in the parameter space. With MCMC, it is possible to generate samples from an arbitrary posterior density and use these samples to approximate expectations of quantities of interest, such as the mean or second-order moment. Several other aspects of the MCMC also contribute to its success. The methodology is very simple and consists of generating simulated samples from that posterior density, even though the density corresponds to unknown distributions. In this context, Gibbs sampling is a natural estimation method. Reasonable choices for starting values of and for the MCMC simulation can be obtained by standard Poisson and negative binomial regression models using any statistical software package, such as Stata. In this work, all simulations were done using WinBUGS (Spiegelhalter et al 1999). We run three parallel chains and a single long chain for diagnostic assessment (checked using Coda software). A total of 100 000 iterations were carried out (after another 100 000 iterations of burn-in). A complete Gibbs sampling algorithm is outlined in Ghosh et al (2006). If Bayesian estimation in considered for the ZIP and ZIG models, these models will be denoted by BZIP and BZIG, respectively.
3 Experiments with insurance data
In this section, an application to the different models considered in the previous sections is developed in order to see how the proposed Bayesian method works. The data set considered was taken from the webpage of Macquarie University in Sydney, Australia, where the different data taken by Jong and Heller (2008) are available. This page contains numerous data that can be used in an actuarial setting.
3.1 The data
In particular, the data studied contains information from the policyholders of an Australian insurance company between 2004 and 2005. It describes certain characteristics related to the vehicles and the policyholders. The database contains 67 856 policies, of which 63 232 (93.18) have no claims. This is high-frequency data, although the methodology proposed here is also valid for models with low-frequency data. Table 1 shows a descriptive summary of the dependent and independent variables.
Variables | Mean | Variance | Minimum | Maximum |
---|---|---|---|---|
Number of claims | 0.07275 | 0.07739 | 0 | 4 |
Vehicle value | 1.77702 | 1.45258 | 0 | 34.56 |
Gender | 0.43110 | 0.24525 | 0 | 1 |
Young age | 0.27436 | 0.19908 | 0 | 1 |
Medium age | 0.57492 | 0.24439 | 0 | 1 |
Old age | 0.46081 | 0.24846 | 0 | 1 |
Vehicle age | 0.57492 | 0.24439 | 0 | 1 |
Sample | Poisson | Geometric | ZIP | ZIG | |
---|---|---|---|---|---|
0.93180 | 0.92983 | 0.93218 | 0.93180 | 0.93186 | |
0.08757 | 0.07276 | 0.08941 | 0.08573 | 0.08705 | |
0.03000 | 0.00000 | 0.03470 | 0.03000 | 0.03000 | |
0.20367 | 0.00000 | 0.22886 | 0.17832 | 0.19640 |
Table 2 shows some measures that lead us to consider departures from the Poisson distribution. These measures are the proportion of zeros, , the cumulant, , the zero-inflation index, , and the third central moment inflation index, (see Puig and Valero (2006) for details). In this table, we can see the sample values of these measures and their corresponding estimated values; these are obtained using the Poisson, geometric, ZIP and ZIG distributions after estimating the corresponding parameters using the maximum-likelihood method. We can see that the geometric distribution (in its inflated and non-inflated at zero versions) outperforms the Poisson one. In order to test whether there are too many zeros for the Poisson distribution, a Van den Broek (1995) score test has been conducted. This measure tests the null hypothesis and is defined by
where , the estimate probability of zero in the Poisson regression, and is the average of the count observations. This statistic has a chi-squared distribution with 1 degree of freedom. The score statistic is given by 121 392.61 (), which provides evidence that the observed zeros exceed the zeros limit of the Poisson distribution.
For each policy, the initial information for the period considered and the existence (or otherwise) of at least a claim are reported within this yearly period. In total, four explanatory variables are considered, together with a dependent variable representing the number of claims. Vehicle value is represented in 10 000 Australian dollars. Vehicle age is equal to 1 if the vehicle is relatively young (seven years old or less). Gender is equal to 1 if the policyholder is a man. This variable is included in the model for didactic purposes, but, as expected, it is not relevant in any of the models considered. Finally, a categorical variable is considered to represent the age of the policyholder by dividing this feature into three dummies: young, medium and old ages. In this sense, we try to identify if there is/are age sets with a higher propensity to make claims. Several authors have previously used the age variable in a dichotomous way (see Boucher et al (2007), Bermúdez (2009) and Pérez et al (2014), among others).
3.2 Results, diagnostic and validation
Table 3 shows results under the Bayesian estimation of the ZIP model. As we can see, vehicle value and older policyholders (in relation to medium-aged policyholders) are relevant for the chance of being in zero-state. The positive value of the first coefficient indicates that this chance increases with the value of the vehicle (with significance at ). The negative scope of old age indicates that the chance of being in zero-state decreases for older policyholders in relation to medium-aged policyholders, at significance. However, an intercept of (with significance at 1) indicates that the average number of claims is lower for medium-aged policyholders. Further, the higher the vehicle value, the lower the average number of claims expected, again at significance. These results are consistent with the zero-state coefficients.
Intercept | 0.482 | 1.883*** |
(0.295) | (0.122) | |
Vehicle value | 0.543*** | 0.102*** |
(0.096) | (0.026) | |
Gender | 0.005 | 0.028 |
(0.181) | (0.077) | |
Young age | 0.0008 | 0.095 |
(0.237) | (0.093) | |
Old age | 0.497** | 0.018 |
(0.212) | (0.101) | |
Vehicle age | 0.070 | 0.035 |
(0.236) | (0.097) |
Table 4 shows results under the Bayesian estimation of the ZIG model. Now, being in the medium-aged class increases the chance of zero-state (the -intercept is relevant at ). The average number of claims increases if the vehicle value is high (at significance) and decreases in the medium- and old-aged classes (more for older policyholders). For the youngest people, the average number of claims is expected to be greater than for the other two groups of policyholders (with significance). Finally, the average number of claims decreases if the age of the vehicle increases.
Intercept | 5.142* | 2.637*** |
(3.199) | (0.051) | |
Vehicle value | 3.097 | 0.046*** |
(3.328) | (0.022) | |
Gender | 2.055 | 0.020 |
(3.962) | (0.032) | |
Young age | 0.311 | 0.096*** |
(4.628) | (0.040) | |
Old age | 2.421 | 0.211*** |
(4.705) | (0.043) | |
Vehicle age | 2.672 | 0.054 |
(4.441) | (0.037)* |
Finally, it is interesting to compare the results from a Bayesian perspective with those obtained using standard methodology. Table 5 shows the results under two standard Poisson and negative binomial distributions, along with their zero-inflated versions. We observe that they detect the same relevant factors as the BZIG model, but these models do not distinguish between the zero-state and average number of claims factors.
Although there are a variety of methodologies to validate several models for a given data set, the DIC, which is a generalization of the AIC and BIC, is useful in Bayesian model selection problems where the posterior distributions of the models have been obtained by MCMC simulation. Recall that the DIC is only valid when the posterior distribution is approximately multivariate normal, which is the case considered here. Out of all the criteria, the model that best fits a data set is the model with the smallest value. As we can see in Table 6, frequentist models are a worse fit than the BZIP and BZIG models. The deviance of BZIP is the smallest, but the confidence interval with significance is ; this includes the deviance of the BZIG model, so there is no significative difference in terms of fitting between these two models. By comparing Tables 4 and 5, we can observe that the estimated coefficients differ considerably, although the signs and the relevant factors remain the same. So, we can say that the results from the BZIG model are more consistent (than those from the BZIP model) with respect to the results from the classical models in Table 5, providing a much better fit.
Negative | ||||
Poisson | binomial | ZIP | ZIBN | |
Intercept | 2.6282*** | 2.6305*** | 2.0540*** | 2.6306*** |
(0.0407) | (0.0412) | (0.0696) | (0.0412) | |
Vehicle value | 0.0381*** | 0.0392*** | 0.0392*** | 0.0393*** |
(0.0114) | (0.0117) | (0.0118) | (0.0117) | |
Gender | 0.0159 | 0.0161 | 0.0164 | 0.0162 |
(0.0300) | (0.0300) | (0.0301) | (0.0301) | |
Young age | 0.0958*** | 0.0955*** | 0.0949*** | 0.0956*** |
(0.0337) | (0.0337) | (0.0337) | (0.0337) | |
Old age | 0.2081*** | 0.2083*** | 0.2082*** | 0.2083*** |
(0.0385) | (0.0385) | (0.0385) | (0.0385) | |
Vehicle age | 0.0617* | 0.0606* | 0.0604* | 0.0606* |
(0.0334) | (0.0335) | (0.0335) | (0.0335) | |
Inflation constant | 0.2849* | 14.1262*** | ||
(0.1283) | (0.7096) |
DIC | AIC | BIC | |
Poisson | 36 118.353 | 36 130.353 | 36 185.105 |
Negative binomial | 36 019.507 | 36 033.507 | 36 097.383 |
ZIP | 36 026.881 | 36 040.881 | 36 104.757 |
ZINB | 36 019.507 | 36 035.507 | 36 108.508 |
BZIP | 9 611.000 | 9 623.000 | 9 744.502 |
BZIG | 9 871.000 | 9 893.000 | 10 004.501 |
4 Final remarks
In this paper, we develop a Bayesian methodology using sampling-based methods in order to model an automobile insurance data set using discrete distributions belonging to the power series distributions. As a consequence, we get a new and flexible model when overdispersion and an inflation of zeros are present in the data set.
In order to validate this new Bayesian model, we present the results of an experiment using real data, collected between 2004 to 2005 from an Australian insurance company, and the MCMC method, which is developed using the WinBUGS package. The validation process includes comparisons with standard and Bayesian ZIP and ZIG models, in terms of parameter estimations and information criteria such as DIC, AIC and BIC. The results obtained here show that the new Bayesian method outperforms the previous standard and Bayesian models.
It should be of interest for future research to modify the model in order to take into account truncated and censored data, which is often seen in insurance claim data.
Declaration of interest
The authors report no conflicts of interest. The authors alone are responsible for the content and writing of the paper.
Acknowledgements
The authors would like to thank the editor, associate editor and the anonymous referees for their relevant and useful comments. The authors also thank the Ministerio de Economía y Competitividad (project ECO2013–47092, Ministerio de Economía y Competitividad, Spain) for partial support of this work.
References
Bermúdez, L. (2009). A priori ratemaking using bivariate Poisson regression models. Insurance: Mathematics and Economics 44(1), 135–141 (http://doi.org/fvhphv).
Bhattacharya, A., Clarke, B. S., and Datta, G. (2008). A Bayesian test for excess zeros in a zero-inflated power series distribution. Institute of Mathematical Statistics 1, 89–104 (http://doi.org/cmhd6b).
Boucher, J., Denuit, M., and Guillén, M. (2007). Risk classification for claim counts: a comparative analysis of various zero-inflated mixed Poisson and hurdle models. North American Actuarial Journal 11(4), 110–131 (http://doi.org/bqxr).
Cameron, C., and Trivedi, P. (1998). Regression Analysis of Count Data. Cambridge University Press (http://doi.org/bqxs).
Cohen, A. C. (1966). A note on certain discrete mixed distributions. Biometrics 22(3), 566–572 (http://doi.org/fm6p24).
Denuit, M., Marèchal, X., Pitrebois, S., and Walhin, J.-F. (2009). Actuarial Modelling of Claim Counts Risk Classification, Credibility and Bonus-Malus Systems. Wiley (http://doi.org/ftd6rt).
Ghosh, S., Mukhopadhyay, P., and Lu, J.-C. (2006). Bayesian analysis of zero-inflated regression models. Journal of Statistical Planning and Inference 136(4), 1360–1375 (http://doi.org/bw9fb2).
Gupta, P., Gupta, R., and Tripathi, R. (1996). Analysis of zero-adjusted count data. Computational Statistical and Data Analysis 23(2), 207–218 (http://doi.org/bhn54p).
Gurmu, S. (1997). Semi-parametric estimation of hurdle regression models with an application to Medicaid utilization. Journal of Applied Econometrics 12, 225–242 (http://doi.org/ckj98m).
Hall, D. (2000). Zero-inflated Poisson and binomial regression with random effects: a case study. Biometrics 56(4), 1030–1039 (http://doi.org/b7g7s7).
Jacobs, M., Karagozoglu, A., and Sensenbrenner, F. (2015). Stress testing and model validation: application of the Bayesian approach to a credit risk portfolio. The Journal of Risk Model Validation 9(3), 41–70 (http://doi.org/bq5q).
Johnson, N., Kemp, A., and Kotz, S. (2005). Univariate Discrete Distributions. Wiley (http://doi.org/ct345v).
Jong, P. D., and Heller, G. (2008). Generalized Linear Models for Insurance Data. Cambridge University Press (http://doi.org/df2fxf).
Kosambi, D. (1949). Characteristic properties of series distributions. Proceedings of the National Institute for Science, India 15, 109–113.
Lambert, D. (1992). Zero-inflated Poisson regression, with an application to defects in manufacturing. Technometrics 34(1), 1–14 (http://doi.org/fp557w).
Lempers, F. (1971). Posterior Probabilities of Alternative Linear Models. Rotterdam University Press.
Mitchell, T., and Beauchamp, J. (1988). Bayesian variable selection in linear regression. Journal of the American Statistical Association 83(404), 1023–1032 (http://doi.org/bqxt).
Mouatassim, Y., and Ezzahid, E. (2012). Poisson regression and zero-inflated Poisson regression: application to private health insurance data. European Actuarial Journal 2(2), 187–204 (http://doi.org/bqxv).
Mullahy, J. (1986). Specification and testing of some modified count data models. Journal of Econometrics 33(3), 341–365 (http://doi.org/b6ff6h).
Noack, A. (1950). A class of random variables with discrete distributions. Annals of Mathematical Statistics 21(1), 127–132 (http://doi.org/bcf2b7).
Parnes, D. (2015). Bayesian synthesis of portfolio credit risk with missing ratings. The Journal of Risk 18(1), 1–25 (http://doi.org/bq5r).
Patil, G. (1962a). Estimation by two-moments method for generalized power series distribution and certain applications. Sankhya: The Indian Journal of Statistics B 24(3/4), 201–214.
Patil, G. (1962b). Maximum likelihood estimation for generalized power series distributions and its application to a truncated binomial distribution. Biometrika 49(1/2), 227–237 (http://doi.org/dhr26j).
Pérez, J., Negrín, M., García, C., and Gómez-Déniz, E. (2014). Bayesian asymmetric logit model for detecting risk factors in motor ratemaking. ASTIN Bulletin 44(2), 445–457 (http://doi.org/bqxw).
Puig, P., and Valero, J. (2006). Count data distributions: some characterizations with applications. Journal of the American Statistical Association 101(473), 332–340 (http://doi.org/dxbdpf).
Ridout, M., Demétrio, C., and Hinde, J. (1998). Models for count data with many zeros. Working Paper, December, International Biometric Conference, Cape Town, South Africa.
Spiegelhalter, D., Thomas, A., and Best, N. (1999). WinBUGS (Version 1.2). MRC Biostatistics Unit, Cambridge.
Spiegelhalter, D., Best, N., Carlin, B., and van der Linde, A. (2002). Bayesian measures of model complexity and fit (with discussion). Journal of the Royal Statistical Society B 64(4), 583–639 (http://doi.org/dfrgt6).
Spiegelhalter, D., Best, N., Carlin, B., and van der Linde, A. (2014). The deviance information criterion: 12 years on (with discussion). Journal of the Royal Statistical Society B 76(3), 485–493 (http://doi.org/bqxx).
Van den Broek, J. (1995). A score test for zero inflation in a Poisson distribution. Biometrics 51, 738–743 (http://doi.org/bpkr4h).
Yip, K., and Yau, K. (2005). On modeling claim frequency data in general insurance with extra zeros. Insurance: Mathematics and Economics 36(2), 153–163 (http://doi.org/fbhjpk).
Only users who have a paid subscription or are part of a corporate subscription are able to print or copy content.
To access these options, along with all other subscription benefits, please contact info@risk.net or view our subscription options here: http://subscriptions.risk.net/subscribe
You are currently unable to print this content. Please contact info@risk.net to find out more.
You are currently unable to copy this content. Please contact info@risk.net to find out more.
Copyright Infopro Digital Limited. All rights reserved.
As outlined in our terms and conditions, https://www.infopro-digital.com/terms-and-conditions/subscriptions/ (point 2.4), printing is limited to a single copy.
If you would like to purchase additional rights please email info@risk.net
Copyright Infopro Digital Limited. All rights reserved.
You may share this content using our article tools. As outlined in our terms and conditions, https://www.infopro-digital.com/terms-and-conditions/subscriptions/ (clause 2.4), an Authorised User may only make one copy of the materials for their own personal use. You must also comply with the restrictions in clause 2.5.
If you would like to purchase additional rights please email info@risk.net