7.  a) more generally replacing 3000 by d,

In terms of E(X), E(min(X,d)), S(d)=P(X>d), we have

E(min(X,d)^k) = int_{x=0}^d x^k f(x) dx + d^k S(d)

Also, E(X^k) = int_{x=0}^d x^k f(x) dx + int_{x=d}^\infty x^k f(x) dx

so int_{x=d}^\infty x^k f(x) dx = E(X^k) - E(min(X,d)^k) + d S(d)

b) (i) the probability that a claim goes to a reinsurer is P(X>d)=S(d).
The number of claims going to the re-insurer
is Poisson (lambda S(d)), so the expected number is lambda S(d)

(ii) The expected payment from the reinsurer is

E((X-d)_+) = \int_{x=d}^\infty (x-d) f(x) dx / S(d)

(iii) The expected aggregate claim cost for the reinsurer is
  lambda S(d) E((X-d)_+),  so just plug in from (i) and (ii)

(iv) the direct insurers expected aggregate claim without reinsurance
is lambda E(X). Subtracting the mean for the reinsurer, the expected
aggregate claim for the direct insurer is
lambda E(X) - lambda S(d) E((X-d)_+)

c) in general V(S) = E(N) V(X) + V(N) (E(X))^2

Here N is Poisson(lambda S(d)), and use the moments
E((X-d)_+) and E((X-d)_+^2) in the calculations

d) the expected cost to the re-insurer is (1-alpha) lambda E(X).
equate this to the expected cose from (iii) and solve for alpha

e) The re-insurer's liability is (1-alpha) S, with variance 
the variance (1-alpha)^2 V(S), and we have V(S)  given in (c)

f)  the means are the same by setup.  The direct insurer will
prefer the situation with lower variance to them, which equates
to lower risk.

15.

a) standard.  

\int_lambda F(y|lambda) pi(lambda) d lambda

b) standard.  Evaluate P_N (P_X(z)) where N~N.B.(r,beta)
and X has a Bernoulli distribution with success probability
S(100).

c)
P(Y^P > y) = P(Y-d > y | Y>d) = P(Y>y+d)/P(Y>d) = S(y+d)/S(d)

d)
Calling the discretized variance K, let
f_0 = P[K=0] = P(Y^p <= delta/2)
f_1 = P[K=1] = P(delta/2 <= Y^p <= 3 delta/2)
f_2 = P[K=2] = P(3 delta/2 <= Y^p <= 5 delta/2)
...

where delta=50 in the example.

e)
from (b), number of payments is negative binomial, which is
member of (a,b,0) and can find a and b if needed.

More generally, g_0 = P_N(f_0) = P(S=0)
then
P(S=1) = (a+b/1)f_1 P(S=0) / (1-a f_0)
P(S=2) = ((a+ 1b/2)f_1 P(S=1)+(a+2b/2) f_2 P(S=0) )/(1-a f_0)

and so on


#Q1:

a)  standard
b)
x=c(200,1000,5000,10000,100000)
observed=c(2400,3400,2400,1500,300); n=sum(observed)
xbar=sum(c(0:4)%*%observed)/sum(observed)
probs=dpois(0:3, xbar); probs=c(probs,1-sum(probs)); probs
expected=probs*sum(observed)
chi2comp=((observed-expected)^2/expected); chi2comp
1-pchisq(sum(chi2comp), length(chi2comp)-1)
#extra grouping is unnecessary
probs2=dpois(0:2, xbar); probs2=c(probs2,1-sum(probs2))
expected2=probs2*sum(observed)
observed2=c(observed[1:3])
observed2=c(observed2,n-sum(observed2))
chi2comp2=((observed2-expected2)^2/expected2); chi2comp2
1-pchisq(sum(chi2comp2), length(chi2comp2)-1-1) #subtract 1 df for parameter estimated

c)
Fn=((1:5)/5)
Fnminus=((0:4)/5)
Fx=1-(75000/(75000+x))^4
KSobs=max(c(abs(Fn-Fx),abs(Fnminus-Fx))); KSobs
#compare to given percentiles

#Q11.
#part a is same as Q1a)
#b)
x=c(0,1,2,3,4)
obs=c(1625,307,58,9,1)
pihat=1/(1+(sum(x*obs)/sum(obs))) #mle for negative binomial probabilty
phat=dnbinom(x[0:4],1,pihat)
expected=sum(obs)*phat
expected=c(expected,sum(obs)-sum(expected)); expected
#need to pool last cell
obs2=c(1625,307,58,10)
expected2=expected[1:3];expected2=c(expected2,sum(obs2)-sum(expected2))
chi2nb=sum((obs2-expected2)^2/expected2); chi2nb

Q6.
a) immediate from form of
E(X^M) on formula sheet
b) L is prod_{i=1}^97 f(x_i) S(m)^3
c) and d) are standard


Q8.

a) f(x)=d/dx F(x) = 2exp(-2x)(x^2+x+1)-exp(-2x)(2x+1)
    S(x) = 1-F(x) = exp(-2x)(x^2+x+1)
    hazard function is h(x)=f(x)/S(x) =  2-(2x+1)/(x^2+x+1) = (2x^2+1)/(x^2+x+1)
b)  is increasing in x for x>1, so right tail is lighter than exponential
c) Mean excess loss is
e(d)=E(X-d | X>d) = \int_{x=d}^\infty S(x) dx / S(d).  Need to use integration by parts.
Will find e(d)= (.5 d^2+ d+1 )/(d^2 + d +1)

d) lim_{d \rightarrow \infty} e(d) = .5, and lim e'(d) = 0, 

From notes, example 5.5 "The important point to note from this example is that for the GPD distribution, the MEL function is a straight line. If ξ ∈ (0,1), it has slope ξ/(1 −ξ), and if ξ = 0, the MEL function is flat."

Have seen that MEL tends to a constant .5, so ξ = 0.


Q10.
a) $log l(\lambda) \approx 2n log (\lambda) + sum_i log (X_i)  - \lambda sum_i X_i$

differentiate and solve for lambda

c) find  -E(2nd derivative of l(lambda)

d) standard

b)   M_{\sum X_i} = (1-t/lambda)^-{2n}
     M_{2n \lambda \bar X}(t) = M_{\sum X_i} ( 2 \lambda t) = (1-2t) ^ {-2n}

      is the mgf of $\chi_{4n}^2$.

  leads to CI      ((qchisq(alpha/2,4n)/(2n \bar X),      qchisq(1-alpha/2,4n)/(2n \bar X))


Q12.
a) plot data, examine empirical distribution, carry out hypothesis tests,  evaluate model selection criteria
b)
  they are estimates of variability of the MLE, calculated using inverse of the Fisher Information, with parameters replaced by MLE's
 
 c) Test statistic for H_0: \lambda = 1

   is T=(\hat \lambda - 1)/se(\hat \lambda)

   p-value is 2 P(Z>|T|)

   BIC=-2 log L + p log(n)
   SBC= log L - (p/2) log(n)

where p is number of model parameters, and n is sample size.

e) could use test above (called Wald's test, likelihood raio test, BIC, AIC)

 for Likelihood ratio test, p-value is  P(chi^2_1 > -2 log LR)
  where log LR is logL0/log(L1)

Q13.  # claims is binomial with parameters n and
   p=P(T > t_0) = e^{beta t_0}

    b) MLE of p is M/n

     then \hat \beta = log(M/n)/t_0

    c) get the CI for p, then transform  to a CI for beta

     d) average time to claim is 1/\beta, estimated as 1/\hat \beta

    e) use delta method

    
Q17)
   f(x)=(1/theta)exp(-x/theta)
   S(x) = exp(-x/theta)
    P(X>d) = exp(-d/theta)
   P(X>x | X > d) = exp(-(x-d)/theta)
  
  L=f(120)f(640)f(700)f(700)
       exp(-(820-100)/theta)
        exp(-(1220-100)/theta)
          exp(-1500/theta)

b) standard
c)  CR lower bound is -E(d^2 log L/d theta^2), applies also to this case with independent but non-identically distributed rv's.

d) and e)

   E(Y) = E(X^1000) = 

    evaluate and use delta method.

Q18)
a)
BIC=-2 log L + p log(n)
SBC=logL-(p/2)log(n)
b)LRT statistic is

  -2 log (L0/L1) = -2 (-427.8+426.2) = 3.2

 c)p-value = P(chi_1^2 > 3.2) > .05

d)e) read solutions

16.
a) f_S(0) = P_N(f_X(0)) = P_N (.3) = exp(2(.3-1))= exp(-1.4)
   f_S(1) = (.3/1) 1 f_X(1) f_S(0) = 0 as f_X(1)=0

Q19.
a) b) standard

   write down likelihood, the estimates of mu and sigma^2 are same as for normal(mu,sigma^2), but with X_i replaced by log(X_i)

c) standard, using Fisher information matrix
d)  standard use of delta method with  g(mu,sigma^2) =   exp(u+sigma^2/2)

Q25)
f(x) = alpha 100^alpha/(100+x)^(alpha+1)
 a)b)c) solve for MLE, large sample variance and associated CI in usual way.
  d)  from formula sheet, mean of X is g(alpha) = 100/(alpha-1)

   use delta method with the above g() and usual large sample CI.


Q16)
a) suppose payments are in units of 1$ (rather than units of 2$).
Let T be the aggregate payment.  Then

P(T=k)=\sum_{n=0}^\infty P(N=n) f^{*n} (k)

Where P(N=n) is Poisson probability, and f^{*n}(k) is binomial probability
  of k successes in n trials, but lower limit of sum should be {n=k}.

Then write P(T=k) = P(S=2k) where S is the payment in $, taking into account
units are 2$

b)
This is an application of Theorem 9.7.  The distribution of total
claims in compound Poisson with Poisson parameter 
\lambda = \lambda_1 + \lambda_2 = (2+1)=3 and severity distribution
with probability .3(2/3)+.6*(1/3) at 0, probability .7(2/3) at 2, and
probability .4(1/3) at 4.

c)  looking for P(S=4). Payments are in multiples of 2$. Notationally
it's easier to work in units of 1$.  In the new units were are looking
for P(S=2) given the severity distribution of 1.2/3 at 0, 1.4/3 at 1
and .4/3 at 2.

f_S(0) = exp(3(1.2/3-1) = exp(-1.8)

then

f_S(1) = (3/1) 1 (1.4/3) exp(-1.8)

and

f_S(2) = (3/2) 1 (1.4/3) f_S(1) + (3/2) 2 (.4/3) f_S(0)


d)  Working 1$, we are 

looking for E[(S-2)_+] = E[S] -  E[min(S,2)]

for E[min(S,2)] = 0 P(S=0)+1P(S=1)+2P(S>=2)

in units of 2$, this expectation is multiplied by 2.


24) a) solve \bar X = \alpha \beta ; S^2 = \alpha \beta^2, giving
       \hat \beta = \bar X/S^2, then \hat \alpha = \bar X/\hat \beta
    b) standard, but need access to gamma function unless \hat \alpha is integer valued
    c) need access to one dimensional optimizer, otherwise standard
    d) standard

23)  This is empirical Bayes credibility estimation

mu=E(E(X|\Theta)); v= E(V(X|\Theta)); a= V(E(X|\Theta))

     V(X)=E(V(X|Theta))+V(E(X|Theta)) = a+v

b)   (T/T+k) with k=v/a
c)  for Poisson dist'n, E(X|Lambda) = V(X|Lambda) = Lambda
       i) mu=E(X)=E(Lambda)
       ii)V(X)=E(Lambda)+V(Lambda)
       iii) a=V(X)-v= V(X) - E(Lambda) = V(X)-E(X)
d)  calculate \hat \lambda = \bar X
    then \hat \mu = \hat v = \bar X
    then \hat \a from iii, replacing V(X) and E(X) with sample mean&variance
e)  \hat k = \hat v / \hat a, the Z=(1/1+\hat k), them
     credibility premium is X Z + (1-Z) \bar X, where X=0,1,2,3 or 4

f) only need one observation per policy holder because
      Poisson assumption equates conditional mean and variance, so reduces
      number of parameters by 1


21)
a)  E(N|theta) = 1, when theta=1;  2 when theta=2; and 5 when theta=5
Hence, for a single individual, E(N) = E(E(N|theta) = .6(1)+.3(2)+.1(5) = 1.7
   for 20 individuals, multiply this by 20

b)  S=\sum_{i=1}^N X
    E(S|N) = N E(X)
    V[E[S|N]] = V[N] E[X]^2 = V[N] = .6 (1-1.7)^2 + .3 (2-1.7)^2 + .1 (5-1.7^2) = 1.41
    
    V(S|N) = N V(X) = 4N
    E(S) = E[E(S|N)] = E[N] = 1.7
    V(S)= V[E(S|N)] + E[V(S|N)] = V[N]+4E[N]

     We know E[N|theta] = 1,  w.p. .6
                        = 2,  w.p. .3
                        = 5,  w.p. .1

     also V[N | theta] = 1, wp .6
                       = 2, wp .3
		       = 5, wp .1

     E[N]=E[E[N|theta]] = 1.7

        so V[N] = E[V[N|theta]] + V[E[N|theta]] = (.6+.6+.5) + .6(1-1.7)^2 + .3 (2-1.7)^2 + .1 (5-1.7)^2
	                = 1.7 + 1.41 = 3.11

     Then V[S] = V[N] (1) + E[N] 4 = 3.11+4(1.7) = 9.81

      so standard deviation of S is 3.148


c)  Number of claims in 3 years is Poisson(3 theta) where theta is 1 wp .6, 2 wp .3, 5 wp .1

    pi(theta| X) \prop L(theta) pi(theta)

       
    here X=3

    and pi(1|3) \prop .6 3^3 exp(-3)/3!
        pi(2|3) \prop .3 3^6 exp(-6)/3!
	pi(3|3) \prop .1 3^15 exp(-15)/3!

     normalize to get posterior distribution

d)  Prior mean of N is 1.7.  Posteior mean of N is mean calculated by replacing prior distribution
of theta with this posterior distribution.


e)  focus is on Buhlmann estimate of claim frequency.
    i) Following Poisson assumption,
    mu(theta) is conditional mean of claim frequency given theta, equal to theta
    and v(theta) is conditional variance, equal to theta.
    ii) then mu=E(theta); v=E(theta); a=V(theta)
        so mu=v=1.7, and a= 1.41

f)  k=v/a = 1.206;  Z=3/(3+k) = .713	Note n=3 because 3 years of data.
    Xbar = sum_{i=1}^n X_i /n = 3/3 = 1
    credibility estimate is Z Xbar + (1-Z) mu = .713(1)+.287(1.7)

g)  analysis should be based on posteior probability


Q4.  Use units of $1000 for convenience
    
a)  the rate of type S claims is .5(.04)
    the rate of type NSS claims is .3(.1)
    the rate of type R claims is .2(.25)

    if a claim is chosen at random, the probability
    that it is type S is .02/(.02+.03+.05)=.2
    and P(NSS)=.3
    and P(R)=.5

    v =E(E(X|Theta))=.2(3)+.3(4)+.5(5)=4.3  is the expectation of the
    claim variance using the posterior distribution.  Note that v=mu
    when using units of $1000

b)  expectation of square of variance is
    E[(X|Theta)^2] = 3^2,4^2,5^2 with posterior probability .2,.3,.5
     so that E(E(X|Theta)^2)=3^2(.2)+4^2(.3)+5^2(.5)=19.1
     and a=V(E(X|Theta))=19.1-4.3^2 = .61

c)  k=v/a = 4.3/.61
    Z=(n/(n+k)) = .124 with n=1 for a single claim

    driver has one claim of 20 over several years
    Buhlmann credibility estimate is given as

    .124 (20) + .816 (4.3)

    there is something subtle here. The claim of 20 is large, but
    not so severe if over a large number of years.


reasoning as to why we use 20000 withough
   considering the number of years insured.  If we took into account
   that the individual was insured for n years with only a single
   claim, then Z->1 as n->infty, with 20000 replaced by 20000/n

d)  standard