|Course Page Timetable|
Svar till tentan 9/6–10
The exam 10/6–09
with short answers.
The Mars exam is ALREADY corrected!
I will hand in the minutes and the exams tomorrow, Thursday 2/3. The results were good: 88% passed, 15% A:s and 26% B:s.
Answers to the exam
I'm sorry that this comes a bit late.
Kristoffer Högbergs anteckningar
Detta är ett självextraherande arkiv. Ladda ner och lägg i någon lämplig mapp och klicka (eller dubbelklicka) på filen, så packas alla anteckningarna upp som pdf-filer (och en liten text-fil).
First I talked about omitted and irrelevant variables; sections 5.8 and 5.9 in Hansen. I also covered the situation when X1 and X2 are uncorrelated. Then I talked about model selection, and described BIC (section 5.10).
Finally, I talked about the Tobit model (censored data, section 12.3) and read aloud some portions from Kennedy's book.
I made a (final?) update of my comments. I added a section in "Self Selection Bias" The latest version is thus dated 24/2–09.
Last Friday I talked about Non Linear Least Squares (NLLS); Ch. 5.4 in Hansen. Today we talked about mine and Stefan Lundgren's article about estimating the demand elasticity for telephone calling time, as a case study. I also started to talk a little about model selection: Ch 5.8 in Hansen; I will continue with that on Wednesday.
Now I have updated my comments again, hopefully for the last time. I have added a section about model selection (Ch. 5.8, 5.9) and revised the section about Self Selection Bias, which was a bit confusing (to put it mildly.)
It is about time you register for the exam! You do that on "Mina sidor".
I solved problem 8, Ch.5.12 in Hansen. Then I talked about "logit" regression. Hansen has a very brief description of this in Ch. 12.1. I think it is at the very core of econometrics, so I put much more emphasis on this. There is a description in my comments (updated 15/2.)
The lecture Mars 4 canceled
I'm going to I:s condefernce instead. We have plenty of time for this cousre, so no harm done.
I have had some health problems, so I didn't feel that fit today. Anyway, I talked about Least Absolute Deviations and Quantile Regression (ch. 5.5, 5.6). I pointed out that beside the fact that it might be desirable to estimate the median—or some other quantile—rather than the mean (as in OLS), LAD and QR has some good features:
I recommend bootstrap in these cases for hypothesis testing and calculating confidence intervals, rather than the awkward method described in Hansen; see my comments.
I then mentioned the lemma in my comments (currently on p.2)—please read the proof there— and derived the expressions for Generalised Least Squares (ch. 5.1 in Hansen; see also my comments).
Updated my comments with a few more lines on "NNLS with Instrumental Variables".
I have now finished bootstrap. I advised a method to estimate a confidence region and do hypothesis testing in the setting of "Percentile Intervals" when the parameter under study is multi-dimensional. I have updated my comments accordingly.
NOTE: we agreed that the assingments should be presented Monday Mars 2:nd.
I have now finished Instrumental Variable Estimation (IV, 2SLS). We did some exercises in Hansen, notably 9.8:3, 9.8:6a,b, 4.18:1, 4.18:3, 4.18:7 (I said 4:18:3, then "corrected" to 4.18:5, but the exercise I did was in fact 4.18:7.)
I then talked about bootstrap and covered 6.1–6.5. I will continue with bootstrap on Friday.
One and a half credit is given for the compulsory assignment. You can do this either alone or in collaboration with one one student, but you may not be more than two! You will later show it at some later lecture—we will agree later on when.
The task is as follows: Here are data on wages and personal characteristics for 2215 persons in the USA year 1976, downloaded from Bruce Hansen's web page, and here is a description of these data. You should do the following:
Last Wednesday and Friday I talked about the instrumental variable method (IV) and two stage least squares (2SLS); Chapter 9 in Hansen. Note that the IV and 2SLS estimators are consistent but biased, they are only asymptotically unbiased. (The bias can be estimated by bootstrap methods; I will talk about this later.) We also talked about identification and derived thee rank condition for identification. When there is only one endogeneous regressor, a test for identification is as follows:
Regress the endogeneous regressor on all exogeneous variables (exogeneous regressors plus all instrunemts), then test for all coefficients for the instruments be =0 (a Wald test, if more than one instrument.) If this null can not be rejected, there is a problem with identification. I have now included this test in my comments.
Last Friday I talked about prediction. There is a very short section about this in Hansen ch. 5.3. I described the trick with observational specific dummies as described in exercise 33. I also talked about outliers which is closely related to influencial observations which Hansen briefly mentions in ch. 3.12. I described a method to identify outliers (and influential observations) using the method of observational specific dummies. I also described "cross validadton" as a means of model selection for your common knowledge. We also looked at exercises 2–6 of ch. 3.14. (We will eventually also solve 6, 9 and 13.)
Yesterday I talked about the Instrumental Variable Method. First we identified the problem of "endogeneity", I classified tree types:
and I gave examples of these, many of which appear in the exercises. I started to describe the method of instrumental variable estimation, but will do this i much more detail tomorrow.
This Monday I talket about "residual regression" i.e., the Frisch-Waugh theorem; ch. 3.7 in Hansen, and I gave a proof based on the "normal equations" X'ê = 0. I also talked about "goodness of fit": R2 and adjusted R2 (last part of ch. 3.3) and solved problem 3.14:6.
Yesterday I talked about confidence intervals and hypothesis testing; ch. 4.7–4.9. In particular, I made a very thorough derivation of the Wald test, and I solved exercises 15 och 23.
Wednesday and Friday. I wrote down in more rigour the assumptions we need for the classical regression model. They are essentially (2.8) and (2.9) in Hansen, plus a littel more: that the observations are essentially independent. In matrix form:
Note that the last assumption is more general that that employed by Hansen! Indeed, we do not adopt Assumption 3.1.1 in Hansen! This renders the statement "ui = xiei which is iid. ..." on page 34 untrue—they are not iid, so the proof of theorem 4.3.1. is somewhat more complicated. We just accept this theorem at face value. The OLS estimate is determined by the normal equations X'ê = 0 which is the Method of Moments Estimator (MME) correspondning to the relation E[xe]=0.
We have previously seen that the assumption E[e|X] = 0 in many contexts where the equation has a structural interpretation may be violated. In some cases one or more relevant variables have been left out, in other situations one might use a different eatimator than OLS—more on this later in the course.
We have also seen previously that we get a more efficient estimate if the model is (nearly) homoskedastic, so we should try to formulate the model with this in mind, but in any case the covariance matrix of the estimated coefficients should always be estimated by White's method.
We looked a problems 3–10 of ch. 2 in Hansen. Then I talked about multicolinearity (ch. 3.11), and finally I commented on the model specification in ch. 4.16.
I have now covered essentially the following sections i Hansen: 1.1–1.3, 2.1, 3.2, 3.3, 3.5, 3.8, 3.9, 4.1. Today I explained somewhat sloppily how we should regard the covariates in an OLS regression (are they deterministic or random?) I will come back to this with more rigour on Thursday. You can also take a look in my comments to Hansen—I have just (19/1) updated them somewhat. I have spent some time on the "soft" parts of econometrics, more specifically I have given examples on possible problems:
I will go on to talk also about
I will eventually update the excercises, but as you can see, we have already discussed some of them.
A possible remedy for the problem of endogeneity (simultaneity) and selection (self selection) bias is to employ an instrumental variable estimation, which we will deal with later.
I introduced the subject "econometrics". Then I went on to Ordinary Least Squares. For now we assume the the regressors are deterministic. That is not the typical situation in econometrics, but we start with this situation. I defined the OLS estimate of the regression equation
yi = xi'β + ei
as the value of (the vector) β which minimises the sum of squared residuals
I then went on to prove that this is equivalent to solving the normal equations
∑ xiei = 0
or, in matrix notation:
X'e = 0
which leads to the following expression for β:
β = (X'X)-1X'Y.