Course Page Timetable | |

Current Information |

Svar till tentan 9/6–10

- Skattningen kommer att konvergera (i sannolikhet) mot cov[X,Y]/(V[X]+V[u]) i stället för det korrekta värdet cov[X,Y]/V[X].
- Wald-testet ger W=1.995, som är en observation av en χ
^{2}-variabel med en frihetsgrad. Detta är alltså inte signifikant med felrisk 5%. - LAD korrigerar till stor del för problemet, men inte fullt ut. Anledningen är att LAD siktar på medianen i stället för medelvärdet. Om t.ex. 7 personer som haft X=5 nu är värda betygen 3,4,4,5,6,6,7 så kommer vi att observera värdena 3,4,4,5,5,5,5 som har medelvärde 4.4 fast det sanna medelvärdet är 5.0. Men LAD siktar in sig på medianen, och medianen för de observerade värdena är 5.0.
- Vi bildar en ny variabel (x
_{1}+x_{2}) och kör regressionen

y = b_{0}+ b_{1}(x_{1}+ x_{2}) + b_{2}x_{2}+ ε

Då är b_{2}= β_{2}- β_{2}. - Definiera hat_ε genom Y = (hat_X)(hat_β) + hat_ε.
Det gäller att visa att detta löser normalekvationerna för OLS, dvs att
(hat_X)'(hat_ε) = 0. Men

(hat_X)'(hat_ε) = (hat_X)'(Y - (hat_X)(hat_β)) = hat_Γ'Z'((hat_u)hat_β + hat_e) = 0

ty enligt normalekvationerna för OLS är Z'(hat_u) = 0 och Z'(hat_e) = 0. - Jag föreslår någonting i stil med

(liter/mil) = β_{0}+ (motorstyrka)β_{1}+ (vikt)β_{2}+ (automatväxel)β_{3}+ (hatchback)β_{4}+ e

där (automatväxel) och (hatchback) är "duummy"-variabler. β_{4}mäter då hur mycket mer bränsle combi-modeller drar.

with short answers.

The Mars exam is ALREADY corrected!

I will hand in the minutes and the exams tomorrow, Thursday 2/3. The results were good: 88% passed, 15% A:s and 26% B:s.

I'm sorry that this comes a bit late.

Kristoffer Högbergs anteckningar

Detta är ett självextraherande arkiv. Ladda ner och lägg i någon lämplig mapp och klicka (eller dubbelklicka) på filen, så packas alla anteckningarna upp som pdf-filer (och en liten text-fil).

Wednesday 25/2-09

First I talked about omitted and irrelevant variables; sections
5.8 and 5.9 in Hansen. I also covered the situation when X_{1}
and X_{2} are uncorrelated. Then I talked about model selection,
and described BIC (section 5.10).

Finally, I talked about the Tobit model (censored data, section 12.3) and read aloud some portions from Kennedy's book.

Tuesday 24/2-09

I made a (final?) update of my comments. I added a section in "Self Selection Bias" The latest version is thus dated 24/2–09.

Monday 23/2-09

Last Friday I talked about __N__on __L__inear __L__east __S__quares
(NLLS); Ch. 5.4 in Hansen. Today we talked about mine and Stefan Lundgren's
article
about estimating the demand elasticity for telephone calling time, as a case study.
I also started to talk a little about model selection: Ch 5.8 in Hansen; I will
continue with that on Wednesday.

Now I have updated my comments again, hopefully for the last
time. I have added a section about model selection (Ch. 5.8, 5.9) and revised the
section about *Self Selection Bias,* which was a bit confusing (to put it mildly.)

**It is about time you register for the exam!** You do that on
"Mina sidor".

Wednesday 18/2-09

I solved problem 8, Ch.5.12 in Hansen. Then I talked about "logit" regression. Hansen has a very brief description of this in Ch. 12.1. I think it is at the very core of econometrics, so I put much more emphasis on this. There is a description in my comments (updated 15/2.)

The lecture Mars 4 canceled

I'm going to I:s condefernce instead. We have plenty of time for this cousre, so no harm done.

Monday 16/2-09

I have had some health problems, so I didn't feel that fit today.
Anyway, I talked about * Least Absolute
Deviations* and

- Robust to outliers,
- Invariant under (monotonically increasing) transformations.

I recommend *bootstrap* in these
cases for hypothesis testing and
calculating confidence intervals, rather than the awkward method
described in Hansen; see my
comments.

I then mentioned the lemma in my
comments
(currently on p.2)—please read the proof there— and derived
the expressions for * Generalised Least Squares*
(ch. 5.1 in Hansen; see also my comments).

Sunday 15/2-09

Updated my comments with a few more lines on "NNLS with Instrumental Variables".

Saturday 14/2-09

I have now finished bootstrap. I advised a method to estimate a confidence region and do hypothesis testing in the setting of "Percentile Intervals" when the parameter under study is multi-dimensional. I have updated my comments accordingly.

**NOTE: we agreed that the assingments should be presented
Monday Mars 2:nd.**

Wednesday 11/2-09

I have now finished Instrumental Variable Estimation (IV, 2SLS).
We did some exercises in Hansen, notably 9.8:3, 9.8:6a,b, 4.18:1, 4.18:3,
**4.18:7** (I said 4:18:3, then "corrected" to 4.18:5,
but the exercise I did was in fact 4.18:7.)

I then talked about bootstrap and covered 6.1–6.5. I will continue with bootstrap on Friday.

Assignment

One and a half credit is given for the compulsory assignment. You can do this either alone or in collaboration with one one student, but you may not be more than two! You will later show it at some later lecture—we will agree later on when.

The task is as follows: Here are data on wages and personal characteristics for 2215 persons in the USA year 1976, downloaded from Bruce Hansen's web page, and here is a description of these data. You should do the following:

- Do exercise 8a, b, c in chapter 9 in Hansens's Lecture Notes (2008).
- Test for identification in part b) and c)
- Now you should think of a regression model that answeres the
question
*Is the return to schooling the same for blacks as for non-blacks?*Try to maintain the assumption that*all other effects (coefficients)*are the same for all. - Estimate the model.
- Interpret your results.
- The above is required. However, it would be nice if you choose to analyse the data in some other respect; some question you come up with yourself. But this is voluntary!
- Write a short report, as if you were to publish it at least in some
internal forum at some imaginary place of work. You might want to look
at this short article (from
*Economics Letters*1991) to see an example. Of course, your report may be shorter. - Remember: You are doing this for your own sake, not mine!

Sunday 8/2-09

Last Wednesday and Friday I talked about the *instrumental variable method* (IV) and
*two stage least squares* (2SLS); Chapter 9 in Hansen. Note that the IV and 2SLS
estimators are consistent but biased, they are only asymptotically unbiased.
(The bias can be estimated by bootstrap methods; I will talk about this later.)
We also talked about *identification* and derived thee *rank condition* for
identification. When there is only one endogeneous regressor, a test for identification
is as follows:

Regress the endogeneous regressor on *all* exogeneous variables (exogeneous regressors
plus all instrunemts), then test for all coefficients for the instruments be =0 (a Wald
test, if more than one instrument.) If this null can not be rejected, there is a
problem with identification. I have now included this test in my
comments.

Tuesday 3/2-09

Last Friday I talked about prediction. There is a very short section about this in Hansen
ch. 5.3. I described the trick with observational specific dummies as described in
exercise 33. I also talked about *outliers* which is closely
related to *influencial observations* which Hansen briefly mentions in ch. 3.12.
I described a method to identify outliers (and influential observations) using the method
of observational specific dummies. I also described "cross validadton" as a
means of model selection for your common knowledge. We also looked at exercises
2–6 of ch. 3.14. (We will eventually also solve 6, 9 and 13.)

Yesterday I talked about the Instrumental Variable Method. First we identified the problem of "endogeneity", I classified tree types:

- self selection
- measurement errors in covariates
- simultaneity

and I gave examples of these, many of which appear in the exercises. I started to describe the method of instrumental variable estimation, but will do this i much more detail tomorrow.

Thursday 29/1-09

This Monday I talket about "residual regression" i.e., the Frisch-Waugh
theorem; ch. 3.7 in Hansen, and I gave a proof based on the "normal equations"
X'ê = 0. I also talked about "goodness of fit": R^{2}
and adjusted R^{2} (last part of ch. 3.3) and solved problem 3.14:6.

Yesterday I talked about confidence intervals and hypothesis testing; ch. 4.7–4.9. In particular, I made a very thorough derivation of the Wald test, and I solved exercises 15 och 23.

Friday 23/1-09

Wednesday and Friday. I wrote down in more rigour the assumptions we need for the classical regression model. They are essentially (2.8) and (2.9) in Hansen, plus a littel more: that the observations are essentially independent. In matrix form:

- E[e|X] = 0
- E[ee'] = diagonal_matrix(σ
_{1}...σ_{n}) - The data in the covariates may be any mixture of constants and random variables.

*Note that the last assumption is more general
that that employed by Hansen!* Indeed, **we do not adopt Assumption
3.1.1 in Hansen!** This renders the statement "u

We have previously seen that the assumption E[e|X] = 0 in many contexts where the equation has a structural interpretation may be violated. In some cases one or more relevant variables have been left out, in other situations one might use a different eatimator than OLS—more on this later in the course.

We have also seen previously that we get a more efficient estimate if the model is (nearly) homoskedastic, so we should try to formulate the model with this in mind, but in any case the covariance matrix of the estimated coefficients should always be estimated by White's method.

We looked a problems 3–10 of ch. 2 in Hansen. Then I talked about multicolinearity (ch. 3.11), and finally I commented on the model specification in ch. 4.16.

Monday 19/1-09

I have now covered essentially the following sections i Hansen: 1.1–1.3, 2.1, 3.2, 3.3, 3.5, 3.8, 3.9, 4.1. Today I explained somewhat sloppily how we should regard the covariates in an OLS regression (are they deterministic or random?) I will come back to this with more rigour on Thursday. You can also take a look in my comments to Hansen—I have just (19/1) updated them somewhat. I have spent some time on the "soft" parts of econometrics, more specifically I have given examples on possible problems:

- heteroskedasticity
- omitted relevant covariate
- selection bias, in particular self selection bias
- endogeneity, simultaneity

I will go on to talk also about

- multicolinearity, and
- omitted non-linearity

I will eventually update the excercises, but as you can see, we have already discussed some of them.

A possible remedy for the problem of endogeneity (simultaneity) and selection
(self selection) bias is to employ an *instrumental variable* estimation,
which we will deal with later.

Wednesday 14/1-09

I introduced the subject "econometrics". Then I went on to Ordinary Least Squares. For now we assume the the regressors are deterministic. That is not the typical situation in econometrics, but we start with this situation. I defined the OLS estimate of the regression equation

y_{i} = x_{i}'β + e_{i}

as the value of (the vector) β which minimises the sum of squared residuals

Σ e_{i}^{2}.

I then went on to prove that this is equivalent to solving the *normal equations*

∑ x_{i}e_{i} = 0

or, in matrix notation:

X'e = 0

which leads to the following expression for β:

β = (X'X)^{-1}X'Y.