Transform the original response *y* to *y'=a(y-c)*, where *a* and *c* are fixed constants.

- What will happen to the OLS estimates of
*β*'s and to the residual sum of squares (RSS) after this linear transformation? - Show that the
*F*-statistics for testing*H*(_{0}: β_{1}=...=β_{s}=0*1 ≤ s ≤ p*) will be the same in both cases. - Where did you use the assumption of normality for errors? How will you test the hypotheses
*H*(see above) when the distribution of errors ε is different from normal (at least asymptotically)?_{0}

- If all n observations
*x*are equidistant from their average, than_{i}*h*._{ii}=2/n - If all but one observation x
_{i}'s are identical, these will have*h*, while for the remaining observation_{ii}=1/(n-1)*h*_{ii}=1

- Plot
*Time*against*WBC*for each level of*AG*. Does the plot indicate that the linear model will be appropriate? Try the effect of the log-transformations on*Time*and*WBC*on this plot. - Fit a full linear regression model (with iteraction) of
*Time*on*WBC*and*AG*. Comment the results. Test for parallel regression. Does this model fit the data? - Re-fit the model on the log-log scale. Does the effect of
*log(WBC)*on*log(Time)*depend on presence of*AG*-factor? Check the adequacy of the resulting model and try to think of possible reasons for problems you found (if any). Compare this model with that of the previous paragraph.

- Fit the main effect model expressing the charges against age and the other variables (don't forget first to express them as suitable indicator variables where necessary). Is the linear model adequate for this data?
- Find the appropriate transformation of the dependent variable from the Box-Cox transformation family, re-fit the model and comment its adequacy.
- Test the hypotheses that the attending physician has no effect on hospital charges (on the chosen scale).
- Some feminist organizations claim that there is sexual discrimination in the hospital and women suffer from higher hospital charges. Does their claim have any statistical ground?
- Point out influential observation(s) that strongly affected your model (if any). Remove them from the data and re-fit the model. Comment the results. Repeat Step 3 and Step 4.
- Repeat Step 2 without influential observations you've found. Did you get the same scale for the response variable as before? Try to explain this phenomenon.
- Are you completely satisfied with the resulting model(s)? If "yes",
*mazal tov!*; if "no", give an idea(s) of improving it.

- the function
**lm**used for fitting linear models creates an object*lm.object*as its output that contains a lot of useful information you may need for analysis of your model. See**help(lm.object)**for more details - fitting a linear model by
**lm**and creating the object*lm.object*as its output, the function**plot(lm.object)**gives useful plots, like residuals vs. predicted values, Q-Q plot, Cook's distance, etc. - to find the optimal Box-Cox transformation, you can use the function
**boxcox**from the package*MASS*you should attach/download first - to define factor variables use the
**factor**or**ordered**functions (see*help*for details) - to use only part of the data in fitting models, use the parameters
*subset*(preferable) or*weights*in**lm**function (see**help(lm)**for details) - the functions
**update, add1, drop1**may be useful for modifying models (see*help*for details) - if you want to plot several plots at the same page you can use
**par(mfrow=c(...,...))**to control the number of plots per page and per row