This online support page will be updated frequently...since many browsers don't recognize changes, please refresh this page each time you visit or you might not see the updates!
You will find reviews, assignments, answers, and links to sites of interest, including the course website discussion page. When you first log in you'll be prompted to set up an account. If you don't have a current email address, make up something like "bob@bob.net". Our conference is titled Econ 3600. If you have any problems logging in, make sure to let me know.
Week One
It will be a good idea for you to read Chapter 0 ASAP. This chapter reviews algebraic concepts and will help bring you up to speed on math-jargon. I'll assume that you are comfortable with this material unless you let me know.
The first topic we study involves linear models. It's a good place to begin because these models are simple as well as being very useful in many applications. Now you might be thinking, "why is simple a good thing when the world is so complex?" I'm sure there are many answers to this question, and in our first meeting I'll ask you to help me answer it. But one thing for sure is that it's a good idea to build up to complexity in stages. In addition to pedagogical reasons, simplicity is kind of elegant. Many economists appreciate microeconomic theory because of this reason alone. Linear models are often very good approximations to reality. At the Economic Time Series Page you can view many economic variables plotted against time (thus the name Time Series). As you browse through these, some variables tend to trend upwards. If we stretched a rubber band along the data (do this on your screen!), we would find that some series fit the rubber band well. But other might not (for example, nominal GDP). As we become comfortable with using linear models we will find out ways to "straighten" time series so that even wildly curved data can be modeled using linear methods.
In class we'll also talk about models graphically. Since you are Econ majors, you are VERY familiar with graphical presentations. I like graphs a lot because pictures convey so much information in a very concise way. If you have a chance, check out The Elements of Graphing Data by William S. Cleveland, AT&T Bell Laboratories, Murray Hill, New Jersey, 1994 (ISBN: 0-9634884-1-4). (William Cleveland is one of the world's top researchers in statistics.)
|
Assignment 1 (Linear Applications) |
|
|
Page |
# |
|
69 78 97 113 122 133 156 |
1, 5, 23, 43, 43 29, 32 8, 14, 39, 56, 57, 59 43, 44, 54 1, 2, 3, 4, 9, 10, 25, 29 1, 3, 5, 43, 52, 53 1, 5, 11, 30, 40 |
Week Two
As you know from last week, one of the major economic applications of linear models is in the context of systems of equations. In micro theory, market equilibrium occurs when demand equals supply (Qd = Qs). For a single market we have a demand curve and a supply curve. If the law of demand holds (the slope is negative) and if the quantity supplied is either fixed or is directly related to price (a positive slope) we are assured of an equilibrium price. Intro texts don't talk much about general equilibrium which is a theory that prices in all markets move to equilibrate the entire economic system.
Thinking about big sets of simultaneous systems of equations will motivate our discussion of matrices and linear algebra in Chapter 2. Linear algebra is important in macro, micro, and econometrics. Essentially we'll develop a notation that facilitates the solution to big systems. Here's an example. If you look in almost any intro econometrics text, discussions are pretty scary and difficult to understand. I think this is largely because authors at this level use SIGMA notation, ie, "for i equal 1 to n, sum some function of i, blah, blah, etc." It's easy to get lost in the notation and distracted from the concepts. By spending just a few weeks on linear algebra, we can more easily talk about econometrics in a non-intimidating way.
This week we will become fluent with basic matrix operations including addition, scalar multiplication, matrix multiplication, and inversion. We won't spend too much time on finding matrix inverses (Sections 2.3 & 2.4) by hand since that's what a computer is for. But we will do a few, just so you convince yourself that it might even be worth waiting for a dial-up connection to the University of Utah system to get access to a computer. BTW, not all good ISP's are listed on the U's link to Utah Internet Service Providers. (If you have some suggestions, please post on our webboard.)
Week Three
This week we continue working with matrices. I've prepared a couple of documents that explain how to use Microsoft Excel to solve systems of equations. They are:
Now we look at one of the most important applications of linear models: linear programming. Linear programming is a subset of mathematical programming where a decision maker chooses values of decision variables in order to maximize or minimize an objective function, subject to constraints. Linear programming assumes that both the objective function and the constraints are linear. One of the nice features about this set-up is that a solution to the problem is guaranteed, if it exists.
I was pleasantly surprised to find out that the "Solver" add-in in Excel can do both linear and non-linear programming, though the explanation was a bit hard to follow. But once I got the hang of it, setting up a problem was easy. We'll do this in class tonight beginning with a simple production problem. Take a look at the Excel spreadsheet: LinearProgram.xls and explore the values in the highlighted cells. If you cannot download it, you can view LinPro instead. In this example we have a manufacturer who produces two products (Type I and Type II). The manufacturer earns $3.00 profit from each Type I produced and $4.00 from each Type II produced. This translates into the objective function: Profit=$3X + $4Y where X,Y represents each unit of Type I and II respectively. Now let's introduce two constraints that are imposed by available time on two machines (1 and 2). Each unit produced requires two hours on machine 1 and there are a total of 10 hours available. This is the Machine 1 constraint: 2X + 2Y <=10. Additionally each Type I product requires 1.5 hours on Machine 2 and each Type II product requires 2 hours on Machine 2. We assume there are 10 hours available on Machine 2 so this production constraint translates to: 1.5X + 2Y < = 10. Before moving on, here's how to summarize our decision maker's problem:
Here is a link to another linear programming problem for you to try.
Transportation problems are non-trivial and generalize into
many other production/distribution problems. I've included three links below that
discuss ways to formulate a shipping problem. We examine warehouses in Nevada,
Illinois, and New Jersey which ship products to retail distribution sites in
California, Colorado, Texas, and New York. One link is the Word document (if you
have Word), one is the accompanying Excel
spreadsheet which "solves" the problem,
and the other discusses the problem using HTML(Java).
For the mid-term exam you should expect
a similar problem and I'll ask you to formulate the problem, but not
actually solve it.
Weeks Four/Five
Chapters 4 and 5 in the text review non-linear functions that we will frequently use. Instead of spending time on those chapters I want you to use them as a reference. In class we now skip to Chapter 9 on derivatives. Section 9.1 explains how to evaluate the limit of a function (see definition p. 580). In this section you'll find that a limit either exists or not. Fortunately most of the functions we use in economics are "smooth" or "well-behaved" so limits do exist. This is fortunate because we use the concept of a limit to define the derivative which is a function, derived from another function (the parent), that tells us about the rate of change of the parent function (see Section 9.3). Finding derivatives is quite simple and next week we will focus on Sections 9.4 through 9.9. As way of warm-up, you'll find the following assignment useful.
You should be comfortable with finding simple derivatives. Good practice problems are on page 631+. You should
be able to do exercises 1-30. Note the text distinguishes between dy/dx and f'(x) in
interpreting the slope of the line tangent to the curve and the instantaneous rate
of change. (Assignment Three.)
Week Six
This week we use calculus to solve assorted optimization problems.
For example:
P(X) = -.04X^2 + 240X – 10,000
How many units should be produced in order to maximize profits?
Week Seven
Although the functions x^2 and 2^x look similar, there's a huge difference in their behaviors. At x=2, both functions evaluate to 16, but by the time we reach x=10, x^2 evaluates at 100 and 2^x reaches 1024. One thing we have learned is that by exploring the derivative we can better understand the basic shape and key features of the function. The derivative of x^2 is 2x, but the derivative of 2^x is (ln(2))2^x! Now...what is ln(2)? Remember that a logarithm is tied to the base of an exponential function. In economics we use "e," defined as the limit as x grows large of (1 + 1/x)^x. The derivative of e^x is e^x. (The derivative of an exponential function is: d(blah^x)/dx = (ln(blah))blah^x; ln(e) = 1; the derivative of a logarithmic function, logbaseblah(x) is (ln(baseblah))(1/x), so d(ln(x)) = 1/x.) Some exercises we will go over are on page 764 & 771, but are a bit obscure in the text -- so don't get frustrated at first reading. Just remember that almost always in economics we use e as the base since e relates to growth and compounding. So here are some simple examples, look over them and you'll get the hang of it:
Important!The second mid-term is next Wednesday. I've put together a handout on partial derivatives. The HTML is here or you can download it as a Word Document. Here is the link to the Sample Mid-Term.
Week Eight
Probability & Statistics. If you ask most students who have taken a class in stat their likely reaction will be something very very unpleasant. So why did Penthouse (The Madonna Issue) rank Prob&Stat one of the most important classes for college students to take? That's what we'll find out in the next few weeks. (Read Chapters 7 and 8.)
We begin by looking at data and trying to figure out ways to discover what, if anything, data can tell us. Sometimes just categorizing data in convenient ways is important...that's data summary. A simple example uses the data set: {1,4,5,21,3,1}. So...tell me about this set? First there are six observations. The data values range from 1 to 21. Most of the numbers range between 1 and 5. One of the numbers is much larger than the rest. If we sort the numbers from low (min) to high (max) we have the set: {1,1,3,4,5,21}. The two "most" middle values are 3 & 4. The "middle" of these two numbers we might call the median which we could say is the average of 3 & 4 or 3.5. What does the median tell us? Well, if we wanted just one number to tell us what are data are "like" we could use the median. So are six numbers could be represented by just one: 3.5. What about the average (or mean) of our six? It is 1+1+3+4+5+21 divided by six or 5.83. For this data set what is a better number to represent what are data are like: 3.5 or 5.83? I'd go with 3.5. Why? And why do you think the mean is larger than the mean? What number (if any) occurs more often than any other? Answer: 1. That's called the mode of our data set. So we have five statistics that we can use to describe a list of numbers. What is a statistic? It's a function of the data.
Now it's your turn: Due Wednesday.
Let's continue with our example from above using the modified data set {1,4,5,2,1,3,1} with seven observations of children's ages. Realistic data are typically more complicated. For example, these data can be matched with the child's name, sex, weight, and so on. Multivariate data are best arranged in a table, or matrix. Here's our new data set:
| Name | Age | Weight | Sex |
| Chris | 3 | 20 | M |
| Ann | 4 | 12 | F |
| Tony | 1 | 8 | M |
| Alice | 3 | 18 | F |
| Robert | 2 | 12 | M |
| Liz | 5 | 18 | F |
| Paul | 1 | 8 | M |
Notice that we now have four variables and seven observations. Two of the four variables are quantitative and two are qualitative. We can also categorize variables into two types: nominal and interval. Interval variables "vary" along a sensible scale. So age and weight are interval variables. Nominal variables are used to classify groups and order doesn't matter. Here is a way to present the same data using the nominal variable MALE. The variable takes on the value 1 when the observed sex is male and 0 otherwise. Here's something to notice: our sex variable takes on two possible values: Male or Female, but we only need one variable to indicate sex. This example might make it clearer.
| Name | Age | Weight | Male |
| Chris | 3 | 20 | 1 |
| Ann | 4 | 12 | 0 |
| Tony | 1 | 8 | 1 |
| Alice | 3 | 18 | 0 |
| Robert | 2 | 12 | 1 |
| Liz | 5 | 18 | 0 |
| Paul | 1 | 8 | 1 |
Let's say that with our data we also track the marital status of the child's parents. For convenience, let's say the parents are either married, divorced, or other -- three "states." I know when I first started doing statistics I would code these as something like 1, 2, or 3 for 1 being married, 2 divorced, and 3 other. That's fine, but there's a better way, like this:
| Name | Age | Weight | Divorced | Married | Sex |
| Chris | 3 | 20 | 0 | 1 | M |
| Ann | 4 | 12 | 0 | 1 | F |
| Tony | 1 | 8 | 0 | 0 | M |
| Alice | 3 | 18 | 1 | 0 | F |
| Robert | 2 | 12 | 1 | 0 | M |
| Liz | 5 | 18 | 0 | 0 | F |
| Paul | 1 | 8 | 0 | 1 | M |
Can you tell from this data what the marital status of Liz and Tony's parents are? Yes, it's "Other" since they are neither divorced nor married. Note that the marriage state nominal variable takes on three values yet we only need two variables to completely describe it. For "readability" it is sometimes nice to add another column: OTHER.
Here are some questions:
Are older children heavier?
Week Nine
This week we continue exploring ways to summarize data both numerically and visually. At this stage we'll make a distinction between population and sample information. We typically deal with samples and attempt to infer population characteristics. For example, we use survey information to help us understand what the population is like. Two important numerical population characteristics are the mean and variance. These are what we might want to infer from the sample data, using the mean and variance statistics as estimators.
Variance relates to spread of the data and it's somewhat relative. The positive square root of the variance is called the standard deviation...and that's an important concept to master. In Excel you can easily compute basic descriptive statistics using built-in functions. If I enter the the seven age numbers into cells A1 to A7 in Excel and type " =average(A1:A7) " in another cell, the mean is returned. If I type " = stdev(A1:A7) " Excel returns the standard deviation for my sample. (The command to compute the population standard deviation is stdevp.) Similarly if I type " =median(a1:a7)" I'll get the median, etc. If you take the standard deviation and divide it by the mean you form the coefficient of variation which is a measure of "standardized" variation...a way to compare "relative" variation. For example, if the standard deviation is 10 and the mean is 5, the coefficient of variation is 2
By using the "Analysis" add-in you can have Excel do many other statistical calculations and graphics. This Excel file includes a little over 3,000 records of parolees: http://www.econ.utah.edu/fowles/prison.xls You can download it and examine how I created histograms of the data values, etc.Find the mean, median, standard deviation, and coefficient of variation for each variable in the prison data. They are the Age at First Arrest, the Number of Prior Arrests, the Number of Prior Convictions, the Degree of the Felony, and the Number of Dependents.
Week Ten
This week we take a quick look at probability in a general way. In class we used an "urn" model to talk about probabilities. Consider the data of 100 people surveyed in the following table:
| Employed | Not Employed | |
| Young | 10 | 20 |
| Old | 50 | 20 |
If we "randomly" select a person the probability that s/he will be old (P(old)) is 70%, employed (P(employed) is 60%, young & not employed P(young & not employed) is 20%, either employed or young (P(employed or young)) is 80% = P(employed) + P(young) - P(young & employed) = 60% + 30% - 10%, and old given employed (P(old|employed)) = 50/70. The last probability is a conditional probability. For this to work note that the events young and old are mutually exclusive. Let's say the data look like this:
|
Owns |
Owns Stereo |
Owns Home | |
| Employed Full Time |
100 | 50 | 10 |
| Employed Part Time |
50 | 75 | 25 |
| Not Employed | 25 | 35 | 10 |
Here things are a bit more difficult. First of all, let's say this table is based on 500 observations. Can we compute, for example, the probability that a person drawn is employed full time? No because ownership is not mutually exclusive. We can, though, note that the probability of car ownership is 175/500 because our employment categories are mutually exclusive.
Now it's your turn. The following table presents data from 1,000 unemployed adult males classified according to three levels of education (E, F, G) and three levels of job skills (R, S, T). Based on this calculate P(E), P(F and T), P(T|E), P(R or G), P(E or notF):
| R | S | T | |||
| E | 160 | 40 | 50 | 250 | |
| F | 75 | 90 | 225 | 390 | |
| G | 210 | 100 | 50 | 360 | |
| 445 | 230 | 325 | 1000 | ||
Week Eleven
This week I've put together a handout on confidence intervals. You can download the Word Document or try the Java Version but note that for some reason (on the Java version) the square roots don't show up! So when you see r(r-1)/n imagine that there's a big square root thing around it!
Week Twelve
During our final week we talked about Ordinary Least Squares Regression and how we use Excel to estimate the parameters of a linear model. In this Excel Spreadsheet I've put in a minimal (and silly) number of observations on height (in inches) and age (in years) of members of my family. In the spreadsheet you can see examples of two regressions. The first one simply models height as a function of age: H = a + bA + e, where H represents height, A represents age, a is the intercept parameter we want to estimate and b is the slope parameter we want to estimate. In this model, e is the assumed error term. The output looks like this.
In the next regression we add a new (computed) variable which is age squared so our model is H = a + bA + cA^2 + e where A^2 is simply Age Squared.
The output includes the parameter estimates (under the coefficients column), the standard errors of the coefficients, t-Stats, P-values, and lower and upper 95% estimates. R Square and Adjusted R Square are statistics that tell us about how well the data are fitted. In our class I want you to be comfortable with interpreting point and interval estimates and how to assess goodness of fit. For example, in Regression 2, the estimated coefficient on age is positive -- 3.569. (In regression 1 it is also positive, estimated at 0.758). In regresson 2 our 95% confidence interval estimate for the age variable is [-7.59,14.73] which covers zero. The fact that this interval covers zero casts some doubt on the "statistical significance" of our estimate. So called "statistically significant" intervals do not cover zero.
When comparing Regression 1 and Regression 2 we do notice a large difference in R Square and Adjusted R Square. The adjusted R Square in Regression 1 is 66% and in Regression 2 it is 94%. We conclude, then, that Regression 2 does fit the data better, with a much lower residual or error sum of squares.
...to be continued