Story Transcript
Introduction Orange consumption and prices are determined by the complex interaction of supply and demand. In the short run, supplies are relatively fixed and inflexible, and prices adjust so products clear the market. What is produced is consumed. When supplies go up, price goes down and consumers buy more. Conversely, smaller supplies bring higher prices and smaller purchases. In the long run, farmers adjust production in response to market prices, producing more of higher priced goods and less of lower priced goods. Demand for fruit in the aggregate is not very responsive to price changes because there is little room for substitution between fruit and nonfruit goods in the consumer's budget. However, demand for individual fruit is more responsive to prices as consumers substitute among alternative fruit commodities. Rising incomes increase expenditures on more expensive fruits, as consumers demand more convenience and quality. Short−period changes in consumption reflect mostly changes in supply rather than changes in consumer tastes. Demographic factors, such as changes in household size and in the age distribution of the population, can bring about changes in consumption. Consumers vote every day in the marketplace with their dollars, and the market listens carefully to their votes. There is continuous feedback from consumers, who respond to the offerings of marketers trying to meet the perceived wants of consumers. As we tried to estimate the consumption per capita for orange, we used several independent variable that would help us to find out what drives the orange consumption per capita to be higher or lower. In 1999, orange consumption had the lowest average consumption per capita (81.40 pounds per year since 1976. On the other hand, watermelon as a substitute fruit reached in 1999 the third highest consumption per capita since 1976 (15.8 pounds), since 1976 consumers have been increasing the consumption of watermelon gradually. Grape and pear fruits have also maintained their average consumption of 46.89 and 6.89 respectively and have not presented major changes in the last 23 years. These facts tell us that orange as a major fruit has been substituted by other type of fruit or food. We are assuming that in order to interpret orange consumption per capita, we will need to consider the prices of substitutes products (in this case watermelon, grape and pear prices) and also a variable of income average per capita. With these elements, we will try to estimate fruit consumption and determine if these variables have strong significance interpreting the orange consumption per capita the U.S. Data Sources The data used to estimate this model was gather for the USDA internet site (United States Department of Agriculture), Vegetable and Fruit Yearbook. The United States Census Bureau internet site, Historical averge income tables−people. Data Bases−Speacialty Agriculture, Fruit and Vegetables, ERS, USDA, National Agriculture Service. The total number of observation is 24 (from 1976 to 1999). The data can be found in Table 1. Estimated Model For the estimation of this model I used a multiple regression model with four independent variables to predict the dependent variable : Orange consumption per capita. The model is stated as follows: Y=0 + 1 X1 + 2X2 + 3X3 + 4X 4 + 5 X5
1
Yi= Orange Consumption per capita X1= Orange prices ($/lb) X2= Watermelon prices ($/lb) X3= Grapes prices ($/lb) X4=Pear prices ($/lb) X5= Personal Average Income (per capita) 0= Y intercept 1= Slope Y with variable X1, and variables X2,X3,X4,X5 held constant. 2= Slope Y with variable X2, and variables X1,X3,X4,X5 held constant. 3= Slope Y with variable X3, and variables X1,X2,X4,X5 held constant. 4= Slope Y with variable X4, and variables X1,X2,X3, X5 held constant. 5= Slope Y with variable X5, and variables X1,X2,X3, X4 held constant. In order to validate the model , the following indicators must be used: • Coefficient of determination (R−squared), measures the proportion of the variation in the dependent variable that is explained by the multiple regression equation • F−test, allow us to test the overall statistical significance of the regression equation. • Test of Significance for individual parameters (T−test), determines if the individual right−hand variables have explanatory power. • Correlation Matrix, is used to determine the independent influence of the variable on each other. • Durbin Watson Test for Autocorrelation, which test for and measures the presence of autocorrelation in the model. The summary outputs of the regression analysis on the consumption of orange are located in table 2 . Analysis of Orange Consumption 1.−R−squared (goodness of fit) Again, this coefficient measures the proportion of the variation in the dependent variable that is explained by the multiple regression equation. For this analysis the R−squared we obtained from the method was .6222, which means that 62.22% of orange consumption can be explained by the independent variables (orange, watermelon, grape and pears prices, and income).This measure is pretty good if we take into account that we only include price variables and an income variable. If we want to look at it the other way, we can say that 37.78 of orange consumption per capita can not be explained by the combination of the different independent variables. 2.− Coefficients
2
The intercept 0 is107.73 or 108, which means that when all other variables are equal to zero the orange consumption per capita is 108 pounds per year. The coefficient for orange price (2 ) is −42.79, meaning that if orange price increases by one percent, the orange consumption per capita will decrease by 42.79 pounds per year. The coefficient for watermelon price turned out to be .4623, this means that is the price of watermelon increases, the orange consumption will also increase. The coefficient for grape price (3) is 13.18, meaning that is the price of grape increases by one percent, the orange consumption per capita will increase by 13 pounds per year. The coefficient for pear price (4) is −9.14, this means that if the price of pear decreases, the orange consumption per capita will also decrease. And the last coefficient, average income (5) is −0.00000623, this means that if the average income decreases, the orange consumption per capita will also decrease (as we see, the average income needs to decrease in a huge amount in order to make the orange consumption per capita decrease, its almost insignificant). 3.− F−test The F−test measures the proportion of the variation in the dependent variable that is explained by the multiple regression equation and to see if is statistically significant. The null and alternative hypothesis are listed below: H0 = 0=1=2=3=4=5 (the independent variables do not affect orange consumption) H1= At least one of the independent coefficient variables is not 0 (the regression is significant) The F−test from our regression model was found to be 5.93. At a 95% level of significance for F () , the critical upper value on the F distribution (with 5 and 18 degrees of freedom) is Fu =2.77. Since the F−statistic is 5.93 and it is grater than 2.93 we then, reject the null hypothesis that the independent variables do not affect the orange consumption. This means, that at least one of the independent variable significantly contributes to orange consumption. 4.− T−test As we know, the T−test determines if the individual right−hand variables have explanatory power, in order to do that we set the null and alternative hypothesis for the each of the slopes of the independent variables: T−test for the slope of the X1 variable (orange price) H0 : =1=0 H1 : =1 =0 T−test for the slope of the X2 variable (watermelon price) H0 : =2=0 H1 : =2 =0 T−test for the slope of the X3 variable (grape price) H0 : =3=0 H1 : =3=0
3
T−test for the slope of the X4 variable (pear price) H0 : =4=0 H1 : =4=0 T−test for the slope of the X5 variable (pear price) H0 : =5=0 H1 : =5=0 At the level of confidence of 95%, the level of significance equal , because is a two−tailed test. The critical value at 18 degress of freedom and is 1.73 . T−critical (.25,18) = +/− 1.73 The following table displays the result of the individual tests of the independent variable and the P−values for each coefficient. The rejection region will be anything to the right upper value (+1.73) and anything to the left of the lower value (−1.73). Also we know that if the P−value is greater that our level of significance .05 we fail to reject the null. Variables
T−statistic
P−value
Orange Price
−2.851
0.010599
Watermelon price
.44594
0.660956
Grape price
1.7484
0.097413
Pear price
−2.3868
0.028178
Average Income
−.11106
0.912793
Decision ts>tc, we reject the null at the level of confidence 95% tstc, we reject the null at the level of confidence 95% ts>tc, we reject the null at the level of confidence 95% ts
From this table, we can now say that the prices of orange, grape and price are significant to determine the variability in consumption of orange. In the other hand, the variable of watermelon price and average income are not significant in determining the variability in orange consumption. 5.− Correlation Matrix In the next table, we show the correlation among the independent variables. The correlation coefficients ranges form +1 to −1. A correlation coefficient of zero or closer to zero will indicate no correlation and a coefficient closer to one will indicate strong correlation.
Orange Price Watermelon Price Grape Price Pear price
Orange Price
Watermelon Price Grape Price
Pear price
1 −0.4837 0.6755 −0.0369
1 −0.4378 −0.2122
1
1 −0.1764
Average Income
4
Average Income
0.8424
−0.5525
0.7587
0.0906
1
Based on the correlation table, the correlation among the independent variables are: • The correlation between watermelon and orange prices is negatively correlated ( −.4837) ,the negative sign means that when the price of arrange increases, we would expect the price of orange to decrease. • The correlation between grape and orange prices is positively correlated (.675), this means that when the prices of orange increases, we would expect the price of grape increase as well. • The correlation between pear and orange prices is negatively correlated (−.0369), the negative sign tell us that if the price of orange increases, we would expect the price of pear to decrease. • The correlation between average income and orange price is the strongest correlation (.842), it show us that if the orange price increases, the income average would increase as well. • The correlation between watermelon and grape price is negatively correlated (−.4378), the negative signs tell us that if the price of watermelon increases, the price of grapes would surely decrease. • The correlation between the watermelon and pear price is also negatively correlated (−.212), again the negative sign means that if the prices of watermelon increases, the price of pear would expect to decrease. • The correlation between watermelon and average income is negatively correlated (−.5255), is the price of watermelon increases, we would expect the average income to decrease. • The correlation among grape and pear price is negatively correlated (−.1764), thus, if the price of grape goes up, we would expect the price of pear go down. • The correlation between grape a price and average income was found to be highly correlated (.75), this means that if the price of grape increases, we would expect the average income to increase as well. • And the last one, the correlation between pear price and average income was found to be positively correlated (.09), this means that if the price of pear increases, the average income would also increase. A common rule of thumb is that correlations among the independent variables from −.70 to .70 do not cause problems (Lind Douglas &Mason Robert−Basis Statistics). Based on this rule we found that the strongest correlations were among: average income and orange price (−.87) and average income and grape price (.75), therefore, we would need to eliminate one of the variables that cause the correlation to be strong: average income. 6.− Elasticity The elasticity for orange price was found to be −.33, this means that if the price of orange increases by 1%, the orange consumption per capita would decrease by .33%. The elasticity for watermelon was .033, this tell s us that if the price of watermelon increase by 1%, the orange consumption would increase by .033. The elasticity for grape price was .24, for a 1% increase in the price of grape, the orange consumption would expect to increase by .24 %. The elasticity for pear price was −.14, thus, is the price of pear increases by 1%, we would expect the orange consumption to decrease .14%. Finally, if the average income increases by 1%, we would expect the orange consumption to decrease by −0.01 %. Values shown in table 1. 7.− Durbin−Watson Test Whenever we use regression analysis we should always test for auto correlation to ensure that all errors are independent. This test statistic is called Durbin Watson Statistic.
5
There are two critical values associated with the Durbin Watson test, a lower value dL and an upper value du. These values depend on the number of observations (n), the number of slopes we are estimating (p), and the level of significance. If we find that calculated D−statistic is less than the lower critical value dL, then there is evidence of autocorrelation among the residuals, thus, we reject the null hypothesis that the residual are not correlated. If the D−statistic is between dL and dU then there is no definite conclusion regarding autocorrelation. In such cases, the OLS regression is valid because there is not enough evidence to prove otherwise. If the D−statistic is above dU then you reject the hypothesis of autocorrelation. In such cases, the OLS regression results are valid. Test Hypothesis Ho: The residual are not auto correlated H1 : The residuals are positively or negatively auto correlated Table5.Durbin Watson Durbin−Watson= 1.76 D lower =.93
D upper =1.90
DECISION: Since the DW stat falls in the middle, We fail to reject the null hypothesis and conclude that our OLS model is valid to interpret the orange consumption in the U.S.
Conclusion Now that we have performed all these statistical tests in our study, we can conclude that prices of pears, watermelons, apples, oranges and personal average income can be used in predicting the orange consumption in the U.S. .We based this assumption in the statistical test performed where we obtained a goodness of fit of .622, this tells us that 62.22% of the variability of orange consumption is explained by our five independent variables. The results from the T−test, turned out to be positive in terms of the validity of the model. We found, that the prices of orange, grape and price are significant enough to determine the variability in orange consumption. Although the price of watermelon and average are not significant in this model, we still believe this model is valid. Also, we performed a correlation analysis, that showed us the correlation among the independent variables. We found, that the strongest correlation were among: personal average income and orange price (−.87) and personal average income and grape price and conclude that we needed to eliminate one of the variables that cause the correlation to be strong, this variable was personal average income. In the F−test, we tested for the significance of the entire multiple regression model and found that we rejected the null hypothesis that the independent variables do not affect the orange consumption in the U.S. and conclude that the model is valid and at least one of the independent variables significantly contributes to the orange consumption in the U.S.
6
In addition to these tests, we also tested for auto correlation among errors (to ensure they are independent, Durbin−Watson test) . In this test, we found that our Durbin Watson statistic was 1.76 and our D lower =. 93 and D upper = 1.90, since our DW statistic was between these two values, we conclude that there is no definite conclusion regarding autocorrelation and our OLS regression is valid, because there is not enough evidence to prove otherwise. Finally, based in these results we can now conclude that the orange consumption in the U.S. can be predicted using the multiple regression model and our independent variables (maybe not personal average income as strong as the others). Our findings for the F−test, T−test, autocorrelation analysis and the Durbin Watson test give enough evidence to support this assumption. Orange Consumption in the U.S._________________________________________________1
7