From the March 01, 2010 issue of Futures Magazine • Subscribe!

Exploiting the home sales/lumber link

The U.S. Census Bureau publishes data on new home sales on roughly the 17th day of every month (www. census.gov/const/newressales). Data are presented both as seasonally adjusted and in raw form. (Periodically, the Department of Housing changes the seasonal adjustment, which has an effect on the reported number and may introduce a degree of error in this analysis.) The housing market, as represented by this data, is a major source of demand for the lumber market. It goes to reason, then, that the price of lumber is linked to new home sales.


Indeed, in looking at “New home sales & inventory”, which covers 2002 through the present, we’re vaguely reminded of the futures market represented in “Lumber prices”. Whether that correlation is strong enough for one to predict the other, however, will require more extensive analysis.



Today, both predictive directions are relevant — home sales as predictive of lumber prices and vice versa. It is, after all, possible to trade lumber prices or exchange-traded funds (ETFs) based upon the real estate index, home builders, real estate prices, etc. However, as traders, we’re only interested if that relationship is measurable and tradable. To begin such analysis, consider the following questions:

• Is there a correlation between sales of new homes and lumber prices?
• Is there a linear regression equation that could estimate lumber prices based upon new home sales with any confidence?
• Is there a time series adjustment that could project home sales — that is, can this month’s lumber prices predict next month’s home sales?
• Conversely, is there a time series adjustment that could project lumber prices — that is, can this month’s home sales project next month’s lumber price?

Before we embark on such analyses, we must consider whether there is a fundamental reason why this should work. Statisticians are aware correlations are nothing other than a comparison of two sets of data. A positive correlation indicates that as one set of data increases, the other set is correspondingly increasing, all else being equal. It does not mean the first set causes the second set to increase; thus the mathematical axiom “correlation does not imply causation.” All too often, traders make the mistake of confusing correlation with causation. If we have reason to conclude there may be a causal relationship between our data sets, it will give us more confidence in any predictive ability of our results.

In this case, the causality is obvious. Lumber’s primary use is in new home construction. If sales of new homes decrease, builders are less likely to begin new homes and the demand for lumber will decrease. Because it is assumed the supply of lumber is relatively constant, absent some global catastrophe, it logically follows that lumber prices are more dependent upon demand than supply factors.

THE EYE HAS IT
A good way to begin even complex data analysis is with simple graphs. The first challenge is to construct a time series overlay of lumber and home sales. The easiest way to compare the two is to normalize the data. We can do this by figuring the difference between the data point and the mean divided by the standard deviation of the respective data set. This is shown in “Lumber vs. home sales: Normalized” (page 38).
A cursory inspection of this chart suggests some hypotheses worth testing: One, that there is a correlation between the two data sets and two, that lumber prices appear to lead home sales.



The data set consisted of 103 points: the monthly close of lumber prices from January 2001 through July 2009 and the corresponding seasonally adjusted new home sales data in the same period. The average lumber price was $278.37 with a standard deviation of $64.94. The average number of homes sold was 930,490 with a standard deviation of 283,490. The period analyzed is essentially one full economic cycle.
The correlation (Pearson) between the two data sets is 76.87%. A linear regression model for predicting lumber prices based upon home sales data is:

Lumber Price =
$114.522 + 0.17609 * Home Sales

This model has an F statistic of 145.84 and a p-value less than 0.000. Both the fitted constant and the home sales variable are significant. Interestingly, the model projects $114 per thousand board feet (TBF) as the minimum price for lumber, even if no homes sell.
If new home sales returned to 600,000, the model lumber prices will close that month between $136 and $303, with a mean of $220 with 95% confidence in the estimate. This has some predictive utility because the home sales report is issued about 13 days prior to the close of the month.

TIME TO TRADE
Having addressed our first two questions, we can turn our attention to the next two. We shall start by offsetting the lumber price by one month. That is, we will use June’s price to estimate July’s home sale. (This will result in a trivial change in the mean and standard deviation of the lumber data set, so we deem the two sets virtually equivalent and any correlation comparisons remain valid.)

The correlation falls slightly to 75.04%. However, if we allow the constant to be 0, when lumber prices are 0 (which makes some logical sense), the correlation jumps to 98.14%. The linear regression equation for sales in month x+1 based upon lumber price in month x is:

Home Sales (x + 1) =
3.3846 * Lumber Price (x)

The model has an F statistic of 2,670 and a p-value less than 0.000 and is
highly significant.
Now, let's offset home sales data using June’s data to predict July’s lumber price. Recall that the actual data point released in July by the Census Bureau for new homes sold is June’s number. So, backing up July’s number by one month is placing June’s number in the month of June and brings the data more in line with reality.
The correlation between the two sets is 77.56%, if we fit a constant. If we force the constant to be fixed at the origin (0, $0), the correlation climbs to 98.37%. The two linear regression
equations are:

Lumber Price (x + 1) =
$110.056 + 0.1804 * Homes (x)
F statistic: 150.2

or

Lumber Price (x + 1) =
0.28847 * Homes (x)
F statistic: 3,019

Assuming June 17’s home sale data release (end of May data) is 600,000, then the prediction for July’s closing price of lumber is $173 using the second model, ranging from $70 to $275, with 95% confidence in our estimated price.
The model improves even more if we offset by two months, allowing May’s data point to predict July’s closing price.

Lumber Price (x + 2) =
0.28840 * Homes (x)
F statistic: 3,316

A reading of 600,000 on May 17 (April’s sales) estimates a price of $173 with a slightly narrower range of $74
to $271.

Because we now see that we can use the point two months back, we can test the addition of two more variables to the model, such as the sales price last month and the lumber price last month, both of which are also known to us. In other words, to project July’s lumber price, we shall use the home sales data from May and June and the lumber closing price from June.

Running various regression models, we find the following to produce the best adjusted correlation with all the data set forth:

Lumber Price (x + 1) =
.83515 * Lumber Price (x) + 0.04435 * Homes (x-1)
F statistic: 5,389.

Or, by way of example, on Oct. 17, 2005, the Census Bureau released the seasonally adjusted new home sales number as 1,346,000 for the end of September 2005, and the closing price of lumber on Oct. 31, 2005 was $310.50. Using the above model, we estimate the closing price of lumber for November to be $323.72 with a 95% confidence interval for price estimate ranging from $269 to $378. The actual closing price was $326.50. Our estimate proved to be less than $3 low. We also tested a three-month moving average of lumber price and found this was of no assistance to the model.

This model was developed in early August 2009. We can use the first available post-development monthly closing price, August’s close, as a quick test. In July 2009, the home sales data released on July 27 was 384,000 for June 2009; July’s closing lumber price was $196.10. Based on this, our model estimates a closing price for lumber in August 2009, at $181.29, with the price range of $127 to $236 with 95% confidence. If we reduce the confidence level to 90%, the range contracts to $136 to $227.

One way to exploit this information — supplemented by additional analysis that maximizes timing and risk management — would be with options. In the case of the August prediction, perhaps an option trader could elect to write a $225 call and a $135 put, and with a high degree of certainty both would expire worthless. If trading futures, a lumber trader could watch prices during August, and if they suddenly fell to $154, go long one contract with a reasonable likelihood of seeing prices climb back by $30 before month’s end, while assuming the downside risk was not more than another $18. Most would consider this an acceptable risk/reward ratio. For the record, either approach would have worked. Lumber closed August at $176.80, around $5 off our predicted price.

We set out to discover if there was a predictive relationship between home sales and lumber prices. Ultimately, we built a model incorporating the home sales data refined by the prior month’s lumber price to achieve a statistically significant regression equation for the following month’s close. The reader may wish to explore other similar variables, such as new home starts or average home sale prices as other potential independent variables.

Arthur M. Field has a Ph.D. in management science from Clemson and a J.D. from Rutgers. He is a former commodity broker and was co-editor of Fidelity’s Pacific Fund and in-house commodity fund. He wrote “The Magic Eight: The Only 8 Indicators You Ever Need to Make Millions.” E-mail him at TheMagicEight@hotmail.com

Go to next page for underlying data for this piece.

New Home Sales (000s) Closing lumber ($/TBF) Projection ($/TBF) Error
Jan-01 224.5
Feb-01 936 202.2
Mar-01 963 230.5 234.0137
Apr-01 939 288.4 283.5663 54.39
May-01 909 319.7 308.6421 36.13
Jun-01 885 293.3 285.2636 -15.34
Jul-01 882 314.1 301.5704 28.84
Aug-01 880 318.4 305.0285 16.83
Sep-01 866 242.5 241.5519 -62.53
Oct-01 853 225 226.3159 -16.55
Nov-01 871 222.9 223.9855 -3.42
Dec-01 924 245.1 243.3241 21.11
Jan-02 979 266.7 263.7139 23.38
Feb-02 876 311 303.1503 47.29
Mar-02 949 295.2 285.3869 -7.95
Apr-02 917 286.3 281.1916 0.91
May-02 916 281 275.3461 -0.19
Jun-02 981 287.5 280.7302 12.15
Jul-02 959 258.1 259.0596 -22.63
Aug-02 961 228.3 233.1964 -30.76
Sep-02 1025 218.6 225.1841 -14.6
Oct-02 1057 222.9 231.6137 -2.28
Nov-02 1005 236.5 244.3909 4.89
Dec-02 1022 217.7 226.3839 -26.69
Jan-03 1052 273.6 273.8227 47.22
Feb-03 1009 232.6 240.9121 -41.22
Mar-03 935 230.6 237.3347 -10.31
Apr-03 1008 233.5 236.4748 -3.83
May-03 1004 253.5 256.4153 17.03
Jun-03 1081 285.5 282.9627 29.08
Jul-03 1200 279.5 281.3668 -3.46
Aug-03 1145 349.5 345.1049 68.13
Sep-03 1190 309.5 309.2597 -35.6
Oct-03 1129 279.98 286.6018 -29.28
Nov-03 1149 313.8 312.1412 27.2
Dec-03 1111 312.6 312.026 0.46
Jan-04 1125 333 327.3778 20.97
Feb-04 1165 385 371.4265 57.62
Mar-04 1159 381.1 369.9434 9.67
Apr-04 1276 434.2 414.0238 64.26
May-04 1186 405.5 395.2439 -8.52
Jun-04 1241 376.5 367.0331 -18.74
Jul-04 1180 422.3 407.7222 55.27
Aug-04 1088 440.5 420.2166 32.78
Sep-04 1175 336.2 329.0302 -84.02
Oct-04 1214 319.5 318.9417 -9.53
Nov-04 1305 339.1 337.0403 20.16
Dec-04 1179 356.4 355.5242 19.36
Jan-05 1242 384.3 373.2368 28.78
Feb-05 1193 418.5 404.593 45.26
Mar-05 1252 400.7 387.5542 -3.89
Apr-05 1324 356.4 353.1737 -31.15
May-05 1270 369.5 367.3073 16.33
Jun-05 1311 325.3 327.9988 -42.01
Jul-05 1272 302.9 311.1098 -25.1
Aug-05 1367 297 304.4528 -14.11
Sep-05 1271 304.9 315.2637 0.45
Oct-05 1253 310.5 315.6829 -4.76
Nov-05 1346 326.5 328.247 10.82
Dec-05 1236 359 359.514 30.75
Jan-06 1259 355.8 351.963 -3.71
Feb-06 1173 330.5 331.8537 -21.46
Mar-06 1020 324.9 323.3628 -6.95
Apr-06 1142 344.7 333.1132 21.34
May-06 1198 301.3 302.2784 -31.81
Jun-06 1087 294.5 299.083 -7.78
Jul-06 1073 273 276.2044 -26.08
Aug-06 969 288.5 288.5283 12.3
Sep-06 1009 240.6 243.9122 -47.93
Oct-06 1004 243.9 248.4422 -0.01
Nov-06 952 266.6 267.1784 18.16
Dec-06 987 268 266.0414 0.82
Jan-07 1019 251.7 253.9807 -14.34
Feb-07 890 253.3 256.7361 -0.68
Mar-07 840 240.5 240.3251 -16.24
Apr-07 830 232.2 231.1758 -8.13
May-07 907 277.8 268.8152 46.62
Jun-07 861 279.8 273.9004 10.98
Jul-07 796 279.5 271.6098 5.6
Aug-07 702 261 253.2768 -10.61
Sep-07 694 248.7 238.8355 -4.58
Oct-07 723 228.5 221.6107 -10.34
Nov-07 629 254.4 244.5272 32.79
Dec-07 600 234.5 223.7388 -10.03
Jan-08 597 216.4 207.3365 -7.34
Feb-08 572 219.2 209.5418 11.86
Mar-08 513 222.1 210.855 12.56
Apr-08 542 210.3 198.3836 -0.56
May-08 514 246.3 229.7351 47.92
Jun-08 503 242 224.9022 12.26
Jul-08 500 253 233.601 28.1
Aug-08 444 252 232.6328 18.4
Sep-08 436 203.5 189.6444 -29.13
Oct-08 409 188.6 176.8459 -1.04
Nov-08 390 193.5 179.7407 16.65
Dec-08 374 169.4 158.7709 -10.34
Jan-09 329 148.1 140.2726 -10.67
Feb-09 354 147 137.3582 6.73
Mar-09 332 171.3 158.7611 33.94
Apr-09 345 161 149.1834 2.24
May-09 362 191.7 175.399 42.52
Jun-09 395 191 175.5684 15.6
Jul-09 433 196.1 181.2912 20.53
Page 2 of 3
Comments
comments powered by Disqus