This blog is written to claim Problem Solving Work from Engineering Economy, Chapter 3 CostEstimation Techniques.
1. Problem Recognition, Definition and Evaluation
A switchroom is one of the big non electrical construction components in a substation. The switchroom cost is usually estimated using the following formula:
Switchroom estimate = a x [Area], where a= swichroom estimate unit rate
Here we try to find a formula to estimate a switchroom cost.
Unfortunately, the actual switchroom project cost is buried in the contract which makes it very hard to determine the cost base on the actual. But usually in the pricing schedule, it is very easy separate the switchroom cost from other part of the project. So, in this case, instead of using actual construction cost, for this paper, the cost from tender pricing schedule is used.
The data was gathered from several tenders, which happen in the same year. It is assume there is no impact on the data due because of escalation.
The switchroom sizes for each project are shown in Table 1:
Projects  Switchroom Size (m2) 
A 
254.92 
B 
295.92 
C 
250.03 
D 
630.78 
Table 1. Switchroom sizes
The tender submissions for projects are shown in Table 2
Cost $ (k)  Switchroom Size (m2) 
1,191 
254.92 
1,009 
254.92 
1,379 
254.92 
1,024 
254.92 
1,220 
295.63 
1,281 
295.63 
949 
250.03 
1,004 
250.03 
1,908 
630.78 
2,050 
630.78 
2,335 
630.78 
2,136 
630.78 
2,092 
630.78 
Table 2. Tender information for the switchroom
 Development of feasible alternatives and the outcomes
There are 2 methodologies that can be used to develop the formula
 Averaging
In this methodology, the tender submission price is divided by the size of the building, to determine the $/m2 value for the each submission. Then an average value of the overall is determined by averaging all the result. The standard deviation of this methodology is determined by comparing the expected/estimated price (calculated based on the overall average multiply by the building size) with the tender prices using the following formula:
σ = √[(Σ (y – ỹ)^{2}) / n]
The result is shown in Table 3.
Cost $ (k)  Building Area (m2)  Average Area/m2  Estimated Cost $ (k) 
1,191 
254.92 
4.672 
1,000 
1,009 
254.92 
3.958 
1,000 
1,379 
254.92 
5.409 
1,000 
1,024 
254.92 
4.017 
1,000 
1,220 
295.63 
4.127 
1,160 
1,281 
295.63 
4.333 
1,160 
949 
250.03 
3.795 
981 
1,004 
250.03 
4.015 
981 
1,908 
630.78 
3.025 
2,475 
2,050 
630.78 
3.250 
2,475 
2,335 
630.78 
3.702 
2,475 
2,136 
630.78 
3.386 
2,475 
2,092 
630.78 
3.317 
2,475 
General Average 
3.924 

Std Deviation 
275.033 
Table 3. Averaging result
The result is as per the following formula:
Switchroom estimate = 3.924 x [Area]
2. Linier Regression
In this methodology, the tender prices are plot against the building areas, and a regression analysis is performed to produce regression formula. The formula is as per the following arrangement:
Y = a x [Area] + b
Two approaches were used. The first one is forcing the regression line to pass the point of origin (0, 0), which represent by setting b = 0. With this arrangement the formula becomes:
Y = a x [Area]
This it can be seen, the above formula is similar to the result from the averaging method.
The second approach is by letting b = constant, as per common regression formula.
For this analysis, LINEST() function in Excel is used. The Excel graph Trendlines function cannot be used in this matter as it produces wrong Rsquare value when the constant is set to 0 ^{[2]}.
Plotting
Figure 1. Plotting switchroom area to tender prices
The regression results for both cases are shown at Table 4.
b=constant 
b=0 

X Variable 
2.66 
3.55 
Intercept 
429.17 
0.00 
R Square 
0.929 
0.981 
Standard Error 
143.62 
227.03 
Formula  2.66 x [Area] + 429.17  3.55 x [Area] 
Table 4. Regression result
The Predicted Y for b=constant is shown at Table 5, and Table 6 shows Predicted Y for b=0.
Building Area (m2) 
Predicted Y 
254.92 
1106.98 
254.92 
1106.98 
254.92 
1106.98 
254.92 
1106.98 
295.63 
1215.21 
295.63 
1215.21 
250.03 
1093.98 
250.03 
1093.98 
630.78 
2106.34 
630.78 
2106.34 
630.78 
2106.34 
630.78 
2106.34 
630.78 
2106.34 
Table 5. Predicted Y for b=constant
Building Area (m2) 
Predicted Y 
254.92 
903.77 
254.92 
903.77 
254.92 
903.77 
254.92 
903.77 
295.63 
1048.07 
295.63 
1048.07 
250.03 
886.44 
250.03 
886.44 
630.78 
2236.28 
630.78 
2236.28 
630.78 
2236.28 
630.78 
2236.28 
630.78 
2236.28 
Table 6. Predicted Y for b=0
3. Selection of Criterion
In regression analysis, coefficient of determination (a.k.a Rsquared) usually is used to determine how much the data fit to the formula. The closer Rsquared value to 1, the better the formula. But unfortunately the Rsquare value produces by Excel cannot be used to compare between these 3 equations, and to select which equation is more representing the data. It is because when a regression is set to start from the point of origin, the Rsquared is calculated based on the comparison of (Σ(ŷ)^{2}/ Σ(y)^{2}) instead of Σ(ŷ – ỹ)^{2}/ Σ(y – ỹ)^{2} , resulting the Rsquared this type of regression is always higher compare to Rsquared for regression with b=constant, despite the regression equation with b=constant has better connection to the data.
Because of that John PerezGarcia suggests using different formula to calculate Rsquared for regression formula which start from the point of origin^{[3]}. The formula is as follow:
Rsquared = 1 – (Σ(y – a x [Area])^{2}/ Σ(y – ỹ)^{2})
Using this equation, the Rsquared for the regression through the point of origin can be calculated. As the averaging methodology produces basically the same set of equation as the regression with b=0, then the above equation can be used to calculate the Rsquared value for the averaging methodology.
5. Analysis and Comparison of the Alternatives
The Rsquared for b=constant is 0.929. The Rsquared value for b=0 based on the above equation is 0.805, and the Rsquared for averaging is 0.690
6. Selection of Preferred Alternatives
As the Rsquared for regression equation with b=constant is the biggest compare to the Rsquared with b=0 and Rsquared for averaging method, then the regression equation with b=constant, should be used as switchroom cost estimating formula. The formula is as follow:
Switchroom cost = 2.66 x [Area] + 429.17
7. Performance monitoring and post evaluation result
The formula is developed based on limited number of project. As the formula surprisingly has quite high Rsquared, it gives confidence that the formula is quite good for this level of estimate. With more projects available is we can check and recalibrate the formula with new data.
Reference
 Carlberg, C.(Feb 12, 2013).Forcing the Constant in Regression to Zero: Understanding Excel’s LINEST() Error. Retrieved from: http://www.quepublishing.com/articles/article.aspx?p=2019170
 Microsoft Knowledge Base. (Jan 11, 2006). You will receive an incorrect Rsquared value in the chart tool in Excel 2003. Retrieved from: http://support.microsoft.com/kb/829249
 PerezGarcia, J. (2007). Lecture 20: Regression through the Origin and Its Pitfalls. In : QSCI 483: Statistical Inference in Applied Research – II Applied Linear Regression. Retrieved from: http://courses.washington.edu/qsci483/Lectures/20.pdf
 Sri, S. (Aug 3, 2010).6 common big Excel errors. Linear Regression Through The Origin, Excel 2000. Retrieved from: http://funwithexcel.blogspot.com.au/2010/08/linearregressionthroughoriginexcel.html
 Interpretationin Multiple Regression. Retrived from: http://www.stat.duke.edu/courses/Spring00/sta242/handouts/beesIII.pdf
 Oracle Crystal Ball Reference and Examples Guide. (n.d). Statistics, Standard Regression without Constant. Retrieved from: http://docs.oracle.com/cd/E17236_01/epm.1112/cb_statistical/frameset.htm?ch07s03s06s02.html
 Oracle Crystal Ball Reference and Examples Guide. (n.d). Statistics, Standard Regression with Constant. Retrieved from: http://docs.oracle.com/cd/E17236_01/epm.1112/cb_statistical/frameset.htm?ch07s03s06s01.html#predictor_stats_44
 STAT@MTSU. (n.d). Excel Interpretation. In Regression Tutorial. Retrieved from : http://mtweb.mtsu.edu/stats/regression/level3/multireg/useexcelinterp.htm

Mason, J. O. (Oct 2012). PERFORMING REGRESSION ANALYSIS USING MICROSOFT EXCEL. In International Journal of Arts and Commerce, Vol. 1 No. 5. Retrieved from: http://www.ijac.org.uk/images/frontImages/gallery/Vol._1_No._5/19.pdf

Sullivan, W. G., Wicks, E. M., & Koelling, C. P. (2012). Engineering Economy (15th Ed.) (pp.6796). New Jersey, United States: Prentice Hall.
AWESOME case study and I am truly amazed at the research you put into this!!
I didn’t realize there were two ways to calculate the R2 value, although I fully understand how starting at zero rather than starting at the first point will change the R2 calculations.
VERY well done!!!
And thank you for bringing to my attention the fact I missed a posting. Sorry about that!!
BR,
Dr. PDG, Jakarta