Linear Regression Analysis and Interpretation

LinearRegression Analysis and Interpretation

Linearregression is a significant tool in statistical methods and analysis.It involves establishing the relationship that exists between two ormore variables. The method involves dependent (response) andindependent (predictor) variables. The dependent variables, which arecontinuous, are explained by the independent variables which can becontinuous (age), binary (either male or female) or categorical(social status) (Schneider et al., 2010). Linear regression can alsobe either univariate or multivariate. Univariate linear regressioninvolves the response variable and a single independent variablewhile multivariate methods involve the response variable and two ormore independent variables.

Theproperties of linear relationships enable them to have diverseapplications in different fields ranging from medicine, business tobanking. These properties and the relationships established areimportant as they help in the forecasting of future trends andpossible identification of prognoses and risks. The univariate linearrelationship is of the form Y=B0+B1X where y represents the responsevariable, B0 is the y-intercept of the line, B1 is the slope of theline, and X is the independent variable. An X input in the regressionline equation yields a corresponding Y output (Seber &amp Lee,2012).


Variousmethods can be used to analyze a given data set through linearregression techniques. In this case, using the data set for sales (X)and customer traffic (Y) provided in the template, the main variablecombinations required are XY, X2, and Y2. The product XY is computedby multiplying a given sales value with the corresponding number ofcustomers in a particular month. After calculating the XY values forall the twelve months, the sum of the XY values of the twelve monthsis found which gives us ∑XY equal to 1,122,727 as evident in theexcel sheet attached.

X^2,on the other hand, is found by squaring the number of customers foreach of the twelve months after which the sum over the months istaken to give the value of ∑X^2 totaling to 1,119,948. Thisrepresents the squared total number of customers visiting the companyin that particular year. The square of every month’s sales (inthousands of dollars), gives the values of Y^2 for the twelve months.Summation of the particular month’s Y^2 yields the required ∑Y^2which equals 1,145,268.

Themeans for the two data sets, that is, the number of customersvisiting the company every month and the month to month sales amountsare paramount for obtaining the values of the intercept, B0, and theslope B1. The mean of the customer traffic for the twelve months isobtained by dividing the total number of customers for the entireyear which is (∑X=3556) by twelve. This gives X-bar which equals296. The same case applies to the sales (Y) values where the sum forthe single month’s sales (∑Y=3644) is obtained, which is thendivided by the total number of months to get the average sales permonth for that year(Y bar=303.67).

Tofind the slope of the regression line, B1, the formula B1= (∑XY-n*Xbar*Y bar)/ (∑X^2-n*(X-bar) ^2) is fed into the formula bar of theExcel spreadsheet while the y-intercept is found by applying theformula B0=Y bar-B1*X bar. With the respective values of the slopeand the intercept of the regression line, that is, 0.6480 and 111.65respectively, the equation for the line thus becomes Y=111.65+0.6480X(Schneider et al., 2010).

Thedata for the sales and the customer traffic was used to obtain ascatter plot with sales on the y-axis and number of customers on thex-axis. A trend line was then fitted to the disconnected markers toget the line of best fit which is the required linear regressionline. The trend line equation, Y=111.65+0.648X, was then used to makepredictions for the future. Given the number of customers for each ofthe twelve months of year two assisted in forecasting themonth-to-month sales. It is obtained by feeding the number ofcustomers (X) in a particular month into the trend line equation toget the corresponding sales forecast (F(t)) for that month (Seber &ampLee, 2012).

Withthe actual data for the year two sales provided, the variance foreach month was then obtained from the differences between the actualsales and the forecasted sales. From the variances obtained for thetwelve months, it is quite evident that in most of the months, thelinear regression line equation forecasted sales that were more thanthe actual sales. This was specifically the case in eight out of thetwelve months. The actual sales for the four months were more thanthe forecasted sales including March, April, September and Novemberwith variances of $43,000, $800, $10300 and $900 respectively.

Inabsolute terms, the difference between the actual and the forecastedsales for the month of April provided the largest variance value($43,000) while that of March gave the smallest variance ($800). Theactual sales in April were $384,000 compared to the forecasted salesof $341,000. The mean monthly variance of -$6,800 implies that onaverage, the actual sales were less than the forecasted sales by$6,800 for each of the twelve months. The average for the actualsales were obtained by summing up the individual sales for each ofthe twelve months divided by the total number of months, in thiscase, twelve. On the other hand, the average forecasted sales werealso obtained in a similar manner by adding the sales for each of thetwelve months and dividing the results by twelve. The mean variancefor the twelve months was then obtained by getting the differencebetween the average of the actual sales and the average for theforecasted sales.

Thecorrelation coefficient for the data set was found to be 0.84735. Itevaluates the degree of association between two data sets, in thiscase, the sales and the number of customers and represents thestrength of the relationship between the two (Seber &amp Lee, 2012).The correlation coefficient of 0.84735 depicts a close relationshipbetween sales and customer traffic making the corresponding linearregression equation a significant fit.


Witha correlation coefficient of 0.85, the linear regression equation isthus reliable in making future sales predictions. ABC furniture canthus be able to forecast sales for a particular future date providedthat the data for the corresponding number of customers(X) isavailable.

Theregression equation, Y=111.65+0.648X, implies that an additionalcustomer would increase the amount of sales made by $648. If nocustomer visits the company, the sales for the company would then becapped at $111.65 ($000) which represents the y-intercept of thelinear regression equation (Schneider et al., 2010). ABC FurnitureCompany, therefore, should aim at employing ways of increasing thecustomer traffic to the company. There are various ways of doingthis. Some include investing in sales promotion methods such asadvertisements, discount offers, and product quality improvements.


Schneider,A., Hommel, G., &amp Blettner, M. (2010). Linear RegressionAnalysis.&nbspDtschÄ rztebl Int,&nbsp107,776-782.

Seber,G. A., &amp Lee, A. J. (2012).&nbspLinearregression analysis&nbsp(Vol.936). John Wiley &amp Sons.