## A Novel Ensemble Linear Regression Scheme Based On Incorporation of the Linear Model Strength Contributed by Each Data Point Co-ordinate – An Example of Construction Accident (Fatal Falls) Prediction

Rahul Konda *

Department of Civil Engineering, Osmania University, Hyderabad, Telangana, India.

Ramesh Chandra Bagadi

Department of Civil Engineering, Osmania University, Hyderabad, Telangana, India and Ramesh Bagadi Consulting LLC (R042752), Madison, 53715, Wisconsin, USA.

V. S. S. Kumar

Department of Civil Engineering, Osmania University, Hyderabad, Telangana, India and NITTR Chennai, Ministry Of HRD, Government Of India, India.

Suresh Kumar N.

Department of Civil Engineering, Osmania University, Hyderabad, Telangana, India.

*Author to whom correspondence should be addressed.

### Abstract

The objective of this research investigation is to present a novel scheme of Ensemble Linear Regression Based on Incorporation of the Linear Model Strength Contributed By Each Data Point Co-ordinate. An example of computation of Construction Accidents (Fatal Falls) Forecasts modeled according to the proposed fashion is presented to this end. Firstly, we compute the linear regression model for the given data points of concern and note its R squared value. The method involves calculation of linear regression model forecasts (for a specific independent variable co-ordinate of concern) of each co-ordinate of the parameter of concern by omitting the co-ordinate itself and modeling the rest of the co-ordinates that the parameter takes on, using Linear Regression. We also note their respective R squared values. We then ascribe the Linear Regression Model Strength coefficient to each independent variable co-ordinate as 1 minus the {(change in R squared value of each linear regression model obtained by omission of the considered independent variable co-ordinate with respect to the R squared value of the linear regression model obtained initially by using all the data point co-ordinates) divided by the R squared value of the linear regression model obtained initially by using all the data point co-ordinates}. We now product scale (using the Model Strength coefficient of each independent variable co-ordinate) the dependent variable co-ordinate obtained for each independent variable co-ordinate using the linear regression model generated by using all the data points. This gives us a new set of dependent variable co-ordinates for the given independent variable co-ordinates of concern for which we finally generate the General Linear Regression Model. We can also note that the R squared values of this thusly fashioned Ensemble Regression Model is higher than the rote General Linear Regression Model. In this study, the authors have used Adjusted R^{2} values for this analysis.

Keywords: Linear regression, ensemble linear regression

**How to Cite**

*Asian Research Journal of Current Science*,

*5*(1), 12–19. Retrieved from https://globalpresshub.com/index.php/ARJOCS/article/view/1766

### Downloads

### References

Higuchi T, Xin Yao, Yong Liu. Evolutionary ensembles with negative correlation learning. IEEE Trans Evol Computat. 2000;4(4):380-7.

Rodrıguez JJ, Kuncheva LI, Alonso CJ. Rotation forest: a new classifier ensemble method. IEEE Trans Pattern Anal Mach Intell. 2006;28(10):1619-30.

Garcia-Pedrajas N, Hervas-Martinez C, Ortiz-Boyer D. Cooperative coevolution of artificial neural network ensembles for pattern classification. IEEE Trans Evol Computat. 2005;9(3):271-302.

Kuncheva LI. Combining pattern classifiers. Wiley; 2004.

Ranawana R, Palade V. Multi-classifier systems: review and a roadmap for developers. Int J Hybrid Intell Syst. 2006;3(1):35-61.

Dietterich TG. Machine-learning research: four current directions. AI Mag. 1997;18(4):97-136.

Strehl A, Ghosh J. Cluster ensembles – a knowledge reuse framework for combining multiple partitions. J Mach Learn Res. 2003;3:583-617.

Fabio Roli, Giorgio Giacinto, and Gianni Vernazza, “Methods for designing multiple classifier systems”, in International Workshop on Multiple Classifier Systems. 2001;LNCS 2096:78–87. Springer.

Niall Rooney, David Patterson, Sarab Anand, Alexey Tsymbal. Dynamic integration of regression models. in International Workshop on Multiple Classifier Systems. 2004;LNCS 3181:164–173. Springer.

Zhou ZH, Wu J, Tang W. Ensembling neural networks: many could be better than all. Artif Intell. 2002. 30;137(1-2):239-63.

Gonzalo Mart´ınez-Mu˜noz and Alberto Su´arez. Pruning Ordered Bagging Ensembles, in International Conference on Machine Learning; 2006;609-16.

Kuncheva LI. Switching between selection and fusion in combining classifiers: an experiment. IEEE Trans Syst Man Cybern B Cybern. 2002;32(2):146-56.

Merz CJ. Classification and regression by combining models [PhD thesis]. University of California; 1998.

Rencher, Alvin C., Christensen, William F., "Chapter 10, Multivariate regression – Section 10.1, Introduction", Methods of Multivariate Analysis, 3rd Edition, Wiley Series in Probability and Statistics. 2012;709.

Hastie T, Tibshirani R, Friedman JH. The elements of statistical learning: data mining, inference, and prediction. Springer S Stat; 2001.

Brown G. Diversity in neural network ensembles [PhD thesis]. United Kingdom: University of Birmingham; 2004.

Geman S, Bienenstock E, Doursat R. Neural networks and the bias/variance dilemma. Neural Comput. 1992;4(1):1-58.

Krogh A, Vedelsby J. Neural network ensembles, cross validation, and active learning. Adv Neural Inf Process Syst. 1995;7:231-8.

Ueda N, Nakano R. Generalization error of ensemble estimators. In Proceedings of International Conference on Neural Networks (ICNN'96). 1996;1:90-95. IEEE.

Brown G, Wyatt JL, Harris R, Yao X. Diversity creation methods: a survey and categorisation. Inf Fusion. 2005;6(1):5- 20.

Webb GI, Zheng Z. Multistrategy ensemble learning: reducing error by combining ensemble learning techniques. IEEE Trans Knowl Data Eng. 2004;16(8):980-91.

Caruana R, Niculescu-Mizil A, Crew G, Ksikes A. Ensemble selection from libraries of models. In: International Conference on Machine Learning; 2004.

Breiman L. Heuristics of instability and stabilization in model selection. Ann Statist. 1996;24(6):2350-83.

Breiman L. Bagging predictors. Mach Learn. 1996;24(2):123-40.

Pedro Domingos. Why does bagging work? a bayesian account and its implications”, in International Conference on Knowledge Discovery and Data Mining. 1997;155–158. AAAI Press.

Parmanto B, Munro PW, Doyle HR. Reducing variance of committee prediction with resampling techniques. Connect Sci. 1996;8(3-4):405-26.

Opitz DW. Feature selection for ensembles. In: 16th National Conference on Artificial Intelligence, Orlando, U.S.A.. AAAI Press. 1999;379-84.

Gabriele Zenobi, P´adraig Cunningham, “Using diversity in preparing ensembles of classifiers based on different feature subsets to minimize generalization error”, in European Conference on Machine Learning. 2001;LNCS 2167:576–587. Springer.

Carlotta Domeniconi, Bojun Yan. Nearest neighbor ensemble. in International Conference on Pattern Recognition. 2004;1:228–231.

Frank E, Pfahringer B. Improving on bagging with input smearing. In: Conference on Knowledge Discovery and Data Mining. Pacific: Asia Publishing. Springer. 2006;97-106.

Raviv Y, Intrator N. Bootstrapping with noise: an effective regularization technique. Connect Sci. 1996;8(3-4):355-72.

Yan X. Linear regression analysis. Theor Comput; 2009. Available:http://users.wfu.edu/akinc/FIN203/Notes/Note_simple_linear_regression.docx.

Schneider A, Hommel G, Blettner M. Linear regression analysis’ Part 14. Dtsch Arztebl Int. 2010;107(44):776-82.

Derr J, Forst L, Chen HY, Conroy L. Fatal falls in the US construction industry, 1990 to 1999. Journal of Occupational and Environmental Medicine. 2001:853- 60.