A Novel Ensemble Linear Regression Scheme Based on Incorporation of the Linear Model Strength Contributed by Each Data Point Co-ordinate Using a Special Weighted Error Metrics Ensembling Scheme Spanning Error Metric Types and Considered Linear Regression Models – An Example of Construction Accident (Fatal Falls) Prediction

Full Article - PDF

Published: 2023-04-01

Page: 84-106


Rahul Konda

Department of Civil Engineering, Osmania University, Hyderabad, Telangana, India.

V. S. S. Kumar

Department of Civil Engineering, Osmania University, Hyderabad, Telangana, India and NITTR Chennai, Ministry of HRD, Government of India, India.

Ramesh Chandra Bagadi

Department of Civil Engineering, Osmania University, Hyderabad, Telangana, India and Ramesh Bagadi Consulting LLC (R042752), Madison, 53715, Wisconsin, USA.

N. Suresh Kumar

Department of Civil Engineering, Osmania University, Hyderabad, Telangana, India.

*Author to whom correspondence should be addressed.


Abstract

This study's goal is to present an innovative ensemble linear regression method that takes into account the contribution of each data point's coordinate to the linear model's strength. In order to achieve this, a computation of Construction Accidents (Fatal Falls) Forecasts modelled in accordance with the suggested manner is provided. For the given data points of interest, we first compute the linear regression model and record its R squared value. By leaving out the co-ordinate itself and using linear regression to model the remaining co-ordinates that the parameter takes on, the method involves calculating the forecasts from a linear regression model for each co-ordinate of the parameter of concern (for a specific independent variable co-ordinate of concern). We also record each R squared value's value. The Linear Regression Model Strength coefficient is then assigned to each independent variable co-ordinate as 1 minus the difference between the R squared value of each linear regression model after excluding the independent variable co-ordinate under consideration and the R squared value of the linear regression model initially obtained by using all the data point co-ordinates. We now consider 9 Error Metric parameters namely, MAPE, MAE, MSE, MAD, RMSE, R squared, Adjusted R squared, p-Value and Bias for each of the thusly computed linear regression models and then Range Normalize these values within an open interval of 0 and 1 and use a special weighted formula in arriving at a Weight Coefficient that spans all the aforementioned Error Metrics. We now use a special Weighted Ensembling Scheme and compute the One Step Forecast value that incorporates the effect of all considered 9 Error Metrics spanning all the aforementioned Linear Models.  The Standard Linear Regression based prediction method predicts the Construction Accidents – Fatal Falls for 2018 as 421 and our proposed model predicts the same as 300.7881103 ~ 301 which is much better than the Standard Linear Regression Based Prediction scheme.

Keywords: Linear regression, ensemble linear regression


How to Cite

Konda, R., Kumar, V. S. S., Bagadi, R. C., & Kumar, N. S. (2023). A Novel Ensemble Linear Regression Scheme Based on Incorporation of the Linear Model Strength Contributed by Each Data Point Co-ordinate Using a Special Weighted Error Metrics Ensembling Scheme Spanning Error Metric Types and Considered Linear Regression Models – An Example of Construction Accident (Fatal Falls) Prediction. Asian Research Journal of Current Science, 5(1), 84–106. Retrieved from https://globalpresshub.com/index.php/ARJOCS/article/view/1787

Downloads

Download data is not yet available.

References

Yong Liu, Xin Yao, Tetsuya Higuchi. Evolutionary ensembles with negative correlation learning. IEEE Transactions on Evolutionary Computation. 2000;4(4):380–387.

Juan J. Rodrıguez, Ludmila I. Kuncheva, Carlos J. Alonso. Rotation forest: a new classifier ensemble. IEEE Transactions on Pattern Analysis and Machine Intelligence. 2006;28(10): 1619–1630.

Nicol´as Garc´ıa-Pedrajas, C´esar Herv´as-Mart´ınez, Domingo Ortiz-Boyer. Cooperative coevolution of artificial neural network ensembles for pattern classification. IEEE Transactions on Evolutionary Computation. 2005;9(3):271–302.

Ludmila I. Kuncheva, Combining pattern classifiers. Wiley; 2004.

Romesh Ranawana, Vasile Palade. Multi-classifier systems: review and a roadmap for developers. International Journal of Hybrid Intelligent Systems. 2006;3(1):35–61.

Thomas G. Dietterich. Machine-learning research: four current directions. AI Magazine. 1997; 18(4):97–136.

Alexander Strehl, Joydeep Ghosh. Cluster ensembles - a knowledge reuse framework for combining multiple partitions. Journal of Machine Learning Research. 2003;3:583–617.

Fabio Roli, Giorgio Giacinto, Gianni Vernazza. Methods for designing multiple classifier systems. in International Workshop on Multiple Classifier Systems. Springer. 2001;LNCS 2096:78–87.

Niall Rooney, David Patterson, Sarab Anand, Alexey Tsymbal. Dynamic integration of regression models. in International Workshop on Multiple Classifier Systems. Springer. 2004;LNCS 3181:164–173.

Zhi-Hua Zhou, Jianxin Wu, Wei Tang. Ensembling neural networks: many could be better than all. Artificial Intelligence. 2002;137:239–263:30.

Gonzalo Mart´ınez-Mu˜noz, Alberto Su´arez. Pruning in ordered bagging ensembles. in International Conference on Machine Learning. 2006; 609– 616.

Ludmila I. Kuncheva. Switching between selection and fusion in combining classifiers: an experiment. IEEE Transactions on Systems, Man, and Cybernetics-Part B. 2002;32(2):146–156.

Cristopher J. Merz. Classification and regression by combining models. Phd thesis, University of California - U.S.A.; 1998.

Trevor Hastie, Robert Tibshirani, Jerome H. Friedman. The elements of statistical learning: data mining, inference, and prediction, Springer series in statistics. Springer; 2001.

Gavin Brown, Diversity in neural network ensembles, Phd thesis, University of Birmingham - United Kingdom; 2004.

Stuart Geman, Elie Bienenstock, Rene Doursat. Neural networks and the bias/variance dilemma. Neural Computation. 1992;4(1):1–58.

Anders Krogh, Jesper Vedelsby. Neural network ensembles, cross validation, and active learning. Advances in Neural Information Processing Systems. 1995;7:231–238.

Naonori Ueda, Ryohei Nakano. Generalization error of ensemble estimators. In IEEE International Conference on Neural Networks. 1996;1:90–95.

Gavin Brown, Jeremy L. Wyatt, Rachel Harris, Xin Yao. Diversity creation methods: a survey and categorization. Information Fusion. 2005;6:5–20.

Geoffrey I. Webb, Zijian Zheng. Multistrategy ensemble learning: reducing error by combining ensemble learning techniques. IEEE Transactions on Knowledge and Data Engineering. 2004; 16(8):980–991.

Rich Caruana, Alexandru Niculescu-Mozil, Geoff Crew, Alex Ksikes. Ensemble selection from libraries of models. in International Conference on Machine Learning; 2004.

Leo Breiman. Heuristics of instability and stabilization in model selection. Annals of Statistics. 1996;24(6):2350–2383.

Leo Breiman. Bagging predictors. Machine Learning. 1996;26:123–140.

Pedro Domingos. Why does bagging work? A bayesian account and its implications. in International Conference on Knowledge Discovery and Data Mining. 1997;155–158. AAAI Press.

Bambang Parmanto, Paul W. Munro, Howard R. Doyle Reducing variance of committee prediction with resampling techniques. Connection Science. 1996; 8(3)4:405–425,.

David W. Opitz. Feature selection for ensembles. in 16th National Conference on Artificial Intelligence, Orlando - U.S.A. AAAI Press. 1999;379–384.

Gabriele Zenobi, P´adraig Cunningham. Using diversity in preparing ensembles of classifiers based on different feature subsets to minimize generalization error. in European Conference on Machine Learning. Springer. 2001;LNCS 2167:576–587.

Carlotta Domeniconi, Bojun Yan. Nearest neighbor ensemble. in International Conference on Pattern Recognition. 2004;1:228–231.

Eibe Frank, Bernhard Pfahringer. Improving on bagging with input smearing. in Pacific-Asia Conference on Knowledge Discovery and Data Mining. 2006;97–106. Springer.

Yuval Raviv, Nathan Intrator. Bootstrapping with noise: an effective regularization technique. Connection Science. 1996; 8(3):4:355–372.

Bevans R. Understanding P-values | Definition and examples. Scribbr; 2022, November 18. Retrieved February 23, 2023. Available:https://www.scribbr.com/statistics/p-value/

Yan, Xin. linear regression analysis: theory and computing. World Scientific. 2009.

Available:http://users.wfu.edu/akinc/FIN203/Notes/Note_simple_linear_regression.docx

Astrid Schneider, Gerhard Hommel, Maria Blettner. Linear Regression Analysis, Part 14 of a Series on Evaluation of Scientific Publications, Dtsch Arztebl Int. 2010;107(44):776–782.

doi: 10.3238/arztebl.2010.0776, PMCID: PMC2992018, PMID: 21116397

Rencher, Alvin C, Christensen, William F. Chapter 10, Multivariate regression – Section 10.1, Introduction. Methods of Multivariate Analysis, 3rd Edition, Wiley Series in Probability and Statistics. 2012;709.

Available:https://en.wikipedia.org/wiki/Construction_site_safety

Konda R, Bagadi RC, Kumar VS. A novel ensemble linear regression scheme based on incorporation of the linear model strength contributed by each data point co-ordinate–an example of construction accident (Fatal Falls) prediction. Asian Research Journal of Current Science. 2023 Feb 18;12-9.

Available:https://www.bls.gov/opub/ted/2022/a-look-at-falls-slips-and-trips-in-the-construction-industry.htm

Bagadi RC, N, SK, Basa J. A holistic theoretical model for optimal multiple linear and multiple non linear regression analysis with a novel validation model scheme for ensemble regression scheme based prediction cases – an example of annual rainfall prediction. Asian Research Journal of Current Science. 2023;5(1):56-83.

Available:https://globalpresshub.com/index.php/ARJOCS/article/view/1784