# A Novel Heterogeneous Ensemble Approach to Variable Selection For Gas-Liquid Two-Phase CO$$_2$$ Flow Metering
Sun, Caiying, Wang, Lijuan, Yan, Yong, Zhang, Wenbiao, Shao, Ding (2021) A Novel Heterogeneous Ensemble Approach to Variable Selection For Gas-Liquid Two-Phase CO$$_2$$ Flow Metering. International Journal of Greenhouse Gas Control, 110 . Article Number 103418. ISSN 1750-5836. (doi:10.1016/j.ijggc.2021.103418) (Access to this publication is currently restricted. You may be able to access a copy if URLs are provided) (KAR id:89452)
Variable selection is an important preprocessing step in the development of effective data-driven models for CO$$_2$$ flow measurement in carbon capture and storage systems. In order to effectively quantify the importance of potential input variables to the desired output, ensemble learning is proposed and incorporated into variable selection methodology. This paper presents a tree-based heterogeneous ensemble approach to variable selection and its application to gas-liquid two-phase CO$$_2$$ flow measurement. The importance of each variable is determined through combining the importance scores from four tree-based algorithms, including decision tree regression, bootstrap aggregating of regression trees, gradient boosting decision tree and gradient boosting random forest. Then the backward elimination algorithm is applied to remove the relatively less important variables and hence a small set of input variables for data-driven models. The selection results demonstrate that the significant variables for CO$$_2$$ mass flow measurement include apparent mass flow rate, time shift, differential pressure and pressure drop while observed density, density drop, observed flow velocity and outlet temperature for prediction of gas volume fraction. To assess the validity of the selected variables, data-driven models based on gradient boosting random forest are developed. Results suggest that the relative error of the model output is mostly within 1% for CO$$_2$$ mass flowrate measurement and 5% for gas volume fraction prediction by taking the selected variables as model inputs.