Gas-Liquid Two-Phase Flow Measurement Using Coriolis Flowmeters Incorporating Artificial Neural Network, Support Vector Machine, and Genetic Programming Algorithms

Coriolis flowmeters are well established for the mass flow measurement of single-phase flow with high accuracy. In recent years, attempts have been made to apply Coriolis flowmeters to measure two-phase flow. This paper presents data driven models that are incorporated into Coriolis flowmeters to measure both the liquid mass flowrate and the gas volume fraction of a two-phase flow mixture. Experimental work was conducted on a purpose-built two-phase flow test rig on both horizontal and vertical pipelines for a liquid mass flowrate ranging from 700 to 14500 kg/h and a gas volume fraction between 0% and 30%. Artificial neural network (ANN), support vector machine (SVM), and genetic programming (GP) models are established through training with the experimental data. The performance of backpropagation-ANN (BP-ANN), radial basis function-ANN (RBF-ANN), SVM, and GP models is assessed and compared. Experimental results suggest that the SVM models are superior to the BP-ANN, RBF-ANN, and GP models for two-phase flow measurement in terms of robustness and accuracy. For liquid mass flowrate measurement with the SVM models, 93.49% of the experimental data yield a relative error less than ±1% on the horizontal pipeline, while 96.17% of the results are within ±1% on the vertical installation. The SVM models predict the gas volume fraction with a relative error less than ±10% for 93.10% and 94.25% of the test conditions on the horizontal and vertical installations, respectively.

flowrate of a two-phase mixture is challenging in industry. Significant research based on traditional flowmeters for twophase flow measurement has been conducted, such as Venturi, V-cone, turbine, vortex, and slotted orifice meters [1]- [3]. The determination of gas volume fraction of two-phase flow is crucial for the optimization of some industrial processes. Resistive sensors, capacitive sensors, electrical capacitance tomography, electrical resistance tomography, and microwave probes have been proposed for the phase fraction measurement of two-phase flow [4]- [6]. These techniques are often referred to as direct method, since the systems are designed to measure the desired two-phase flow characteristics directly. Due to the difficult nature of two-phase flow and complexity of the sensing systems, the applications of such direct two-phase flowmeters have achieved limited success in industry.
Indirect techniques based on traditional sensors incorporating soft-computing algorithms, such as artificial neural network (ANN), support vector machine (SVM), least-squares SVM, and extreme learning machine together with genetic algorithms or particle swarm optimization, have also been applied to two-phase or multiphase flow measurement or flow regime identification [7]- [10]. Coriolis flowmeters, as one of the most accurate single-phase mass flowmeters, have been successfully applied to a range of industrial applications. In recent years, many researchers have attempted to use Coriolis flowmeters for two-phase or multiphase flow measurement [11]. However, despite recent progress in sensor and transmitter technologies, improving the accuracy for mass flow metering of liquid with entrained gas still remains a challenge. A bubble effect model was proposed to study gasliquid two-phase flow for Coriolis flowmeters [12], but it cannot deal with positive errors in the mass flow measurement. Subsequently, Liu et al. [13] used a neural network to correct mass flow errors in a Coriolis mass flowmeter, which was based on a horizontal flow tube and the flow rate was limited to 1.5-3.6 kg/s. The multilayer perceptron and radial basis function (RBF) networks include four inputs, i.e., temperature, damping, density drop, and flowrate to estimate mass flow errors. Although most of the mass flow errors were reduced to within ±2%, the gas entrainment was not quantified and different installation conditions were not considered. A method based on fuzzy inference was proposed to correct the mass This work is licensed under a Creative Commons Attribution 3.0 License. For more information, see http://creativecommons.org/licenses/by/3.0/ flow errors of a Coriolis flowmeter for the measurement of two-phase flow [14]. The fuzzy system accepts damping, drop in density, and apparent mass flowrate as inputs to generate corrected mass flowrate. Lari and Shabaninia [15] applied a neuro-fuzzy algorithm to the error correction of a Coriolis mass flowmeter for air-water two-phase flow measurement. However, the experimental data and the results were not explained in detail in [14] and [15]. Hou et al. [16] developed a digital Coriolis flow transmitter and tested a commercial Coriolis flowmeter. The measurement errors achieved under gas-liquid two-phase flow conditions were corrected using a feedforward neural network with two inputs-apparent liquid mass flowrate and apparent drop in density. Xing et al. [17] applied a Coriolis flowmeter in combination with an ultrasonic flowmeter to measure the individual mass flowrates of gasliquid two-phase flow under low liquid loading. The rootmean-square errors of gas and liquid mass flowrates were 3.09% and 12.78%, respectively. Ma et al. [18] used a 25-mm bore Coriolis flowmeter together with SVM algorithms to measure the overall mass flowrate of oil-water two-phase flow and achieved relative errors within ±1%. The mass flowrate of individual phase was obtained with the maximum error of ±8%. However, it is known that the gas entrained in a liquid flow affects significantly the performance of Coriolis flowmeters, especially under different flow regimes [11]. Moreover, very little research has been undertaken to date to predict the gas volume fraction from the outputs of a Coriolis flowmeter.
Owning to the good reproducibility of the measurement errors of Coriolis flowmeters under two-phase flow conditions, data driven models, such as ANN, SVM, and genetic programming (GP), have the potential to correct the liquid mass flowrate and predict gas volume fraction. In this paper, experimental work was undertaken on a purpose-built 1-in (25 mm) bore air-water two-phase flow test rig. Coriolis flowmeters (KROHNE OPTIMASS 6400 S25) in conjunction with DP transducers were applied to obtain liquid mass flowrate and gas volume fraction on both the horizontal and vertical pipes. Parametric dependence along with input variable selection for the data driven models is investigated based on the partial mutual information (PMI) algorithm [19], [20]. Four data driven models based on backpropagation-ANN (BP-ANN), RBF-ANN, SVM, and GP, respectively, are established and validated through training and testing with the experimental data. The performances of the four models are evaluated and compared in terms of robustness and accuracy. The basic principle of BP-ANN modeling with some preliminary results was reported at the 2016 IEEE International Instrumentation and Measurement Technology Conference [21]. This paper presents in detail the principles, structures, training, and performance comparisons of the BP-ANN, RBF-ANN, SVM, and GP models.

A. Overall Measurement Strategy
ANN, SVM, and GP are common data driven models for modeling a nonlinear system with multiple inputs and outputs [22]- [26]. These techniques learn from history data and give  examples by constructing an input-output mapping in order to perform estimations of desired outputs. Fig. 1 shows the principle and structure of the measurement system. The data driven models accept variables from a Coriolis flowmeter and a DP transducer, while the output gives the corrected mass flowrate or predicted gas volume fraction. The analysis of parametric dependence and input variable selection for the data driven models based on the experimental data is presented in Section III-C. Since the volume of data is often limited in practice, it is appropriate to design a separate model for each desired output. The structure of each data driven model based on ANN, SVM, and GP will be explained in detail in Sections II.B-II.E.

B. BP-ANN
BP-ANN is a multilayer feedforward neural network trained with a BP learning algorithm, which is one of the most common neural networks. A BP-ANN consists of an input layer, one or more hidden layers, and an output layer. The hidden layer connects the input and output layers and represents their quantitative relationship. In general, a neural network with a single-hidden layer of sufficient neurons is able to represent any nonlinear problem. In consideration of the simplicity of the ANN structure, a single-hidden layer is chosen and investigated in this paper.
As shown in Fig. 2 . , x n ] T is an input sample and y is the desired output. Assume y is the linear output of the hidden neurons and a transfer function f (x) is used on the neurons, the ANN is modeled as where n and L are the numbers of input variables and hidden nodes. ω j is the weight connecting the j th hidden node and the output node, and ω i j is the weight connecting the i th input node to the j th hidden node. a j and b are the biases on the j th hidden node and the output node. In this paper, the hyperbolic tangent sigmoid function is used as a transfer function on hidden neurons and presented by The learning algorithm is described as a procedure that consists of adjusting the weights and biases of a network, to minimize an error function between the network output and desired output for a given set of inputs. The BP algorithm has been widely applied to solve practical problems. However, the BP algorithm has the disadvantage of slow convergence and long training time. In addition, the success of the BP algorithm depends on the user-dependent parameters, such as initialization and structure of the ANN.

C. RBF-ANN
RBF-ANN has a fixed three layer structure (Fig. 3) and uses a type of RBF as an activation function to the hidden nodes. The output of the network is a linear combination of RBFs of the inputs and neuron parameters. The RBF measures the distance between the input vectors and the weight vectors and is typically taken to be the Gaussian function. Thus, the output of the network is given by where C j is the center vector for the j th hidden node and determined by the K-means clustering method. x − C j is the Euclidean norm and σ 2 is the variance of the Gaussian function.
An RBF network with enough hidden nodes can approximate any continuous function with arbitrary precision. Moreover, as a local approximation network, the RBF neural network has the advantages of simple structure, less adjustive parameters, and fast training.

D. SVM
SVM was developed by Cortes and Vapnik [27] to solve the classification problem based on the statistic learning theory and structural risk minimization. Then, this method has been extended to the domain of regression and prediction problems [28]. As shown in Fig. 4, the input vector x is first mapped into an L-dimensional feature space using transfer functions, and then, a linear model is constructed in this feature space.
The linear model in the feature space is given by where ω = (ω 1 , ω 2 , . . . , ω L ) is the weight vector and b is the bias term. Regression estimates can be obtained by minimizing the empirical risk on the training data. SVM regression performs a linear regression in the high-dimensional feature space using ε-insensitive loss and tends to reduce the model complexity by minimizing ω 2 . This can be described by introducing slack variables ξ i and ξ i (i = 1, 2, . . . , m) to measure the deviation of training samples (X * , D) outside ε-insensitive zone. X * = (x 1 , x 2 , . . . , x m ) represents m input vectors of training samples and D = (d 1 , d 2 , . . . , d m ) is the corresponding desired output. Thus, the optimization problem can be formulated as where m is the number of training samples. C is a positive constant as a regularization parameter that allows tuning the tradeoff between the flatness of the function and the tolerance of deviations larger than ε (a constant). Minimize the risk function of (5) subject to the following constraints: Equation (4) can be transformed into a dual problem and solved by Lagrange functional where α i and α * i are Lagrange multipliers and K (x, x i ) is a kernel function. There are some optional kernel functions for SVM, such as linear, polynomial, RBF, and sigmoid function. One of the most widely used kernel functions is the RBF. The final product of a training process in the SVM method can be presented by

E. GP
GP as an evolutionary computation technique is an extension of genetic algorithms and is widely applied to symbolic data mining (symbolic regression, classification, and optimization) [29]- [31]. Unlike the traditional regression analysis, GP-based symbolic regression automatically evolves both the structure and the parameters of the mathematical model from the available data. Meanwhile, it is superior to other machine learning techniques due to the ability to generate an empirical mathematical equation without assuming prior form of the existing relationships. In this paper, multigene symbolic regression is applied to establish a model for two-phase flow measurement. The structure of a multigene symbolic regression model is shown in Fig. 5.
The GP model can be regarded as a linear combination of lower order nonlinear transformations of the input variables. The output y GP is defined as a vector output of n trees modified by the bias term b 0 and scaling parameters b 1 , . . . , b n y G P = b 0 + b 1 t 1 + · · · + b n t n (12) where t i (i = 1, . . . , n) is the (m × 1) vector of outputs from the i th tree comprising a multigene individual. The evolutionary process starts with initial population by creating individuals containing GP trees with different genes generated randomly. The evolutionary process continues with an evaluation of the fitness of the new population, two-point high-level crossover to acquire and delete genes, and low-level crossover on subtrees. Then, the created trees replace the parent trees or the unaltered individual in the next generation through mutation operators. The best program that appeared in any generation, the best-so-far solution, defines the output of the GP algorithm [30]. Fig. 6 shows the schematic of the two-phase flow test rig that was used in this paper. The measurement data obtained on this rig and subsequent conclusions drawn from the data  are expected to be transportable to other gas-liquid two-phase flow conditions. The gas flow is set to enter to the liquid flow through a bypass on the pipe. The liquid mass flowrate is controlled by adjusting the pump frequency from 15% to 80%. The gas flowrate is varied by adjusting the opening of the valve in a gas flow controller. Two independent Coriolis flowmeters (KROHNE OPTIMASS 6400 S25 and Bronkhorst mini CORI-FLOW M15) were installed before the mixer to provide references for the individual mass flow rates of the liquid and gas phases, respectively. Both reference meters' measurement uncertainties under single-phase conditions were verified according to the manufacturer's technical specification. In the downstream, two additional Coriolis flowmeters (see Fig. 7) of the same type as the liquid reference meter were installed in the vertical and horizontal test sections, respectively. These are the meters under test to assess the performance of ANN, SVM, and GP models under twophase flow conditions. In view of the effects of gravity and buoyancy on two-phase fluid, both the horizontal and vertical installations of the meters are considered. A DP transducer was used to record the DP value across each flowmeter under test. The data logging frequencies, as set in the data loggers for the mass flowrate, density, damping, and DP, are 50, 10, 2, and 20 Hz, respectably. Each parameter was logged over a period of 100 s with a time averaged value generated under each experimental condition. Gas volume fraction α is defined and calculated as follows:

A. Test Rig and Experimental Conditions
where q v,g and q v,l are the calculated volume flowrates of gas and liquid phases from the reference flow meters and the temperature and pressure in the upstream of the horizontal test meter. Density drop is determined from the density of the liquid flow (ρ l ) and the apparent density (ρ) from the Coriolis flowmeter under test Two series of experimental tests, Tests I and II, were conducted for the liquid mass flow rate ranging from 700 to 14 500 kg/h and gas volume fraction from 0% to 30%. The fluid temperature during the tests was around 20°C. For the purpose of ANN training, 237 data sets were collected from Tests I, while 24 data sets recorded from Tests II for testing the performance of the data driven models.

B. Analysis of Original Errors
The typical original mass flow errors of the Coriolis flowmeters in Test I are shown in Fig. 8. The Coriolis flowmeter on the vertical section gives negative errors at flowrates below 4000 kg/h. At a higher flowrate (>5500 kg/h), the mass flow errors become positive and crossing the zero line and then return to negative errors again along with increasing entrained gas. This is believed to be due to the flow regime effects on the fluid-tube coupling system at different flowrates. At a lower flowrate (<2000 kg/h), the flow was nearly slug flow as observed during the test, while the flow regime became gradually dispersed bubbly flow as the flowrate and entrained gas increase. For the Coriolis flowmeter on the horizontal pipeline, the range of mass flow errors is different from that on the vertical pipeline most likely due to the effects of gravity and buoyancy on the flow regime. Positive errors occur at the mass flowrates of 700 and 1000 kg/h when the gas volume fraction below 6%. By comparing the mass flow errors at the same flowrate in Figs. 8 and 9, the errors are generally reproducible for the same installation and thanks to the new-generation flow transmitter [32]. For the test data set, Test II includes some experimental data that were collected at different flowrates from those in Test I. The new conditions as in Test II that were conducted on a different day and obtained under different flowrate from Test I are useful to assess the models' generalization capability and reproducibility. Fig. 10 shows the distribution of the relative errors of the measured liquid mass flowrate on both the horizontal and vertical pipelines. Each color (blue or green) in the figure represents training or test data sets, respectively. The Coriolis flowmeter on the horizontal pipeline yields the liquid mass flowrate with a relative error between −41% and 9%, while the meter on the vertical pipeline gives an error from −25% to 11%. The difference in errors between the vertical and horizontal installations is due to the fact that the bubbles in a vertical flow are distributed evenly in the pipe cross section due to the effect of gravity, resulting in less interruption on the tube vibration inside the Coriolis flowmeter and hence different errors.

C. Analysis of Parametric Dependence
There are three important parameters from a Coriolis flowmeter, including observed density drop, apparent mass flowrate, and damping. The DP value from the DP transducer is also included as a potential input variable in this paper. The apparent mass flowrate from a Coriolis flowmeter and the DP value across the meter correlate strongly with the liquid mass flowrate under two-phase conditions. In addition, when gas entrains in the liquid flow, a rapid rise in damping occurs for the fluid-conveying tube and the mixture density also deviates from the liquid density. This physical background for the fluidtube coupling system determines that these four input variables are more important than other variables. There exist strong nonlinearities between the outputs of a Coriolis flowmeter and the flowrate being measured under two-phase flow conditions, as observed by other researchers [12], [13]. Such nonlinearities are also shown in Fig. 8.
In order to investigate the parametric dependence of individual input parameters and the combined effect of multiple parameters on the output of a data model, PMI is utilized to measure the partial dependence between a potential input variable and the output, conditional on any inputs that have already been selected. The variable with the highest PMI score is added to the input set, if the Akaike information criterion (AIC) value decreases as a result from the inclusion of this variable. The detailed definitions of PMI and AIC are available in [19] and [20]. Suppose variables x 1 , x 2 , x 3 , and x 4 represent observed density drop, apparent mass flowrate, damping, and DP, respectively, the variable selection procedures for the models for correcting the liquid mass flowrate and predicting the gas volume fraction are summarized in Tables I and II. H-L and V-L represent the models established for the horizontal and vertical pipelines, respectively, to correct the liquid mass flowrate, while H-G and V-G stand for the models for the horizontal and vertical pipelines to predict the gas volume fraction, respectively. The selection sequence also represents the sensitivity level of each variable to the desired output. For the liquid mass flowrate, x 2 (apparent mass flowrate) has more significant effect on the liquid mass flowrate. The coefficient of determination, R 2 , indicates the goodness of fit. A combination of the four variables gives the highest R 2 , which illustrates that the combined effect of the variables is more significant than that of an individual variable on the output. For predicting the gas volume fraction, x 1 (observed density drop), plays a more important part than other variables. Variable x 3 (damping) is not used in models H-G and V-G, since the AIC value becomes increasing and R 2 is reducing with the inclusion of x 3 . As a result of these variable selection procedures, the models for correcting the liquid mass flowrate accept the four input variables (observed density drop, apparent mass flowrate, damping, and DP) and three variables (observed density drop, apparent mass flowrate, and DP) are taken as the inputs to the models for predicting the gas volume fraction.

D. Performance of the BP-ANN
The BP-ANN model is established through training with data set I and tested with data set II. For each installation condition, a separate model is established for the correction of the measured liquid mass flowrate and the prediction of gas volume fraction. The inputs of the BP-ANN for liquid mass flowrate correction include four variables, i.e., observed density drop, apparent mass flowrate, damping, and DP. The inputs of the BP-ANN for gas volume fraction prediction include observed density drop, apparent mass flowrate, and DP. The number of neurons (L) in the hidden layer is determined using (15) and (16), as proposed in [33] L ≤ 2n + 1 where n and m are the numbers of input variables and training samples, respectively. However, (15) and (16) give only the range of L for BP-ANN models. The exact L for a model can be selected by a trial-and-error method to compromise between minimizing errors and achieving good generalization capability. The output layer has one neuron for each model, since there is only one output variable. The BP-ANN transfer function between the input and hidden layers is hyperbolic tangent sigmoid transfer function. The pure linear function is taken as the transfer function connecting the hidden layer to the output layer. The training function is Bayesian regularization, while the learning function is gradient descent with momentum weight and bias learning function. Training stops when the maximum number of epochs is reached or the performance is minimized to the goal. In this paper, normalized root-mean-square error (NRMSE) is used to assess the performance of a data driven model, which is defined as where y i is the reference mass flow rate of the liquid phase or gas volume fraction,ȳ is the mean of y i ,ŷ i is the corrected mass flow rate or predicted gas volume fraction from the data driven model accordingly, and m is the number of samples used.
As the weights and biases between the neurons are initialized randomly, a different BP-ANN is obtained for each training, resulting in different performances. A preliminary study of averaging NRMSE of more than 200 BP-ANNs did not show any noticeable difference. Therefore, in order to minimize the effect of random initialization of an ANN, the average NRMSE of 200 BP-ANNs with the same structure is calculated to assess the effect of the hidden neurons on the performance of the ANN.
For the models for liquid mass flowrate correction, the number of neurons in the hidden layer is set from 4 to 9 as per (15) and (16). The NRMSE values of the BP-ANNs are summarized in Fig. 11. The error bars indicate the maximum and minimum errors of 200 BP-ANNs for the same structure. In view of the errors on both training and test datasets, the BP-ANN with seven neurons in the hidden layer performs better than other structures under both the horizontal and vertical conditions. The BP-ANN used for gas volume fraction prediction has lower NRMSE when the number of the hidden neurons is 6.
Once the structure of a BP-ANN is determined, the trained neural network that has the minimum error with the test data set is selected. Fig. 12 shows the errors of the corrected Fig. 13.
Error of the predicted gas volume fraction from the trained BP-ANNs. (a) Errors of the predicted gas volume fraction on the horizontal pipeline with training data set. (b) Errors of the predicted gas volume fraction on the horizontal pipeline with test data set. (c) Errors of the predicted gas volume fraction on the vertical pipeline with training data set. (d) Errors of the predicted gas volume fraction on the vertical pipeline with test data set. liquid mass flowrate from the BP-ANNs. For the horizontal and vertical pipelines, the relative errors are mostly less than ±2% (the red dashed lines in Fig. 12) with the training data set except some larger errors at the low flowrates of 700 and 1000 kg/h. This is very likely due to larger bubbles or slugs appearing in the flow tubes under low flowrate, which affects the Coriolis flowmeter behaving differently from smaller bubbles. The trained BP-ANN has relatively larger errors at low flowrates and hence results in unsatisfactory performance with the test data set under the same experimental conditions.
Since the gas volume fraction under the experimental conditions ranges from 0% to 30% and the intrinsic complexity of two-phase flow, the relative errors of the predicted gas volume fraction from the BP-ANNs are quite large when the gas volume fraction is below 5%. As the entrained gas increases, the errors from the training data set are mostly within ±10% (the red dashed lines in Fig. 13). For the test data set, however, all the errors are less than ±10% on the vertical pipeline, even under the low flowrate conditions. Fig. 14 shows the relative errors of the corrected liquid mass flowrate from the RBF-ANNs. In order to achieve more accurate results with the test data set, the RBF-ANN on the horizontal pipeline disregards the errors at lower flowrates (<2000 kg/h) and the network is trained to well fit higher flowrates (>4000 kg/h). Consequently, the errors at higher flowrates with the training data set, and the errors with the test data set are reduced to ±1%. Due to the insignificant Fig. 17.

E. Performance of the RBF-ANN
Errors of the predicted gas volume fraction from the SVMs. (a) Errors of the predicted gas volume fraction on the horizontal pipeline with training data set. (b) Errors of the predicted gas volume fraction on the horizontal pipeline with test data set. (c) Errors of the predicted gas volume fraction on the vertical pipeline with training data set. (d) Errors of the predicted gas volume fraction on the vertical pipeline with test data set. difference in the original errors between the lower and higher flowrates on the vertical pipeline, the RBF-ANN yields errors between ±2% with the training data set and ±1% with the test data set.
As shown in Fig. 15, the RBF-ANN for gas volume fraction prediction outperforms significantly the BP-ANN, particularly under the low entrained gas. When the gas volume fraction is below 5%, the maximum relative errors from RBF-ANNs on both the horizontal and vertical pipelines are around ±30%. The rest errors with the training data set are well within ±10%. The relative errors from the test data set are almost less than ±10%, except at the flowrate of 1000 kg/h on the horizontal pipeline. This is probably due to the fact that the samples at 1000-kg/h flow rate are far away from the center vectors in the network. F. Performance of the SVM SVM models are also established for both installation conditions. An important difference between the SVM and ANN models is that the SVM leads to a unique deterministic model for each data set, while ANNs depend on a random initial choice of synaptic weights and cannot produce the fixed results. Through a direct comparison of the performances of SVM between the four kinds of kernel function (Table III), we know that the SVM with RBF generates the smallest NRMSE among the four models.
From Fig. 16(a) and (c), the SVM model performs well to fit with training data and limit the relative errors on the horizontal and vertical pipelines to ±1% or less, except some points at 700 and 1000 kg/h, which is a common problem for the ANN and SVM models. The generalization ability of the SVM model is proven, as shown in Fig. 16(b) and (d). Most errors from the SVM models with the test data are reduced to ±1%. Fig. 17 shows that for gas volume fraction prediction, a less number of points from the SVM models have an error beyond ±10% with the training data set. Since the kernel function used in the SVM models is RBF, the performance of the SVM models has the common problem with the RBF-ANN. The relative errors in the predicted gas volume fraction with the test data set at the flowrate of 1000 kg/h are larger than other test data.

G. Performance of the GP
Four GP models are established in this paper for correcting the liquid mass flowrate and predicting the gas volume fraction, respectively, for the horizontal and vertical installations of Coriolis flowmeters. The parameters that were set in the GP algorithms include: a population size of 250, a tournament size of 25, an elitism of 0.7, maximum number of genes allowed in an individual 6, function set {×, −, +, tanh, mult3, add3}, and terminal sets {x 1 , x 2 , x 3 , x 4 } for models H-L and V-L and {x 1 , x 2 , x 4 } for models H-G and V-G.
The GP-based formulations for the four models are given in the following: The errors of the corrected mass flowrate on the training data set using GP are higher by −15% and 25%, respectively, under the horizontal and vertical installations [ Fig. 18(a) and (c)], which results in larger errors on the test data set [ Fig. 18(b) and (d)]. As can be seen that, larger errors normally occur at low flowrates, which indicate that the GP models are unable to approximate all the data. As shown in Fig. 19, for the prediction of gas volume fraction, the outputs of GP models have large errors for low gas entrainment and low flowrates. The relative errors with test data reach 25% and −50% on the horizontal and vertical pipes, respectively.  H-L and V-L. However, the SVM models are significantly better than the BP-ANN, RBF-ANN, and GP models for the prediction of gas volume fraction. Moreover, BP-ANN and RBF-ANN have uncertain parameters to optimize which could result in differences in performance. However, due to their fixed structure, the SVM models produce repeatable results all the time. This outcome suggests that the SVM models are superior to both ANN and GP models in terms of robustness.
2) Accuracy: Fig. 21 shows the relative error histograms of the ANNs, SVMs, and GPs for corrected liquid mass flowrate. It is clear that the error distributions of the GP and ANN models are much wider and dispersive than the SVM models. Through comparing the mean value and standard deviation of the errors between the eight error distributions (Table IV), we can see that the SVM models with the lowest mean value and standard deviation outperform the BP-ANN, RBF-ANN, and GP models for liquid mass flowrate measurement on both the horizontal and vertical pipelines. Moreover, the data driven models (a mean value of 0.0008% and a standard deviation of 0.40%) on the vertical pipeline perform better than those on the horizontal pipeline (a mean value of 0.0585% and a standard deviation of 0.66%). Fig. 22 shows the relative error histograms of the four types of models for gas volume fraction prediction. GP models have a larger range of errors than all other models. The error distribution of the SVM model is much narrower than the ANN models for the measurement of gas volume fraction. It can be seen that most errors of the SVM models are concentrated around zero line. Table V shows that the standard deviations of the SVM and RBF-ANN models are smaller than that of the BP-ANN and GP models on both the horizontal and vertical pipelines.
In order to assess the accuracy of the ANN, SVM, and GP models, the percentage of experimental data for each model that can achieve the accuracy of ±2% and ±1%, respectively, for liquid mass flowrate measurement and ±10% for gas volume fraction prediction is calculated and summarized in Table VI. For liquid mass flowrate measurement with the SVM models, 93.49% of the experimental data yield a relative error less than ±1% on the horizontal pipeline, while 96.17% of the results are within ±1% on the vertical installation. The SVM models predict the gas volume fraction with a relative error less than 10% for 93.10% and 94.25% of the test conditions on the horizontal and vertical installations, respectively. Therefore, the SVM models perform significantly better than the BP-ANN, RBF-ANN, and GP models for twophase flow measurement in terms of robustness and accuracy.

IV. CONCLUSION
In this paper, experimental and analytical investigations have been carried out to assess the performance of BP-ANN, RBF-ANN, SVM, and GP for gas-liquid two-phase flow measurement using Coriolis flowmeters. The results presented have suggested that the SVM models are superior to the two ANN models and the GP models for two-phase flow measurement in terms of robustness and accuracy. The SVM models perform well consistently, while the performance of ANN and GP models depends on the user-defined parameters. For liquid mass flowrate measurement, the SVM models outperform the BP-ANN, RBF-ANN, and GP on both the horizontal and vertical pipelines and the most corrected errors (>93%) are within ±1%. For the gas volume fraction prediction, the RBF-ANN and SVM models yield most relative errors (>90%) less than ±10% and outperform the BP-ANN and GP. It must be stressed that the significantly reduced errors in mass flowrate measurement from the Coriolis mass flowmeters and gas volume fraction prediction are achieved by using the existing data from the Coriolis flowmeters and a simple DP transducer without the use of any other devices. SVM has consistently outperformed ANN and GP in the correction of liquid mass flow errors and prediction of gas volume fraction. This outcome has effectively extended the applicability of Coriolis mass flowmeters to liquid flow measurement with a significant volume of entrained gas. In the future work, the data driven models will be extended for the measurement of other liquids with different viscosities under two-phase or multiphase flow conditions.