Skip to main content
Log in

Using Fuzzy Classifier in Ensemble Method for Motor Imagery Electroencephalography Classification

  • Published:
International Journal of Fuzzy Systems Aims and scope Submit manuscript

Abstract

In a motor imagery-based brain–computer interface system, an effective classifier is required. However, the effectiveness of classifier is substantially influenced by the individual differences among electroencephalography (EEG) signals and artifacts. Therefore, in this study, we adopted an ensemble method by combining various classifiers, including a fuzzy classifier that can reduce the influence of artifacts, to improve the robustness and accuracy in classification across participants. Nine participants were recruited for the experiment and asked to perform a left- and right-hand motor imagery task. We calculated the classification rates obtained with the linear discriminant analysis (LDA), quadratic discriminant analysis (QDA), Naive Bayes, support vector machine (SVM), and fuzzy twin SVM (FTSVM) classifiers based on the spectral features extracted by an autoregressive (AR) model and the spectral–temporal features extracted by the Morlet wavelet from overlapped 1.024-s EEG segments. The fivefold cross-validation accuracies of the ensemble method for the 1.024-s EEG were 71.39% and 73.06% with the AR- and wavelet-extracted features, respectively. In the comparison of individual classifiers, the Linear-FTSVM method outperformed other individual classifiers. In addition, the ensemble model with the inclusion of FTSVM classifiers performs superior to the ensemble models without using FTSVM classifiers.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8
Fig. 9
Fig. 10

Similar content being viewed by others

References

  1. Wolpaw, J.R., Birbaumer, N., McFarland, D.J., Pfurtscheller, G., Vaughan, T.M.: Brain-computer interfaces for communication and control. Clin. Neurophysiol. 113(6), 767–791 (2002). https://doi.org/10.1016/s1388-2457(02)00057-3

    Article  Google Scholar 

  2. Langhorne, P., Coupar, F., Pollock, A.: Motor recovery after stroke: A systematic review. Lancet Neurol. 8(8), 741–754 (2009). https://doi.org/10.1016/S1474-4422(09)70150-4

    Article  Google Scholar 

  3. Takalo, R., Hytti, H.: Ihalainen, H: Tutorial on univariate autoregressive spectral analysis. J. Clin. Monitor. Comp. 19, 401–410 (2005). https://doi.org/10.1007/s10877-005-7089-x

    Article  Google Scholar 

  4. Hearst, M.A., Dumais, S.T., Osuna, E., Platt, J., Scholkopf, B.: Support vector machines. IEEE Intell. Syst. Appl. 13(4), 18–28 (1998). https://doi.org/10.1109/5254.708428

    Article  Google Scholar 

  5. Huang, H., Wei, X., Zhou, Y.: Twin support vector machines: A survey. Neurocomputing 300(26), 34–43 (2018). https://doi.org/10.1016/j.neucom.2018.01.093

    Article  Google Scholar 

  6. Gao, B.-B., Wang, J.-J., Wang, Y., Yang, C.-Y.: Coordinate Descent Fuzzy Twin Support Vector Machine for Classification. 2015 IEEE 14th ICMLA (2015). https://doi.org/10.1109/ICMLA.2015.35

  7. Pfurtscheller, G., Silva, F.H.L.d.: Event-related EEG/MEG synchronization and desynchronization: Basic principles. Clin. Neurophysiol. 110(11), 1842–1857 (1999). https://doi.org/10.1016/S1388-2457(99)00141-8

  8. Tang, Z., Sun, S., Zhang, S., Chen, Y., Li, C., Chen, S.: A brain-machine interface based on ERD/ERS for an upper-limb exoskeleton control. Sensors 16(12), 2050 (2016). https://doi.org/10.3390/s16122050

    Article  Google Scholar 

  9. Jayadeva, Khemchandani, R., Chandra, S.: Twin support vector machines for pattern classification. IEEE Trans. Pattern Anal. Mach. Intell. 29(5), 905–910 (2007). https://doi.org/10.1109/TPAMI.2007.1068

  10. Meng, J., Zhang, S., Bekyo, A., Olsoe, J., Baxter, B., He, B.: Noninvasive electroencephalogram based control of a robotic arm for reach and grasp tasks. Sci Rep 6, 38565 (2016). https://doi.org/10.1038/srep38565

    Article  Google Scholar 

  11. Ng, D.W.-K., Soh, Y.-W., Goh, S.-Y.: Development of an autonomous BCI wheelchair. 2014 IEEE symposium on computational intelligence in brain computer interfaces (2014). https://doi.org/10.1109/CIBCI.2014.7007784

  12. Wriessnegger, S.C., Brunner, C., Müller-Putz, G.R.: Frequency specific cortical dynamics during motor imagery are influenced by prior physical activity. Front. Psychol. 9, 1976 (2018). https://doi.org/10.3389/fpsyg.2018.01976

    Article  Google Scholar 

  13. Kay, S.M.: Fundamentals of statistical signal processing: estimation theory. Prentice Hall, Englewood Cliffs (1993)

    MATH  Google Scholar 

  14. Delprat, N., Escudie, B., Guillemain, P., Kronland-Martinet, R., Tchamitchian, P., Torresani, B.: Asymptotic wavelet and Gabor analysis: Extraction of instantaneous frequencies. IEEE Trans. Inf. Theory 38(2), 644–664 (1992). https://doi.org/10.1109/18.119728

    Article  MathSciNet  MATH  Google Scholar 

  15. Yangcheng Zhang, Jingyu Liu, Jianze Liu, Jun Sheng, Jingwei Lv: EEG recognition of motor imagery based on SVM ensemble. 2018 5th ICSAI (2018). https://doi.org/10.1109/ICSAI.2018.8599464

  16. Luo, J., Gao, X., Zhu, X., Wang, B., Lu, N., Wang, J.: Motor imagery EEG classification based on ensemble support vector learning. Comput. Meth. Programs Biomed. 193, 105464 (2020). https://doi.org/10.1016/j.cmpb.2020.105464

    Article  Google Scholar 

  17. Ali, M.A., Üçüncü, D., Ataş, P.K., Özöğür-Akyüz, S.: Classification of motor imagery task by using novel ensemble pruning approach. IEEE Trans. Fuzzy Syst. 28(1), 85–91 (2020). https://doi.org/10.1109/TFUZZ.2019.2900859

    Article  Google Scholar 

  18. Xu, Q., Zhou, H., Wang, Y., Huang, J.: Fuzzy support vector machine for classification of EEG signals using wavelet-based features. Med. Eng. Phys. 31(7), 858–865 (2009). https://doi.org/10.1016/j.medengphy.2009.04.005

    Article  Google Scholar 

  19. Chen, S.-C., Chen, Y.-J., Zaeni, I.A.E., Wu, C.-M.: A single-channel SSVEP-based BCI with a fuzzy feature threshold algorithm in a maze game. Int. J. Fuzzy Syst. 19, 553–565 (2017). https://doi.org/10.1007/s40815-016-0289-3

    Article  Google Scholar 

  20. Hsu, W.-C., Lin, L.-F., Chou, C.-W., Hsiao, Y.-T., Liu, Y.-H.: EEG classification of imaginary lower limb stepping movements based on fuzzy support vector machine with kernel-induced membership function. Int. J. Fuzzy Syst. 19, 566–579 (2017). https://doi.org/10.1007/s40815-016-0259-9

    Article  MathSciNet  Google Scholar 

  21. Theodoridis, S.: Machine learning: a bayesian and optimization perspective. Academic Press, Amsterdam (2015)

    Google Scholar 

Download references

Acknowledgements

This work was supported by the Ministry of Science and Technology, Taiwan (MOST106-2221-E-010-016-MY3 and MOST106-2314-B-010-058-MY2) and Ministry of Education, Taiwan (110BRC-B701). We thank all the participants in this study. This manuscript was edited by Wallace Academic Editing.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Yu-Te Wu.

Appendices

Appendix A: Support Vector Machine (SVM)

SVM is based on the principle of finding a maximum margin in the feature space, to find the best hyperplane to separate samples of different categories [4].

Given a training set \({\text{D}} = \left\{ {\left( {x_{1} , y_{1} } \right), \left( {x_{2} , y_{2} } \right), \ldots , \left( {x_{m} , y_{m} } \right)} \right\}, \;{\text{label}}\;y_{i} \; \in \left\{ { - 1, + 1} \right\}\), and the hyperplane is obtained using the training set D, which can be represented as follows:

$$w^{T} x + b = 0$$
(A1)

where w and b denote the normal vector and the bias of the hyperplane, respectively.

In the feature space, the distance from \(x\) to the hyperplane is given by:

$$r = \frac{{\left| {w^{T} x + b} \right|}}{\left\| w \right\|}$$
(A2)

If a hyperplane can classify all samples correctly so that the following formula is satisfied:

$$\left\{ {\begin{array}{*{20}c} {w^{T} x_{i} + b \ge + 1, y_{i} = + 1} \\ {w^{T} x_{i} + b \le - 1, y_{i} = - 1} \\ \end{array} } \right.$$
(A3)

Therefore, the margin between the two categories can be calculated as

$$\gamma = \frac{2}{\left\| w \right\|}$$

and maximizing the margin is equivalent to minimizing \(\Vert w\Vert\). However, in reality it is rare to encounter a hyperplane that can perfectly separate the two categories. We need to add the slack variables (\({\xi }_{i}\)) to regularize the model and the cost function can be expressed as:

$$\mathop {\min }\limits_{{w,b,\xi_{i} }} \frac{1}{2}\left\| w \right\|^{2} + C\mathop \sum \limits_{i = 1}^{m} \xi_{i}$$
(A4)

We can balance the looseness of hyperplane classification by adjusting the value of C in (A4).

Appendix B: Fuzzy Support Vector Machine (FSVM)

In the conventional SVM, each sample in the training data has an equal influence in the training process. However, the data usually have attributes of different classes in different proportions. The method used by FSVM is to assign different degree of memberships (\({s}_{i}\)) to samples, thereby adjusting the weight of each sample on slack variables (\({\xi }_{i}\)). As a result, equation (A4) can be rewritten as:

$$\mathop {\min }\limits_{{w,b,\xi_{i} }} \frac{1}{2}\left\| w \right\|^{2} + C\mathop \sum \limits_{i = 1}^{m} s_{i} \xi_{i}$$
(A5)

Note that, for the sample on the fuzzy boundary, it has a greater influence on the construction of the classification boundary, and the larger the value of \({s}_{i}\) is assigned to it. For the sample far away from the boundary, the smaller value of \({s}_{i}\) is assigned to it (see Fig. 9).

Appendix C: Twin Support Vector Machines (TSVM)

For binary classification tasks, SVM defines a single classification boundary to classify samples, while TSVM uses two non-parallel decision planes to classify samples (Fig. 10). TSVM defines two decision planes for two categories, and the unknown sample will be classified in the category of the nearest decision plane. In the study of TSVM, Huang et al. [5] described the decision planes for the respectively two categories as follows:

$$\left\{ {\begin{array}{*{20}c} {x^{T} w_{1} + b_{1} = 0} \\ {x^{T} w_{2} + b_{2} = 0} \\ \end{array} } \right.$$
(A6)

where \({w}_{i}\) and \({b}_{i}\) denote the normal vector and the bias of the corresponding hyperplane, respectively.

In TSVM training, the goal is to minimize the following two objective functions:

$$\begin{array}{*{20}c} {\mathop {\min }\limits_{{w^{{\left( 1 \right)}} ,b^{{\left( 1 \right)}} ,\xi ^{{\left( 2 \right)}} }} \frac{1}{2}\Vert Aw^{{\left( 1 \right)}} + e_{1} b^{{\left( 1 \right)}}\Vert^ {2} + C_{1} e_{2}^{T} \xi ^{{\left( 2 \right)}} } \\ {{\text{subject~to}}~~~\left\{ {\begin{array}{*{20}c} { - \left( {Bw^{{\left( 1 \right)}} + e_{2} b^{{\left( 1 \right)}} } \right) + \xi ^{{\left( 2 \right)}} \ge e_{2} } \\ {\xi ^{{\left( 2 \right)}} \ge 0} \\ \end{array} } \right.} \\ \end{array}$$
(A7)

and

$$\begin{array}{*{20}c} {\mathop {\min }\limits_{{w^{{\left( 2 \right)}} ,b^{{\left( 2 \right)}} ,\xi ^{{\left( 1 \right)}} }} \frac{1}{2}\left\| {Bw^{{\left( 2 \right)}} + e_{2} b^{{\left( 2 \right)}} } \right\|^{2} + C_{2} e_{1}^{T} \xi ^{{\left( 1 \right)}} } \\ {{\text{subject~to}}~~~\left\{ {\begin{array}{*{20}c} { - \left( {Aw^{{\left( 2 \right)}} + e_{1} b^{{\left( 2 \right)}} } \right) + \xi ^{{\left( 1 \right)}} \ge e_{1} } \\ {\xi ^{{\left( 1 \right)}} \ge 0} \\ \end{array} } \right.} \\ \end{array}$$
(A8)

where \({C}_{1}>0\) and \({C}_{2}>0\) are penalty parameters, \(\xi^{(1)}\) and \(\xi^{(2)}\) are the slack variable, and \({e}_{1}\) and \({e}_{2}\) are the unit row vector with their dimensions equal to sample sizes in the corresponding category.

For the classification of the unknown sample x, prediction can be made by:

$$f\left( x \right) = \mathop {\arg \min }\limits_{ \pm } \frac{{\left| {x^{T} w_{ \pm } + b_{ \pm } } \right|}}{{\left\| {w_{ \pm } } \right\|}}$$
(A9)

Appendix D: Fuzzy Twin Support Vector Machines (FTSVM)

In the process of model training, fuzzy membership can assign different weights to samples. Appropriate design of membership function can reduce the influence of contaminated sample on the generated decision boundary and improve the model robustness. The design of the membership function is based on the average and distribution radius of samples in the feature space. Let the center of the positive class (\({\varphi }_{pcen}\)) and the center of the negative class (\({\varphi }_{ncen}\)) be defined by:

$$\left\{ {\begin{array}{*{20}c} {\varphi_{pcen} = \frac{1}{{l_{ + } }}\mathop \sum \limits_{i = 1}^{{l_{ + } }} \varphi \left( {x_{i} } \right), {\text{for}}\, x_{i} \in X_{ + } } \\ {\varphi_{ncen} = \frac{1}{{l_{ - } }}\mathop \sum \limits_{i = 1}^{{l_{ - } }} \varphi \left( {x_{i} } \right), {\text{for}}\, x_{i} \in X_{ - } } \\ \end{array} } \right.$$
(A10)

where \(\varphi \left({x}_{i}\right)\) is a transformation that transform from \({x}_{i}\) to the feature space.

In the positive class, the distribution radius of positive class in the feature space is given by:

$$r_{{\varphi^{ + } }} = \max \left\| {\varphi (x_{i} ) - \varphi_{pcen} } \right\|,\;{\text{for}}\;x_{i} \in X_{ + }$$
(A11)

where \({X}_{+}\) represents the training data set of positive class.

The membership function of the positive class is defined by

$$\begin{gathered} s_{{i + }} = \mu \left( {1 - \sqrt {\frac{{\left\| {\varphi \left( {x_{i} } \right) - \varphi _{{pcen}} } \right\|^{2} }}{{r_{\varphi }^{2} + \delta }}} } \right)\quad {\text{if}}~~\left\| {\varphi \left( {x_{i} } \right) - \varphi _{{pcen}} } \right\|~~ \ge \left\| {\varphi \left( {x_{i} } \right) - \varphi _{{ncen}} } \right\|,{\text{ and}} \hfill \\ s_{{i + }} = \left( {1 - \mu } \right)\left( {1 - \sqrt {\frac{{\left\| {\varphi \left( {x_{i} } \right) - \varphi _{{pcen}} } \right\|}}{{r_{\varphi }^{2} + \delta }}^{2} } } \right)~\quad {\text{if}}~~\left\| {\varphi \left( {x_{i} } \right) - \varphi _{{pcen}} } \right\|~~ < \left\| {\varphi \left( {x_{i} } \right) - \varphi _{{ncen}} } \right\|~\quad \hfill \\ \end{gathered}$$
(A12)

where \(\mu\) is a constant between 0 and 1, and \(\delta\) is a constant greater than 0 to avoid numerical divergence.

The objective function for FTSVM, in addition to including the membership function for reducing the artifacts, also includes the term \({\Vert w\Vert }^{2}\) to maximize the margin and is expressed as:

$$\begin{gathered} \mathop {\min }\limits_{{w_{ + } ,b_{ + } ,\xi_{ - } }} \frac{1}{2}C_{2} \left\| {w_{ + } } \right\|^{2} + \frac{1}{2}\left\| {X_{ + } w_{ + } + e_{ + } b_{ + } } \right\|^{2} + C_{3} s_{ - }^{T} \xi_{ - } \hfill \\ {\text{subject}}\;{\text{to}} \left\{ \begin{array}{lll} \left( {X_{ - } w_{ + } + e_{ - } b_{ + } } \right) + \xi_{ - } \ge e_{ - } \hfill \\ \xi_{ - } \ge 0 \hfill \\ \end{array} \right. \hfill \\ \end{gathered}$$
(A13)

and

$$\begin{gathered} \mathop {\min }\limits_{{w_{ - } ,b_{ - } ,\xi_{ + } }} \frac{1}{2}C_{2} \left\| {w_{ - } } \right\|^{2} + \frac{1}{2}\left\| {X_{ - } w_{ - } + e_{ - } b_{ - } } \right\|^{2} + C_{4} s_{ + }^{T} \xi_{ + } \hfill \\ {\text{subject}}\;{\text{to}} \left\{ \begin{gathered} \left( {X_{ + } w_{ - } + e_{ + } b_{ - } } \right) + \xi_{ + } \ge e_{ + } \hfill \\ \xi_{ + } \ge 0 \hfill \\ \end{gathered} \right. \hfill \\ \end{gathered}$$
(A14)

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Lin, CY., Lu, CF., Lu, HM. et al. Using Fuzzy Classifier in Ensemble Method for Motor Imagery Electroencephalography Classification. Int. J. Fuzzy Syst. 23, 2417–2431 (2021). https://doi.org/10.1007/s40815-021-01108-8

Download citation

  • Received:

  • Revised:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s40815-021-01108-8

Keywords

Navigation