top of page

This site was designed with the

website builder. Create your website today.Start Now

Methods & Study Design

Machine Learning Process:

Machine Learning Process Flow Chart.png

Data Collection

Data set obtained from Kaggle.com
Contains 11,500 observations of epileptic and non-epileptic EEGs from 500 patients
Response variable: Seizure activity (1-5), Explanatroy variables: EEG reading at successive times (178 total)

Data Preparation

Transformed to binary response variable (1: seizure, 0: no seizure)
Used randomization to split data into Training (80%) and Test (20%) sets

Model Training

3 different classification models were developed using the training data set: logistic regression, Support Vector Machines (SVM), and Long Short-Term Memory (LSTM)
Cross-validation was used to improve performance and prevent over-fitting

Model Evaluation

Model results were compared using the test data set
Confusion matrices and Reciever Operating Characteristic Curves (ROC) were used to assess model sensitivity and false positive rate

Raw Data.png

•Each observation of this data was a total of 178 datapoints over the 23.6 sec interval (explanatory variables)
•The y column (response variable) classified our data as 1-5 which can be transformed to a binary response of 1 or 0, where 1 is seizure and 0 is no seizure
•Each individual data point is a selected recording of a section of a patient’s brain at a given interval of time.

EEG Signal 1-5.png

23 segments of EEG data for 500 patients were used in this data set

Each 23.6 sec segment contains 178 readings (0.133 sec interval)

Each segment was classified as 1-5, where 1 contained seizure activity, 2 and 3 were seizure free intervals, and 4 and 5 were healthy volunteers

All EEG signals were recorded with the same 128- channel amplifier system and written continuously onto the disk of a data acquisition computer system

Our Classification Models

Logistic Regression

Traditional classification method for binary response variables

Data is fit using a sigmoid function

Decision boundary selected to classify the predicted probability value

Simple method that generates easily interpretable results

Does not typically perform well with complex data

Logistic Regression.jpg

Support Vector Machines (SVM)

Powerful but flexible method for supervised learning classification

Goal is to divide the data classes and find the maximum margin (the gap between the closest data points of different classes)

SVM typically offers accurate results and works well with high-dimensional data

May require longer training times

Support Vector Machines.png

Long Short-Term Memory (LSTM)

Complex method based on Recurrent Neural Networks (RNN)

Unlike traditional neural network methods, RNNs the outputs depend on prior sequence elements (memory)

LSTM expands the memory of RNNs using memory blocks

LSTM is well suited for complex, time-series data

SVM Model Picture.png

bottom of page