Discussion: Open-Source Framework for Advanced EEG Data Analysis Using Pre-trained Foundation Models #59
Replies: 24 comments 19 replies
-
|
@myduong-0420, thanks for initiating the discussion! I want to add to the discussion by sharing a summary of the trends in recent foundational models for EEG.
While I have listed references, they are not how they are formally cited. I just wanted to make some quick notes and discuss with the community. Happy to discuss this further in the forum! Other papers: Intro: email: rajpuraparam[at]iitgn[dot]ac[dot]in |
Beta Was this translation helpful? Give feedback.
-
|
Hi everyone, |
Beta Was this translation helpful? Give feedback.
-
|
Hi everyone, I'm really interested in contributing to the project "Open-Source Framework for Advanced EEG Data Analysis Using Pre-trained Foundation Models." I’ve worked on a minor project under a doctor from Brown University(Providence, Rhode Island, on historic College Hil, U.S.A), where I gained experience in EEG signal processing. In that project, I worked on data conversion, preprocessing, and dominant frequency analysis with conversion to CSV. This gave me a solid understanding of handling EEG signals and working with raw data. I also explored various models like LGBM and RF for analyzing EEG signals and have some familiarity with CNN, KNN, XGBoost, and SVM for signal analysis. Additionally, I have experience with Graph Neural Networks (GNN) for analyzing EEG data, where I worked on EEG-based pattern recognition and feature extraction. Through this, I learned how to handle noise and complex signal variations in EEG datasets, improving the overall model accuracy and performance. Recently, I’ve been reading more about the biological aspects of EEG signals, including how brain waves are generated and the physiological significance behind different EEG patterns. Understanding these biological foundations has helped me interpret EEG data more effectively and refine preprocessing techniques. I’m eager to learn more about the project and how I can contribute effectively. It would be helpful to get some insights into the current codebase and the specific dataset being used. Also, I plan to start posting content related to EEG data processing, model implementation, and insights on my GitHub profile, which might help others who are getting started. Looking forward to collaborating with you all! Thanks, |
Beta Was this translation helpful? Give feedback.
-
|
Hi everyone, I'm Vaibhav Kanojia, a B.Tech student in Computer Science at Delhi Technological University (DTU), with a strong passion for machine learning, AI, and healthcare innovations. Recently, I had the honor of winning Impulse 2025, a hackathon in collaboration with AIIMS Delhi, where my team and I developed an advanced EEG Seizure Classification model. The project involved the use of Generative Adversarial Networks (WGAN-GP) to generate synthetic EEG data, along with advanced signal processing techniques for feature extraction and model explainability using SHAP and saliency maps. This work significantly improved classification accuracy and ensured transparency in clinical applications, which was highly appreciated in the healthcare domain. With my expertise in signal processing, machine learning, and AI model explainability, I am eager to contribute to the Open-Source Framework for Advanced EEG Data Analysis Using Pre-trained Foundation Models project. I have hands-on experience with EEG data, advanced feature extraction methods (Fourier, Wavelet), and deep learning models, making me well-equipped to assist in enhancing and extending the visualizations and algorithms of the framework. I look forward to collaborating with you all to build a robust and impactful solution in the field of EEG data analysis. Thanks, |
Beta Was this translation helpful? Give feedback.
-
As a part of transformer model training, this reminds me of masked word modelling in NLP, but I personally think that the masking approach is not super robust on EEG. For text, we have a very very large number of text corpora, and thus we can capture a broad range of semantic dependencies and do effective masked word prediction. On the other hand, not only is the available training data for EEG quite limited and transformers models need a lot of training data, but EEG data generally exhibits high non-stationarity, encodes spatial and temporal information, and contains low SNR like you mentioned. This is partly the reason why I myself am a bit reserved to consider the transformers architecture, but it is very popular in the current literature. |
Beta Was this translation helpful? Give feedback.
-
|
Hi everyone, This project on EEG data analysis with pre-trained models sounds fascinating! While I don’t have direct experience with EEG data, I have worked on disease outbreak prediction using supervised classification models. I’m familiar with data preprocessing, feature engineering, and model training in frameworks like TensorFlow and Scikit-learn. I’m eager to learn more about EEG signal processing and would love to contribute, even if it means starting with smaller tasks like documentation, preprocessing pipelines, or model evaluation. Could you guide me on how best to get started? Looking forward to collaborating!" |
Beta Was this translation helpful? Give feedback.
-
|
Given the technical complexity of the projects and accelerated project timelines (12 weeks), we require contributors submitting proposals to possess: Demonstrated expertise in EEG signal processing Prior hands-on experience with neurophysiological data analysis workflows This eligibility criterion minimizes onboarding delays. |
Beta Was this translation helpful? Give feedback.
-
|
Hi everyone, I'm Shubham Vishwakarma, a recent ECE graduate from DJSCE, currently working as a Data Scientist at a private company that provides data-driven solutions to clients. I have a strong interest in the intersection of AI and Neurology and am eager to contribute to the project "Open-Source Framework for Advanced EEG Data Analysis Using Pre-trained Foundation Models." Previously, I worked as a research intern at IIT Patna, where I focused on finding the optimal rank for fine-tuning LLMs in a federated setting and implemented a research paper using PyTorch. I have published a research paper related to Stable Diffusion and am currently working on another paper focused on the early prediction of Alzheimer's disease using EEG and LLMs. I am one of the co-authors of this paper, along with my university professor. My work has directly tackled complex feature extraction and pattern recognition in raw EEG signals, aligning closely with the objectives of this project. I have developed a strong understanding of EEG data characteristics, preprocessing techniques, and noise handling. Additionally, my capstone project involved using ultrasound imaging for the rapid diagnosis of ACL injuries using deep learning. @zeydabadi, could you please provide a brief overview of any pre-tasks or additional steps I should complete before submitting my application? I would really appreciate any guidance on how to best prepare for this opportunity. |
Beta Was this translation helpful? Give feedback.
-
|
Is the data labeled ? |
Beta Was this translation helpful? Give feedback.
-
|
How can Explainable AI (XAI) be integrated into EEG-based models to make predictions more interpretable for neuroscientists? |
Beta Was this translation helpful? Give feedback.
-
|
Hello everyone, I’m Ishaan Sharma, a B.Tech student in Computer Science and Biosciences at Manipal University Jaipur. I’m passionate about integrating AI into biosciences, especially in neural signal analysis. I previously worked on an EEG-based attention detection model, where I leveraged deep learning to classify attention levels from EEG data. The project involved preprocessing EEG signals, feature extraction, and building a CNN for classification. Beyond this, I have also worked on multiple AI-driven projects such as Multiple Lung Disease Detection using deep learning to classify lung diseases. Yield Prediction from the aqueous phase of bio-oils using machine learning and artificial neural networks (ANNs).Robustness Analysis of Pretrained CNNs and ViTs using AdvGANs and generative AI to evaluate model vulnerability against adversarial attacks. Cell classification on the basis of their Genome expression using 10xGenomics dataset. The Open-Source Framework for Advanced EEG Data Analysis Using Pre-trained Foundation Models immediately caught my interest as it aligns with my expertise and passion. I’m excited about the opportunity to contribute to this project by applying my experience in AI-driven neuro-signal analysis and learning from the incredible community behind it! |
Beta Was this translation helpful? Give feedback.
-
|
Hi, I'm Maureen.. A 500L undergraduate Student at the University of Nigeria, Nsukka. I'm interested in the Open-Source Framework for Advanced EEG Data Analysis Using Pre-trained Foundation Models Project. |
Beta Was this translation helpful? Give feedback.
-
|
What kind of testing and validation strategy do we want to implement to ensure the robustness and reliability of the framework? |
Beta Was this translation helpful? Give feedback.
-
|
I am Nnaemeka, a pharmacology major and machine learning engineer interested in applied ML/AI in healthcare. I am interested in contributing and developing the project Open-Source Framework for Advanced EEG Data Analysis Using Pre-trained Foundation Models |
Beta Was this translation helpful? Give feedback.
-
|
This is exactly the kind of initiative the EEG research community needs right now. Open accessibility, combined with cutting-edge foundation models, can transform how we process, analyze, and interpret neural signals—especially in data-scarce clinical settings. From my end, I’ve been actively working on seizure prediction models using EEG, where I’ve implemented CNN-LSTM architectures to learn spatial-temporal patterns from raw recordings. I've also used Butterworth filters and band-pass filtering to clean and isolate frequency bands relevant to seizure dynamics, which has improved model performance significantly. I have been working on implementing this on the CHB MIT dataset available on kaggle. I’m especially intrigued by this framework's potential to: Incorporate pre-trained foundation models for transfer learning and cross-subject generalization, tackling one of the biggest challenges in EEG analysis. Enable temporal anomaly detection and support real-world conditions like multi-channel synchronization and domain adaptation. Provide support for Spiking Neural Networks (SNNs)—a powerful addition for temporal precision and energy-efficient modeling in EEG applications. Include explainability tools critical for clinical relevance, along with modular, interpretable pipelines for flexible experimentation. Offer streamlined model export (e.g., ONNX, TensorFlow Lite) for real-time deployment on edge devices. While I’m continuing to deepen my understanding of advanced neural decoding techniques and SNNs, I bring hands-on experience in designing ML workflows for biosignals and evaluating models on EEG-based seizure datasets. I’m looking forward to contributing insights, testing cross-dataset generalizability, and learning from this project’s development. This initiative has the potential to standardize and democratize advanced EEG analytics for both researchers and clinicians—excited to be part of the community around it! |
Beta Was this translation helpful? Give feedback.
-
Hi everyone! I'm Shuting XieI'm excited to join this community as part of Google Summer of Code 2025, working on the project: Open-Source Framework for Advanced EEG Data Analysis Using Pre-trained Foundation Models. I've submitted my proposal, and I would sincerely appreciate any feedback or suggestions. About Me
I'm currently working as an ML Researcher at the Krembil Brain Institute, University Health Network, where I:
This experience sparked my passion for building generalizable neural representations from biomedical signals like EEG, which aligns closely with this GSoC project. Skills & Tools
Thoughts on the EEG Foundation Model ProjectThis project aims to build an open-source EEG foundation model trained with self-supervised learning on large-scale unlabeled EEG data. I’ve compiled a summary table comparing key properties of existing EEG foundation models, please check the link: My draft for the key components of the EEG foundation model include:
Questions
ContactShuting Xie |
Beta Was this translation helpful? Give feedback.
-
|
Hello Everyone, At 22 years old, I am actively learning machine learning while also working on hands-on projects and teaching support classes. My goal is to become a professional in AI applied to cybersecurity and healthcare. This project would be a great opportunity to strengthen both my practical and scientific skills. Project Summary (as I understand it): This project aims to build an open-source framework for advanced analysis of EEG signals using pre-trained foundation models. The idea is to leverage publicly available EEG datasets to train a deep model capable of automatically extracting reusable representations for various downstream tasks such as brain signal classification or anomaly detection. The project includes signal preprocessing, feature extraction, development of a deep model (such as an autoencoder or transformer), and integration into a usable open-source package. Expected Contributions: -Develop a complete EEG signal processing pipeline. Technical Skills: -Languages: Python (advanced), Bash (intermediate) Even though I’m still building my experience, I am highly motivated, a fast learner, and truly excited about this opportunity. Thank you for your time and for working on such an exciting and impactful topic! Best regards, |
Beta Was this translation helpful? Give feedback.
-
|
Hi everyone Technical Qualifications:
Alignment with Project Requirements:
Deliverables Commitment:
Availability:
Project Contributions Ideas
Project Timeline (Full-Time: 35h/week)
Email : 0amam3@gmail.com |
Beta Was this translation helpful? Give feedback.
-
|
Motivation Relevant Background Understanding of the Project Proposed Contributions Timeline Best Regards, |
Beta Was this translation helpful? Give feedback.
-
|
Hi, I'm Sahithi Madas, a graduate student in Data Science at SUNY Albany and a Data Engineer Intern at Albany County Local Government, NY. I've been exploring a while now, looking for projects where AI and Data Analytics meet healthcare space. Your project on "Adaptive Closed-loop Neuromodulation" immediately grabbed my attention. The idea of using reinforcement learning in a closed-loop system to personalize brain stimulation - esp. for Parkinson's disease—really resonates with me. I've worked on public health-related data analysis with the New York State Department of Health as a Datat Analyst Intern and built dashboards and machine learning models that involved large-scale health and environmental datasets. Additionally, I've developed predictive models for "Disaster Risk" and "Resale Car Price Predictor" and have experience in Python, PyTorch, and machine learning libraries by building clean ML pipelines. While I haven't worked directly with EEG data yet, I've done real-time signal processing using OpenCV, and I'm confident in my ability to apply similar principles here. I'm particularly excited by the idea of contributing to the data preprocessing and model training stages in reinforced learning (RL). This project felt like a meaningful step in bringing AI and healthcare - something I have always talked about and deeply passionate to contribute. Best, |
Beta Was this translation helpful? Give feedback.
-
|
Hi everyone! 👋 I’ve been exploring the “Open-Source Framework for Advanced EEG Data Analysis Using Pre-trained Foundation Models” project and I’m really excited about its potential. I’ve worked on AI/ML-based mental health tools in hackathons and have experience with Python, PyTorch, and NLP models. I’m currently preparing my GSoC proposal and would love any guidance on: Looking forward to learning from and contributing to this incredible project! 🙌 Best, |
Beta Was this translation helpful? Give feedback.
-
|
Hi, I’m P. Y. Rajkamal Tutu, an M.Tech Artificial Intelligence student at NIT Silchar, also pursuing a B.S. in Data Science and Applications from IIT Madras in parallel. I’m passionate about brain-computer interfaces, cognitive modeling, and the intersection of AI, neuroscience, and generative modeling. I’ve previously worked on explainable AI for Indic-language spam classification, and few other kaggle projects. I enjoy working with LLMs, visual-language models, and models that combine structure and reasoning in biological contexts. I believe GSoC 2025 is an ideal opportunity for me to contribute to high-impact open-source research, collaborate with domain experts, and grow technically and intellectually through mentorship and community engagement. E-mail: tutuponnekanty@gmail.com Thank you! |
Beta Was this translation helpful? Give feedback.
-
|
Hello Everyone, My name is Abdeldjalil Lamara, and I’m currently pursuing a Master’s in Bioinformatics with a strong foundation in data engineering and AI. I’ve been deeply involved in projects that combine data processing, machine learning, and real-world impact, which is why I’m particularly excited about Project 2: Open-Source Framework for Advanced EEG Data Analysis Using Pre-trained Foundation Models. I’ve worked with Python, PyTorch, and MNE for EEG data exploration and am especially interested in the intersection of neuroscience and deep learning. The opportunity to contribute to a foundation-model-based EEG framework aligns perfectly with my skills and aspirations. I’d love to get involved whether it’s through early contributions, discussions, or exploring relevant repositories. Could you please share how I can begin engaging with the project or if there are any resources you'd recommend reviewing? Looking forward to learning from and contributing to the community. Best regards, |
Beta Was this translation helpful? Give feedback.
-
Hello Everyone, My name is Amaan Arif, and I am an undergraduate student in Bioinformatics with a growing interest in AI and machine learning, particularly in the application of these technologies to biological data analysis. I’ve worked with Python and have a foundational understanding of data engineering, machine learning, and deep learning algorithms. I am excited about the opportunity to contribute to the Open-Source Framework for Advanced EEG Data Analysis Using Pre-trained Foundation Models project, as it aligns perfectly with my academic background and passion for neuroscience and AI. I’m eager to get involved, whether through discussions, code contributions, or by exploring relevant research and repositories. I would appreciate any guidance on how I can begin engaging with the project. Looking forward to collaborating and learning from the community. Best regards, |
Beta Was this translation helpful? Give feedback.
Uh oh!
There was an error while loading. Please reload this page.
Uh oh!
There was an error while loading. Please reload this page.
-
Hi everyone,
I saw that a lot of people want to contribute to the project "Open-Source Framework for Advanced EEG Data Analysis Using Pre-trained Foundation Models", so I want to start this discussion thread by sharing a few resources (mostly papers) on the EEG/ signal processing and the ML pipeline that I found. Hopefully we can continue the conversation from here!
Book:
Papers:
The project description is pretty general, but one of the mentioned papers have pointed out several components for the data processing pipeline, which I think should help. Besides, the project we are exploring seems to focus on Pretrained Foundation Models, so I have been thinking about which architectures or models out there to best suit each of the components. I want to share this to anyone interested, and would love to learn from yall too.
Cheers,
Amy
Beta Was this translation helpful? Give feedback.
All reactions