Analyzing bert variants for NLP tasks

Project information

  • Project Title: ML Paradigms
  • Skills: AI/ML, NLP
  • Colloborations: Benjamin Wu, Joshua Davis, Michel
  • Project URL: [GITHUB]

Project Description

I explored Transfer Learning, Multi-Task Learning, and Few-shot Learning techniques for Natural Language Processing (NLP) tasks. We developed a deep learning model using PyTorch and pre-trained backbones to perform tasks such as Named Entity Recognition (NER), Natural Language Inference (NLI), and Machine Reading Comprehension (MRC). We used various data pre-processing and post-processing techniques to prepare the data for training and evaluation, including setting up a no-trainer evaluation loop for MRC. Our experiments involved evaluating the performance of the model under different conditions, such as using static or dynamic weights for multi-task learning and finetuning with different amounts of target-domain data.

The experiments yielded promising results, demonstrating the effectiveness of Transfer Learning, Multi-Task Learning, and Few-shot Learning techniques for improving model performance across different domains and tasks. We were able to achieve state-of-the-art results on the SQuAD v2 dataset for MRC using a few-shot transfer approach, where we froze the backbone parameters and used only 10% of the available data. We also gained valuable experience in deep learning, PyTorch, and experimental design and analysis.

Overall, this project was a great opportunity to apply and expand our knowledge of NLP and deep learning techniques, and to contribute to the growing body of research in this field. We developed a deep learning model using PyTorch and pre-trained backbones, and used various data pre-processing and post-processing techniques to prepare the data for training and evaluation. Our experiments demonstrated the effectiveness of Transfer Learning, Multi-Task Learning, and Few-shot Learning techniques for improving model performance across different domains and tasks, and we were able to achieve state-of-the-art results on the SQuAD v2 dataset for MRC using a few-shot transfer approach.