Multi-Level Text Classification for Medical Insights Documents
Problem
Our Client was facing challenges with classification of Medical Insights documents. Earlier classification was done manually, and it takes around 12-man hours a day. They tried developing a model and they were facing below challenges:
- The datasets is highly imbalance, and they need to predict multi-level prediction of categories
- Documents in dataset are large and noisy, it is complex to perform a classification task
- We need to train the model for 4 products and with current approach we will end up with 8 models. And when it comes about deployment strategy, it could be expensive.
- Deploy a cost saving deployment strategy
Solution
MResult developed a cutting-edge AI solution for multi-level text classification model using technologies such as Python , NLP, NLTK , Transformers, Bert, Deep Learning, Fastai. Machine Learning Algorithms such as, BERT pretrained models , Microsoft Pubmed, SAP Bert.
- A pipeline of solutions were built to meet multiple requirements such as:
- Baseline the data sources and metrics for information
- Collect key points from these data sources into one location
- Apply NLP and deep learning techniques with Transfer Learning
- Developed multi-tasking model using deep learning
- Developed the Deployment strategy using Sagemaker
- Reduced Deployment cost by 50%