Possessing an exceptional research background in Machine Learning and Artificial Intelligence, I am a consummate Machine Learning Engineer and Researcher with over five years of experience and a proven track record of exceeding expectations in developing and managing scalable production systems. My comprehensive grasp of the software development lifecycle empowers me to proficiently navigate projects from inception to execution, and on to diligent maintenance.
Complementing my technical acumen, my exemplary collaboration and communication skills have consistently facilitated seamless integration within interdisciplinary teams, ensuring a cohesive alignment of varied stakeholders around a shared vision. Driven by an unyielding thirst for knowledge, I actively pursue opportunities for continued learning to remain at the forefront of emerging trends and technologies.
As a proactive problem solver and innovative thinker, my adaptability allows me to thrive in dynamic environments, meeting challenges head-on and devising creative solutions. My unwavering commitment to excellence positions me as an ideal candidate for Machine Learning Engineering and Research-based roles, where I aspire to make a lasting, positive impact on the organizations I serve and contribute meaningfully to the advancement of the industry.
August 2021 – December 2023
Thesis: Generating Fluent sentences through Curriculum Learning and Disfluency Augmentation https://hdl.handle.net/1969.1/203038
Advisor: Prof. James Caverlee,
Thesis Commitee Members: Prof. Ruihong Huang, Prof. Jinsil Seo
Courses: Machine Learning, Software Engineering, Pattern Recognition, Deep Learning, Natural Language Processing, Analysis of Algorithms, Operating Systems, Data Mining, Object Oriented Programming, Parallel Computing
Holding a Master of Science in Computer Science, awarded with a Graduate Scholarship, my academic journey culminated with a pioneering thesis entitled "Generating Fluent Text Through Curriculum Learning and Disfluency Augmentation". This groundbreaking research, conducted at Texas A&M University's renowned InfoLab, achieved state-of-the-art (SOTA) results in its domain. Under the expert tutelage of Professor James Caverlee, my work contributed significantly to the field of natural language processing.
In parallel, my role as a Research Assistant at the Soft Interaction Lab at TAMU, guided by Professor Jinsil Hwaryoung Seo, involved spearheading innovative projects in the research and development of advanced conversational AI systems. This work primarily focused on the training of Large Language Models, pushing the boundaries of machine learning and human-computer interaction.
August 2015 – May 2019
Courses: Data Structure and Algorithms, Operating Systems, Computer Architecture and Organization, VLSI, Antenna Theory, Analog Circuits, Digital Signal Processing, Digital Systems
Activities and societies: Member of the Official Robotics Club of NIT Durgapur, Centre for Cognitive Activities(CCA)
Tech Stack: Python, PyTorch Lightning, NVIDIA NeMo, JAX, Tensorflow, Docker, AWS Bedrock, SageMaker, EKS, ECS
1. Architected and implemented the AGI organization’s first automated Model Lineage Tracking System, enabling 100% traceability of model ancestry and production datasets across multiple model variants during training, becoming critical infrastructure for training reproducibility, debugging, compliance, and audit approvals across 10k+ production training runs.
2. Successfully prototyped a bridge between JAX and Sharding Data Loader (SDL) infrastructure, creating a standalone DataModule that preserved SDL’s distributed capabilities while achieving comparable performance to PyTorch Lightning based implementation.
3. Designed and implemented a high-performance multi-modal dummy data loader achieving 2x faster data loading through a custom sample multiplexing algorithm compared to PyTorch’s SampleMultiplexer, enabling isolated testing across a combination of 5 modalities (text, image, video, audio, speech) and accelerating 1k+ MFU (Model FLOPs Utilization) optimization experiments.
4. Developed a batch consistency and reproducibility verification system via cumulative hashing and verification of training batches, ensuring 100% data integrity across all distributed training runs and reducing debugging time by 60% for data-related issues.
5. Integrated MLFlow for Nova model training, centralizing experiment tracking and reducing experiment analysis time by 75% through automated logging of 50+ training metrics, git details, and system configurations across 10k+ training runs.
6. Developed a checkpoint management system enabling precise control over loading of optimizer states, learning rate scheduler states, and data checkpoints while resuming training. This was especially used for rampdown training across model variants.
7. Engineered a proactive CPU memory monitoring system providing real-time tracking and early OOM (out-of-memory) warning capabilities across all training toolkit executions.
8. Engineered an automated system that verifies training datasets against approved data manifests at the start of model training, ensuring proper dataset usage in training models and 100% validation coverage for all production runs.
9. Worked on pre-training (PT) and supervised fine-tuning (SFT) of Amazon Nova Sonic , a cutting-edge speech-to-speech model that seamlessly integrates speech recognition, understanding, and generation to power more natural, intelligent voice experiences.
10. Featured by Amazon Careers on LinkedIn for impactful work as an ML Engineer.
Tech Stack: Java, JavaScript, TypeScript, AWS - Lambda, DynamoDB, Athena, S3, EC2, VPC, CDK, IAM
1. Developed OmniBot, an AI tool that boosted individual efficiency by 2-3 hours daily. Leveraging a Tampermonkey script for web extension integration and interfacing with AWS Bedrock models via AWS Lambda and API Gateways, OmniBot excelled in grammatical correction, comprehensive code analysis, issue identification, and contextual information retrieval directly from active web pages.
2. Led full-stack development initiatives, delivering 15+ features across Java, TypeScript, & Python, resulting in 40% latency reduction.
3. Developed automation scripts that reduced operational tracking time by 75%, handling data from 50+ microservices and enabling real-time monitoring of KPIs, leading to 30% faster operational decision-making and 60% decrease in manual reporting efforts.
4. Worked with cross-functional teams to align software solutions with business goals, from concept to deployment
Tech Stack: Python, C#, Tensorflow, PyTorch, Hugging Face, GCP
1.Developed Conversational Artificial Intelligence driven interview trainer Web app by parameter-efficient fine-tuning of quantized LLAMA 2, model with Low-Rank Adaptation (LoRA) using Lit-gpt. The system performs resume analysis, personalized questioning, & context retention in custom-built memory components for enhanced training through dialogue. Hosted on Google Cloud Platform.
2. Fine-tuned the LLAMA 2 model to produce more human-like responses by integrating appropriate emotional and disfluency cues.
3. Spearheaded the development of Conversational AI integrated Virtual Reality & Web applications to serve as virtual patients for Texas A&M School of Nursing students replacing manual training methods. Adopted & highly acclaimed by the Nursing School.
4. Led and mentored a team of 4 Computer Science graduate researchers across 3 concurrent research projects, resulting in 3 successful paper publications.
Tech Stack: Python, Tensorflow, PyTorch
1. Completed Master's thesis under Prof. James Caverlee's guidance, achieving state-of-the-art results in disfluency removal and fluent text generation tasks using lightweight Large Language Models with Disfluency Augmentation and Curriculum Learning. https://hdl.handle.net/1969.1/203038
2. Conducted research on harnessing Dense Passage Retrieval, Retrieval Augmented Generation, & Large Language Models to advance question-answering performance on Multidoc2dial & Wizard Of Wikipedia datasets under Prof. James Caverlee's guidance.
Tech Stack: Java, JavaScript, TypeScript, AWS - Lambda, DynamoDB, Athena, S3, EC2, VPC, CDK, IAM
1. Developed a full-stack software that procures run-time customer-data consumption details of internal services and analyzes it to show the data consumption statistics and access limitations for the individual services in a dashboard.
2. Enabled service owners to get a better perspective of the data utilization details, access limitations, and possible security breaches all in one place (with the help of this software), thereby saving 100% manual effort in finding them.
Tech Stack: Python, C++, SNPE, QNN, AIMET, Tensorflow, PyTorch, Hugging Face, ONNX
1. Optimized trained Neural Network models (of Samsung, OnePlus, & other Qualcomm customers) utilizing model compression, quantization, & pruning techniques, to run the models efficiently on the digital signal processor of Snapdragon chipsets.
2. Implemented critical feature requests in Snapdragon Neural Processing Engine SDK to enhance its functionalities.
3. Developed a new Recommendation System to give suggestions of similar Salesforce issues raised by customers in the past for newly raised customer issues, with a reported accuracy of 74% across various engineering divisions of Qualcomm.
4. Developed a widely used (more than 5000 internal users/month) Automation Software to automatically download (Selenium), intelligently parse, & generate error logs & reports from device crash dumps sent by customers in Salesforce.
5. Identified and fixed critical Docker, bokeh server, and documentation bugs in AIMET (Artificial Intelligence Model Efficiency Toolkit)
Tech Stack: Google Dialogflow, SAP, Javascript
1. Developed an AI ChatBot using Google Dialogflow and SAP (Systems, Applications & Products in Data Processing) Cloud Platform to send and receive, query and data, to and from the SAP cloud database in real-time.
2. The chatbot allowed the engineers to easily and efficiently query and populate data into the database thereby saving 70% of their manual labor.
3. Project Documentation Link is available here: Link
In Amazon Technical Reports (2024)
Worked on Amazon Nova Sonic Model as part of Amazon Artificial General Intelligence Team
Available at: https://www.amazon.science/publications/amazon-nova-sonic-technical-report-and-model-card
Master's Thesis published by Texas A&M University
Thesis defended in-person at Texas A&M University, College Station Texas on 14th November 2023
Available at: https://hdl.handle.net/1969.1/203038
In Proceedings of the 25th International Conference on Artificial Intelligence in Education (AIED 2024)
Paper presented in-person at the International Conference on Artificial Intelligence in Education 2024 (July 8-12, 2024), Recife, Brazil
Available at: https://link.springer.com/chapter/10.1007/978-3-031-64315-6_44
In Proceedings of the 2024 Joint International Conference on Computational Linguistics, Language Resources and Evaluation (LREC-COLING 2024), pages 4311–4321, Torino, Italy. ELRA and ICCL.
Paper presented in-person at the LREC-COLING 2024 conference, May 20-25, 2024, Torino, Italy. Pre-recorded presentation link for the conference: https://www.youtube.com/watch?v=8VIDlocdaco
Available at: https://aclanthology.org/2024.lrec-main.385/
In arXiv Preprint, March 2024
Available at: https://arxiv.org/abs/2404.01339
In Human-Computer Interaction - INTERACT 2023 - 19th IFIP TC 13 International Conference. Lecture Notes in Computer Science, vol 14145. Springer, Cham.
Paper presented virtually at INTERACT 2023, York, UK.
Available at: https://link.springer.com/chapter/10.1007/978-3-031-42293-5_43
In International Conference on Artificial Intelligence in Education (pp. 701-707). Cham: Springer Nature Switzerland.
Paper presented virtually at the International Conference on Artificial Intelligence in Education 2023, Tokyo, Japan.
Available at: https://link.springer.com/chapter/10.1007/978-3-031-36272-9_59
In 15th International Conference on Electrical Engineering/Electronics, Computer, Telecommunications and Information Technology (pp. 660-663). IEEE.
Paper presented in-person at the ECTI-CON 2018 conference, in Chiang Rai, Thailand.
Available at: https://ieeexplore.ieee.org/document/8619860
Featured by Amazon Careers on LinkedIn for impactful work as an ML Engineer
Awarded funding from the Academy of Visual and Performing Arts (AVPA) of Texas A&M University to present my research paper at the 24th International Conference on Artificial Intelligence in Education, AIED 2023
Received Scholarship of $10,205/year for 2 years in a row from the Department of Computer Science and Engineering of Texas A&M University, College Station
Earned multiple Professional Excellence Awards from Qualcomm for independently developing two pivotal Natural Language Processing and automation-based software, and for enriching the Artificial Intelligence Model Efficiency toolkit (AIMET) through significant open-source contributions, all while exceeding the scope of my core responsibilities
This page has been visited 3303 times.