Welcome to My Blog
Hello and welcome to my blog! I’m Ujjwal Chowdhury, a data scientist and research analyst based in Kolkata, India. With a Master’s degree in Data Science and over a year of practical experience, I specialize in Natural Language Processing (NLP), Generative AI, and Stock Forecasting. My passion lies in designing and developing innovative solutions using Artificial Intelligence, implementing machine learning models, and collaborating with cross-functional teams to deliver impactful, data-driven insights.
Areas of Expertise
- Natural Language Processing (NLP)
- Generative AI
- Machine Learning Algorithms
- Time Series Analysis
- Algorithm Optimization
- Statistical Analysis
- Data Visualization
- Model Development
- Cross-functional Collaboration
Skills
Data Visualization
- Microsoft Power BI
- Excel
- Tableau
- Seaborn
- Plotly
- Matplotlib
Machine Learning and Deep Learning
- Feature Engineering
- Model Development
- Hyper-parameter Tuning
- Neural Networks
- Reinforcement Learning
- Transfer Learning
- Optimization Techniques
- MLOps
Tools/Frameworks
- Python, R
- TensorFlow, PyTorch, Keras, TFLite, MLFlow, PySpark
- PostgreSQL
- Azure, AWS
- LangChain, streamlit, Docker, Pydantic
Natural Language Processing
- Text Generation
- Sentiment Analysis
- Speech Recognition
- Named Entity Recognition
- Text Classification
- LLM Prompt Engineering
Computer Vision
- Image Processing
- Object Detection
- Image Classification
- Image Segmentation
- Image Generation
Data Analysis and Mining
- Data Mining
- Web Scraping
- Statistical Analysis
- Time Series Analysis
- Anomaly Detection
- Predictive Analytics
- Survival Analysis
Soft Skills
- Problem-Solving
- Teamwork
- Active Listening
- Adaptability
- Communication
- Analytical Thinking
Professional Experience
Research Executive (AI & NLP)
Feedsense AI Private Limited, Kolkata, India (Mar 2024 - Present)
- Utilized reinforcement learning models integrating financial data to predict market movements and develop optimized trading strategies.
AI Researcher
Vista Intelligence Private Limited, Kolkata, India (Jan 2023 - Mar 2024)
- Led the NLP team, overseeing project developments and team operations.
- Fine-tuned an RNN-Transducer driven speech-to-text model to effectively capture Indian accents, reducing the Word Error Rate from 56.8% to 23.4%.
- Developed a live audio transcription model for real-time news analysis.
- Created a trade signal generator model integrating live audio, textual news articles, OHLC data, and quantitative techniques with over 75% directional accuracy.
- Developed an auto question generator program for applicant CVs.
- Utilized OpenAI API with Langchain for a large document summarizer.
- Employed a 4-bit quantized Mistral 7b LLM model for summarizing conference call conversations.
Research & Publication
- Investigate How Market Behaves: Toward an Explanatory Multitasking Based Analytical Model for Financial Investments
IEEE Access, March 2024
DOI: 10.1109/ACCESS.2024.3369033
Courses & Certifications
- Artificial Intelligence (AI) for Investments (April 2023) - NPTEL
- Cloud Computing and Distributed Systems (March 2023) - NPTEL
- NISM-Series-XV: Research Analyst (Feb. 2023) - National Institute of Securities Markets
- Database Management System (Oct. 2022) - NPTEL
- Deep Learning for Computer Vision (Oct. 2022) - NPTEL
- Data Science Math Skills (April 2020) - Duke University, Coursera
Education
MSc Data Science
RKMVERI, Belur, West Bengal, India (2021-2023)
BSc Mathematics
Vidyasagar University, Medinipur, West Bengal, India (2017-2020)
Personal Projects
Fin-Bot: Advanced Agent-based Financial Chatbot
Domain: NLP, LLM, Generative AI, Deep Learning, RAG- Integrated web search functionality for comprehensive query responses.
- Implemented a custom vector database for efficient retrieval of financial news and transcripts.
- Employed LLM-equipped agent for directed user queries, ensuring comprehensive insights.
Sales Forecasting and Anomaly Detection on Walmart Sales Dataset
Domain: Machine Learning, Time Series Analysis, Deep Learning- Used Factor Analysis for feature extraction.
- Applied time series, machine learning, and deep learning for sales prediction.
- Used unsupervised techniques for anomaly detection.
Deep Bidirectional LSTM Network for Textual Sentiment Analysis
Domain: Deep Learning, Sentiment Analysis, NLP, Web Scraping- Integrated Twitter API for real-time tweet scraping.
- Used AsyncHTMLSession to scrape news articles from Google News.
- Leveraged Bi-LSTM architecture for sentiment classification.
Brain Tumor Classification
Domain: Computer Vision, Deep Learning, Optimization Techniques- Used Transfer Learning and fine-tuned several pre-trained models.
- Explored various optimization algorithms.
- Applied snapshot learning technique for ensemble predictive model construction.
Statistical Analysis of Diet, Exercise, and Fitness
Domain: EDA, Data Visualization, Data Analysis, Statistical Inference- Collected data using online surveys.
- Employed descriptive statistics for dataset summarization.
- Used Power BI, Tableau, Excel, R, and Python for analysis and visualization.
Stay tuned for more insights, projects, and discussions on data science, AI, and beyond. Feel free to connect with me on LinkedIn or explore my projects on GitHub.
Thank you for visiting my blog!