🚀 Data Science & ML Projects
Welcome to my project showcase.
Below you’ll find a selection of technical work demonstrating my skills in Machine Learning, AI Engineering, data visualization and scientific computing.
🧠 1. LLM Scientific Assistant (RAG System)
Technologies: Python, LangChain, FAISS, OpenAI API, FastAPI
A domain-specific assistant built to:
- retrieve scientific knowledge from thousands of documents
- synthesize multi-source insights
- prevent hallucinations through citation-driven reasoning
- track temporal freshness of data
Key features
- RAG pipeline with semantic search
- relevance re-ranking
- prompt engineering framework
- production-ready API
👉 Repository coming soon
🔬 2. Single-Cell ML Pipeline (PhD Project)
Technologies: Python, PyTorch, Scanpy, scVI, UMAP, Leiden
Large-scale ML workflow analysing millions of single-cell RNA-seq datapoints.
Highlights:
- probabilistic models for cell-state inference
- dimensionality reduction
- graph-based clustering
- reproducible pipeline with Snakemake
Includes an interactive dashboard built with Dash + Plotly.
👉 Repository coming soon
📈 3. Time Series Forecasting for Resource Optimization
Technologies: Python, Prophet, LSTM, sklearn, Dash**
Pipeline capable of:
- forecasting resource consumption
- evaluating model drift
- providing interactive scenario simulation via Dash dashboard
👉 Repository coming soon
📊 4. Dashboards & Visualization Collection
A collection of professional dashboards using:
- Dash / Plotly
- Altair
- Streamlit
Examples include:
- High-dimensional embeddings
- Cluster exploration
- Time series anomaly detection
🧪 5. Probabilistic Models for Biological Systems
From early internship work:
- Hidden Markov Models
- stochastic simulations
- parameter inference
Used in computational biology projects.
👉 More projects available on my GitHub:
https://github.com/ton-username