Hi, I’m Muntaha Shams — I build AI products across GenAI (RAG/LLMs), Computer Vision, and ML systems, with production-focused APIs and UI demos.

Email GitHub

Projects

17 projects • Filter: All

#CV#NLP
German Invoice Extraction with OCR + Layout Models
Turn German invoices into structured, validation-ready data using OCR, layout-aware extraction, and API-friendly outputs.
German Invoice Extraction with OCR + Layout Models
#CV#NLP
Document AI for Structured Field Extraction
Extract business-critical fields from PDFs and scans, then convert them into structured outputs ready for automation.
Document AI for Structured Field Extraction
#GenAI#NLP
Document RAG Assistant for Grounded Q&A
Build document Q&A systems that retrieve relevant context first and generate answers grounded in source material.
Document RAG Assistant for Grounded Q&A
#GenAI#NLP
Video RAG for Searchable Long-Form Content
Make long videos searchable by turning transcripts and segments into a retrieval-powered Q&A experience.
Video RAG for Searchable Long-Form Content
#GenAI#NLP
Data-Grounded LLM Writing Assistant with Review UI
Generate domain-specific content from connected knowledge sources through a controllable, review-friendly LLM interface.
Data-Grounded LLM Writing Assistant with Review UI
#GenAI#CV
Character Consistency with LoRA + Stable Diffusion
Train a LoRA-based image generation workflow that preserves a character’s identity across prompts, poses, and scenes.
Character Consistency with LoRA + Stable Diffusion
#NLP#ML
CV–Job Description Matching with Semantic Ranking
Match CVs to job descriptions with semantic similarity, ranking features, and explainable skill-gap insights.
CV–Job Description Matching with Semantic Ranking
#CV#NLP
Address Extraction Review Tool for Scanned Documents
Extract address fields from scanned documents and route them through a review UI before export.
Address Extraction Review Tool for Scanned Documents
#GenAI#NLP#ML
Financial Data Assistant with PostgreSQL + GPT
Let users query structured financial data in natural language and receive clear, application-ready responses.
Financial Data Assistant with PostgreSQL + GPT
#ML#NLP
AI Workflow Automation with n8n, APIs & LLMs
Design AI-powered business automations in n8n that connect triggers, APIs, databases, and LLM-based processing.
AI Workflow Automation with n8n, APIs & LLMs
#CV#ML
Surfing Pose Tracking & Maneuver Recognition
Analyze surfing videos with robust pose tracking and temporal maneuver recognition for coaching-style feedback.
Surfing Pose Tracking & Maneuver Recognition
#CV#ML
3D Dental Scan Alignment with Point Clouds
Align dental 3D scans through point-cloud registration and coarse-to-fine geometry matching.
3D Dental Scan Alignment with Point Clouds
#CV#ML
Real-Time YOLO Object Detection Pipeline
Develop real-time object detection pipelines with YOLOv8 and deployment-ready inference optimization.
Real-Time YOLO Object Detection Pipeline
#CV#ML
YOLO Segmentation for Visual Inspection
Build segmentation systems for visual inspection and measurement tasks using YOLO-based models.
YOLO Segmentation for Visual Inspection
#GenAI#CV
Diffusion-Guided 3D Asset Generation
Explore diffusion-guided generation of textured 3D assets through modern image and 3D modeling techniques.
Diffusion-Guided 3D Asset Generation
#ML
ML Prediction Dashboard with API Serving
Serve machine-learning predictions through APIs and make model outputs easier to interpret in an interactive dashboard.
ML Prediction Dashboard with API Serving
#CV#Deep Learning
Diabetic Retinopathy Screening with Transfer Learning
Classify retinal fundus images with CNNs and transfer learning to support diabetic retinopathy screening workflows.
Diabetic Retinopathy Screening with Transfer Learning