Chinmay Sahu, Ph.D.

Senior AI Research Scientist

Thales Digital Identity & Security

#1 NIST IREX-10 Global Ranking (2023)

Computer Vision & Multi-Modal AI Researcher specializing in robust recognition systems, generative modeling, and large-scale deep learning deployed in production

Computer Vision Generative AI Multi-Modal Learning Domain Adaptation

Impact By The Numbers

šŸ„‡
#1
NIST IREX-10 Global Ranking
Among 28 international competitors
šŸ–¼ļø
1M+
Synthetic Images Generated
Photorealistic training data
šŸ“Š
10M+
Training Samples
Large-scale distributed systems
šŸ’”
9
Patents & Inventions
Innovation disclosures
šŸš€
4
Production DL Modules
Deployed at scale
šŸ“š
10+
Conference Presentations
IEEE & international venues

About Me

I'm a Computer Vision and Imaging Researcher with demonstrated success delivering state-of-the-art recognition systems deployed at scale. My work focuses on robust visual recognition under real-world capture variability, domain adaptation, generative modeling for photorealistic data synthesis, and multi-modal fusion.

Currently at Thales Digital Identity & Security, I've achieved the #1 global ranking in NIST IREX-10 among 28 international competitors, demonstrating state-of-the-art performance in fine-grained visual iris recognition. I've delivered 4 production-ready deep learning modules and generated over 1 million photorealistic synthetic images for training advanced AI systems.

My expertise spans generative models (Diffusion, GANs), vision transformers, multi-modal learning, and large-scale distributed training. I'm passionate about translating research innovations into production-ready systems that solve real-world problems.

Research Interests

Computer Vision Generative Models Vision-Language Models Domain Adaptation Multi-Modal Learning Open-World Perception Large-Scale ML Systems

Major Achievements

Innovation Portfolio

2021-Present

7-9 invention disclosures including 4 under internal review, 4 trade secrets, and 1 patent filed, demonstrating consistent research innovation across multiple domains.

Patents Innovation

Production ML Deployment

2021-Present

Delivered 4 deep learning modules to production: segmentation, denoising, pose alignment, and feature extraction for Thales multi-modal biometric SDK (Fingerprint, Iris, Face).

Production ML Multi-Modal

Best Poster Presentation

2019

Best Poster Presentation in Computational Methods (Graduate Category) at Clarkson University, recognizing excellence in research communication and technical innovation.

Research Excellence

Session Chair

2020

Session Chair for "Applications of Deep Learning I" at Asilomar Conference on Signals, Systems and Computers, demonstrating leadership in the research community.

Leadership

Professional Experience

Senior AI Research Scientist

Aug 2021 – Present

Thales Digital Identity & Security, Pasadena, CA

Multi-Modal Perception & Recognition Systems

  • Achieved #1 global ranking in NIST IREX-10 among 28 international competitors in iris recognition
  • Designed compact visual representations reducing feature dimensionality by >60% while maintaining accuracy
  • Improved fingerprint Rank-1 performance by 16% and reduced FRR by 39% at 1e⁻⁓ FAR
  • Developed multi-modal fusion strategies combining fingerprint, iris, and face recognition across heterogeneous sensors

Generative Modeling

  • Led research in diffusion models, CycleGAN, and adversarial augmentation for photorealistic image synthesis
  • Generated 1M+ photorealistic synthetic samples for training segmentation, denoising, and recognition models
  • Demonstrated synthetic-to-real transfer capabilities applicable to camera simulation and data augmentation

Large-Scale Deep Learning Systems

  • Designed and trained production systems using PyTorch DDP/FSDP on 10M+ sample datasets
  • Delivered 4 DL modules for production: segmentation, denoising, pose alignment, feature extraction
  • Built distributed training pipeline with experiment tracking and model validation

Model Optimization & Deployment

  • Optimized models with ONNX and OpenVINO for low-latency CPU inference through quantization
  • Designed embedding-level, score-level, and rank-level fusion strategies for ensemble models
  • Developed high-performance classical CV pipelines including dense correspondence and geometric alignment
PyTorch Diffusion Models GANs Vision Transformers ONNX OpenVINO Multi-Modal Fusion

Research Assistant

Jan 2019 – Aug 2021

Clarkson University, CoSiNe Lab, Potsdam, NY

  • Designed novel approaches to mitigate demographic bias in face recognition across 1M+ images
  • Developed multi-modal learning pipeline combining visual features with keystroke dynamics, achieving 98.87% accuracy
  • Created data-driven localization algorithms for biomedical and environmental applications
  • Published in top-tier venues including IEEE T-BIOM and IEEE Sensors Journal
Face Recognition Fairness in AI Multi-Modal Learning Behavioral Biometrics

Research Data Scientist Intern

May 2020 – Aug 2020

Potsdam Sensors, Potsdam, NY

  • Built data-driven spatial modeling and localization algorithms for real-time sensor data analysis
  • Designed systems for handling temporal dynamics and spatial reasoning in complex environments
Sensor Fusion Spatial Modeling Real-Time Systems

Research Areas

Computer Vision

Image segmentation, object detection, pose estimation, dense correspondence, optical flow, and feature extraction for robust visual recognition systems.

  • CNNs & Vision Transformers
  • Attention Mechanisms
  • Feature Learning

Generative Models

Diffusion models, GANs (CycleGAN, StyleGAN), and neural rendering for photorealistic image synthesis and cross-domain style transfer.

  • Diffusion Models
  • Adversarial Training
  • Synthetic Data Generation

Multi-Modal Learning

Cross-modal fusion, sensor fusion, and multi-task learning combining visual features with metadata and behavioral data.

  • Vision-Language Models
  • Sensor Fusion
  • Multi-Task Learning

Domain Adaptation

Cross-sensor generalization, adversarial training, sim-to-real transfer, and open-set recognition under distribution shift.

  • Transfer Learning
  • Cross-Domain Recognition
  • Robust Perception

Model Optimization

Knowledge distillation, quantization, pruning, and ONNX/OpenVINO deployment for efficient edge inference.

  • Model Compression
  • Hardware Optimization
  • Edge Deployment

Large-Scale ML Systems

Distributed training with PyTorch DDP/FSDP, mixed precision, gradient accumulation, and production ML pipelines.

  • Distributed Training
  • MLOps
  • Production Systems

Selected Publications

10+ Conference Presentations
4+ Research Domains
Active Reviewer CVPR, IJCB, IEEE T-Biom
2024

Data-driven Source Localization in Complex and Nonlinear Signal Dynamics

C. Sahu, M. Banavar, J. Sun

IEEE Sensors Journal, 2024

Localization Signal Processing Data-Driven Methods
2022

A Novel Non Linear Transformation Based Multi User Classification Algorithm for Fixed Text Keystroke Behavioral Dynamics

C. Sahu, M. Banavar, S. Schuckers

IEEE Transactions on Biometrics, Behavior, and Identity Science (T-BIOM), 2022

Behavioral Biometrics Multi-Modal Learning Classification
2017

Explicit Model Predictive Control of Split-Type Air Conditioning System

C. Sahu, V. Kirubakaran, T. K. Radhakrishnan, N. Sivakumaran

Transactions of the Institute of Measurement and Control, vol. 39, no. 5, pp. 754–762, 2017

Control Systems Model Predictive Control

Conference Presentations

10+ presentations at prestigious venues including:

Asilomar (2019-2023) IEEE FIE (2021) IWBF (2021)

Professional Service

Program Committee & Reviewer:

ACM FAccT (2023) CVPR PBVS Workshop (2023, 2024) IEEE ICME (2023) IWBF (2023) IEEE ICIP (2023) IJCB (2023, 2024, 2025) IEEE T-Biom IEEE Access Pattern Recognition IEEE IoT

Technical Expertise

Programming & Frameworks

PyTorch (DDP, FSDP) Python C++ MATLAB TensorFlow Keras

ML/CV Libraries

OpenCV ONNX OpenVINO NumPy Pandas SciPy scikit-learn DLIB Albumentations

Computer Vision

Image Segmentation Object Detection Pose Estimation Image Synthesis Feature Extraction Dense Correspondence Optical Flow

Deep Learning

CNNs Vision Transformers (ViT) Attention Mechanisms Diffusion Models GANs (CycleGAN, StyleGAN) Neural Rendering

Multi-Modal & Adaptation

Cross-Modal Fusion Sensor Fusion Multi-Task Learning Domain Adaptation Transfer Learning Sim-to-Real Transfer

Model Optimization

Knowledge Distillation Quantization Pruning ONNX Deployment OpenVINO Optimization

Distributed Training

PyTorch DDP FSDP Mixed Precision Training Gradient Accumulation

Tools & Platforms

Docker Git Weights & Biases TensorBoard LaTeX

Get In Touch

I'm always interested in discussing research collaborations, opportunities, or exciting projects in computer vision and AI.