Projects

Compression of 3D/4D Gaussian Splats
Sep 2024 - Dec 2024- Designed an end-to-end neural network-based compression algorithm for 3D Gaussian splats during my internship at Dolby
- Developed techniques to compress spherical harmonic coefficients (~50% size) using inter-frame video coding tools, achieving significant storage and bandwidth efficiency
- Applied hybrid methods combining neural networks and conventional tools, contributing to advanced 3D content compression solutions

Joint Multi-modal Neural Field Representations for Audio and Video
May 2023 - Aug 2023- Designed a novel Neural Field architecture for joint audio-video representation using time-stamp coordinates during my summer internship at Dolby
- Implemented model pruning and quantization techniques to optimize the representation, moving beyond modality-specific compression
- Contributed to advancing multi-modal compression techniques for audio-video data

Task Aware Image Quality Estimation for End-to-end Face Analytics
Jan 2023 - May 2024- Designed the first ever task-specific unsupervised image quality estimator correlating image quality with face detection performance using innovative regularization techniques like Dropblocks
- Developed novel evaluation protocols for image quality estimators in face detection and recognition, significantly reducing computational complexity
- Explored masked vision trasnformers as image quality estimators in face detection and recognition
- This work forms an integral part of my Ph.D. thesis, advancing image quality assessment methods for robust computer vision applications

Task Aware Video Compression using Lightweight Edge-specific Neural Networks
May 2021 - May 2022- Assessed the impact of video compression on deep learning models for tasks like pedestrian detection, face detection, and face recognition during my Ph.D., funded by Ford Motor Corp.
- Developed a task-aware frame partitioning algorithm for video encoders (e.g., HM and VVC) using edge-based deep learning models like MobileNets to optimize bit allocation for critical regions
- Achieved 6% bit-rate and 15% encoding time savings while maintaining video analytics performance under compression

Lightweight Compression of Intermediate Neural Network Features (Video Coding for Machines)
May 2021 - Aug 2021- Investigated the capability of video codecs like HEVC to encode neural network intermediate features during a summer project at Purdue
- Assessed the feasibility of splitting neural networks for efficient encoding and transmission of intermediate features
- Explored Autoencoder models for Video Coding for Machines and Scalable Video Coding, advancing research in machine-centric video compression

Dataset Curation and Systematic Evaluation of End-to-end Face Analytics
Jan 2020 - May 2021- Designed an end-to-end in-vehicle face analytics system for sequential face detection and recognition during the early years of my Ph.D., funded by Ford Motor Corp
- Developed a data collection system to capture diverse in-vehicle face data across multiple camera modalities, angles, and lighting conditions
- Curated a balanced dataset preserving original properties and capturing task interdependence, enabling systematic evaluation of face analytics systems
- Performed a meaningful and interpretable evaluation of an end-to-end face analytics system using the carefully curated data to gain valubale performance insights.

Background-Foreground Segmentation for Camera-Trap Images using RobustPCA
Aug 2019 - Dec 2019- Developed an unsupervised robust saliency predictor using robust PCA to differentiate background and foreground in camera-trap images during my first research project at Purdue
- Achieved performance comparable to learning-based models (e.g., R3-Net) without requiring training
- Applied the system to track animal movements, calculate population densities, and analyze habitual patterns in wildlife activities

EdgeDetect - A lightweight framework to detect DDoS attacks on Edge nodes
July 2018 - Aug 2019- Built a system for detecting DDoS attacks on edge devices using Recurrent Neural Networks (RNNs) under the supervision of Dr. Reshmi Mitra at the Indian Institute of Science
- Achieved state-of-the-art performance on the UNSW 2015 dataset with a minimal model architecture optimized for edge devices

Traffic Analytics Architecture and Dataset for Indian Roads using a Monocular Surveillance Camera Network
April 2018 - Aug 2019- Designed a real-time front-end web server system for delivering live RTMP and HLS video streams with features like content sharing, routing, congestion management, and load balancing under the supervision of Dr. Abhay Sharma at the Indian Institute of Science
- Collaborated on traffic analytics solutions, including vehicle counting, license plate detection, speed computation, and queue-length estimation
- Successfully deployed the full framework in Electronic City, Bangalore, India