Sounak Paul

Sounak Paul

Research Statistician Developer | PhD in Statistics | AI/ML Specialist

Education

Academic Journey

University of Chicago

PhD in Statistics

Oct 2019 – Aug 2024

GPA: 3.92/4.00

University of Alberta

MSc in Mathematics

Sep 2017 – Aug 2019

GPA: 4.00/4.00

Indian Statistical Institute

Bachelor of Mathematics (B.Math)

Aug 2014 – May 2017

  • First Division with Distinction (Class Rank: 2)

Technical Skills

Technical Expertise

About Me:

My name is Sounak Paul. I am from Kolkata, India.

Research areas:

Deep learning Computer vision Generative AI Applied statistics Time series forecasting

Languages:

Python R C/C++ SQL SAS Bash

Libraries:

PyTorch Tensorflow numpy scipy pandas scikit-learn OpenCV PyTorch3D MLflow

Tools and frameworks:

Git Docker Kubernetes JIRA AWS LangGraph OpenAI Agent SDK

Professional Experience

Professional Journey

Research Statistician Developer

SAS Institute Inc.

Aug 2024 – Present

Cary, NC

  • Developed and filed a provisional patent for an AI-driven code triage system leveraging agents, RAGs, and MCP to integrate static and dynamic analysis tools for automated identification of performance bottlenecks, security vulnerabilities, and code inefficiencies for C/C++ projects.
  • Streamlined multistep code triage processes, reducing manual effort, accelerating diagnostics, and improving code quality, and achieved significant performance improvements (up to 14 times faster, i.e. 93% reduction in run time) in computational tasks through targeted optimizations.
  • Collaborated with customers, consultants, technical support, and testers to develop analytical components of forecasting and scientific computing, gather and analyze business requirements.
  • Led a squad of 7 to develop TASK options for forecasting nodes using Python, C, and SAS, leveraging industry-standard technologies such as GitHub CI/CD, containers, Kubernetes, cloud platforms, and MLOps practices.

Forecasting R&D Intern

SAS Institute Inc.

Jun – Aug 2022 and 2023

Remote

  • Developed a novel multi-objective blackbox optimization method using genetic algorithms to autotune seasonal and subset model parameters of general ARIMA models, simultaneously with Box-Cox parameter.
  • Achieved significant improvement in out-of-sample fit statistic (40% for RMSE) over popular methods such as auto-arima and SAS Diagnose, averaged over four separate data sets.
  • Resulted in a patent (US12380369B1) and a paper (under review).

Projects

Research Projects

Second order methods for stochastic ERM and EM algorithms in orbit recovery setting

  • Used second order methods (newton and quasi-newton) to accelerate stochastic variance reduced gradient descent and EM algorithms for orbit recovery problems.
  • Achieved ≈ 75% reduction in run time using variance-reduced methods on simulated signals.
  • Tools Used: numpy, scipy, PyTorch, MLflow, matplotlib

Deep learning priors for orbit recovery problems

  • Developed neural network architectures for supervised learning of signals and rotational distributions.
  • Demonstrated the advantage of using our method to accelerate the convergence for the reconstruction of signals from the moments (up to 83%).
  • Tools Used: numpy, scipy, PyTorch, OpenCV

Estimation of the amount of heavy-tailedness and long-range dependence in linear processes

  • Used Marcinkiewicz strong laws of large numbers (MSLLN) to find rates of convergence for heavy-tailed multivariate products of long-range dependent two-sided linear processes.
  • Developed a novel method to estimate how much (if any) LRD and HT a sequential data set possesses, and tested it on real financial data using R.

Smaller projects

  • Instance-Level Object Detection using SIFT descriptors vs using YOLO. (Computer vision)
  • Does BCG vaccine have a protective effect against severe COVID-19? (Applied statistics)
  • Reimplementation of various machine learning and deep learning papers (ML and DL)

Publications

Research Work

M. A. Kouritzin and S. Paul On almost sure limit theorems for heavy-tailed products of long-range dependent linear processes. Stochastic Process. Appl., 152 (2022), pp. 208-232 arXiv

Y. Khoo, S. Paul and N. Sharon Deep Neural-network Prior for Orbit Recovery from Method of Moments. J. Comput. Appl. Math., 444 (2024), 115782 arXiv

S. Paul, I.V. Farahani, M.V. Joshi and Y. Park On the Use of Derivative-Free Optimization for Autotuning ARIMA Models. Int. J. Forecast. Under review.

M.V. Joshi, S. Paul, I.V. Farahani and Y. Park Hyperparameter tuning in autoregressive integrated moving average (ARIMA) models. US Patent 12,380,369. 5 Aug 2025

Awards

Honors & Awards

Academic Honors

  • Dr. Josephine M. Mitchell Scholarship, University of Alberta (2018)
  • Pundit RD Sharma Memorial Graduate Award, University of Alberta (2017)
  • University of Alberta Master's Scholarship, University of Alberta (2017)
  • Visiting Student Research Program Fellowship, TIFR Bombay & TIFR CAM (2016)
  • KVPY Scholarship, IISc Bangalore and Dept. of Science and Technology, India (2013)