I am a Computer Science PhD student in the Courant Institute at NYU in the CILVR group focusing on deep learning applied to natural language processing and advised by Kyunghyun Cho and Sam Bowman. Before this I completed my B.A. in Statistics and M.S. in Computer Science focusing on machine learning and natural language processing at Northwestern University. My advisor was Doug Downey. I was part of the WEBSAIL group at Northwestern specifically focusing on deep learning for language modeling.

Broadly, my research interests are:

  • Deep Learning
  • Machine Learning
  • Learning to Learn
  • Machine Translation
  • Autonomous Driving
  • Reinforcement Learning

I follow both international and club football (soccer), NBA basketball, and professional tennis very closely. I’m a huge supporter of Borussia Dortmund from the German Bundesliga.


Research Profiles: Google Scholar

PAG2ADMG (Statistics Bachelor's Thesis)

Novel methodology which enumerates the full set of causal graphs by converting any partial ancestral graph (PAG) to the set of all acyclic directed mixed graphs (ADMGs) that belong to the same Markov equivalence class encoded by the PAG.

[ Conference Paper in Review at AISTATS 2018 | arXiv]

[ Student Abstract Accepted to AAAI 2017 | AAAI | code ]

Active Hyperparameter Optimization for Deep Nets (CS Master's Thesis)

Project in which more efficient methods for hyperparameter selection for various deep nets including LSTMs and ConvNets are being developed. We plan to submit to a leading ML conference in the next couple of months.

Targeted Dropout for Deep Nets

Project in which alternative methodologies to perform dropout to increase variance of models to improve deep neural network performance on a variety of tasks are being developed. We plan to submit to a leading ML conference in the next couple of months.

Deep Ensemble Methods for Language Modeling

Project in which alternative methodologies to improve state-of-the-art language model performance with LSTMs are being developed. We plan to submit to a leading ML conference in the next couple of months.

[Find Out More]

Head-Neck Cancer Subtype Prediction

This project aims to predict data-driven head-neck cancer subtypes using pathology images. We aim to leverage ConvNets and use pretrained meta-features from st ate-of-the-art image classification architectures.

[Find Out More]

Identifying Genetic Basis for Cellularity in Glioblastoma Patients

This project developed a pipeline that uses TCGA gene expression and molecular data as features and cellularity as the outcome variable to learn structures of Bayesian networks using max-min hill climbing. This networks are then model averaged using standard Bayesian model averaging and used for prediction.

[Find Out More]

The Boundary Searcher

Novel method which estimates the representative volume of a point cloud. This methodology extends the state-of-the-art by being applicable in any dimensional space.

[Find Out More]

Improving Word Embeddings with Analogical Knowledge

In this project, we developed methods to input pre-existing analogical knowledge to improve word-embeddings in Google's word2vec models.

[Find Out More]

ICU 30-Day Readmission Prediction

In this project, ICU 30-day readmission was predicted using a multivariate panel of physiological measurements. We utilized Subgraph Augmented Non-Negative Matrix Factorization (SANMF) for this task.

[Find Out More]

Predicting Unmet Health Care Needs in Children with DBD

In this project, the best predictors of unmet health care needs in children with disruptive behavior disorder were identified. The dataset utilized was the National Survey of Children's Health. General multivariate logistic regression and group-wise logistic regression methods were utilized.

[Find Out More]

How Evil are Turnovers?

In this project, the impact turnovers and turnover differential has on winning and scoring in the NBA were identified. NBA regular season data from basketball-reference were used. Various regression methods were used to estimate impact of these features.

[Find Out More]

Mechanical TA

In this project, methodologies to transform a set of peer review grades and a set of TA grades into true grades and peer reviewer qualities for any set of assignments were developed and implemented. These methods were used for EECS 349 and are currently being extended for other EECS courses at Northwestern.

[Find Out More]

Latest Blog Post

25 Jan 2017 . project . Deep Ensemble Methods for Language Modeling Comments

Abstract: In this paper, we present two new ensemble methods each with two variants for Recurrent Neural Networks (RNNs) which have Long Short-Term Memory (LSTM) units. We propose two new methods: (1) AdaBoost Inspired Mini- Batch Sampling (ABIMBS) and (2) AdaBoost Inspired Sentence Sampling (ABISS) ensemble methods for the language modeling task. ABIMBS has a forward and backward variant and ABISS has a standard deviation and square root variant. We show that all four of these methods applied to both non-dropout and dropout LSTM architectures for language modeling achieve lower perplexity than its current state-of-the- art...



  • September 2017 - Present

    Computer Science PhD Student in Machine Learning at New York University

  • March 2017 - August 2017

    Deep Learning Research Intern at Salesforce Research

  • March 2016 - March 2017

    Research Assistant in Deep Learning & NLP at Northwestern University

  • January 2016 - March 2017

    MS in Computer Science at Northwestern University

  • September 2015 - January 2016

    Master’s Exchange Student in Computer Science at ETH Zurich

  • June 2015 - January 2016

    Research Assistant in Biomedical Informatics at Stanford University

  • July 2014 - March 2015

    Research Assistant in Neural Network Language Modeling at Northwestern University

  • September 2013 - March 2017

    BA in Statistics at Northwestern University


Drop me an email if you are interested in collaborating on research or have any questions regarding my projects.