Learning Low-Dimensional Manifolds by Learning Distance

Part of my dissertation, successfully defended on November 22, 2019. The abstract is as follows:

High-dimensional data in machine learning applications often has an intrinsic low-dimensional structure to it. Classic examples include video and images, which exist in spaces which can have millions of dimensions, and yet most real world images cluster close to low-dimensional manifolds. This property of real world data is often called the manifold hypothesis. Because working with high-dimensional data is often quite difficult in terms of algorithmic complexity, there is great interest in methods for dimensionality reduction for working with data satisfying the manifold hypothesis. In the case of linear manifolds, principal component analysis (PCA) is one example of a very successful and well-known algorithm for dimensionality reduction. Unfortunately, real-world data quite often has a low-dimensional but nonlinear structure, and so there is a need for nonlinear methods for dimensionality reduction in order to exploit the structure of the data. Manifold learning is a branch of machine learning for working with data which satisfies the manifold hypothesis.

Robust Matrix Completion: Extending the Orthogonal Rank-One Matrix Pursuit Algorithm and the Alternating Projection Algorithm

Part of my dissertation, successfully defended on November 22, 2019. The abstract is as follows:

The problem in this paper is to recover a low rank matrix that has been corrupted by both erasures and errors. This is an extension of the standard matrix completion problem where the problem is to recover a low rank matrix corrupted only by erasures. Two algorithms for solving this problem are presented: The Robust Orthogonal Rank-One Matrix Pursuit (ROR1MP) algorithm, an extension of the Orthogonal Rank-One Matrix Pursuit algorithm (OR1MP); and the Robust Alternating Projection (RAP) algorithm, an extension of the Alternating Projection (AP) algorithm. First, the working of the original algorithms to be extended is explained and the relevant results are collected here. Next we explain the new algorithms and test their performance on both synthetic and real-world data.

A Proximal Forward-Backward Splitting Algorithm for Affine Low-Rank Matrix Completion

Part of my dissertation, successfully defended on November 22, 2019. The abstract is as follows:

This document describes a numerical algorithm for the recovery of an affine low-rank matrix corrupted by noise, errors, and erasures, based on the proximal forward-backward splitting algorithm. We shall refer to this algorithm as the ``affine low-rank matrix completion'' algorithm, or ALRMC algorithm. This recovery problem arises when solving the rigid motion segmentation problem, i.e. the problem of identifying which feature tracks in a video sequence belong to a common rigid object. Under orthography, feature tracks that all belong to a particular rigid object lie in a 4-dimensional affine subspace. Hence the interest in an effective algorithm to recover a low-dimensional affine subspace from the data points. With this application in mind, we allow for three sources of corruption in the data common to video: noise (low magnitude), errors (high magnitude but sparse), and erasures (missing data or high magnitude corruptions with known locations).

Singular Level Curves of Harmonic Functions, Conformal Mappings and Emerging
Applications to Shape Recognition of Planar-domains

This paper is the product of joint research with Sa'ar Hersonsky, which appeared in the 2019 International Conference on Image Processing, Computer Vision, & Pattern Recognition. The abstract is as follows:

In spite of significant advances in computer graphics and computer vision for description and recognition of rigid shapes and objects, the problem of description and recognition of non-rigid shapes and objects is still open. In earlier studies [7, 8, 9] and [2], we proposed an approach, based on complex and harmonic analysis, aiming to describe and analyze non-rigid shapes. In this note, we describe a step in our program to supply a canonical set of shapes, very much like a dictionary with mathematical structures, that intrinsicially determine the shape and its classification. Our ultimate goal is to provide a computational method of representing 2D shapes where every planar shape will be assigned a unique fingerprint, which is a conformal map of the shape to a canonical shape in the plane or in space. We believe that the study presented in this paper can be extended to describe and recognize both rigid and non-rigid shapes and objects. The main focus of this article is to provide an insight into the mathematical and geometrical fundamentals. We will restrict our attention to the case of planar domain of high genus since the case of simply-connected and doubly-connected domain was treated in the references mentioned above.

Reevaluative Subspace Clustering

This research was the subject of my oral exam as part of my doctoral program at the University of Georgia: successfully passed on April 27, 2017. The abstract is as follows:

The Reevaluative Subspace Clustering (RSC) Algorithm is an algorithm for solving the subspace clustering problem, which consists of four phases: (1) an initial clustering phase, during which some other primary clustering algorithm is applied; (2) a decomposition phase, in which the data is used as a self-expressive dictionary to represent itself, with a sparse error term and a noise term allowed; (3) a subspace estimation phase, in which new subspaces are estimated from the decomposition from the previous step, with errors and noise accounted for; and (4) a reevaluative phase where the class of each data point is reevaluted using the new subspace estimation. The algorithm is iterative and repeats this steps 2 through 4 until a convergence criterion is met.

Pairs-Trading: An Optimal Selling Rule

This paper I worked on as part of a VIGRE research class during Spring 2013, along with my coauthors Kuo, Luu, Nguyen, Thompson, and Zhang. The abstract is as follows:

Pairs trading involves two cointegrated securities. When divergence is underway, i.e., one stock moves up while the other moves down, a pairs trade is entered consisting of a short position in the outperforming stock and a long position in the underperforming one. Such a strategy bets the spread between the two would eventually converge. This paper is concerned with an optimal pairs-trade selling rule. In this paper, a difference of the pair is governed by a mean-reverting model. The trade will be closed whenever the difference reaches a target level or a cut-loss limit. Given a fixed cutloss level, the objective is to determine the optimal target so as to maximize an overall return. This optimization problem is related to an optimal stopping problem as the cutloss level vanishes. Expected holding time and profit probability are also obtained. Numerical examples are reported to demonstrate the results.

Separate Calvo Price-Stickiness Parameters in an Aggregate Supply and Demand Model

My Masters Thesis in Economics. The abstract is as follows:

This paper tests whether or not the degree of price-stickiness between supply and demand shocks differs. An aggregate supply and demand model is used to derive theoretical impulse response functions for both supply and demand shocks. Then the Blanchard-Quah long-run identifying restrictions are used to identify supply and demand shocks in monthly time series data of industrial production and the consumer price index from January 1975 to January 2000, which are then used to derive empirical impulse response functions for supply and demand shocks. Then the Calvo price-stickiness parameter is separately calibrated for supply and demand shocks by fitting the theoretical impulse response functions to the empirical ones. This paper finds that there is no significant improvement in the fit of the theoretical impulse response functions to the empirical impulse response functions by calibrating the Calvo parameter separately for supply and demand shocks, and hence suggests that firms take approximately the same amount of time to adjust prices in response to demand shocks as they do to supply shocks.