Analyzing Form D Submissions
Keywords: nonlinear dimension reduction, K-means clustering, nonparametric smoothing, natural language processing, hypothesis testing, Form D
This project was done for the requirements of the class 46-923 Financial Data Science II, along with my group members Dhruv Baid, Guolun Li, Shangyu Li, and Yi Xin Xiang. It can be found on GitHub.
Used nonlinear dimension reduction, clustering, nonparametric smoothing, and natural language processing to analyze Form D submissions from SEC's website. Also learnt about working with messy data which required extensive cleaning.