Hi, my name is
Fang Cabrera.
I am a software engineer based in NYC.
About Me
Hi there! I'm Fang, an aspiring software engineer based in New York City.
I'm currently finishing my last semester at NYU Courant as a graduate student in Computer Science.
Courant is a magical place where ideas and inspirations bounce around like spells. I've most recently fallen victim to the charms of natural language processing. Prior to that, functional programming and the lovely semaphore.
Before I became a pair of googly eyes permanently glued to the screen of my laptop, I was a classical musician. I studied violin with Naoko Tanaka and viola with Karen Ritscher.
I also spent some happy years exploring the magnificent Chinese language and literature at Peking University.
Here are a few technologies I'm familiar with:
- Java
- Python
- Scala
- C
- CUDA
- SQL
- OCaml
- Spark
CS Courses I've Taken
Statistical NLP
Fall 2019
What I learnt:
- Language Models
- Naive Bayes and Maximum Entropy Model
- Conditional Random Fields and Hidden Markov Model
- Neural Networks for NLP
- Word Embeddings
- POS Tagging
- Word Alignment
- Machine Translation
- Parsing
Sample Works:
- Handcrafted MaxEnt Classifier in Python
- Experiment with FastText for Word Embiddings
- NER with HMM and Viterbi algorithm in Python
Some Things I've Built
Featured Project
PeachyDB
A miniature relational database implemented in Java that comes with a pretty-printer and supports a set of algebraic and aggregate queries. Two indexing methods (hash and b-tree) are offered to speed up operations.
- relational database
- Java
Featured Project
Loan Discrimination Analysis
The objective of this analysis is to draw actionable and insightful analysis regarding US loan approval bias using the Home Mortgage Disclosure Act data between 2007-2017, and to create an application for determining biases in lender practices.
- Spark
- MLlib
- Scala
- Python
- NYU HPC Cluster
- Naive Bayes
Featured Project
Quantifying the Relationship Between Occupancy and Performance On CUDA
Occupancy is an important metric on Streaming Multiprocessor utilization in CUDA. Low occupancy oftentimes results in poor performance, but it remains unclear if occupancy always correlates with performance positively. This paper tries to quantify the relationship between occupancy and performance on a set of benchmark applications, utilizing NVIDIA’s Nsight Compute profiling tools. It’s the author’s hope that this analysis will lead to better kernel optimization.
- CUDA
- Nsight Compute
Featured Project
Language Style Transfer
Language style transfer is the task of generating a new sentence that preserves the content of the source sentence while emulating the style of a target domain. It is an important component of natural language generation that facilitates many NLP applications.
- encoder-decoder
- transformer
- NLP
Other Projects
NER Tagger Based On HMM Implemented With Viterbi Algorithm
In this project, I explored the application of Hidden Markov Model on the task of name entity recognition. The project is structured into four parts:
- (1) Function to compute emission probability e(x|y)
- (2) Baseline tagger implemented as y* = argmax e(x|y)
- (3) Functions to generate trigrams and their corresponding log probability
- (4) Using maximum likelihood estimates for transitions and emissions, implement the viterbi algorithm that computes argmax p(x1, ..., xn, y1, ..., yn)
In Progress!
Handcrafted MaxEnt Classifier
MaxEnt Proper Name Classifiers which attempt to classify proper names on the basis of their surface strings alone. Performance is boosted to >= 80% with handcrafted feature engineering and hyperparameter tuning.
Foreign Exchange Rate Prediction
Autoregressive integrated moving average (ARIMA) has been one of the widely used linear models in time series forecasting during the past decades. Recent progress in machine learning has witnessed recurrent neural network (RNN) and long short term memory (LSTM) gain popularity. This paper reviews in depth these three models and explores the efficacy of their application on foreign exchange rate prediction. Furthermore, sentiment analysis is incorporated into the LSTM model to show that there is correlation between sentiments extracted from historical news and currency exchanges.
Some of My Music Performances
My Tech Blog
I just started a tech blog where I record some cool stuff I've learnt recently. It could've been another page attached to this website but I have some Google domains to use... these squatters, I know!
So please check it out here at nichijou.co.
You're probably looking at the picture on the right and wondering how it has anything to do with a tech blog. Well, it doesn't! Haha. I just thought my husband looks cute ^.^
What's Next?
Get In Touch
I'm currently looking for full-time engineering opportunities!