About Me

I’m an MS CS student at Georgia Tech right now, and expect to graduate in May 2020. Before this, I was an R&D Engineer working with RaRe Technologies, a company that works on projects involving NLP and Machine Learning in a wide variety of domains. We are also the maintainers of Gensim, a popular open source library for unsupervised learning in NLP.

I’ve worked on both commercial and open source projects for the company. We often create well-designed, scalable implementations of practically useful models from research papers and open source them as part of Gensim. The last such project I did was implementing Poincare Embeddings, a model from a paper by Facebook AI Research to learn vector representations of nodes from a graph with hierarchical information. We wrote a blog post about it here.

I’ve previously been a mentor in Google Summer of Code 2017 for a project to implement FastText - a model to learn word representations from an unstructured text corpus using character n-grams - as part of Gensim. I’ve also written an analysis of how the embedding does in comparison to word2vec.

One of my primary interests is using NLP to analyze large-scale text corpuses on the web in order to understand society at large. I’m particularly interested in social media data and news articles - I’ve published an open source library to perform exploratory analysis of news corpora in order to find topics and trends in them. Here, you can see a demo of the library on a dataset of Hacker News articles. I’ve also delved into a fun project on coming up with a novel (to the best of my knowledge) method to cluster classic literature available from Project Gutenberg by using word2vec. You can read about it here.

I graduated from Indian Institute of Technology, Roorkee in 2015 with a degree in mechanical engineering and knowledge of a lot of things unrelated. I spent a lot of time in college with a student group, SDSLabs (Software Development Section), a bunch of smart geeks who make cool web and mobile applications. I participated in quite a few hackathons with them, winning a couple. I was very interested in game development in college, being part of Google’s open source program, Google Summer of Code 2014 with CodeCombat, and interning in a game dev startup, Edlogiq in 2013.

When I’m not doing any of that, I love to read, especially classic sci-fi and dystopian fiction. I also like to trek, travel and listen to blues and classic rock. Feel free to contact me at jayantjain1992@gmail.com.