projects

stuff i do when i pretend to be smart

senior design - ShopEZ

spring '21

motivation

A quarter of the budget of all store operations is in customer service, with much of that cost coming from the customer service spent at the checkout lane. ShopEZ seeks to reinvent the shopping experience by providing customers with a means of performing their own self-checkout from their shopping cart.

Current means of eliminating the self-checkout line have not been successful. Scanner modules seen at grocery stores still need to be brought to the checkout line in order to pay. Amazon Go seeks to solve the problem via a Computer Vision based solution; however, customer discovery found that many stores and customers take issues with the underlying privacy issues. Amazon has also created a smart-shopping cart to be used at Whole Foods; however, this would require a store to completely replace all shopping carts and baskets for a full integration.

ShopEZ seeks to solve all these problems via an easy-to-use and cheap add-on to any shopping cart. It provides both a hardware-based solution and a software-based solution in order to accomodate customers who may or may not be comfortable with paying from their phone. For further questions, feel free to email me.

what i did:

team co-lead

electrical system design (sw + wiring + power)

c++ drivers (lcd, ble, barcode scanner, rotary encoder)

pulled 2 weeks of all nighters in a row

embedded systems - DrinkUp

spring '21

motivation

"Face it: COVID-19 has sucked out the fun of drinking with your friends due to social distancing. As a college kid however, I was stuck wondering: How am I going to get peer pressured to drink without my friends being a bad influence on me? Fear no more, fraternity brothers! DrinkUp is a social media web app that allows you to link your “DrinkUp Score” (proprietary score based on your BAC) to the internet for all of your friends and family to admire!"

what i did:

raspberry pi4 apache2 server

php dynamic web site

overall system design

documentation

"Navigating the Latent Space of Generative Models"

fall'19 - fall'20

motivation

Over the past 20 years, the field of deep learning has exploded in its modern-day significance. In particular, the topic of generative models in deep learning has advanced at an astronomical rate, ever since the inception of the Generative Adversarial Network (GAN) in 2014. The project, sponsored by the Fall 2020 President’s Undergraduate Research Award, focused on the localization of points in the latent space of generative models. Our research sought to answer whether an agent could navigate the latent space of any generative model (Variational Autoencoders and GANs), given n pairs of comparison queries.

As the latent space of deep learning generative models is condensed into features, rather than the input (i.e., images), humans cannot navigate the latent space by means of a direct search. Rather, the human must perform a manual search through the latent space of the generative model, searching for any desired features. This task quickly becomes infeasible for humans when generalizing to k dimensions, as the latent space of significant generative models (i.e. StyleGAN) reach 512 dimensions. This problem is only worsened when dealing with GANs, as the architecture of a “vanilla” GAN does not include an encoder, which would allow for a direct translation from image to latent space. This means that any latent point serving as the target image needed to be randomly sampled from the latent space of a GAN, which heavily limited the ability to quantitatively measure the reconstruction of images from latent points.

The most-significant finding from my experimentation with GANs took place with the NVIDIA’s state-of-the-art StyleGAN v2. Through the use of random localization, where users are asked to select the more ideal image from a query of two-randomly generated images, I was able to achieve the results on the right, localizing a latent point that qualitatively is extremely similar to a desired target image! These results are enormously promising, as the 5000 comparisons were made via a simple mean-square-error metric for images of size 1024x1024 in a 512-dimensional latent space. From a quantitative standpoint, the results still showed promise, as the mean-squared-error was reduced from 1.031 to 0.138. Additionally, the state-of-the-art metric Frebuchet Inception Distance (FID), which measures the accuracy of image reconstruction via a classifier trained on the ImageNet dataset, further showcased the viability of our localization scheme in the complex GAN latent space. Taking advantage of StyleGAN’s revolutionary projection method, which serves as a means of encoding images to the latent space of a GAN, I was able to implement a scheme of measuring the FID of our reconstructed images.

more info

google colab notebook

brief slides showing results