top of page

Github Repo & Jupyter Notebooks

Data Science Projects

Wikipedia Scraping, Spotify Scraping;
REST to Python;
Audio Analysis, Librosa;
KMeans Clustering

Picture1.png

PolReddit

Track political discourse on Reddit

Realtime Database, Firebase, MongoDB;

Spark, parallel processing;

Reddit Scraping, Reddit PRAW;

spaCy, name entity recognition;

Flask, HTML, Javascript;

wordcloud

Screen Shot 2021-09-08 at 1.38.23 PM.png

Neural Network, TensorFlow;
Image Transform, Data Augmentation;
Model Fine-tuning, LeNet-5 CNN, VGGNet CNN;
Regularization, Dropout, Batch Normalization, Convolutional layer

Screen Shot 2021-09-08 at 1.28.25 PM.png

seaborn, matplotlib;
TensorFlow, Keras;
Feature Engineering, vectorization, periodicity and cyclicity mapping;
Data Windowing, window split, single step window;
CNN, RNN, LSTM

Screen Shot 2021-09-08 at 1.30.17 PM.png

Text Preprocessing, Tokenization;
TensorFlow, Keras, Sklearn;
Naive Bayes; 
Word Embedding;
CNN, Global Average Pooling, Word2vec

emoji3.jpg

[HACKATHON]

Covid-19 Vaccination Finder

Realtime Database, Firebase, JSON;
Backend to Frontend Pipeline;
Flask, HTML, CSS, Javascript;
realtime Geopy query;
UI design

Picture.png

News website scraping, beautifulsoup, chromedriver selenium;
text preprocessing, deduplication, stemming, stopword removal, language detection;
LDA topic modeling;
Tableau

lda.jpg

Linear Regression, Cross-Validation, Data Pipeline, Visualization, Feature Engineering, RMSE, LASSO, Ridge, ElasticNet, Regularization

Screen Shot 2021-09-08 at 1.27.20 PM.png

Sklearn, Logistic Regression, LASSO, Ridge;
Feature Engineering, Dummy Variable Encoding;
Confusion Matrix, Precision score, Recall score, AUC score, ROC curve

Screen Shot 2021-09-08 at 1.27.51 PM.png

Kmeans clustering, dimensionality reduction, PCS visualization, performance measurement (centroids, DB index, Inertia curve, silhouette graph)

Screen Shot 2021-09-08 at 1.27.01 PM.png

Youtube Scraping, chromedriver selenium

Untitled.png
bottom of page