top of page

Github Repo & Jupyter Notebooks

Data Science Projects

Wikipedia Scraping, Spotify Scraping;
REST to Python;
Audio Analysis, Librosa;
KMeans Clustering

PolReddit
Track political discourse on Reddit

Realtime Database, Firebase, MongoDB;

Spark, parallel processing;

Reddit Scraping, Reddit PRAW;

spaCy, name entity recognition;

Flask, HTML, Javascript;

wordcloud

Neural Network, TensorFlow;
Image Transform, Data Augmentation;
Model Fine-tuning, LeNet-5 CNN, VGGNet CNN;
Regularization, Dropout, Batch Normalization, Convolutional layer

seaborn, matplotlib;
TensorFlow, Keras;
Feature Engineering, vectorization, periodicity and cyclicity mapping;
Data Windowing, window split, single step window;
CNN, RNN, LSTM

Text Preprocessing, Tokenization;
TensorFlow, Keras, Sklearn;
Naive Bayes; 
Word Embedding;
CNN, Global Average Pooling, Word2vec

[HACKATHON]
Covid-19 Vaccination Finder

Realtime Database, Firebase, JSON;
Backend to Frontend Pipeline;
Flask, HTML, CSS, Javascript;
realtime Geopy query;
UI design

News website scraping, beautifulsoup, chromedriver selenium;
text preprocessing, deduplication, stemming, stopword removal, language detection;
LDA topic modeling;
Tableau

Linear Regression, Cross-Validation, Data Pipeline, Visualization, Feature Engineering, RMSE, LASSO, Ridge, ElasticNet, Regularization

Sklearn, Logistic Regression, LASSO, Ridge;
Feature Engineering, Dummy Variable Encoding;
Confusion Matrix, Precision score, Recall score, AUC score, ROC curve

Kmeans clustering, dimensionality reduction, PCS visualization, performance measurement (centroids, DB index, Inertia curve, silhouette graph)

Youtube Scraping, chromedriver selenium

PolReddit
Track political discourse on Reddit

Realtime Database, Firebase, MongoDB;

Spark, parallel processing;

Reddit Scraping, Reddit PRAW;

spaCy, name entity recognition;

Flask, HTML, Javascript;

wordcloud

bottom of page