Github Repo & Jupyter Notebooks
Data Science Projects
Wikipedia Scraping, Spotify Scraping;
REST to Python;
Audio Analysis, Librosa;
KMeans Clustering

PolReddit
Track political discourse on Reddit
Realtime Database, Firebase, MongoDB;
Spark, parallel processing;
Reddit Scraping, Reddit PRAW;
spaCy, name entity recognition;
Flask, HTML, Javascript;
wordcloud

Neural Network, TensorFlow;
Image Transform, Data Augmentation;
Model Fine-tuning, LeNet-5 CNN, VGGNet CNN;
Regularization, Dropout, Batch Normalization, Convolutional layer

seaborn, matplotlib;
TensorFlow, Keras;
Feature Engineering, vectorization, periodicity and cyclicity mapping;
Data Windowing, window split, single step window;
CNN, RNN, LSTM

Text Preprocessing, Tokenization;
TensorFlow, Keras, Sklearn;
Naive Bayes;
Word Embedding;
CNN, Global Average Pooling, Word2vec

[HACKATHON]
Covid-19 Vaccination Finder
Realtime Database, Firebase, JSON;
Backend to Frontend Pipeline;
Flask, HTML, CSS, Javascript;
realtime Geopy query;
UI design

News website scraping, beautifulsoup, chromedriver selenium;
text preprocessing, deduplication, stemming, stopword removal, language detection;
LDA topic modeling;
Tableau

Linear Regression, Cross-Validation, Data Pipeline, Visualization, Feature Engineering, RMSE, LASSO, Ridge, ElasticNet, Regularization

Sklearn, Logistic Regression, LASSO, Ridge;
Feature Engineering, Dummy Variable Encoding;
Confusion Matrix, Precision score, Recall score, AUC score, ROC curve

Kmeans clustering, dimensionality reduction, PCS visualization, performance measurement (centroids, DB index, Inertia curve, silhouette graph)

