Github Repo & Jupyter Notebooks
Data Science Projects
Wikipedia Scraping, Spotify Scraping;
REST to Python;
Audio Analysis, Librosa;
KMeans Clustering
PolReddit
Track political discourse on Reddit
Realtime Database, Firebase, MongoDB;
Spark, parallel processing;
Reddit Scraping, Reddit PRAW;
spaCy, name entity recognition;
Flask, HTML, Javascript;
wordcloud
Neural Network, TensorFlow;
Image Transform, Data Augmentation;
Model Fine-tuning, LeNet-5 CNN, VGGNet CNN;
Regularization, Dropout, Batch Normalization, Convolutional layer
seaborn, matplotlib;
TensorFlow, Keras;
Feature Engineering, vectorization, periodicity and cyclicity mapping;
Data Windowing, window split, single step window;
CNN, RNN, LSTM
Text Preprocessing, Tokenization;
TensorFlow, Keras, Sklearn;
Naive Bayes;
Word Embedding;
CNN, Global Average Pooling, Word2vec
[HACKATHON]
Covid-19 Vaccination Finder
Realtime Database, Firebase, JSON;
Backend to Frontend Pipeline;
Flask, HTML, CSS, Javascript;
realtime Geopy query;
UI design
News website scraping, beautifulsoup, chromedriver selenium;
text preprocessing, deduplication, stemming, stopword removal, language detection;
LDA topic modeling;
Tableau
Linear Regression, Cross-Validation, Data Pipeline, Visualization, Feature Engineering, RMSE, LASSO, Ridge, ElasticNet, Regularization
Sklearn, Logistic Regression, LASSO, Ridge;
Feature Engineering, Dummy Variable Encoding;
Confusion Matrix, Precision score, Recall score, AUC score, ROC curve
Kmeans clustering, dimensionality reduction, PCS visualization, performance measurement (centroids, DB index, Inertia curve, silhouette graph)
PolReddit
Track political discourse on Reddit
Realtime Database, Firebase, MongoDB;
Spark, parallel processing;
Reddit Scraping, Reddit PRAW;
spaCy, name entity recognition;
Flask, HTML, Javascript;
wordcloud