Data Science: Machine Learning
Predicting STAAR Scores with ML - Github
- Developed a decision tree classification model to predict students’ 8th grade Math STAAR classification using their math district assessment scores, achieving a 71% accuracy rate, and documented findings in a Jupyter notebook.
- Cleaned and analyzed a dataset of 140 students with Pandas and NumPy; utilized Sklearn for the machine learning algorithms, and Matplotlib for visualizations.
Mail Marketing Campaign - Github
- Conducted a study to analyze a marketing mail campaign using Logistic Regression, Random Forest and Support Vector Machines to identify groups of customers most likely to engage with promotional emails.
GameVibe: Analysis and Classification of Video Game Reviews with ML - Github
- Utilized data science techniques to develop a Multinomial Naive Bayes classifier capable of accurately classifying video game reviews as positive or negative achieving an accuracy rate of 81%.
- Demonstrated expertise in natural language processing (NLP) and machine learning, producing a model that can assist in identifying the most impactful reviews for a given video game.
Tinnitus Correlation Study - Github
- Conducted a study to investigate the correlation between various factors involved in the presence of tinnitus using multiple linear regression and k-mean regression methods.
- Analyzed data to identify the factors that are most strongly associated with tinnitus, providing insights into the condition and potentially informing the development of new treatments and preventative measures.
Data Warehousing
HomeNeedsService: Connecting Home Service Providers with Homeowners in Need - Github
- Designed and built the database warehouse with phpMyAdmin and MySQL, ensuring efficient and reliable data storage for the HomeNeedService platform.
- Led the creation and administration of a web server (LAMP) and hosted it on an AWS EC2 instance.
- Collaborated with other team members to integrate SQL commands with PHP language, ensuring that the website functions seamlessly and according to specifications.
Data Visualization and Data Mining
Golden Boot Race: Qatar 2022 - Github
- Utilized data mining and data cleaning skills to create a Tableau dashboard of previous Golden Boot winners.
- Created infographics and race bar graphs for social media platforms.
Programming: Automatization
Automating Academic Interventions with Selenium and Python - Github
- Developed a Python solution that automates the process of documenting interventions on a student management system using Selenium that reduces the time spent on this task by an average of 12-15 minutes per day.
- Built an ETF pipeline to enable data transformation and data cleaning using Pandas.
- Devised a user-friendly GUI with Tkinter to facilitate the use of the application by teachers while keeping their login credentials and student rosters secure.