TL;DR: An integrated analysis of climate change impacts through disaster frequency and cost, public sentiment, and urban land use changes, showcasing advanced data science and visualization techniques.
Overview
Welcome to a project conducted during graduate school, the GitHub repository of which is here. This project delivers a multifaceted analysis of climate change and its impacts through the use of advanced analytical techniques, including interactive dashboards, Natural Language Processing (NLP), and computer vision. It is divided into three major sections:
- Natural Disasters in the United States: Frequency, Cost & Politics
Analyzes disaster frequency and cost using Python and Dash, integrating data visualization and political analysis. - RedditGoesGreen: Public Opinion on Climate Change
Investigates public sentiment and discourse using advanced NLP techniques, including Transformer models and topic modeling. - Urban Land Use Maps: Temporal Changes
Converts static land use maps into interactive formats using computer vision and georeferencing for temporal analysis.
Each section showcases distinct coding skills and methodologies, providing a comprehensive view of climate change’s diverse impacts. The code for each project is organized in its respective subfolder.
Detailed Project Descriptions
Natural Disasters in the United States: Frequency, Cost & Politics
This section examines the increasing frequency and cost of natural disasters in the U.S. over recent decades, along with the political responses to these changes.
Technologies and Skills:
- Python
- Dash for interactive dashboard creation
- Data scraping and API usage
- Data analysis and visualization
Key Features:
- Visualize the frequency and cost of natural disasters over time.
- Highlight impacts on specific states and counties.
- Analyze voting patterns on climate-related legislation.
- Utilize FEMA data, Climate IRA bill voting records, and the Census API for socio-economic data.
Description:
42.63% of natural disasters in the U.S. since 1980 have occurred in the last 12 years, highlighting the increasing frequency and cost of such events due to climate change. This dashboard visualizes these trends, examines state-specific impacts and political stances, and identifies high-risk counties based on socio-economic factors. The data sources include FEMA disaster data, public assistance data, Climate IRA bill voting records, and the Census API.
RedditGoesGreen: Public Opinion on Climate Change
This section explores public opinion on climate change through advanced NLP techniques, analyzing the Reddit Climate Change Dataset.
Technologies and Skills:
- PyTorch
- NLP techniques: sentiment analysis, topic modeling
- Transformer models (e.g., ClimateBERT)
- Data visualization
Key Features:
- Analyze sentiment trends over time.
- Compare the effectiveness of different NLP models.
- Extract and visualize key topics discussed in relation to climate change.
Description:
This project uses advanced NLP techniques to analyze sentiment and discourse on climate change from Reddit discussions. It evaluates the effectiveness of Transformer-based models like ClimateBERT compared to Word2Vec and uses topic modeling techniques such as LDA and BERTopic. The findings offer insights into public perceptions of climate change, valuable for policymakers, activists, and researchers.
File Structure:
data/
folder: Contains filtered data. Note: Some datasets are stored in Google Drive due to size.eda/
folder: Contains exploratory data analysis scripts.models/
folder: Contains scripts for various NLP models and their evaluations.
Urban Land Use Maps: Temporal Changes
This section converts static images of land use maps into interactive and classified maps for analyzing temporal changes.
Technologies and Skills:
- Python
- Computer Vision (OpenCV)
- Georeferencing (QGIS)
- Data visualization
Key Features:
- Convert static land use maps into interactive formats.
- Classify land use categories and analyze changes over time.
- Provide code templates for similar processes in other cities.
Description:
Most land use maps in developing countries are stored as static images in PDFs. This section, developed for the Energy Policy Institute at Chicago (EPIC), converts static images from 1972 to 2015 into interactive land use maps using computer vision and georeferencing techniques. While interactive maps cannot be shared, the provided code outlines how to perform similar processes for other cities.