Introduction

Wildfire causes substantial damage to private property in the US each year. Property owners can reduce the risk of wildfire damage by taking defensive action, yet many do not. There are many reasons why people do not take action to mitigate their risk (other WiRe work). The literature has identified several community-level attributes/factors that encourage people to reduce risk on their property (information campaigns, action plans, etc.). Community organizations devote considerable effort and resources to introduce and administer programs and practices that encourage residences to reduce the risk of fire damage. Therefore, understanding the characteristics of a community that make its residents more amenable to specific programs or strategies could help increase the rate of uptake and reduce the damage caused by wildfire.

The objective of this project is to determine whether community-level socioeconomic and biophysical attributes can be used to explain a community’s willingness to adopt programs or practices. If so, can those observable characteristics be used to classify communities and identify programs and actions with the highest probability of success. we pursue the project objectives in two phases. Phase 1 consists of compiling data on socioeconomic and biophysical attributes of communities and using that data to develop community classifications. A secondary component of phase 1 is to compare these purely data-driven classifications to the archetypes established in the literature. Phase 2 investigates weaknesses in the data-driven approach as a means to identify important information that community practitioners could collect via survey.

Methods (for internal use)

We consider 74 communities predominantly in Western Colorado who are involved with the WiRe project with the aim of becoming more fire prepared. The communities are socioeconomically diverse and are all located in fire prone areas. We compile a dataset of socioeconomic, biophysical, and survey information collected by the WiRe team. We use a data-driven clustering approach to classify communities. The hope is that these community groupings may aid in future efforts to promote fire preparedness by allowing deeper understanding of community characteristics before any work is begun. The communities included are seen in the map below, it is worth noting that despite having large variation in the landscape covered most are near large tracts of public land.

Data

We compile a diverse dataset for this analysis. First, we collect socioeconomic and demographic data on communities from the American Community Survey (U.S. Census Bureau, 2017 American Community Survey 5-Year Estimates). The data is collected at the highest spatial resolution available (tracts). However, the communities are self defined and do not necessarily correspond to legal delineations. We use housing location data (MS Structures) to calculate the fraction of a community that intersects with a census geography. We use the calculated weights to associate the census data with each community. The variables are:

We collect biophysical data from numerous sources to capture landscape-level feature that may influence wildfire risk as well as people’s preferences for natural ammenities. Since many of these features are in proximity of the community boundaries, we spatially merge biophysical data within a 5km radius (and include the community iteself). We calculate the area of that each federal agency manages or has jurisdiction over with regard to wildfire management (WFDSS, 2010). We calculate the average ground evacuation time of the land within the buffer as a proxy for community ruralness (WFDSS). Ground evacuation time is the time in hours from a given point to a definitive care facility (hospital). We calculate the area with wilderness designation within 5km of the community. We calculate the average wildfire hazard potential (Dillon, 2018) to capture the risk of fire near the community. We calculate the number of recreation opportunities (e.g., reservoirs, trail heads, and campgrounds) within 5km of the community. We calculate the area of land by existing vegetation type (20 categories) from Landfire. Lastly, we construct mean, median, and 25th and 75th percentiles of elevation, slope, aspect, and ruggedness based on a 30 meter digital elevation model (elevatr package).

We add the socioeconomic and biophysical data to survey data collected by the WiRe team from… more info on the surveys and the data.

Clustering

After the data set is created the package ClustOfVar is employed to different subsets of the above data(the census data, the census data and the biophysical data, and the census, biophysical, and WiRe community data) for dimensionality reduction as the number of variables are larger than the number of observations. This is done through Principal Component Analysis.

In order to determine the optimal number of components a bootstrap method is used (with the number of bootstraps set at around one half of variables) on hierarchal clustering. This is done by repeatedly calculating a hierarchical clustering of the variables while dropping one observation each time with replacement. This allows to see how much any one of the variables influences the groupings and the optimal number of variable groupings. For the case of the demographic and biophysical clusters the optimal grouping and thus subsequent dimensions of the community clustering has six different dimensions. The ClustOfVar package can also support categorical variables/ qualitative variables, which is used in the case of the WiRe community groupings, using a mirror to PCA called Multiple Correspondence Analysis(MCA).

After the number of components is determined the largest Principal Components of the dataset are taken until the determined number. Then variables are assigned to the direction to which they are most correlated. This determines the membership of each variable. From there a synthetic variable is made from the composite of all members of the group. This creates a new space onto which the communities are mapped.

This new mapping is the bases for the clusters of communities which is done using the k means algorythm (cluster package). This is done by assigning starting points according to the number of clusters and assigning the communities to the cluster closest to them. The centroid of each group is then calculated and each community is again assigned to the closets cluster. This process repeats until no centroid moves when recalculated. This algorythm is also repeated several times to avoid any issues that may arise from unusual starting points and insure path independence.

Communties in Clusters
Bachelor Groups/Ranching Communities Country Orchards Educated Ski-Towns Small Working Town Telluride
TIMBER RIDGE Cedar Mesa Aldasoro LAKE HATCHER Telluride/Hillside
TWINCREEK VILLAGE Fruitland Mesa Hastings Mesa LOMA LINDA
Cedar Hill North Hotchkiss Lawson Hill SAN JUAN RIV
Colby Canyon North Hotchkiss Lower Valley Stoney Creek
Cottonwood TIMBERLINE VIEW Ophir Rainbow Estates
Fire Mountain Log Hill Village / Fairway Pines San Bernardo/Priest Lake ASPEN TRAILS
Hidden Valley North Log Hill Mesa Trout Lake BUENA VISTA RANCH
Highway 65 Corridor Cedar Mesa Two Rivers Subdivision COLUMBINE
Leroux Lower Mountain Village EDGEMONT RANCH
Long Gulch Upper Mountain Village ENCHANTED FOREST
Needle Rock FALLS CREEK
North Redlands LOS RANCHITOS
North Rogers Mesa MESA VISTA ESTATES
Northridge SAILING HAWKS
South Redlands TIMBERDALE
South Rogers Mesa TRAPPERS CROSSING
Stucker Mesa Elk Springs Ranch
Surface Creek Elk Stream Ranch
Cash Canyon/ Stinking Springs Green Gate
Indian Camp Ranch Jackson Gulch
Kernan Creek Ranch Oakview
McElmo Pine Ridge/ Wapiti Rim
Trail Canyon Radical Ridge
Road 42
Skyline
Sundance
Sunwest and 41.2
Brown Ranch
Illium Valley/Ames
Iron/Mackenzie Springs
Specie Mesa
DEER VALLEY

Results

When grouping the community observations into five different clusters there are trends and characterization that occur. Not all clusters are all of the same size and in fact one cluster is a singleton, made up solely of the community of Telluride /Hillside. The other four have distinct trends that lend themselves to representative characterization.

Educated Ski-Towns: With high proximity to ski resorts and high amenity ski towns these communities are largely made up of established predominantly male white collar professionals with advanced degrees.

Small Working Towns: Containing fewer than one thousand people, these communities are focused on surrounding agricultural and resource extraction, but are aging. These communities are established middle income couples with largely high school educations.

The Bachelor Groups/ Ranch Communities : Small pockets of extraction based employment largely for young men with some college experienced on the slopes on the southern edge of the state which is mixed in with small groups of Ranches.

Country Orchards : On the Western slopes these communities are oriented to local tourism and agriculture. These communities are dependent on their scenic nature and are made up of the youngest and most diverse group of people. The local industry seem to provide moderate to high income to these young families.

The bachelor group seems to have the most heterogeneity, there seem to be groupings of young families mixed in with some extraction based work as well as Ranches. These may also be communities in the process of evolution, we only have a single snapshot in time represented here.

Cluster Characteristics

The next set of graphics shows the distribution of community attributes within a data-driven cluster. The colors correspond to the clustering figures above. The size of the dot corresponds to the number of communities at a given level of the y-axis variables (e.g., household income). Some of the groups above have more cohesion on common grouping variables such as income, while others encompass more variation. This can be seen when boxplots are constructed by grouping. Elevation and slope seem to be correlated to higher income communities which may indicate a more leisure oriented lifestyle.

These characterizations are based more directly on the observable statistics which is different from the archetypes which are more attitude and social interaction based. Clearly attitudes and social intricacies cannot be represented purely by the demographics associated with them, however it can be an early indicator to those who are unfamiliar with an area of interest. There is a nonzero level of grouping correlation, and some of the groups fit nicely together while others seem to be more disparate.

Once the WiRe Community variables are included a group’s propensity to participate in a preventative action can be assessed.

Probabilities by Socio/Biophysical Clusters
Cluster Defensible space Codes Any Building Code Comm Slash Pile Community Mitigation Projects Grant Interest Firewise Community Communication New Resident Communication Fire Danger Sign Communication Mechanism evacuation plan
Bachelor Groups/ Ranching Communities 0.00000 0.0434783 0.7826087 0.3043478 0.2608696 0.0000 0.8695652 0.1304348 0.000 0.9130435 0.7391304
Country orchards 0.12500 0.3750000 0.8750000 0.7500000 0.6250000 0.2500 1.0000000 0.0000000 0.375 0.8750000 0.7500000
Educated Ski-Towns 0.20000 0.1000000 1.0000000 0.3000000 0.7000000 0.0000 1.0000000 0.0000000 0.200 1.0000000 1.0000000
Small Working Town 0.03125 0.2187500 0.5000000 0.5625000 0.7500000 0.1875 0.5312500 0.2187500 0.125 0.6250000 0.3750000
Telluride 0.00000 0.0000000 1.0000000 0.0000000 0.0000000 0.0000 1.0000000 0.0000000 0.000 1.0000000 1.0000000
Probabilities by Community Clusters
Cluster Defensible space Codes Any Building Code Comm Slash Pile Community Mitigation Projects Grant Interest Firewise Community Communication New Resident Communication Fire Danger Sign Communication Mechanism evacuation plan
1 0.0000000 0.0000000 1.0000000 0.0000000 0.0000000 0.0000000 1.0000000 0.0000000 0.0000000 1.0000000 1.0000000
2 0.0294118 0.1470588 0.4411765 0.5882353 0.7647059 0.1176471 0.5000000 0.2352941 0.0588235 0.5882353 0.2647059
3 0.1111111 0.4444444 0.7777778 0.7777778 0.6666667 0.2222222 1.0000000 0.1111111 0.3333333 0.8888889 0.7777778
4 0.0000000 0.0000000 1.0000000 0.1176471 0.0588235 0.0000000 1.0000000 0.0000000 0.0000000 1.0000000 1.0000000
5 0.1538462 0.2307692 0.9230769 0.3846154 0.6923077 0.1538462 0.9230769 0.0769231 0.3076923 1.0000000 0.9230769

Conclusion