The Fire Model II

Who should read it

Anyone interested in simulation and those interested in an elegant solution for more efficient fire model code.

The missing part from the previous post

In my previous blog post, I introduced the fire model, a simplistic tool for simulating the spread of wildfires. The primary objective was to showcase the model’s key findings and offer insights on how beginners could approach this modeling task. However, one crucial aspect was ignored — the inherent uncertainty of systems modeling.

Modeling with unknown outcomes

Usually, the outcome of a model is unknown, requiring a thorough examination of the system’s behavior under varying inputs. In the case of the fire model, the best way to approach this task is to run simulations.

Every time we face a question regarding a probabilistic system simulations could be helpful. By running the model multiple times, often in the hundreds, thousands or even millions, with identical input parameters, we can study the system’s behavior.

In case of the fire model, we need to vary the tree density an run the model, let’s say hundred times for each density ranging from 1 to 100. This should give us a fairly clear picture of the behavior of the system.

As we already have the code we can just loop over it 10,000 times (100x100), right?

Simulations are time-intensive

There is one crucial problem: Our previous code was extremely slow, so running this loop would take ages. Code efficiency is extremely important when performing simulations. Let’s look at the fire model from a different perspective.

The forest from a different perspective

Instead of writing a detailed description I played around with manim a maths animation engine written in Python to explain the problem.

How it works

From the animation above, the general idea should be clear. The tricky part is to get from the binary matrix to the network representation. Below you find the code for calculating the area burnded and the loops for running the simulations. From the comments in the code everything should be clear. Here just a hint towards one issue with this solution. I use the igraph package, which is a package for network analysis, to calculate the size of the components in the network. The problematic part is the manhattan distance, which is a simple measure for the distance between two points in a matrix. This is the least efficient step in the code. Especially large matrices cause problems. If you familiarize yourself with the manhattan distance you should immediately see why it is inefficient in our setting.

#' Calculates the area burned in the fire model. 
#' 
#' @param mat matrix; binary matrix where 1's indicate trees
#' @description
#' The fire model explains the spread of wildfires. The function area_burned() 
#'    calculates the total area burned in the model. In the model you define a 
#'    tree density and then you randomly populate a matrix with 0's and 1's 
#'    given the tree density. The fire than spreads from the fire line (first 
#'    column) to other trees if they border a burning tree. Fire can spread up/
#'    down, left/right but not diagonally.

area_burned <- function( mat ) {
  
  # find the index of trees in the matrix
  pos <- which(mat == 1, arr.ind = TRUE) # array index is an extremely useful argument for this
  
  # calculate the manhatten distance between trees trees are connected if the 
  # manhatten distance equals 1. Manhattan distance is super simple: see for 
  # example https://de.wikipedia.org/wiki/Manhattan-Metrik
  distance_mat <- as.matrix( dist( pos, method = "manhattan", diag = T ) )
  distance_mat[ upper.tri( distance_mat ) ] <- 0
  
  # create edges (how nodes (pos) are connected)
  edge <- which( distance_mat == 1, arr.ind = TRUE )
  
  # create graph and find components
  # create a graph/network of trees that are connected (we stored the connections in edge)
  graph <- igraph::graph_from_edgelist( edge, directed = FALSE )
  
  # find components of graph using the igraph package
  comp <- igraph::components( graph )
  
  # find components that are hit by fire
  # the part which( pos[ , 2 ] == 1 ) refers to trees in the first column of
  # the matrix as only those are hit initially by the fire
  # comp$membership assigns each tree to a component (group) of the network
  comp_burned <- unique( comp$membership[ which( pos[ , 2 ] == 1 ) ] )
  
  # calculate area burned in percent
  # comp$csize is the size of each component
  area_burned <- sum( comp$csize[ comp_burned ] ) / sum( mat ) * 100
  
  return( area_burned )
  
}

#' simulate the spread of wildfires

  # install and load igraph package
  if( system.file( package = "igraph" ) == "" ) install.packages( "igraph" )
  library( "igraph" )


  set.seed( 20230526 )

  # create matrix to store results
  df_store <- as.data.frame( matrix( 0, nrow = 100, ncol = 99 ) )
  
  time <- numeric( 99 )
  
  # define matrix/forest size
  matrix_size <- 50
  
  for( density in 1:99 ){ # loop through each density
    
    print( density )
    
    # save time for each step
    start_density <- Sys.time()
    
    for( i in 1:100 ){ # run simulation 100 times for each density
      
      # populate grid with trees
      mat <- matrix( sample( c( 0, 1 ),
                             replace = TRUE,
                             size = matrix_size^2,
                             prob = c( 1-density/100,
                                       density/100 ) ),
                     ncol = matrix_size )
    
      # run area_burned() see function above and store result
      df_store[ i, density ] <- area_burned( mat )
        
    }
    
    # store time it took to run simulation for one specific density
    time[ density ] <- difftime( Sys.time(), start_density, units = "mins" )
    
    
  }
  
  plot( time*60)
  
  plot( colMeans( df_store ) )