Setting a distributions over large dimensions in @defsim

jrising · May 28, 2024, 2:16am

I have 183 regions, and I would like to set a parameter for each region to be drawn from a Uniform(0, 1) distribution. I am using a standard @defsim macro. In theory, I understand that I could do this by creating a large list like MyComponent_uniforms = ["USA" => Uniform(0, 1), "SOM" => Uniform(0, 1), etc.] or many commands like MyComponent_uniforms["USA"] = Uniform(0, 1). But I have tried both:

MyComponent_uniforms = [iso => Uniform(0, 1) for iso in allisos]

and

for iso in get_countryinfo().ISO3
    RCPSSPScenario.rateuniforms[iso] = Uniform(0, 1)
end

and these both give an “Unrecognized expression” error.

How can I do this?

lrennels · June 2, 2024, 4:33pm

Hi @jrising, good question I’ve struggled with this as well. I think the primary issue here is the rigidity of the @defsim macro, which cannot parse as wide of an array of expressions as I’d like. It’s something I’d like to work on and can add to the Issues list.

In the meantime, what I’ve done in models like GIVE that really need more flexilbility is use the underlying helper functions that @defsim actually uses to build up my mcs. You can even use the macro for the simpler cases and then add to it as needed. The GIVE code could be an example. That is here: MimiGIVE.jl/src/main_mcs.jl at main · rffscghg/MimiGIVE.jl · GitHub

First we construct a small mcs function, which could be empty but some of our random variables are pretty simple so we start with that.

 # define the Monte Carlo Simulation and add some simple random variables
    mcs = @defsim begin
        dice2016R2_damage.a2 = Normal(0.00236, 0.00236/2) # Nordhaus (2017, PNAS) DICE2016 RV 
    end

and then use small functions to add random variables to it with more flexibility.

The agriculture would be a good example for you I think, where we have one RV per region per coefficient:

for coef in [1,2,3] # three coefficients defined with an anonymous dimension
    for (i, region) in enumerate(["USA","CAN","WEU","JPK","ANZ","EEU","FSU","MDE","CAM","LAM","SAS","SEA","CHI","MAF","SSA","SIS"]) # fund regions for ag
         rv_name = Symbol("rv_gtap_coef$(coef)_$region")
         add_RV!(mcs, rv_name, ag_sample_stores[i, coef])
         add_transform!(mcs, :Agriculture, :gtap_df, :(=), rv_name, [region, coef])
    end
end

So in your case I think you’d have something along the lines of:

 for (i, country) in enumerate(get_countryinfo().ISO3)
    rv_name =  rv_name = Symbol("rv_uniform_$country")
    add_RV!(mcd, rv_name, Uniform(0,1))
    add_transform!(mcs, :MyComponent, :MyParameter, :(=), rv_name, [country]) 
end

Where the first line makes a local variable for the random variable, the second lines adds a random variable with that name to the Monte Carlo simulation definition, and the third line attaches your parameter to that random variable, sets it equal (:(=) there are other options but they are rarely used), and then connects to the country dimension of your model. You can change that if you call it something else?

Let me know if that helps and be in touch with any follow up!

I always appreciate your questions, and I’m looking to carve out some time to make some feature additions to Mimi

jrising · June 4, 2024, 2:28am

This sounds like a great solution, but I think there might be a bug in _perturb_param!. I have my parameter, rateuniforms = Parameter(index=[country]), and your code (with name-changes), but I get the error “ArgumentError: indexed assignment with a single value to possibly many locations is not supported; perhaps use broadcasting .= instead?” on this line:

github.com

mimiframework/Mimi.jl/blob/6217072492aa4a723f4bdf78b52efc598770ff0b/src/mcs/montecarlo.jl#L381


      
          indices = _param_indices(param, md, i, trans)
          
          if op == :(=)
              
              # first we check for a time index
              ti = get_time_index_position(param)
          
              # If there is no time index we have all methods needed to broadcast normally
              if isnothing(ti)
                  broadcast_flag = sum(map(x -> length(x) > 1, indices)) > 0
                  broadcast_flag ? pvalue[indices...] .= rvalue : pvalue[indices...] = rvalue
              
              else
                  indices1, ts, indices2 = indices[1:ti - 1], indices[ti], indices[ti + 1:end]
                  non_ts_indices = [indices1..., indices2...]
                  broadcast_flag = isempty(non_ts_indices) ? false : sum(map(x -> length(x) > 1, non_ts_indices)) > 0
                  
                  # Loop over the Array of TimestepIndex 
                  if isa(ts, Array) 
                      for el in ts
                          broadcast_flag ? pvalue[indices1..., el, indices2...] .= rvalue : pvalue[indices1..., el, indices2...] = rvalue

When I look at the indices array that is being used to set broadcast_flag, I see Any[[1]] for this case. I think your code works because you have [region, coef], which probably comes in as [[1, 1]], so I’m guessing that I could make this work by adding a dummy dimension.

I tried to figure out from the defsim macro code how this is supposed to work normally, but it looks like I would end up with essentially what you wrote, so I’m not sure what to do.

lrennels · June 13, 2024, 8:10am

@jrising sorry I just saw this! Will get to it tomorrow and post back.

lrennels · June 19, 2024, 8:09am

@jrising working on debugging this, am I right that the country dimension of your model has something like 184 elements? So the minimum model to recreate would be to have a single component model with two dimensions, time and country, a rateuniforms Parameter with the country dimension, and then the mcs code I added?

lrennels · June 19, 2024, 8:16am

Ah ok @jrising I lead you astray, the argument at the end there is not the dimension name it’s the actual label. For example from GIVE:

  # add one transform per country asigning each to the appropriate regional random variable
    for row in 1:size(cromar_mapping_raw, 1)
        rv_name = Symbol("rv_β_mortality_$(cromar_mapping_raw.cromar_region[row])")
        add_transform!(mcs, :CromarMortality, :β_mortality, :(=), rv_name, [cromar_mapping_raw.ISO3[row]])
    end

So I think my example code was correct but my explanation incorrect which perhaps caused a problem?

I’ll build a mini example and try it out though!

jrising · June 20, 2024, 12:14pm

I didn’t think that country in your code was the dimension name-- I understood that it was the specific region label. Were you able to get a minimum working example for this? I could make one for you, if not. Within the context of the whole of PAGE, I’ve made a single set of changes that I imagine is intended to work, but causes the error above:

lrennels · June 20, 2024, 7:04pm

Thanks @jrising perfect, I’ll dig into this now.

lrennels · June 20, 2024, 11:53pm

Thanks for the small example @jrising super helpful, I’ve been comparing to what we do in MimiGIVE etc. You might have sort of figured this out so sorry for verbosity but you know Julia and Mimi so well I figure I’ll explain

So what’s going on here is that we do not want the broadcast_flag to be true, because we assigning one value (rvalue) to one spot (pvalue[indices...]) which for example for the country MDV would be pvalue[174].

However, for some reason indices in your implementation is a vector of vectors …

julia> indices = Mimi._param_indices(param, m.md, i, trans)
1-element Vector{Any}:
 [174]

Whereas for my implementation in MimiGIVE of similar work indices is a Vector of Floats

indices = Mimi._param_indices(param, m.md, i, trans)
1-element Vector{Any}:
 12

So we get this error because we do

pvalue[indices...] .= rvalue

where pvalue is a Vector of Floats, rvalue is a Float, BUT indices is a Vector of vectors. More succinctly in an example:

julia> a = zeros(1)
1-element Vector{Float64}:
 0.0

julia> b = [[1]]
1-element Vector{Vector{Int64}}:
 [1]

julia> a[b...] = 3
ERROR: ArgumentError: indexed assignment with a single value to possibly many locations is not supported; perhaps use broadcasting `.=` instead?
Stacktrace:
 [1] setindex_shape_check(::Int64, ::Int64)
   @ Base ./indices.jl:261
 [2] _unsafe_setindex!(::IndexLinear, A::Vector{Float64}, x::Int64, I::Vector{Int64})
   @ Base ./multidimensional.jl:953
 [3] _setindex!
   @ Base ./multidimensional.jl:944 [inlined]
 [4] setindex!(A::Vector{Float64}, v::Int64, I::Vector{Int64})
   @ Base ./abstractarray.jl:1393
 [5] top-level scope
   @ REPL[132]:1

lrennels · June 20, 2024, 11:56pm

I now realize that

indices = Mimi._param_indices(param, m.md, i, trans)

in my model gives me.a Vector of Integers, but yours gives me a Vector of Vectors of (single) Integers – so broadcasting is not flagged but then errors.

It seems like this might have something to do with your dimension :country is of type String15 instead of String. I haven’t seen this as I tend to use CSVFiles which only generates the typical Julia String.

Our code should be robust to this, I’m working on a simple way to clean it up here: WIP Use Different Approach to Check for Broadcasting in Perturb Param by lrennels · Pull Request #999 · mimiframework/Mimi.jl · GitHub which I should finish tomorrow.

The main issue is that I need to decide when to broadcast and when not to, can’t just overhaul and always broadcast or you get errors the other way. There maybe a nice solution with recursion too.

jrising · June 23, 2024, 10:15pm

Thank you so much for your help with this! I feel like we need an “advanced MC API” part of the documentation to explain how to do this, and also the use case for drawing integers to initialize more complicated variable sets, and the non-sim-based MC setup that we’ve both been using with FAIR.

lrennels · June 23, 2024, 11:39pm

Yes I totally agree, with GIVE etc. we now have enough of a code base to produce some solid examples as well.

I’d never run into these needs until GIVE but now find them really helpful. It’s also helpful because you can use these techniques for a post_mcs_creation_function! that modifies a given mcs.

Topic		Replies	Views
Using conditionals in @defsim, and add_RV! for vectors	3	800	January 27, 2023
MCS - Assigning random variables to indexed variable	5	635	December 10, 2020
Make MCS @defmcs accept constrained distributions?	36	780	March 11, 2019
Monte Carlo Simulation Update	7	531	May 3, 2019
Set a parameter as a distribution	13	1091	September 11, 2020

Setting a distributions over large dimensions in @defsim

Related topics