Using init for subset of years

I am trying to add components that are designed to run from 2010 on to the MimiFAIRv2 model. I have added my component including the first=... parameter in add_comp!, and I was under the impression that Mimi would then only call my component for a subset of the years. However, the init function appears to be being called for all years.

Also, within my init function, I am filling in various variables, and these do not complain when called with years prior to the intended Timestep. That is, when called with a Timestep for the year 1750, I would expect the variable setting to fail, since that would correspond to a negative index for the variables in this component. Instead, the year 1750 appears to start filling in the variables at index 1.

How is this intended to work?

Actually, I realized that init being called with times doesn’t make sense. Within my init function I have a loop:

for tt in dd.time
    ...
end

and this is where I get all times. And I ask for the .t attribute of those entries, so that’s why I’m getting bad indexes. But how can I just get the subsetted times for my for loop?

Hi @jrising thanks for the question! When we call run_timestep(p,v,d,t) the function has access to richer time information (from t) than init(p,v,d). I have to go double check the d object, but I think as you said that’s just going to give you the time dimension of the model without nuances like what times that component runs for, and thus you are getting all times.

The init function is run only once, as described in Build and Init Functions · Mimi.jl and quoted below, and is technically not meant to be used for situations where you need access to time. That said, we’ve done it before for speed and data needs so if you’re in a similar situation we can certainly work it out (see below re. MimiRFFSPs and MimiSSPs).


The init function

The init function can optionally be called within @defcomp and before run_timestep. Similarly to run_timestep, this function is called with parameters init(p, v, d), where the component state (defined by the first three arguments) has fields for the Parameters, Variables, and Dimensions of the component you defined.

If defined for a specific component, this function will run before the timestep loop, and should only be used for parameters or variables without a time index e.g. to compute the values of scalar variables that only depend on scalar parameters. Note that when using init, it may be necessary to add special handling in the run_timestep function for the first timestep, in particular for difference equations. A skeleton @defcomp script using both run_timestep and init would appear as follows:


So out of curiosity, what is the need to do this here in the init function instead of run_timestep, given that you do need the time dimension? We ran into this with the MimiSSPs and MimiRFFSPs where we anted to fill in data in the init function and not run it through the run_timestep function. Is your situation similar?

If so, we can certainly talk about the best approach. Those two components show examples, likely the largest difference is that the MimiSSPs approach is far slower because it uses lots of DataFrames filtering etc, MimiRFFPs is optimized a lot, but is a bit less flexible to different countries and time dimensions.

One little note on changing the time dimension of MimiRFFPs that I’ll tag here in case it’s relevant for us later: Updating the RFFSPs Component Time Dimension · Issue #23 · rffscghg/MimiGIVE.jl · GitHub

Thanks for this response. I would be interested if there’s a good alternative to the init function for my case. I have parameter-dependent input timeseries. So, I have an RCP component, which is intended to provide baseline emissions data to other components. The way I have it set up, that component just has one Parameter{String} to specify the scenario, and then it fills in a collection of Variable vectors on init. I could fill those in one-year-at-a-time, but the data is stored in files and I want to avoid extra I/O.

By the way, I now have everything working, by having the component check if gettime(tt) >= 2010 and filling in my Variables by constructing an index TimestepIndex(gettime(tt) - 2009). It’s not elegant though.

Ah yes ok so that’s exactly the use case that we have in those two links above do, fill in that parameter dependent input time series. They use global variables to avoid reloading data (a clever fix by David although might feel convoluted or have a better solution now that we’ve had time to step back) and then load everything in init, so I think you’re idea is very similar to ours at the time.

I think we similarly also used a parameter for start_year and end_year and then we used that in our component to filter the dataset (in MimiRFFSPs using Arrow file format because we really needed speed, in MimiSSPs using just DataFrames and Query because we weren’t as focused on speed) – so that sounds a lot like what you did too?

We’ve been thinking it would be wise to have some sort of data loading helper for exactly this case as well, but haven’t built that out or designed it yet.

@jrising proof I’ve been thinking about this … maybe I can fix it up sometime this week :slight_smile: Add access to time information in the `init` function; Add Dimensions keys to `d` · Issue #948 · mimiframework/Mimi.jl · GitHub

Ha, that github issue is exactly for this case. Don’t worry about it for my sake. It’s just good to know that I wasn’t misusing the system.

1 Like

@jrising glad we are aligned, it’s on my todo list I’m hoping for some time to just work on Mimi this summer which is my favorite way to procrastinate writing my dissertation.