You are here

A question motivated by agent-based modeling articles I've been reading...

After reading several articles on agent-based modeling, I am struggling to answer a question that I've been pondering: How do you know that your agent-based model has a complete set of information?

When modeling customer behavior at a supermarket or amusement park, traders' behavior at NASDAQ, driver behavior in traffic, or any other human phenomenon, there are countless variables of human behavior that must be taken into account. When these models form a conclusion about how people act, and how theoretical changes in the structure of an institutional will affect human actions, how do they know that they have not left out a crucial piece of our thought processes that may be lost among the myriad of processes that have already been accounted for.

For example, in the supermarket model that has been discussed in one of my earlier blogs, a supermarket owner makes management decisions based upon his judgement of human action simulated by the model. But how does he know that, once entering a supermarket, shoppers' behavior won't be influenced by a factor that has not been accounted for -- for example, the shopper is in a hurry, or the shopper prefers some brands over others. If these "forgotten factors" become prevalent enough, the findings of the model could become useless.

It seems to me that unless a model really strives to replicate human behavior in its entirety, its findings can be called into question.

Comments

"Another issue has to do with the very nature of the systems one is modeling with ABM in the social sciences: they most often involve human agents, with potentially irrational behavior, subjective choices, and complex psychology – in other words, soft factors, difficult to quantify, calibrate, and sometimes justify. Although this may constitute a major source of problems in interpreting the outcomes of simulations, it is fair to say that in most cases ABM is simply the only game in town to deal with such situations. Having said that, one must be careful, then, in how one uses ABM: for example, one must not make decisions on the basis of the quantitative outcome of a simulation that should be interpreted purely at the qualitative level. Because of the varying degree of accuracy and completeness in the input to the model (data, expertise, etc.), the nature of the output is similarly varied, ranging from purely qualitative insights all the way to quantitative results usable for decision-making and implementation."

From: http://www.pnas.org/cgi/content/full/99/suppl_3/7280

Jelal, I fully understand and appreciate your concern. This is what I call the modeller's "Deer in the Headlights Effect". If we worry too much about precision and validation, we talk ourselves out of even starting down the road that will lead us to a useful result.

All models are approximations, especially simulation models. A model is an abstraction and a simplification of a "real world" system. The challenge is to find a balance between the overly simple and overly complex model. These extremes are anchored by the application of the model. If, for example, the model is to be used in the development of a control system for a nuclear reactor, adequate complexity is a necessity. So, too, would you want rigorous accuracy if you were developing a simulation model for training nuclear reactor operation technicians. The lives of potentially millions are a stake based on the accuracy and completeness of the model embedded in the control system and underlying the training simulator.

But such applications are on one extreme of potential uses of agent-based simulation. There are many applications of simulation modeling where "good enough" is, well, good enough. In exploratory learning applications, model rigor (that is, how accurately it renders the "countless variables" you mention of its target real world system) has to be tempered with learner interest/abilities. A nuclear reactor simulation for junior high students wanting to understand what happened at Three Mile Island would, for example, be considerably less rigorous that the adult operations technician training simulator mentioned earlier. In this case, model simplification is a requirement rather than a failure point.

Think about your experience playing the SimFarm game. You certainly would not expect that a real life farmer could learn useful farming technique by playing this game. But countless non-farmers, young and old, can gain a deeper appreciation and understanding of the challenges and satisfactions of farming by walking a virtual mile in the virtual boots of a farmer by playing SimFarm.

So, what we want to accomplish with the Local Food Economy Game is something more akin to SimFarm than it would be to the nuclear reactor training simulation. We need enough rigor to be authentic to the dynamics of a local food economy, but it does not need to be so rigorous as to useful in economic forecasting or policy decision-making.

The challenge, then, as you so rightly are wrestling with, is to determine the sufficient combination of those countless variables that you are thinking about so as to be useful from an exploratory learning perspective. Our initial "user"/player/learner is a casual learner interested in his or her impact of participation in the local food economy, We need our agent-based simulation to be rigorous enough to be instructive, but "relaxed" enough to be fun and engaging so these folks stick to it enough to learn by doing rather than learn by being told.

There are two threads I want to add to this hopper. A few weeks back as Jalel was eager to jump into the modeling of the local food economy, I put the brakes on the modeling and required that we do more background "homework" on farmers markets in general, on the 8 counties surrounding Grinnell and Fairfield (and their farmers' markets), on the nitty gritty of activity that takes place in the farmers' market environment, etc., etc.

My point was to put the modeling aside while we really focused on determining the wide range of variables that interact in these two local food systems (any system for that matter) for the very reasons both Jalel and Jim are discussing here. So the frustration Jalel raises and the caveats on ABM are exactly expected.

Systems are complex; abstractions are .... well, abstractions. The more you know about the complexity and the interplay of the many, many variables in a system, the more valid are the abstractions you design to model that system.

The same thing is true in any abstraction. How you intend to use that abstraction (model) determines the number of variables you must capture in that model in order for it to be valid for the use intended.

Econometrics, statistical analysis, forecasting, scientific research, etc., require incredible attention to determining the complex volume of variables in the system being modeled and rigorous concern about which of those many variables need to be included in an algorithm for example ... which then generates endless debate on the importance to that system (weighting coefficients for example ) to be applied to each of the variables selected to be included in the abstration. This is the stuff that drives new discoveries and questions or debunks established ones.

So for our purpose here, as we continue to gather more and more information about the complexity of variables that interplay in this local food economy system, and as we begin to select those variables that get included in our system model, we must continually be willing to iterate as more information becomes available and the intended use of the model evolves.

But give up in frustration in the face of complexity? No good researcher is willing to do that; you plow ahead armed with a growing understanding of the system you are modeling and for what purpose that model can be used.