The Scientific Organization: Organizing U.S. Climate Modeling

September 30th, 2011 <-- by Richard Rood -->

The Scientific Organization: Organizing U.S. Climate Modeling

Summary: In order to address the need to provide climate-model products, a new type of organization is needed. This organization needs to focus on and to be organized to support the unifying branch of the scientific method. This requires application-driven model development. This will require the organization as a whole to develop hypotheses, design experiments, and document methods of evaluation and validation. In such an organization the development of standards and infrastructure support controlled experimentation, the scientific method, in contrast to the arguments that have been used in the past to resist the development of standards and infrastructure. This organization where a collection of scientists behaves as a “scientist” requires governance structures to support decision making and management structures to support the generation of products. Such an organization must be envisioned as a whole and developed as a whole.

Introduction

Over the past 25 years there have been many reports written about climate and weather modeling (example), climate and weather observing systems (example), high performance computing (example), and how to improve the transition from research to operations (example). A number of common themes emerge from these reports. First, the reports consistently conclude with commendation of the creativity and quality of U.S. scientific research. Second, the reports call for more integration across the federal agencies to address documented “needs” for climate-science products. De facto, the large number of these reports suggests that there is a long-held perception that U.S. activities in climate science are not as effective as they need to be or could be. The fact that there are reports with consistent messages for more than two decades suggests that our efforts at integration are not as effective as required.

Integration within existing organizations and across organizations is always difficult. Within institutions when there is a push towards more integration of research, there is both individual and institutional resistance (earlier entry). This resistance occurs for many reasons, both good and bad, both structural and cultural. I want to focus on those reasons that appeal to the sanctity of “the science.”

There are attributes of science-based research that, perhaps, pose additional challenges. These arguments are often based on the notion that creativity and innovation cannot be managed. Further arguments rely on the observation that many innovations come from unexpected places and cannot be anticipated. Therefore, the creative edge of science needs to be left unencumbered by the heavy hand of management needed to assure integration.

Another argument that is used to resist integration of scientific activities maintains that complying with the standards required to integrate component pieces into a whole hurts the integrity of “the science.” There are two lines that support this. The first line focuses on examples of when management-focused attention and resources towards, for example, facilitating technology (cyberinfrastructure), the result of that attention was of dubious scientific integrity – or the diversion of resources that could have been used to support “the science.” The second line is that by the time a particular component, say the algorithm that calculates ice particles in clouds, is integrated into an operational weather or climate model that that algorithm is no longer state-of-the art. Therefore, again, the integrating activity is a step behind the “best science.”

Such arguments serve to benefit the dominate type of scientific efforts in the U.S. These are the efforts associated with individual scientists, who focus (or reduce) their problems in such a way to isolate something specific and to determine cause and effect. This reductionist approach to investigation is central to the classic scientific method and is an effective method of discovery or knowledge generation.

The focus on reductionist investigation, however, comes at the expense of the unifying path of science; that is, how do all of the pieces fit together? This unifying path requires a synthesis of knowledge. This synthesis does, in fact, lead to new discoveries because when the pieces do not fit together, then we are required to ask – why not? The synthesis of scientific knowledge is also required to, for example, forecast the weather or climate or to adapt to sea level rise – or more generally, problem solving. (An excellent reference on this: Consilience by E.O. Wilson, and A complex book review)

My ultimate thesis is that a focus on unified science does not come at the expense of “the science,” and does not undermine the scientific method or the integrity of “the science.” If we are going to address the recommendations and ambitions addressed in the reports linked above, then we must develop a practice of science that supports the synthesis and unification of science-generated knowledge.

The Scientific Method and The Scientific Organization

Core to the scientific method is checking. In a good scientific paper most of the text is spent describing the results and how those results are defended – determined to be correct. A scrupulous reader looks for independence in the testing and validation; that is, how is unbiased information brought into the research to evaluate the results. Then the paper is subjected to peer review, which is another form of verification. Once a paper is published, it becomes fair game for all to question, and there is, ultimately, a requirement that the result be verified by independent investigation. If the result cannot be reproduced, then there is little acceptance of the result as correct (see Wikipedia Cold Fusion).

This process of checking is ingrained into scientists, and those who develop a sustaining legacy as quality researchers are always expert on how to check results in multiple ways. It is also true that on the individual level, it is ingrained into the scientist to question the results of others – to be skeptical. Therefore, at the level of the individual, the culture of scientific investigation does not promote synthesis, integration, or unification. Quite the contrary, what is promoted is the creation of many high quality granules of knowledge. These granules may or may not fit together to form a consistent body of knowledge.

The synthesis, the fitting together, of scientific knowledge to address specific problems does not happen organically. It requires attention; it requires some form of organization or coordination or management. Within the field of climate science, there is much effort spent on “assessment.” ( IPCC Assessments, National Climate Assessment). These assessments are the synthesis and evaluation of the body of knowledge. The ability to provide modeling information for these assessments is one of the motivations for the reports listed at the beginning of the article. Many scientists view these assessments as a halting of research, and they are widely viewed as a tax on the science community. Some view that they impede model development.

Building climate models is also a matter of synthesis and integration. Tangibly, software components are brought together in a unified environment: software components that represent physical, chemical, or biological processes; software components that provide connectivity; software components that represent computational infrastructure. The software components represent science-based research; hence, the intellectual capital of research is integrated. When a model is built, it is the function of the model as a whole rather than the performance of individual components, that becomes the focus. It is the science of the collected, rather than the science of the individual. Therefore, the focus needs to be brought to this integrated system. This integrated system poses a different set of scientific challenges than those posed by investigation of the individual components; there is science associated with integration – with coupling components. The integrated system also poses a set of information technology challenges that are crucial to the execution of the model and also impact which components can be used in a particular configuration of the model. Like assessments, modeling building requires organization or coordination or management.

Framing integrating activities such as assessment and model building, more generally, the synthesis path of the scientific method as a scientific activity, and formally recognizing it as such, is a necessary step to advance the organizational issues required for delivering climate products.

Validation, verification, evaluation, certification, confirmation, calibration: All of the words in this list have been used in discussions of how to assess the quality of models. For some, there are nuanced differences between the words, but in the general discussion they are all likely to take on the same meaning – some quantitative measure of model quality. This quantitative measure of quality is at the core of the scientific method. If climate modeling requires formal organization, then there needs to be an organization that, as a whole, honors the principles of the scientific method. This requires, then, a process that builds trust among the individuals of the organization. It requires structuring of checking and validation in a form that supports the transfer of knowledge (and computer code) from one individual to another. It requires the development of validation strategies that test the combined knowledge, the combined algorithms, in a quantitative and repeatable way. This organization is far different than an organization that is comprised on many, individual, excellent scientists, each following their own expression of the scientific method.

What does it take for an organization to adhere to the scientific method? First, I introduced that such an organization has to recognize, formally, the synthesis or unification of scientific knowledge as a path of the scientific method. Second, the organization has to develop strategies to evaluate and validate collected, rather than individual, results.

A Type of Scientific Organization: In May I attended a seminar by David Stainforth. Stainforth is one of the principles in the community project climateprediction.net. From their website, “Climateprediction.net is a distributed computing project to produce predictions of the Earth’s climate up to 2100 and to test the accuracy of climate models.” In this project people download a climate model and run the model on their personal computers, then the results are communicated back to a data center where they are analyzed in concert with results from many other people.

Figure 1: Location of participants in climateprediction.net. From the BBC, a sponsor of the experiment.

This is one example of community science or citizen science. Other citizen science programs are Project Budburst and the Globe Program. There are a number of reasons for projects like this. One of the reasons is to extend the reach of observations. In Project Budburst people across the U.S. observe the onset of spring as indicated by different plants – when do leaves and blossoms emerge? A scientific motivation for doing this is to increase the number observations to try to assure that the Earth’s variability is adequately observed – to develop statistical significance. In these citizen science programs people are taught how to observe – a protocol is developed.

There is another goal of these citizen science activities, education about the scientific method. In order to follow the scientific process, we need to know the characteristics of the observations. If, as in Project Budburst, we are looking for the onset of leafing, then we need to make sure that the tree is not sitting next to a warm building or in the building’s atrium. Perhaps, there is a requirement of a measurement, for example, that the buds on a particular type of tree have expanded to a certain size or burst in some discernible way. Quantitative measurement and adherence of practices of measurement are at the foundation of developing a controlled experiment. A controlled experiment is one where we try to investigate only one thing at a time; this is a difficult task in climate science. If we are not careful about our observations and the design of our experiments, then it is difficult, perhaps impossible, to evaluate our hypotheses and arrive at conclusions. And the ability to test hypotheses is fundamental to the scientific method. Design, observations, hypothesis, evaluation, validation – in a scientific organization these things need to be done by the organization, not each individual.

Let’s return to climateprediction.net. A major goal is to obtain simulations from climate models to examine the range of variability that we might expect in 2100. The strategy is to place relatively simple models in the hands of a whole lot of people. With this strategy it is possible to do many more experiments than say one scientist or even a small team of scientists can do. Many 100,000s of simulations have been completed.

One of the many challenges faced in model-based experiments is how to manage the model simulations to provide controlled experiments. If you think about a climate model as a whole, then there are a number of things that can be changed. We can change something “inside” of the model, for example, we can change how rough we estimate the Earth’s surface to be – maybe grassland versus forest. We can change something “outside” of the model – the energy balance, perhaps, some estimate of how the Sun varies or how carbon dioxide will change. And, still “outside” the model, we can change the details of what the climate looks like when the model simulation is started – do we start it with January 2003 data or July 2007? When you download a model from climateprediction.net, it has a unique set of these parameters. If you do a second experiment, this will also have a unique set of parameters. Managing these model configurations and documenting this information allows, well, 100000s of simulations to be run, with a systematic exploration of model variability. Experiment strategy is explained here.

climateprediction.net has designed a volunteer organization that allows rigorous investigation. Protocols have been set up to verify that the results are what they should be; there is confidence in the accuracy of the information collected. Here is an example where scientists are able to define an organization where the scientific method permeates the organization. Is this proof that a formalized scientific organization is possible? What are the attributes that contribute to the success of a project like climateprediction.net? Are they relevant to a U.S. climate laboratory?

Bringing this back to the scale of U.S. climate activities – in 2008 there was a Policy Forum in Science Magazine by Mark Schaefer, Jim Baker and a number of equally distinguished co-authors. All of these co-authors had worked at high levels in the government, and they all struggled with the desire and need to integrate U.S. climate activities. Based on their experience they posed an Earth System Science Agency made from a combined USGS and NOAA. In their article they pointed out: “The synergies among our research and monitoring programs, both space- and ground-based, are not being exploited effectively because they are not planned and implemented in an integrated fashion. Our problems include inadequate organizational structure, ineffective interagency collaboration, declines in funding, and blurred authority for program planning and implementation.” Planning and implementation in an integrated fashion, I will add – consistent with the scientific method.

Validation and the Scientific Organization

In a philosophical sense there is controversy about whether or not climate models can be validated. The formal discussion of whether or not climate models can or cannot be validated often starts with a greatly cited paper by Naomi Oreskes et al. entitled Verification, Validation, and Confirmation of Numerical Models in the Earth Sciences. In fact quoting the first two sentences in the abstract:

“Verification and validation of numerical models of natural systems is impossible. This is because natural systems are never closed and because model results are always nonunique.”

Oreskes et al. argues that the performance of the models can be “confirmed” by comparison with observations. However, if the metric of “validation” is a measure of absolute truth, then such absolute validation is not possible. By such a definition little of the science of complex systems, which would include most biological science, medical science, and nuclear weapons management, can stand up to formal validation. This points out a weakness in the development of models of natural systems, that the adjustments of the models to represent a historical situation does not assure that model correctly represents the physics of cause and effect. In fact, this is a general problem with modeling of complex natural systems, if you get the answer “right,” then that does not mean you get it right for the right reason. Hence, in the spirit of Oreskes et al. validation is not possible – there is no absolute to be had.

Weather models, river forecast models, storm surge models all suffer from the fact that their correctness cannot be assured in any absolute sense. Yet, aren’t storm surge models, weather models, climate models, etc. useful and usable? Their evaluation is usually cast as a set of predictions. Predictions do not represent a complete set of metrics to evaluate models, and the success or failure of these predictions does not state in any absolute sense whether or not the models have usable information.

It is easy, therefore, to establish that models that cannot be formally validated can be both useful and usable. The results of these models might not be certain, but the degree of confidence that can be attributed to their calculations is very high. This confidence is, in general, established by many forms of model evaluation and, in addition to the ability to predict, the use of additional sources of relevant information, most importantly, observations and basic physical principles.

Validation is, therefore, both controversial and important. I pose that validation is at the center of the development of the scientific organization. (Validation and the Scientific Organization) Climate scientists need to develop a standard process out of all the nuanced meanings of validation and evaluation. The evaluation of climate models can be structured and quantified as “validation.”

The definition I have posed for the scientific organization is an organization that as a whole functions according to the scientific method. Therefore, if it is a climate modeling organization the model development path, the modeling problems that are being addressed, are determined in a unified way. In that determination, it is required that ways to measure success be identified. This leads to a strategy of evaluation that is determined prior to the development and implementation of model software. With the existence of an evaluation strategy, a group of scientists who are independent of the developers can be formed to serve as the evaluation team. Both of these practices, pre-determined evaluation strategy (hypothesis), and evaluation by an independent validation group, are consistent with practice of the scientific method.

The development of an evaluation plan requires that a fundamental question be asked? What is the purpose of the model development? What is the application? If the model is being developed to do “science,” then there is no real constraint that balances the interests of one scientific problem versus another. There is little or no way to set up a ladder of priorities.

The scientific organization to support the synthesis of knowledge requires developing organizational rather than individual goals. It is a myth to imagine that if a group of individuals are each making the “best” scientific decisions, the accumulation of their activities will be the best integrated science. Science and scientists are not immune to the The Tragedy of the Commons. If one wants to achieve scientifically robust results from a unified body of knowledge, then one needs to manage the development of that body of knowledge so that as a whole the scientific method is honored. A scientific organization requires governance and management.

When I was at NASA I had a programmatic requirement to develop a validation plan. And, yes, my friends and colleagues would tell me that that validation was “impossible.” But I am stubborn, and not so smart, so I persisted and still persist with the notion. That old plan can still be found here in Algorithm Theoretical Basis Document for Goddard Earth Observing System Data Assimilation System (GEOS DAS) with a Focus on Version 2.

The software we produced was an amalgam of weather forecasting and climate modeling. For the validation plan the strategy was taken to define a quantitative baseline of model performance for a set of geophysical phenomena. These phenomena were broadly studied and simulated well enough that they described a credibility threshold for system performance. They were chosen to represent the climate system. Important aspects of this validation approach were that it is defined by a specific suite of phenomena, formally separated validation from development, and relied on both quantitative and qualitative analysis.

The validation plan separated “scientific” validation from “systems” validation. It included steps of routine point-by-point monitoring of simulation and observations, formal measures of quality assessment by measure of fit of simulations and observations, and calculation of skill scores to a set of “established forecasts.” There was a melding of methodologies of practices of the study of weather and the study of climate. We distinguished the attributes of the scientific validation from the systems validation. The systems validation, focused on the credibility threshold mentioned above, used simulations that were of longer time scales than the established forecasts and brought attention to a wider range of variables important to climate. The scientific validation was a more open-ended process, often requiring novel scientific investigation of new problems. The modeling software system was released for scientific validation and use after a successful systems validation.

The end result of this process was the quantitative description of the modeling system against a standard set of measures over the course of one modeling release to the next. Did it meet the criterion of the absolute validation? No. Did it provide a defensible quantitative foundation for scientific software and its application? Yes.

Summary: In order to address the need to provide climate-model products, a new type of organization is needed. This organization needs to focus on and to be organized to support the unifying branch of the scientific method. This requires application-driven model development. This will require the organization as a whole to develop hypotheses, design experiments, and document methods of evaluation and validation. In such an organization the development of standards and infrastructure support controlled experimentation, the scientific method, in contrast to the arguments that have been used in the past to resist the development of standards and infrastructure. This organization where a collection of scientists behaves as a “scientist” requires governance structures to support decision making and management structures to support the generation of products. Such an organization must be envisioned as a whole and developed as a whole.

Figure 2: Chaos and order, 2008 Galvanized wire, 60x60x60cm. Barbara Licha, Finalist of Willoughby Sculpture Prize 2009. (from Ultimo Project Studios

r

Leave a Reply