Troy's Scratchpad

December 21, 2011

Measuring Sensitivity with the GFDL-CM 2.1 Control Run, Part 2

Filed under: Uncategorized — troyca @ 6:43 pm

In part one, I noted a number of potential issues with using the Forster and Gregory (2006) method over a period dominated by ENSO activity to diagnose climate sensitivity. And yet, as shown in that post, using the method on GFDL CM2.1 seemed to yield fairly accurate results, even slightly underestimating climate sensitivity (it appears to do the same with ECHAM-MPI as well). So, if there are indeed issues that can lead to overestimates of the sensitivity (as it does in 12 of the other models), why doesn’t it overestimate sensitivity in these other two models?

One obvious answer would be that some of these potential sensitivity-inflating issues are not present in the models, such as errors in the independent variable (i.e. temperature measurements) and unknown radiative forcings (the Spencer and Braswell argument). But that still leaves other issues, such as the timing offset between atmospheric and sea surface temperature changes (and hence the measured temperature radiative response being off with respect to surface temperature changes), along with the large difference between short-term and long-term cloud feedbacks in the models, both of which ARE included in the GFDL CM2.1 and ECHAM-MPI models.

To investigate, I used the Soden GFDL 2.1 kernels, along with the last 100 years of the GFDL CM2.1 pre-industrial control run from the PCMDI archive, to separate out the radiative responses by climate component (water vapor, surface albedo, and temperature), then ran regressions against combined tos (sea surface temperature) + land tas (2 meter air temperature) for 11 year periods (roughly matching the 2000 – 2010 CERES data we have) to see what the method would yield for these instantaneous feedbacks. As the actual sensitivity and feedbacks per doubling of CO2 for this model is known, we can compare these to the "instantaneous" feedbacks as diagnosed by the FG06 method. The long-term feedbacks for GFDL CM2.1 are from Soden and Held (2006). The remaining feedbacks come from the median of the estimators from my different regression periods in the control experiment. Units are in W/m^2/K.

	12 mo avg	3 mo avg	1 mo avg	Long Term
Overall	-1.78	-1.76	-1.58	-1.37
Temperature (Planck + Lapse Rate)	-3.72	-3.75	-3.81	-4.36
Water Vapor	1.76	1.77	1.82	1.97
Surface Albedo	0.21	0.21	0.20	0.21
Cloud	0.53	0.53	0.59	0.81
Residual	-0.56	-0.55	-0.38	?

Please note that this does not imply the GFDL CM2.1 has a negative net climate feedback by the typical definition, since I am including the Planck response in the overall feedback presented.

Anyhow, the temperature, water vapor, surface albedo, and cloud flux contributions determined using this technique seem to explain about 97.5% of the variance in TOA radiative fluxes in the GFDL model over this period of the control run. Unfortunately, the variance in the residuals seems to be fairly well correlated with temperature (r^2 = 20%, higher than the cloud and surface albedo portions), so that there seems to be a substantial leftover instantaneous response apart from these other feedbacks. The “Residual” row listed in the table above is merely the difference between the “Overall” diagnosed feedback (from overall flux anomalies) and that of the sum of the individual feedbacks (using the kernel technique). To ensure that this is not merely a statistical artifact, I also regressed the residual fluxes after removing the various climate components against temperature, yielding values near those in the “residual” row.

Nonetheless, this may go a bit towards solving a couple of those mysteries. The fairly large underestimate of the temperature response ( ~ 0.5 to 0.6 W/m^2/K) is very likely be the result of the timing offset / atmospheric temperature lag time previously discussed, and this can have serious consequences on estimated sensitivity (the difference, for example, between 3K and 2.15 K sensitivity). However, we don’t see an overestimate of sensitivity because a) there appears to be a unique short-term response going on here, and b) the underestimate of the cloud feedback, which is significantly more positive in the long-term for this model than it is in the short-term. From Dessler (2010), we see that only one other model significantly underestimates the positive cloud feedback in the short-term: ECHAM-MPI. Part (b) leads me to consider that the reason the FG06 method does NOT overestimate the sensitivity in these two particular models is because of this relationship between short-term and long-term cloud feedbacks. It’s worth noting that Dessler (2010) calculated an even smaller short-term cloud feedback from GFDL CM2.1 than here…I use a different part of the control run, but otherwise I’m not sure how to explain the difference.

The water vapor calculation is pretty close, although perhaps a bit underestimated using the instantaneous method. Surprisingly, the short-term albedo and long-term albedo estimates are about the same. I think this is surprising since typically the albedo feedbacks are considered slower feedbacks that won’t fully manifest in the short-term. Finally, one may notice that the GFDL CM2.1 overall feedback of –1.37 W/m^2/K from Soden and Held (2006), if converted to ECS in a typical manner (3.8 W/m^2 / 1.37 W/m^2/K = 2.78 K) does not correspond to the published ECS of that model (3.4 K). You get closer if you use the 4.3 W/m^2 TOA forcing described in Soden and Held (2006) for a doubling of CO2 instead of 3.8, but it still does not seem to explain how CM2.1 can have a significantly higher radiative response to surface temperature changes while also having a higher sensitivity to a CO2 doubling than CM2.0. Unless the estimated CO2 forcings are that much different? This is why I have left a “?” in the residual row for the long-term column.

Regardless, I will be investigating this method in some other control runs, along with more periods in the GFDL control run. From these results alone, however, my tentative conclusion would be that using the radiative fluxes over a period similar to 2000-2010 to measure climate sensitivity, in the absence of errors in the regressors and no noise due to unknown radiative forcings, would lead to:

1) Likely underestimates of the temperature response (due to the timing offset)

2) Inaccuracy in the cloud feedback, although the direction is unknown (the models are split on this, at least according to Dessler 2010).

3) Some “residual” response component, whose magnitude and sign is unknown with respect to the different models

Of course, testing the method on more models may change things. That’s a lot left to do.

Code and data for this post available here.

Leave a Comment

December 12, 2011

Foster and Rahmstorf 2011 lends support to…Spencer and Braswell?

Filed under: Uncategorized — troyca @ 6:24 pm

A new paper is out by Foster and Rahmstorf (2011), and while I may later do a more in-depth analysis, I want to point out a rather interesting implication of this paper, if indeed one were to take it at face value — it supports Spencer and Braswell (2008, 2010, and 2011 to some degree). Allow me to explain. (Note: to avoid confusion, there is Grant Foster, a.k.a. Tamino, and Piers Forster, whose papers I reference below attempt to measure sensitivity from radiation fluxes).

As you may recall, I performed some sensitivity tests related to the multiple regression a while back . Looking that post over again, there are a few errors on my part (I believe I used actual surface T for the S-B/Planck response), but there are a few interesting tidbits: 1) leaving the adjustments for TSI/solar out affects the conclusions, and 2) the estimated solar response is around 4 times greater than the volcanic response.

Let’s take a closer look at #2, which may have changed a bit from the post to the paper. From figure #3 of the FR11 paper, we see the coefficient for TSI at around 0.1 C for the land data, which, after adjusting for planetary albedo and shadow area / surface area, results in around a 0.57 C/(W/m^2) instantaneous surface temperature response for the actual solar forcing. Note that in Tamino’s original post, he had estimated about 0.39 C/(W/m^2) for solar, but that was when the solar influence range was only 0.08 C rather than the 0.12 C mentioned in the new paper.

[Aside]

For Aerosol Optical Depth (volcanic), the coefficient is around 2 deg.c / tau. If we look up the approximate efficacy, we see that it is around -25 W/m^2/tau. Such an estimate would yield the instantaneous sensitivity of around 0.08 C/(W/m^2), which would put it at around 1/7 the efficacy of a solar forcing, both in W/m^2. Certainly, there are reasons to believe that the instantaneous surface temperature response to the larger forcings may be damped (thank you SteveF) by the ocean heat uptake, but it seems that a factor of 7x (or 4 times) remains far too big of a discrepancy to be considered a reasonable physical result. Furthermore, the longer-term response may be expected to manifest itself over the course of say 8-12 years, but for the FR11 paper anything beyond the instantaneous response is ignored.

[End Aside]

Anyhow, according to FR11, the time between the solar forcing anomaly to the surface temperature response is estimated to be around 1 month. Remember that for later.

Relation to Spencer and Braswell

For more on the background of attempting to measure climate sensitivity and where the Spencer and Braswell arguments fit in, please see this page . But as a quick summary, I’ll note that in Forster and Gregory (2006), the authors comment (my insert in bold):

The X terms [radiative noise or unknown radiative forcings] are likely to contaminate the result for short datasets, but provided the X terms are uncorrelated to Ts, the regression should give the correct value for Y, if the dataset is long enough.

Spencer and Braswell argue that the unknown radiative forcing (fluctuations in cloud cover, which we know to exist AT LEAST on short timescales, per Dessler (2010)) would necessarily influence the Ts and hence lead to an underestimate of the radiative response. The counter-argument has been two-fold:

The decorrelation time of this radiative noise is shorter than the surface temperature response time. From Murphy et al. (2009), we read:

If temperature variations are changing outgoing radiation then temperature should be the independent variable whereas if radiation variations are affecting temperature then temperature should be the dependent variable. Although both are true to some extent, they can be partially separated by time response: outgoing radiation changes are mostly immediate whereas surface temperatures lag radiative forcing. Autocorrelation analyses of global temperatures suggest that the surface ocean portion of the Earth’s climate response has a time constant of about 8–12 years [Scafetta, 2008; Schwartz, 2008].

I believe this response misses the mark, as you might very well expect significant surface temperature responses to forcings on much shorter time-scales, even if the full forcing response is not realized for several more years. A better argument might be that the decorrelation time of this noise used by SB is too long, and that for cloud fluctuations it is actually on the scale of 2-3 months, whereas the temperature response is (for example) about 5 months later. However, the Forster and Rahmstorf (2011) paper implies a lag in temperature response of only around 1 month for these smaller fluctuations, which, even with only intraseasonal fluctuations (such as the Madden–Julian oscillation) in cloud cover, would suggest a strong correlation between these unknown radiative fluctuations (X) and T_surface!

The second major argument against the Spencer and Braswell result, as advanced by Murphy and Forster (2010) and Dessler (2011), is that the effective heat capacity of the ocean on these timescales is too high for the unknown radiative forcing to have any significant effect on surface temperatures. They attribute almost all of the surface temperature fluctuations during this recent decade to internal, non-radiative forcings (e.g., heat exchange between the ocean layers). I have explained before why their estimates of heat capacity are inappropriate for these monthly timescales. Nonetheless, using a similar method to Dessler (2011), I’ll point out that the standard deviation in surface temperature anomaly from 2000-2010 is around 0.1 C. Dessler (2011) calculates the standard deviation of the cloud forcing/noise to be around 0.5 W/m^2. So, can this 0.5 W/m^2 cloud fluctuation force any significant amount of the 0.1 C surface temperature changes? According to Dessler (2011), the answer is a strong NO (~5%). But according to Foster and Rahmstorf (2011), with its 0.57 C/(W/m^2) instantaneous response to solar forcing, such cloud forcing fluctuations (if the response scales) could result in 0.28 C changes! Now one may argue that the responses to slower solar cycles don’t experience the same damping, but even if the cloud forcing efficacy is only 1/5 that amount, this would imply that the measurements of climate sensitivity from radiative fluxes has been greatly overestimated.

Overall, I don’t think a proper analysis will support either the high Dessler (2011) heat capacity over these short period, or the high instantaneous effect of changes in TSI from FR11 that contradicts it. Indeed, I suspect the latter is likely an artifact of fitting to an underlying linear trend, as the effect of the solar minimum is overestimated in order to counter the flattening in the early 21st century. I think this point highlights the larger problems with such a methodology. Nonetheless, if one were to take the FR11 results at face value, Spencer and Braswell could very well point out that this peer-reviewed paper suggests a short lag time for the large surface temperature response (1 month) to a small forcing, lending credence to their argument that unknown radiative forcing “noise” will correlate with surface temperature. Heck, even using a T_s response midway between the FR11 values for TSI and AOT per W/m^2 would strongly support the SB case.

Comments (10)

December 1, 2011

Katsman and van Oldenborgh 2011 Update

Filed under: Uncategorized — troyca @ 8:32 am

Two months ago I posted on a couple issues I had with Katsman and van Oldenborgh 2011: 1) They assumed overlapping 8-year trends were independent when calculating the likelihood of a single 8-yr negative or 0 trend occurring in the upper ocean heat content over 30 years, and 2) The ENSO observations did not support the theory developed by the model that El Nino was the cause of the extra radiation escape.

Thanks to commenter Howard on that thread, who pointed out that a correction has been published soon after (readable draft version), I am pleased to see that issue #1 was addressed. The probability of an 8-yr negative trend (according to the ECHAM-MPI model) during that 30 year period starting in 1990 has been appropriately reduced from 57% to 25-30%, and the probability of a 9-yr negative trend has been reduced from 48% to 5-15%. However, it appears the 9-year trend will now likely be positive, given the recent uptick in UOHC. I would love to flatter myself and think I had something to do with the correction, EXCEPT that the draft was received by GRL on September 29th, a couple weeks before my post.

Finally, KO (2011) mention the following:

The computational error has no impact at all on the analysis in the remainder of the paper, from which we concluded that such a period without upper ocean warming is explained by increased radiation to space, largely as a result of El Nino variability on decadal timescales, and by increased ocean warming at larger depths, partly due to a decrease in the strength of the Atlantic meridional overturning circulation.

Bold mine. As I pointed out in that previous post and in the 2nd issue above, their model suggests that 8-yr average El Nino conditions with a four year lead would explain part of the negative trend, but the ENSO index was actually negative in that four year lead to the current flattening. It may be that El Nino variability is the cause, but if that’s the case then the ECHAM-MPI model would seem to have the radiative response to El Nino wrong (at least with respect to the time lag). This, I think, would affect the conclusions.

Comments (2)

November 14, 2011

New page on the observation-based estimate of climate sensitivity

Filed under: Uncategorized — troyca @ 9:02 am

You may have noticed my new page, which can be accessed on the right. I created this because I’ve been posting often lately on the method originally introduced in Forster and Gregory (2006), which has led to the recent disagreement between the latest Dessler and Spencer papers. I will simply link to this page in future posts so that readers can find the relevant history, papers, and resources, rather than burdening each post on this topic with the same links/introductions.

Leave a Comment

November 6, 2011

Estimated Sensitivity and Climate Response with TLS

Filed under: Uncategorized — troyca @ 6:05 pm

As discussed in a previous post, Forster and Gregory (2006) and Murphy et. al. (2009) use OLS regressions of temperatures versus the difference between measured TOA flux and known/estimated forcings to try and determine the overall climate feedback. As the inverse of this feedback is assumed to be the climate sensitivity, any underestimate in this climate feedback will hence lead to an overestimate in the climate sensitivity from this method.

Now, one possible issue with the method is that there are measurement errors in the monthly temperature anomalies, and OLS will lead to underestimates of the regression coefficient when there are errors in the independent variable. A possible way to combat this regression attenuation is to use total least squares, which fits the line considering errors in the independent variable as well. Of course, the trouble with using this method is that you need to have some idea of the relative variance of the “errors”, otherwise you could swing the opposite direction and get a huge overestimate.

As I don’t have a strong statistics background, implementing this Deming regression (a simple, specialized case of TLS) is rather new to me. However, one thing that is clear is that the assumed value for δ – the ratio of variance of errors in the dependent variable over the variance of errors in the independent variable – greatly affects the resulting estimate. Below, I show the Deming regression estimates for the overall feedback response (lambda, or Y, the inverse of sensitivity) based on the assumed δ. I use the HadCRUT3 and GISS datasets and the CERES net TOA flux measurements from March 2000 through December 2010 (the length of the CERES dataset):

Using OLS, which assumes no errors in the independent variable (or a δ = infinity), yields climate feedbacks of 1.16 and 1.19 W/m^2/K for GISS and HadCRUT respectively (equal to a sensitivity of around 3.2 C).

The red lines in the posts above correspond to δ = variance in Q-N divided by the variance in temperatures. There is no strong reason to suggest that this assumption is correct, but for reference, using the specified values for δ would yield estimates of 5.10 and 6.77 W/m^2/K for the climate response in those temperature sets, corresponding to tiny values of 0.75 and 0.56 C for sensitivity.

As Nic mentions in the comments of that last post, and Forster and Gregory (2006) note, the errors leading to the low correlation are not necessarily coming from measurement errors, but rather from other radiative influences. However, it appears that even if we assume the variance of errors in Q-N is much larger (say, 75x) than that of variance of errors due to uncertainty in monthly temperature anomalies, the effect is still quite noticeable (about 2.5x the climate feedback estimated in HadCRUT, 1.5x in GISS). In fact, if we assume the 0.075 C for the 1 sd in the monthly temperature errors, and then use the variance of Q-N itself as an upper bound on its possible errors, that 75x is what we get.

I should note that Murphy et. al. (2009) also uses orthogonal regressions for comparison, and these (predictably) lead to higher estimates of the response, though lower than I would expect based on my tests. At the moment, however, I’m not able to reproduce the lower results using this orthogonal distance regressions (which is basically the Deming regression above but with δ = 1, although they likely adjusted for units as well). They make a case on the grounds of cause and effect why OLS is more appropriate (surface temperature influence radiative flux at 0 time lag more than the opposite), and certainly it would seem that assuming the same errors in both variables is probably incorrect, but I fail to see why this means that we should not necessarily take into account ANY of the measurements errors in T, particularly when the errors on a monthly scale seem large relative to the monthly anomalies themselves. Of course, as I mentioned above, I am still in the process of learning these methods.

Anyhow, the script for this post is available here.

Leave a Comment

November 3, 2011

Measuring Sensitivity using the FG06 method with a GFDL CM2.1 control run

Filed under: Uncategorized — troyca @ 7:58 am

I originally posted on the Forster and Gregory 2006 method for determining climate sensitivity back in June. Since then, I’ve come across a number of issues (mostly discovered by others) on why this method may not constrain the sensitivity with the accuracy previously assumed. I will likely need to create a page of resources on this alone, because the number of links to blog posts and papers could easily grow unmanageable if I were to repeat them in every post. However, for now I’ll mention the recent Science of Doom and Isaac Held posts, and then begin listing some of the difficulties I see (particularly without using a limited time period):

The unknown radiative forcing. This is Dr. Spencer’s main objection (SB08, SB10, and part of SB11), which suggests the forcing will lead to an underestimate of the response (and hence and overestimate of sensitivity) if it is correlated with T. Murphy and Forster 2010 argue that the effect is small, but it doesn’t look like their use of such a deep mixed layer was appropriate. Science of Doom has a good introduction to this issue in the post I link to above.
There is not a strong reason to believe that the radiative response to a temperature change is both constant across all seasons and linear (also briefly mentioned in the SoD post). However, it theoretically could be a reasonable approximation.
Dessler 2010 concludes that there is unlikely to be a correlation between the short-term/instantaneous cloud feedback and the long-term cloud feedback (at least, it’s not present in models). The set-up of FG06 implicitly only includes the short-term feedback, and since the cloud feedback is one of the biggest uncertainties surrounding equilibrium climate sensitivity, it suggests that the FG06 method could not necessarily diagnose the ECS in this case.
There is a timing offset between the sea surface temperature changes and the bulk of the tropospheric temperature changes (1 – 3 months). When using a smaller period with interannual variations greater than the long-term trend (e.g., the CERES era), and with the bulk of the Planck response coming from the atmosphere, the timing offset would lead to an underestimate of the response.
Using the method in the control runs of models generally leads to a large overestimate of the climate sensitivity (which suggests major issues with the method, the models, or both).
Errors in the surface temperature measurements will yield underestimates of the response (and overestimates in the sensitivity). I specifically point to the surface temperature measurements rather than the satellite flux measurements because they act as the independent variable in the regressions, meaning they’ll lead to regression attenuation using traditional OLS methods (assuming Gaussian white noise for the flux measurements, we would get a lower correlation but not necessarily an underestimate).
The sampling error using only (for example) 10 year periods could make it difficult to diagnose accurately.

For this post, I will briefly mention #5, and then use the GFDL 2.1 500 year control run from PCMDI to explore #6 and #7.

General Inaccuracies using the GCM Control Runs

For point #5, I will point to a perhaps unexpected place…figure 2 of Dessler 2011:

The black lines are from the control runs of the models. Note the regression slopes at 0 lag (which corresponds to the FG06 method), which I’ve circled in green. Now, the average ECS of these models we know to be about 3 K, which corresponds to a radiative response of about (3.8 W/m^2 / 3 K) = 1.27 W/m^2/K for the radiative response, a number that we’d expect to see as the average regression slope. But instead, the average regression slope is closer to 0.5 W/m^2/K, which corresponds to a whopping 7.6 K ECS, more than double of what is known for these models. Using the FG06 method in the control runs of the models thus overestimates the sensitivity in what appears to be 12 out of the 14 models. Dr. Spencer goes into some more issues of testing this against models.

Anyhow, there appear to be two models that actually show reasonable results when using the FG06 method. From the Trenberth, Fasullo, and Abraham response, it appears that one of them is ECHAM_MPI. The other one looks to be GFDL CM2.1 from my tests:

Closer Look at the GFDL CM2.1 Control Run

The GFDL CM2.1 has a ECS of 3.4 according to the IPCC AR4 table. This corresponds to a response of 1.12 W/m^2/K, which is almost the exact value I get when using the 500 years of the control run and monthly temperature anomalies (r^2=.09). Of course, it is curious why the correlation would be so low if the FG06 method uses an appropriate model, considering there is no measurement noise in the flux or temperature outputs from the model.

Anyhow, before continuing further, I’d like to show a chart of the control run global surface air temperature, which has got me scratching my head a bit:

Now, my understanding is that the pre-industrial control experiment does not include any change in forcings, and that the model is run to “stabilize” prior to the start of the control experiment. The gray lines are the monthly anomalies, while the black line is the 30 year moving average. Note that we see climate-scale trends (using the 30 year averages) that are completely unforced (if I’m understanding it correctly); this is not “natural variability” in terms of solar or volcanic variation, but rather in the “no TOA forcing changes” sense. Whether this is simply model drift, or is actually supposed to be simulating long-term, unforced variability, I’m not sure…it’s on the scale of 0.2C for what appears to be some < 75 year periods, which seems like it could be significant compared to the 20th century rise.

Anyhow, I will proceed with some different trials in order to diagnose the climate feedback in the model’s control run and compare it to the known value. First, I’ll note that using annual averages over the 500 year period gives me a response of 1.37 W/m^2/K (r^2 = 0.26), which would be an underestimate of sensitivity. Dr. Held, in his response to my comment on his blog, mentioned that using 1000 years (which I don’t think was available at PCMDI), he got response of 1.7 W/m^2/K. I was a bit surprised not to match his results, since 500 years seems like it would be enough to constrain it, until I broke it down into two 250 years periods and found responses of 1.56 and 1.7 W/m^2/K, which seems to suggest that there are periods of temperature change that are not met with corresponding radiative responses (perhaps this is what allows for the continuing drift?).

Using Absolute Values Rather Than Anomalies

Anyhow, in Murphy et. al 2009, they extend the FG06 method to use CERES monthly observations. One curious point is that in figure 2, they use interannual AND seasonal variations (that is, they don’t take monthly anomalies) to show the radiative response. Steve McIntyre has explored this as well. The result is that you get higher r^2 values, but I think this may inflate the confidence we should have in the Murphy (2009) method. After all, it doesn’t seem that seasonal changes in temperature and the radiative response would necessarily be equal to longer term changes, and in fact using absolute values for monthly temperature anomalies and fluxes with the 500 year control run results in a response of 4.55 W/m^2/K (way higher than the known long-term value) with a r^2 = 94%! However, that 94% clearly is not indicative that it is accurately reflecting the ECS. Of course, such a high value may only suggest that the GFDL CM2.1 model overestimates the flux response to seasons, or underestimates the seasonal temperature response.

Regression Attenuation from Uncertainty in Surface Temperatures

I added white noise with a standard deviation of .075 C into the surface air temperatures to simulate measurement/sampling noise, based on estimates of the uncertainty presented in the charts of the HadCRUTv3 paper towards the early 21st century (during the CERES era). This resulted in the expected regression attenuation in the 500 year monthly test, bringing the estimated response down to about 1.02 W/m^2/K, and thus yielding a slight overestimate of the sensitivity. It is curious that the use of TLS to avoid this attenuation is not examined in more detail. For instance, FG06 mention:

For less than perfectly correlated data, OLS regression of Q-N against T_s will tend to underestimate Y values and therefore overestimate the equilibrium climate sensitivity (see Isobe et al. 1990).

The reason main reasons that they give for sticking with OLS is two-fold: 1) the issue of cause and effect (T_s inducing the radiative flux changes at short time scales, not the opposite), also discussed in MF09, which is in dispute by Spencer in point #1 of the post, and 2) that using it on the model HadCM3 does not yield accurate results using different regressions.

I will point out that even IF no unknown radiative forcing is confounding the feedback signal in this way, this still does NOT mean that there is no “error” in the independent variable…as shown above, HadCRUTv3 includes estimates of uncertainties. Furthermore, that another method correcting for these errors (both in terms of cause and effect and measurement error) would yield overestimates of the response in a climate model that includes neither these specified variations in cloud forcing nor measurement error is unsurprising, but it says nothing about whether a different regression method is appropriate for real world data that DOES include such errors. Finally, I found the following statement from FG06 appendix quite interesting:

Another important reason for adopting our regression model was to reinforce the main conclusion of the paper: the suggestion of a relatively small equilibrium climate sensitivity. To show the robustness of this conclusion, we deliberately adopted the regression model that gave the highest climate sensitivity (smallest Y value). It has been suggested that a technique based on total least squares regression or bisector least squares regression gives a better fit, when errors in the data are uncharacterized (Isobe et al. 1990). For example, for 1985–96 both of these methods suggest YNET of around 3.5 +.- 2.0 W/m^2/K (a 0.7–2.4 K equilibrium surface temperature increase for 2 x CO2), and this should be compared to our 1.0–3.6-K range quoted in the conclusions of the paper.

Murphy et. al (2009) explore orthogonal regression a bit as well, but I couldn’t find anything that explicitly takes into account the known uncertainties in the surface temperatures.

Sampling Error in 10 year intervals

Finally, I will look at the different estimates we get for the climate radiative response when breaking the 500 year control run into 50 10-year periods (and still including noise). I set the standard deviation for Net TOA flux measurements to 0.33 W/m^2 per month when adding in the white noise (based on estimated CERES RMSE (SW + LW) / sqrt(2)) The red lines represent the “true” radiative response. Using monthly values, we get the following results over 10 year periods:

Based on those responses, this includes climate sensitivities from 1.7 C to 14.2 C (!) based on a 10-year period, when the known sensitivity is 3.4 C.

For annual data (which should reduce some of the measurement noise, but yield a smaller sample size), we get:

Which includes everything from 0.8 C to 13.1 C.

Interestingly, these sampling errors tend to lead towards overestimates of the response, or underestimates in climate sensitivity. I’m not quite sure at the moment why this should be the case.

All code and data for this post is available here.

Comments (8)

October 28, 2011

Will 2011 be one of the top ten hottest years?

Filed under: Uncategorized — troyca @ 12:22 pm

Disclaimer: This post is for entertainment purposes only…no scientific value is suggested.

In a previous post on OHC GISS-ER model projections, Bob Tisdale pointed out a RealClimate post from earlier this year, in order to reconcile some differences between my graph and the one over there (Layman Lurker has suggested one possibility, the other possibility being that it is showing GISS-EH projections in the RC graph rather than GISS-ER). And yet, the other tidbit from that RC post that piqued my interest is a prediction from Gavin Schmidt:

Consistent with that, I predict that 2011 will not be quite as warm as 2010, but it will still rank easily amongst the top ten warmest years of the historical record.

He stuck to his guns even when a commenter brought up the issue of La Nina, which I like.

Anyhow, there was quite an interest over whether 2010 would set a record in all the major indices, and so in order to try and emulate some of that “temperature race” excitement from last year, I now have an excuse to show a new race – for tenth place in each of the major indices.

In the following graph, I have shown how the current temperature for the year is shaping up, versus that of the 10th hottest year in each of the major indices (they are different). I have left each of the indices in their “native” reported baseline (which gives some separation), so this graph should NOT be used to compare the anomalies from one index against another. Furthermore, the reason you notice more variability in the graph towards the beginning of the year versus the end is because the average is cumulative…so that the value for January shows the anomaly for January only, whereas the value for June shows the average anomaly for January through June. The value for December is thus the average for all anomalies January through December, or the annual average.

So, how does the graph look through the first 9 months of the year?

This year, GISS and UAH are currently on pace to be in the top 10, while HadCRUT, RSS, and NOAA are outside of it.

The GISS average anomaly for the remaining months would have to be below 0.36 C in order to avoid the top ten, and even though it is on a downswing (the September anomaly was 0.48 C), the strong La Nina from earlier in the year only dropped the temperature down to 0.43 C, so this will likely be in the top ten.

UAH would need to average below –0.06 C for three months to avoid the top ten (September anomaly was 0.29), which, even with the second part of the double-dip La Nina, would appear to be out of reach.

On the other hand, the satellite temperatures from RSS would require an average anomaly of 0.332 C to crawl INTO the top-ten, and even though this would seem feasible given the September anomaly (0.288), the fact that daily MSU temperatures have dropped significantly suggest that it will not beat out 2004 for tenth hottest.

NOAA needs to average 0.57 C to beat out its competition of 2001 for a place in the top ten, but with the recent posted anomaly of (0.52 C) being down from the previous months, it may be difficult.

And finally, HadCRUT, which just posted for September, needs to average .534 C over three months to crack the top ten (and beat 2007). This also seems unlikely, given September’s drop down to .371 C.

There you have it – with the hindsight of 3/4 of the year having reported, I would venture that GISS and UAH will be in the top ten, while NOAA, HadCRUT, and RSS will fall outside of it. Of course, things may change.

Script available for this post here.

Leave a Comment

October 21, 2011

More on the effective ocean mixed layer on multi-year timescales, and Dessler 2011

Filed under: Uncategorized — troyca @ 6:42 pm

In a previous post on Dessler 2011, I commented that the paper had a few important errors within it, most notably an incorrect use of the ocean mixed layer heat capacity to show that the forcings from ocean heat transport dominated that of the unknown radiative forcings when it comes to surface temperature changes. It has since been published, In that previous posting I noted (although Dr. Spencer was there first) that Dessler 2011 incorrectly used the 700 meter layer (down to 750m) from the Argo data, which could be confirmed by downloading the Douglass and Knox data. That fact has now been mentioned, with text explicitly added, in the published version:

“This can be confirmed by looking at the Argo ocean heat content data covering the top 750 m of the ocean over the period 2003–2008”.

I don’t want to go over the same ground as the previous post, but I will quickly note the following on why this is incorrect (slightly modifying one of my own comments from over at Bart’s):

The heat that ENSO distributes from the lower layers to the mixed layer is primarily what Dessler and Spencer are trying to estimate with with the “non-radiative” forcing. But the way Dessler has set it up, heat redistributed from the 110-400 m layers to the mixed layer during ENSO is NOT counted towards this forcing, but heat redistributed from the 800 meter layer to the 700 meter layer (even though the 700 meter layer is uncorrelated with surface temperature changes) IS counted. Clearly, Dessler’s formulation is problematic.

The above image shows the relationship between the SST and the temperature measured by the floats at different depths for three-month anomalies (from 2000-2010). As one can see, the top 100 meters or so show a positive relationship between SST and the temperature at those depths, whereas from 110 – 400 meters we see the opposite. This is because during El Nino events, the heat leaves the 110m to 400m layer to make its way into the upper layers, and the opposite is true during La Nina events. Clearly, beyond 400m, the heat lost or gained in those layers does not show much of a relationship with SST when it comes to seasonal variations.

There is, however, another method used by Dessler 2011 to estimate the monthly fluxes, which involves simply taking a set heat capacity (presumably the mixed layer) and multiplying it by the monthly fluctuations in SST:

To evaluate the magnitude of the first term, C(dTs/dt), I assume a heat capacity C of 168 W-month/m^2/K, the same value used by LC11 (as discussed below, SB11’s heat capacity is too small). The time derivative is estimated by subtracting each month’s global average ocean surface temperature from the previous month’s value.

The choice for this heat capacity is particularly important, given that Murphy and Forster 2010 used a similar method in their response to Spencer and Braswell 2008. Science of Doom has been looking into this as well, and gives some background on the origins of why this is important related to the original Forster and Gregory topic.

So, why do we get such radically different results when using using 168 W-month/m^2/K times SST differences (~ 9 W/m^2 per month) versus actual measurements of the top 100 meters in Argo ( ~ 2 W/m^2 per month)? Well, looking at the figure above, one reason should be fairly clear – a change in global SST does not result in a uniform change in all layers globally down to 100 m layer on these short time scales; the graph would show a regression coefficient of about 1 all the way to 100 meters if that is the case. Since in fact the regression coefficient falls off pretty quickly after 50 meters (likely because parts of the globe don’t have a mixed layer depth deeper than 50 m), using a single constant heat capacity equivalent to 100 meters of water will overestimate the heat changes.

So, where does this 168 W-month/m^2/K come from? In our conversation over at Bart’s, Eli Rabett notes the following:

Dessler and Lindzen and Choi, and Schwartz and etc.’s heat capacity of 168 W-month/m^2/K is the standard choice corresponding to a depth of ~ 100 m as you say. Everyone is Galileo

Which was helpful in that it led me to Schwartz 2007, who appears to be the originator. In there, he notes:

The present analysis indicates that the effective heat capacity of the world ocean pertinent to climate change
on this multidecadal scale may be taken as 14 ± 6 W yr m^-2 K^-1. The effective heat capacity determined in this way is equivalent to the heat capacity of 106 m of ocean water or, for ocean fractional area 0.71, the top 150 m of the world ocean. This effective heat capacity is thus comparable to the heat capacity of the ocean mixed layer.

The 168 appears to come from 14 W-yr/m^2/K * 12 months/year. But the regression in S07 was performed to estimate the heat capacity on multi-decadal scales, when heat slowly moves into the deeper ocean, not merely the short-term monthly, seasonal, or even annual variations during 10 year periods! Indeed, even Murphy and Forster 2010 note:

An appropriate depth for heat capacity depends on the length of the period over which Eq. (1) is being applied (Watterson 2000; Held et al. 2010).

But they seem to proceed with the 100m derived for a much longer period. We can get this approximate 14 W-yr/m^2/K as follows:

Specific heat of water * mass per m^2 / seconds per year=

4.18 (J/g/K) * (1000 g/kg) * (1000 kg / m^2) * (106 m depth) / [ (365.25 days/ year) * 24 (hrs/day) * (3600 s/day)] = 14 W-yr/m^2/K

I’ll note a few things in passing: first, that 4.18 is technically for freshwater, not sea water. Furthermore, at lower depths we start hitting some sea bottom, so the area fraction of the lower layers is not quite as large as that of the higher layers, although this should not make much of a difference in those first 110 meters. However, most importantly, we see that multiplying the results of that 14 W-year/m^2/K regression by 12 and applying this to monthly values would assume that the heat capacity is the same for any length of period, and that annual change in energy would be about 12 times that of the monthly change in energy, which is simply not the case when you have the monthly and seasonal fluctuations up and down. Indeed, using the outputs of ECHAM-MPI model (which will be free of the measurement noise that would affect short-term actual measurements), the three-month heat changes during the ENSO dominated periods are only about 2 times that of the single-month changes, and the annual heat changes are only about 4 times that of single month changes.

Since we’re interested in the seasonal and annual responses during a 10 year period (as in Forster and Gregory 2006 or Murphy et. al 2009), we can use this most recent decade for a new regression. Here I calculate a more relevant relationship between the diff of SST (dTs/dt) and the actual measure OHC flux (dH/dt) of the top 110 meters. As seen below, we get a regression coefficient of approximately 32 for the three month anomalies, which (if I’m not screwing this up) would theoretically correspond to about a 60m mixed layer depth appropriate for this time period (and indeed, on a monthly scale, 40m is more appropriate).

All of this ignores the fact that there are likely to be measurement errors within the ocean temperatures as well, and that, at one month, those errors will affect the average more than 3 months, and certainly more than a year. In discussions over at Isaac Held’s place, it’s becoming clearer to me – based on model estimates of that regression coefficient – that using monthly anomalies are not effective for calculating the feedbacks, and that we’ll probably need to look more at annual fluxes to reduce the noise (both measurement and atmospheric) when calculating that sensitivity. On those time scales, at least according to Dr. Spencer’s recent post, it appears that the “unknown” radiative forcing can contribute 60% of the forcing necessary for the ocean heat changes.

Taking a look at the control runs in Dessler 2011 (figure 2) at 0 lag, the models seem to show an average coefficient of around 0.5, which corresponds to a sensitivity per doubling of CO2 of about (3.8/0.5) = 7.6 C, when in fact we know the average sensitivity in those models are around 3 C. This would suggest that the method of Murphy et al. 2009 and Forster and Gregory 2006 using seasonal anomalies (of flux vs temp changes) shows no linear relationship with the actual sensitivity, or, if there is a relationship, it is a severe overestimate (applying this to the results from observations, correcting for such a theoretical underestimate would yield an ECS of ~ 1.3 C per doubling of CO2, not even accounting for underestimate caused by the issues in SB08/SB1). However, I tend to think that it is not an accurate way to diagnose feedback, as many of the regression coefficients are near 0. What remains to be seen is if using annual anomalies of TOA flux and temperature changes will yield better estimates.

Code and data to reproduce the figure available here.

Comments (3)

October 13, 2011

GISS-ER Ocean Heat Content, 20th Century Experiment and A1B projections

Filed under: Uncategorized — troyca @ 12:43 pm

In my first post on GISS-ER Ocean Heat Content projections, Bob Tisdale asked me if I could show the hindcast data as well. This included 9 runs of GISS-ER for the 20th century experiment, which I then downloaded and processed in the same way I mentioned in that previous post. Also, since the model runs actually spit out absolute temperatures, it’s possible to baseline even the projections – which start in 2003 – to the 1955 through 1998 period of overlap between the NODC observations and the 20th century hindcasts. As I mentioned in that last post, these calculations have a few simplifying assumptions and should not be taken as perfect, but the method certainly tracks extremely well with Levitus OHC calculations (r^2=.998). You’ll also notice that there is a gap from where the 20th century experiment ends and the A1B projections begin, between 1999 and 2003.

I must say, on first glance, this seems to vindicate Bob’s post that received so much flak for showing the GISS-ER and the NODC trends starting at the same point in 2003, as his choice would seem to be a better representation of the quantity of “missing heat” between the model projection and the observations.

All code and intermediate data is available from here. Raw ThetaO data for the CMIP3 model runs available from PCMDI. Observational data available from NOAA NODC.

Update (10/13): I noticed that I didn’t mention that this is ONLY for the top 700 meters, in line with what I’ve been looking at in the past few posts. I’ve updated the image to include that note.

Comments (18)

October 11, 2011

ECHAM-MPI model runs, Ocean Heat Content, and Katsman and van Oldenborgh 2011

Filed under: Uncategorized — troyca @ 7:37 am

As I mentioned in my last post, the GISS-ER projections for ocean heat content did not seem to include any period of flattening like we’ve seen between 2003 and the present, so I wanted to see if some other CMIP3 models might include the variability in upper ocean heat content to explain such observations. Over at Real Climate, Dr. van Oldenborgh mentioned that his paper (I’ll call it KO2011 from here on out) used 17 runs from ECHAM-MPI to explain the current flattening. The three key points for the paper listed at GRL are:

An 8-yr period without upper ocean warming is not exceptional
It is explained by more radiation to space (45%) and deep ocean warming (35%)
Recently-observed changes point to an upcoming resumption of upper ocean warming

First, I will note that the projections (these start in 2001) for the upper 700m DO seem to show more variability than GISS-ER, (despite there only being 2 for the SRESA1B scenario at PCMDI) which is a good thing when it comes to explaining the recent flattening:

I’ve once again baselined to the overlapping years, which now starts in 2001. As you can see, it still seems like ECHAM-MPI has difficulty reproducing the variability, but it is not quite as monotonic as GISS-ER.

However, while the Katsman and van Oldenborgh paper does some interesting analysis with respect to deep ocean warming and the model simulations, there are a few problems with it that lead me to question those three key points.

Issue #1 – A Statistical Problem of Independence

In KO11, they use 17 different runs, and then calculate all of the different overlapping 8-yr trends during the period for each of the runs, yielding their following conclusions:

From the distribution of linear trends in UOHC, it appears that 11% of all overlapping 8-yr periods in the time frame 1969–1999 have a zero or negative UOHC trend (Figure 2a). Over 1990–2020, around the time of the observed flattening, this is reduced to 3% (Figure 2b), corresponding to a probability of 57% of at least one zero or negative
8-yr trend in this time frame.

Bold mine. The 3% actually appears to be rounded up from 2.66%, which is calculated by 14 of these overlapping 8-year trends from "17 members × 31 overlapping 8-yr periods = 527 trend values" having seen a trend of 0 or negative. The 57% percent could then be calculated based on the probability that 31 of the non-events do NOT occur, or 1- (1 – .0266)^31 = 57%. Clearly, there appears to be a problem. These event are treated as independent, when clearly the fact that the trends are from overlapping periods means there will be high amounts of autocorrelation (quite separate from year-to-year OHC exhibiting autocorrelation). In other words, it ignores the fact that a particular ensemble member containing one negative 8-year trend is more likely to have another 8-year trend (particularly if 7 of those years are overlapping) that is negative, so that the probability of any particular run having "at least one zero or negative 8-yr trend in this time frame" is actually substantially less than that 57%.

The degree to which it is different I believe depends largely on the autocorrelation/noise model. I’ve used a ARIMA(2,0,2) for the noise, which I got as a "best fit" from one of the ECHAM-MPI runs, and added in a trend for this simulation. I’ve tuned to model to give approximately the 2.66% negative trends presented in the paper. Anyhow, running the simulation 1000 times yields 797 negative 8-yr trends out of (1000 runs * 31 8-yr periods) or 2.57% of the trends are negative. By the logic in KO11, we should expect that there is a 55% chance [1-(1-.0257)^31] that we’d see a negative trend over any given run. However, if we look at our simulation, only 36% (362 out of 1000) of the runs actually contain the negative trend. A look at our histogram below shows why:

Out of the 362 runs that contain at least one negative trend, a whopping 197 of them contain 2 or more negative trends, far more than we would expect if they were independent trials, but quite what we’d expect with these overlapping periods.

But really, much of this extra simulation is unnecessary. A look at figure 2b seems to show (if I’m reading the colors correctly) that 5 out of the 17 ensemble members during this period show a zero or negative trend, or 29%. Yes, this is a smaller sample size, but if KO11 had simply reported this they would have avoided this statistical pitfall, and the 29% is almost certainly a more correct figure than the 57%. Figure 2B from KO11 (each color represents a different member of the ensemble):

Does this issue "matter"? On the one hand, I would argue that yes, it matters if the probability of this happening over the 31-year period is more likely than not (57%) vs. less than a 1 in 3 shot. On the other hand, using a length of 31 years to diagnose whether an 8-yr event is "exceptional" seems rather arbitrary, particularly when it happened right after the SRES projections began. Furthermore, why KO11 also prominently included the 1969-1999 centered 8-year trends (which includes 1966 as the first year) is a mystery to me, given that the change in anthropogenic forcings used for the model over that period do not resemble the magnitude present in estimates for the 2003-2010 period. I’m not quite sure how to interpret the "not exceptional" part: certainly, if I can choose a model with some of the highest variability, choose a length of time for the event to happen, and/or choose a period with a smaller increase in forcings, then run this model numerous times, it seems possible that the event may occur with likelihood over that period. When it comes to examining the CMIP3 models as a whole, I might ask what percentage of the ensemble members showed an OHC trend resembling that 2003-2010 period? I won’t know until I process all the different runs for the different models.

Issue #2 – The ENSO observations do not bear out the theory

From KO11, under the "Recent Absence of Ocean Warming" section, we read:

During 2002–2007, a series of El Niño events occurred (www.cpc.noaa.gov/data/indices/), which probably yielded a larger than average upper ocean heat loss [Trenberth et al., 2002] caused by the (lagged) response through net outgoing TOA radiation (Figure 3b). This seems at odds with direct observations that indicate an opposing increase in the radiation from space [Trenberth and Fasullo, 2010], but the record has large uncertainties and is too short to separate trends from decadal variability.

KO11 readily admit that the TOA radiation observations do not match the behavior in the models that explain the flattening of UOHC, but chalk this up to uncertainties and a short record. That’s fine. But the comment about 2002-2007 "El Nino events" seems to make little sense given the theory that has been established throughout the paper. Basically, KO11 show that in the model simulations, El Nino events result in a heat loss that leads to a decrease in the ocean heat content trend in subsequent years, whereas La Ninas have the opposite effect (heat gain) and lead to an increase in the OHC trend. A very specific lagged correlation is shown between the mean Nino3.4 index over an 8-year interval and the 8-year OHC trend in figure 3d:

According to the graph of theory/model simulations, the OHC trend from 2003-2010 would be driven by the mean 8-year Nino3.4 index from 4 years prior, or 1999-2006 (that is, the running mean centered on 2002). Below, I’ve shown the Nino3.4, with this relevant portion highlighted:

The two red lines bound the ENSO period we’re talking about, and the green lines point to the specific point in the running mean that would be relevant according to the theory. Note that in 2002 it is actually BELOW 0, closer to La Nina conditions, which is the opposite of the strong decadal El Nino (positive Nino3.4 index) that the theory projected would have caused such a flattening! Yes, we see El Ninos between 2002-2007, but we also see a lot of fluctuations in the opposite direction…that’s the nature of ENSO. However, unless I’ve missed something (which is quite possible), or the chart is unclear, I cannot see how the combination of the flattening OHC trend and the observed Nino3.4 is at all consistent with the model findings, and in fact seems to contradict them.

So does this issue matter? In this case, I think the answer is “yes”. The current OHC trend would seem to be exceptional according to the model, if we experienced it without the corresponding ENSO variation that would have been expected to cause the extra 45% heat to radiate to space. Furthermore, the projection of an “upcoming resumption of upper ocean warming” depends largely on the shift in ENSO to La Nina conditions, but, as we’ve seen, the La Nina conditions already were present 4 years prior to the current flattening. If I had to bet, I would bet that OHC warming indeed picks up again, but this particular attribution to ENSO does not seem to have much observational support.

Anyhow, all code and intermediate data for this post can be found here.

Comments (3)

Older Posts »

Dec	JAN	Feb
	01
2011	2012	2013

Troy's Scratchpad

December 21, 2011

Measuring Sensitivity with the GFDL-CM 2.1 Control Run, Part 2

December 12, 2011

Foster and Rahmstorf 2011 lends support to…Spencer and Braswell?

Relation to Spencer and Braswell

December 1, 2011

Katsman and van Oldenborgh 2011 Update

November 14, 2011

New page on the observation-based estimate of climate sensitivity

November 6, 2011

Estimated Sensitivity and Climate Response with TLS

November 3, 2011

Measuring Sensitivity using the FG06 method with a GFDL CM2.1 control run

General Inaccuracies using the GCM Control Runs

Closer Look at the GFDL CM2.1 Control Run

Using Absolute Values Rather Than Anomalies

Regression Attenuation from Uncertainty in Surface Temperatures

Sampling Error in 10 year intervals

October 28, 2011

Will 2011 be one of the top ten hottest years?

October 21, 2011

More on the effective ocean mixed layer on multi-year timescales, and Dessler 2011

October 13, 2011

GISS-ER Ocean Heat Content, 20th Century Experiment and A1B projections

October 11, 2011

ECHAM-MPI model runs, Ocean Heat Content, and Katsman and van Oldenborgh 2011

Issue #1 – A Statistical Problem of Independence

Issue #2 – The ENSO observations do not bear out the theory

Pages

Recent Comments

Blogroll

Recent Posts

Search

Archive

Follow Troy's Scratchpad