Instructions
User Manual:
Open the PDF directly: View PDF .
Page Count: 4
Download | |
Open PDF In Browser | View PDF |
Skills development: data analysis Project 1: Consider the following data: Timeseries of global annual temperature anomalies for 1880-2017 (reconstructed from climate models and observations) – plotted in Figure 1 Timeseries of measured annual CO2 concentrations in the atmosphere from Mauna Loa observatory for 1959-2017 – plotted in Figure 2 Temperature anomaly (K) 1.2 1 0.8 0.6 0.4 0.2 0 1860 1880 -0.2 1900 1920 1940 1960 1980 2000 2020 2040 2000 2020 -0.4 -0.6 Figure 1 420 Annual CO2 (ppm) 400 380 360 340 320 300 1950 Figure 2 1960 1970 1980 1990 2010 2030 Research questions/objectives: 1) Is there a long-term trend in the global temperature data and is the trend significant? How different is the trend over the last 50 years in comparison to the overall trend (1880-2017)? 2) Is there a long-term trend in CO2 data and is the trend significant? 3) How strongly are temperature and CO2 timeseries linearly related and what is the linear relation between the two? How good is the linear model that relates temperature to CO2 (i.e. how much variance in temperature is explained by this model)? 4) If CO2 is twice its 2017 value, how much the global temperature would change according to the linear model? If CO2 concentration continues to rise with the same rate per year until 2100 what would be the global temperature in 2100 according to the linear regression model? 5) What would be the CO2 concentration in year 1880 according to the linear regression model between CO2 and temperature? 6) Is there a relationship between year-to-year variability in global temperature data and year-to-year variability in CO2 data? Instructions and guidelines: Feel free to use any computing environment/software (e.g. Excel, Matlab, R, Python, ...) to perform the data analysis. The data is given in Data.xls file. In the worksheet labelled as 'Data', the fist column contain years, the second column contains temperature and the third contains CO2. The second sheet, labelled as 'Temperature', gives the solution to question 1: the trend calculation and its significance (T-statistics). Start by opening the data (or loading it in the computing software of your choice) and plotting it. See if you can reproduce the Figures 1 and 2. Specific instructions for each question + guidelines for students using Excel: 1) Use the linear regression (ordinary least square method) to calculate the trend. See the guidelines in 'Stats_recap.pdf' presentation from Day 5 (on Canvas) on how to derive the slope and intercept. Plot the trend-line on top of the original timeseries. Calculate the T-statistics and determine whether the trend is significantly different from zero at your choice of the confidence level (note: 5% and 1% are the most commonly used confidence levels). 2) Same steps as in (1) but for CO2 timeseries. 3) In a new sheet copy the temperature and CO2 timeseries for the period they overlap (1959-2017). It's always a good practice to keep the raw data (the first worksheet) intact. In this new sheet, perform the correlation analysis between the two timeseries. Plot a scatter-plot with temperature on y-axis and CO2 on x-axis and perform a linear regression (y=ax+b) on this data. In a new column calculate the regressed temperature (y=ax+b). Plot the regressed temperature (trendline) on top of the scatter-plot. Calculate the correlation coefficient (r) between the regressed temperature and the original temperature -> r^2 tells you how much variance is explained by the linear regression model. You can also plot a scatterplot with regressed temperature on y-axis and original temperate on x-axis to visually inspect the model performance. 4) Using the model from above (y=ax+b) try to answer the first question. For the second question: look back on the trend of CO2 from (2) and extrapolate that trendline further to 2100. What CO2 values do you get for year 2100? Once you estimate that value, use the model (y=ax +b) to determine the temperature for year 2100. 5) This question requires you to perform the linear regression between CO2 (yaxis) and temperature (x-axis). So you need to find a new linear regression model y=Ax+B. Once you calculate the values A (slope) and B (intercept), use the temperature values from year 1880 to determine CO2 value for that year. 6) Similarly to the question (3), in a new sheet copy the temperature and CO2 timeseries for the period they overlap (1959-2017). Observe that in both timeseries the long-term trend over the whole period is the most dominant signal. Therefore, we first need to get rid of the trend in order to see how the data fluctuates year-to-year. In other words, we need to perform de-trending of the original timeseries. Steps for detrending (option 1): • Calculate the trendline (linear regression) for both temperature and CO2 -> you have already done this in (1) and (2). Add a column that contains values of the trendline for temperature (T_trendline=trend_T*time+intercept_T) and add a column that contains values of the trendline for CO2 (CO2_trendline=trend_CO2*time+intercept_CO2). • Subtract the T_trendline values from original temperature (T) values (for each year) and save the values in a new column -> this is the detrended temperature signal. Perform the same for CO2. • Plot separately the detrended T and detrended CO2 timeseries. • Perform the correlation analysis (calculate r) between the two detrended timeseries. You can also plot the two timeseries in a scatter-plot (detrended T versus detrended CO2). Steps for detrending (option 2): Another approach for detrending is by using a moving-average (running-mean). This approach is probably better for the CO2 data as the detending approach above did not really produce year-to-year fluctuations for CO2 (assumption of the normal distribution of residuals in the linear regression does not really apply in this case). • Start by calculating a 5-year moving average for temperature and save the values in the centre of the averaging window, i.e. find the average temp for 1959-1963 and save that value for the year 1961, then find the average temp for 1960-1964 and save that value for the year 1962, then find the average temp for 1961-1965 and save that value for the year 1963, and so on. Your last centred window will be year 2015 and it will contain the average CO2 value for 2013-2017. In this way the moving-average will not have the values for the first two years at the beginning of the period and the last two years at the end of the original period, i.e. the moving average is given for the period 1961-2015. • Repeat the same procedure for the CO2 timeseries. • Plot separately the moving-average timeseries for T and CO2. • Similarity to the detrending in option 1, you need to remove the movingaverage timeseries from the original timeseries (this can be only done for the overlapping period of 1961-2015). This subtraction will give you the residual timeseries for 1961-2015. Plot those residual timeseries. If all goes well you should be getting the following plot for the residual CO2 timeseries: 1 CO2 residuals 0.8 0.6 0.4 0.2 0 1960 -0.2 1970 1980 1990 2000 2010 2020 -0.4 -0.6 -0.8 time Figure 3: Residual CO2 timeseries (when the 5-yr moving average is removed from the original timeseries). Finally, you can perform the correlation analysis between the detrended (residual) timeseries of T and CO2. Try also correlating the detrended temperature timeseries from option 1 with the residual CO2 timeseries from option 2. You can also test the sensitivity of your results (correlation coefficient) to the choice of the moving-averaging window (e.g. 3-yr, 7-yr moving average). Which movingaveraging window gives you the highest correlation coefficient between residual T and residual CO2 timeseries?
Source Exif Data:
File Type : PDF File Type Extension : pdf MIME Type : application/pdf PDF Version : 1.4 Linearized : No Page Count : 4 Language : en-CA Creator : Writer Producer : LibreOffice 4.2 Create Date : 2019:01:23 18:56:35-08:00EXIF Metadata provided by EXIF.tools