This chapter illustrates how to use NCO to process and analyze the results of a CCSM climate simulation.
************************************************************************
Task 0: Finding input files
************************************************************************
The CCSM model outputs files to a local directory like:
/ptmp/zender/archive/T42x1_40
Each component model has its own subdirectory, e.g.,
/ptmp/zender/archive/T42x1_40/atm
/ptmp/zender/archive/T42x1_40/cpl
/ptmp/zender/archive/T42x1_40/ice
/ptmp/zender/archive/T42x1_40/lnd
/ptmp/zender/archive/T42x1_40/ocn
within which model output is tagged with the particular model name
/ptmp/zender/archive/T42x1_40/atm/T42x1_40.cam2.h0.0001-01.nc
/ptmp/zender/archive/T42x1_40/atm/T42x1_40.cam2.h0.0001-02.nc
/ptmp/zender/archive/T42x1_40/atm/T42x1_40.cam2.h0.0001-03.nc
...
/ptmp/zender/archive/T42x1_40/atm/T42x1_40.cam2.h0.0001-12.nc
/ptmp/zender/archive/T42x1_40/atm/T42x1_40.cam2.h0.0002-01.nc
/ptmp/zender/archive/T42x1_40/atm/T42x1_40.cam2.h0.0002-02.nc
...
or
/ptmp/zender/archive/T42x1_40/lnd/T42x1_40.clm2.h0.0001-01.nc
/ptmp/zender/archive/T42x1_40/lnd/T42x1_40.clm2.h0.0001-02.nc
/ptmp/zender/archive/T42x1_40/lnd/T42x1_40.clm2.h0.0001-03.nc
...
************************************************************************
Task 1: Regional processing
************************************************************************
The first task in data processing is often creating seasonal cycles.
Imagine a 100-year simulation with its 1200 monthly mean files.
Our goal is to create a single file containing 12 months of data.
Each month in the output file is the mean of 100 input files.
Normally, we store the "reduced" data in a smaller, local directory.
caseid='T42x1_40'
#drc_in="${DATA}/archive/${caseid}/atm"
drc_in="${DATA}/${caseid}"
drc_out="${DATA}/${caseid}"
mkdir -p ${drc_out}
cd ${drc_out}
Method 1: Assume all data in directory applies
for mth in {1..12}; do
mm=`printf "%02d" $mth`
ncra -O -D 1 -o ${drc_out}/${caseid}_clm${mm}.nc \
${drc_in}/${caseid}.cam2.h0.*-${mm}.nc
done # end loop over mth
Method 2: Use shell 'globbing' to construct input filenames
for mth in {1..12}; do
mm=`printf "%02d" $mth`
ncra -O -D 1 -o ${drc_out}/${caseid}_clm${mm}.nc \
${drc_in}/${caseid}.cam2.h0.00??-${mm}.nc \
${drc_in}/${caseid}.cam2.h0.0100-${mm}.nc
done # end loop over mth
Method 3: Construct input filename list explicitly
for mth in {1..12}; do
mm=`printf "%02d" $mth`
fl_lst_in=''
for yr in {1..100}; do
yyyy=`printf "%04d" $yr`
fl_in=${caseid}.cam2.h0.${yyyy}-${mm}.nc
fl_lst_in="${fl_lst_in} ${caseid}.cam2.h0.${yyyy}-${mm}.nc"
done # end loop over yr
ncra -O -D 1 -o ${drc_out}/${caseid}_clm${mm}.nc -p ${drc_in} \
${fl_lst_in}
done # end loop over mth
Make sure the output file averages correct input files!
ncks -M prints global metadata:
ncks -M ${drc_out}/${caseid}_clm01.nc
The input files ncra used to create the climatological monthly mean
will appear in the global attribute named 'history'.
Use ncrcat to aggregate the climatological monthly means
ncrcat -O -D 1 \
${drc_out}/${caseid}_clm??.nc ${drc_out}/${caseid}_clm_0112.nc
Finally, create climatological means for reference.
The climatological time-mean:
ncra -O -D 1 \
${drc_out}/${caseid}_clm_0112.nc ${drc_out}/${caseid}_clm.nc
The climatological zonal-mean:
ncwa -O -D 1 -a lon \
${drc_out}/${caseid}_clm.nc ${drc_out}/${caseid}_clm_x.nc
The climatological time- and spatial-mean:
ncwa -O -D 1 -a lon,lat,time -w gw \
${drc_out}/${caseid}_clm.nc ${drc_out}/${caseid}_clm_xyt.nc
This file contains only scalars, e.g., "global mean temperature",
used for summarizing global results of a climate experiment.
Climatological monthly anomalies = Annual Cycle:
Subtract climatological mean from climatological monthly means.
Result is annual cycle, i.e., climate-mean has been removed.
ncbo -O -D 1 -o ${drc_out}/${caseid}_clm_0112_anm.nc \
${drc_out}/${caseid}_clm_0112.nc ${drc_out}/${caseid}_clm_xyt.nc
************************************************************************
Task 2: Correcting monthly averages
************************************************************************
The previous step appoximates all months as being equal, so, e.g.,
February weighs slightly too much in the climatological mean.
This approximation can be removed by weighting months appropriately.
We must add the number of days per month to the monthly mean files.
First, create a shell variable dpm:
unset dpm # Days per month
declare -a dpm
dpm=(0 31 28.25 31 30 31 30 31 31 30 31 30 31) # Allows 1-based indexing
Method 1: Create dpm directly in climatological monthly means
for mth in {1..12}; do
mm=`printf "%02d" ${mth}`
ncap2 -O -s "dpm=0.0*date+${dpm[${mth}]}" \
${drc_out}/${caseid}_clm${mm}.nc ${drc_out}/${caseid}_clm${mm}.nc
done # end loop over mth
Method 2: Create dpm by aggregating small files
for mth in {1..12}; do
mm=`printf "%02d" ${mth}`
ncap2 -O -v -s "dpm=${dpm[${mth}]}" ~/nco/data/in.nc \
${drc_out}/foo_${mm}.nc
done # end loop over mth
ncecat -O -D 1 -p ${drc_out} -n 12,2,2 foo_${mm}.nc foo.nc
ncrename -O -D 1 -d record,time ${drc_out}/foo.nc
ncatted -O -h \
-a long_name,dpm,o,c,"Days per month" \
-a units,dpm,o,c,"days" \
${drc_out}/${caseid}_clm_0112.nc
ncks -A -v dpm ${drc_out}/foo.nc ${drc_out}/${caseid}_clm_0112.nc
Method 3: Create small netCDF file using ncgen
cat > foo.cdl << EOF
netcdf foo {
dimensions:
time=unlimited;
variables:
float dpm(time);
dpm:long_name="Days per month";
dpm:units="days";
data:
dpm=31,28.25,31,30,31,30,31,31,30,31,30,31;
}
EOF
ncgen -b -o foo.nc foo.cdl
ncks -A -v dpm ${drc_out}/foo.nc ${drc_out}/${caseid}_clm_0112.nc
Another way to get correct monthly weighting is to average daily
output files, if available.
************************************************************************
Task 3: Regional processing
************************************************************************
Let's say you are interested in examining the California region.
Hyperslab your dataset to isolate the appropriate latitude/longitudes.
ncks -O -D 1 -d lat,30.0,37.0 -d lon,240.0,270.0 \
${drc_out}/${caseid}_clm_0112.nc ${drc_out}/${caseid}_clm_0112_Cal.nc
The dataset is now much smaller!
To examine particular metrics.
************************************************************************
Task 4: Accessing data stored remotely
************************************************************************
OPeNDAP server examples:
UCI DAP servers:
ncks -M -p http://dust.ess.uci.edu/cgi-bin/dods/nph-dods/dodsdata in.nc
ncrcat -O -C -D 3 -p http://dust.ess.uci.edu/cgi-bin/dods/nph-dods/dodsdata \
-l /tmp in.nc in.nc ~/foo.nc
Unidata DAP servers:
ncks -M -p http://motherlode.ucar.edu:8080/thredds/dodsC/testdods in.nc
ncrcat -O -C -D 3 -p http://motherlode.ucar.edu:8080/thredds/dodsC/testdods \
-l /tmp in.nc in.nc ~/foo.nc
NOAA DAP servers:
ncwa -O -C -a lat,lon,time -d lon,-10.,10. -d lat,-10.,10. -l /tmp -p \
http://www.esrl.noaa.gov/psd/thredds/dodsC/Datasets/ncep.reanalysis.dailyavgs/surface \
pres.sfc.1969.nc ~/foo.nc
LLNL PCMDI IPCC OPeNDAP Data Portal:
ncks -M -p http://username:password@esgcet.llnl.gov/cgi-bin/dap-cgi.py/ipcc4/sresa1b/ncar_ccsm3_0 pcmdi.ipcc4.ncar_ccsm3_0.sresa1b.run1.atm.mo.xml
Earth System Grid (ESG): http://www.earthsystemgrid.org
caseid='b30.025.ES01'
CCSM3.0 1% increasing CO2 run, T42_gx1v3, 200 years starting in year 400
Atmospheric post-processed data, monthly averages, e.g.,
/data/zender/tmp/b30.025.ES01.cam2.h0.TREFHT.0400-01_cat_0449-12.nc
/data/zender/tmp/b30.025.ES01.cam2.h0.TREFHT.0400-01_cat_0599-12.nc
ESG supports password-protected FTP access by registered users
NCO uses the .netrc file, if present, for password-protected FTP access
Syntax for accessing single file is, e.g.,
ncks -O -D 3 \
-p ftp://climate.llnl.gov/sresa1b/atm/mo/tas/ncar_ccsm3_0/run1 \
-l /tmp tas_A1.SRESA1B_1.CCSM.atmm.2000-01_cat_2099-12.nc ~/foo.nc
# Average surface air temperature tas for SRESA1B scenario
# This loop is illustrative and will not work until NCO correctly
# translates '*' to FTP 'mget' all remote files
for var in 'tas'; do
for scn in 'sresa1b'; do
for mdl in 'cccma_cgcm3_1 cccma_cgcm3_1_t63 cnrm_cm3 csiro_mk3_0 \
gfdl_cm2_0 gfdl_cm2_1 giss_aom giss_model_e_h giss_model_e_r \
iap_fgoals1_0_g inmcm3_0 ipsl_cm4 miroc3_2_hires miroc3_2_medres \
miub_echo_g mpi_echam5 mri_cgcm2_3_2a ncar_ccsm3_0 ncar_pcm1 \
ukmo_hadcm3 ukmo_hadgem1'; do
for run in '1'; do
ncks -R -O -D 3 -p ftp://climate.llnl.gov/${scn}/atm/mo/${var}/${mdl}/run${run} -l ${DATA}/${scn}/atm/mo/${var}/${mdl}/run${run} '*' ${scn}_${mdl}_${run}_${var}_${yyyymm}_${yyyymm}.nc
done # end loop over run
done # end loop over mdl
done # end loop over scn
done # end loop over var
cd sresa1b/atm/mo/tas/ukmo_hadcm3/run1/
ncks -H -m -v lat,lon,lat_bnds,lon_bnds -M tas_A1.nc | m
bds -x 096 -y 073 -m 33 -o ${DATA}/data/dst_3.75x2.5.nc # ukmo_hadcm3
ncview ${DATA}/data/dst_3.75x2.5.nc
# msk_rgn is California mask on ukmo_hadcm3 grid
# area is correct area weight on ukmo_hadcm3 grid
ncks -A -v area,msk_rgn ${DATA}/data/dst_3.75x2.5.nc \
${DATA}/sresa1b/atm/mo/tas/ukmo_hadcm3/run1/area_msk_ukmo_hadcm3.nc
Template for standardized data:
${scn}_${mdl}_${run}_${var}_${yyyymm}_${yyyymm}.nc
e.g., raw data
${DATA}/sresa1b/atm/mo/tas/ukmo_hadcm3/run1/tas_A1.nc
becomes standardized data
Level 0: raw from IPCC site--no changes except for name
Make symbolic link name match raw data
Template: ${scn}_${mdl}_${run}_${var}_${yyyymm}_${yyyymm}.nc
ln -s -f tas_A1.nc sresa1b_ukmo_hadcm3_run1_tas_200101_209911.nc
area_msk_ukmo_hadcm3.nc
Level I: Add all variables (but not standardized in time)
to file containing msk_rgn and area
Template: ${scn}_${mdl}_${run}_${yyyymm}_${yyyymm}.nc
/bin/cp area_msk_ukmo_hadcm3.nc sresa1b_ukmo_hadcm3_run1_200101_209911.nc
ncks -A -v tas sresa1b_ukmo_hadcm3_run1_tas_200101_209911.nc \
sresa1b_ukmo_hadcm3_run1_200101_209911.nc
ncks -A -v pr sresa1b_ukmo_hadcm3_run1_pr_200101_209911.nc \
sresa1b_ukmo_hadcm3_run1_200101_209911.nc
If already have file then:
mv sresa1b_ukmo_hadcm3_run1_200101_209911.nc foo.nc
/bin/cp area_msk_ukmo_hadcm3.nc sresa1b_ukmo_hadcm3_run1_200101_209911.nc
ncks -A -v tas,pr foo.nc sresa1b_ukmo_hadcm3_run1_200101_209911.nc
Level II: Correct # years, months
Template: ${scn}_${mdl}_${run}_${var}_${yyyymm}_${yyyymm}.nc
ncks -d time,....... file1.nc file2.nc
ncrcat file2.nc file3.nc sresa1b_ukmo_hadcm3_run1_200001_209912.nc
Level III: Many derived products from level II, e.g.,
A. Global mean timeseries
ncwa -w area -a lat,lon \
sresa1b_ukmo_hadcm3_run1_200001_209912.nc \
sresa1b_ukmo_hadcm3_run1_200001_209912_xy.nc
B. Califoria average timeseries
ncwa -m msk_rgn -w area -a lat,lon \
sresa1b_ukmo_hadcm3_run1_200001_209912.nc \
sresa1b_ukmo_hadcm3_run1_200001_209912_xy_Cal.nc