Version Compatibility
Reference examples tested with: Python 3.10+, rasterio 1.4+, pandas 2.0+
Before using code patterns, verify installed versions match. If versions differ:
pip show rasterio pandas openpyxl
If code throws ImportError, install missing packages:
pip install rasterio pandas openpyxl
Overview
WorldClim provides global climate data as GeoTIFF raster files. Each .tif file is a grid covering the entire Earth, where each grid cell stores a climate value (e.g., temperature in °C or precipitation in mm). This skill automates the process of extracting climate values for specific geographic coordinates.
How It Works
- Input: Excel or CSV file containing sample coordinates (longitude, latitude)
- Data: WorldClim 2.1 bioclimatic GeoTIFF files (19 BIO variables, 1970-2000 average)
- Process: For each coordinate, find the corresponding grid cell and read its value
- Output: Original data plus extracted climate columns appended
Grid Resolution
| Resolution | Cell Size | Approx. Area | File Size |
|---|---|---|---|
10m | 0.167° | ~18.5 km² | ~48 MB zip |
5m | 0.083° | ~9.3 km² | ~170 MB zip |
2.5m | 0.042° | ~4.6 km² | ~650 MB zip |
Default: 10m — sufficient for most ecological/population genetics studies.
Quick Start
Using the CLI Script
A reusable Python script is provided at {baseDir}/extract_worldclim.py:
# Extract BIO1 (annual mean temp) and BIO12 (annual precipitation) — default
python3 {baseDir}/extract_worldclim.py \
-i samples.xlsx \
-o samples_with_climate.xlsx
# Extract all 19 bioclimatic variables
python3 {baseDir}/extract_worldclim.py \
-i samples.xlsx \
-o samples_all_bio.xlsx \
--bios 1-19
# Extract specific variables with custom column names
python3 {baseDir}/extract_worldclim.py \
-i coords.csv \
-o result.xlsx \
--bios 1,5,6,12,13 \
--res 2.5m \
--lon longitude \
--lat latitude
Using Python Directly
For custom integration or programmatic use:
import pandas as pd
import rasterio
def extract_bio(tif_path, lon, lat):
"""Extract a single value from a GeoTIFF at given coordinates."""
with rasterio.open(tif_path) as src:
value = next(src.sample([(lon, lat)]))[0]
return value
# Read sample coordinates
df = pd.read_excel("samples.xlsx")
coords = list(zip(df["经度"], df["纬度"]))
# Extract BIO1 (Annual Mean Temperature)
with rasterio.open("wc2.1_10m_bio_1.tif") as src:
df["年均温度_C"] = [v[0] for v in src.sample(coords)]
# Extract BIO12 (Annual Precipitation)
with rasterio.open("wc2.1_10m_bio_12.tif") as src:
df["年降水量_mm"] = [v[0] for v in src.sample(coords)]
df.to_excel("samples_with_climate.xlsx", index=False)
WorldClim Data Download
Automatic (script handles it)
The CLI script auto-downloads data on first run to the --cache directory (default: ./worldclim_data).
Manual Download
If automatic download fails (e.g., network issues):
# 10m resolution (~48 MB)
curl -O https://geodata.ucdavis.edu/climate/worldclim/2_1/base/wc2.1_10m_bio.zip
unzip wc2.1_10m_bio.zip -d ./worldclim_data/
# 2.5m resolution (~650 MB)
curl -O https://geodata.ucdavis.edu/climate/worldclim/2_1/base/wc2.1_2.5m_bio.zip
unzip wc2.1_2.5m_bio.zip -d ./worldclim_data/
BIO Variable Reference
| BIO | Name | Unit | Description |
|---|---|---|---|
| BIO1 | Annual Mean Temperature | °C | 年均温度 |
| BIO2 | Mean Diurnal Range | °C | 昼夜温差月均值 |
| BIO3 | Isothermality | % | 等温性 (BIO2/BIO7 × 100) |
| BIO4 | Temperature Seasonality | SD × 100 | 温度季节性 |
| BIO5 | Max Temp of Warmest Month | °C | 最暖月最高温 |
| BIO6 | Min Temp of Coldest Month | °C | 最冷月最低温 |
| BIO7 | Temperature Annual Range | °C | 年温度范围 (BIO5−BIO6) |
| BIO8 | Mean Temp of Wettest Quarter | °C | 最湿季均温 |
| BIO9 | Mean Temp of Driest Quarter | °C | 最干季均温 |
| BIO10 | Mean Temp of Warmest Quarter | °C | 最暖季均温 |
| BIO11 | Mean Temp of Coldest Quarter | °C | 最冷季均温 |
| BIO12 | Annual Precipitation | mm | 年降水量 |
| BIO13 | Precipitation of Wettest Month | mm | 最湿月降水量 |
| BIO14 | Precipitation of Driest Month | mm | 最干月降水量 |
| BIO15 | Precipitation Seasonality | CV | 降水季节性 |
| BIO16 | Precipitation of Wettest Quarter | mm | 最湿季降水量 |
| BIO17 | Precipitation of Driest Quarter | mm | 最干季降水量 |
| BIO18 | Precipitation of Warmest Quarter | mm | 最暖季降水量 |
| BIO19 | Precipitation of Coldest Quarter | mm | 最冷季降水量 |
Data Source: WorldClim 2.1 (1970-2000, 30-year average)
Input Format Requirements
Required Columns
- Longitude column: Decimal degrees, range [-180, 180]. Default column name:
经度(override with--lon) - Latitude column: Decimal degrees, range [-90, 90]. Default column name:
纬度(override with--lat)
Supported Input Formats
.xlsx— Excel workbook (recommended, handles Chinese headers well).csv— Comma-separated values
Common Issues
| Issue | Cause | Solution |
|---|---|---|
| Coordinates read as text | Hidden special characters (e.g., \xa0 non-breaking space) | Script auto-cleans with pd.to_numeric(errors='coerce'); check for NA after conversion |
| Negative longitudes rejected | Using East/West format instead of decimal | Convert to decimal: 东经 117° → 117.0; 西经 117° → -117.0 |
| Missing extracted values | Coordinate falls in ocean or outside raster bounds | Check coordinate validity; WorldClim covers land globally |
Output Format
The output file contains all original columns plus extracted BIO columns:
名称 经度 纬度 年均温度_C 年降水量_mm
NFAL10 117.214052 31.270421 16.15 1325.0
NFBJ1 116.591445 40.032115 11.88 542.0
Using R (terra) for Cross-Validation
If you need to validate results with R:
library(terra)
# Read raster stack
bio <- rast(list.files("./worldclim_data", pattern = "\\.tif$", full.names = TRUE))
# Read and clean coordinates
pts <- readxl::read_excel("samples.xlsx")
pts$经度 <- as.numeric(gsub("\\s+", "", pts$经度)) # Remove hidden spaces
pts$纬度 <- as.numeric(pts$纬度)
pts <- pts[!is.na(pts$经度) & !is.na(pts$纬度), ]
# Extract
v <- vect(pts, geom = c("经度", "纬度"), crs = "EPSG:4326")
result <- extract(bio, v)
write.csv(cbind(pts, result[, -1]), "output.csv", row.names = FALSE)
Note: R's as.numeric() is stricter than Python's pandas and may fail on hidden whitespace. Always clean coordinates before conversion.
Decision Tree
Need to extract climate data for sample coordinates?
├── Have coordinates in Excel/CSV?
│ └── Use the CLI script: python3 extract_worldclim.py -i input.xlsx -o output.xlsx
├── Need only temperature and precipitation?
│ └── Default: --bios 1,12 (no need to specify)
├── Need all 19 bioclimatic variables?
│ └── Use: --bios 1-19
├── Need higher spatial resolution?
│ ├── ~9 km cells → --res 5m
│ └── ~4.6 km cells → --res 2.5m
└── Need to integrate into a Python pipeline?
└── Use the direct Python code pattern with rasterio.sample()
Related Skills
- bio-geo-data — For general geospatial data operations
- bio-read-sequences — For biological sequence file parsing
- bio-batch-processing — For processing multiple files in batch