Data Structure and Schema

GRAIN Attributes

The GRAIN Dataset is distributed as country-scale files in two formats, a lightweight GeoParquet format and as ESRI Shapefiles, to ensure compatibility with legacy GIS software. All files are projected to EPSG:4326 based on the WGS-84 datum as per standard geospatial data convention. The attribute scheme for each GRAIN canal is given in the table below.
Field Type Unit Description
grain_id String Unique identifier – format: ISO3_PfafstetterL6ID_seq.numbering.
osm_id String OpenStreetMap feature ID.
country String Country where the canal segment is located.
continent String Continent in which the canal lies.
country_iso String ISO-3 country code.
length_KM Float km Canal path length.
slope_mkm Float m km⁻¹ Longitudinal slope derived from SRTM DEM.
elev_diff_M Float m Elevation difference between start and end points.
predicted_class String ML-classified label ("canal" or "river").
confidence Float Prediction confidence of ML classifier (0–1).
osm_label String Original OSM waterway label.
osm_name String Canal name from OSM (if available).
alt_name String Alternate canal name detected from OSM tags.
tags String Raw OSM tags (JSON string).
canal_use String Canal use class (e.g., Agricultural, Urban, Navigation).
koppen_class_code String Köppen–Geiger climate-zone code.
update_date String Date of dataset creation or latest update.
version String GRAIN dataset release version number.