--- title: "Data Cubes with STAC" output: rmarkdown::html_vignette vignette: > %\VignetteIndexEntry{Data Cubes with STAC} %\VignetteEngine{knitr::rmarkdown} %\VignetteEncoding{UTF-8} --- High-resolution satellites generate many snapshot images each with a limited field of view or spatial extent. In order to see a larger area in space, and/or observe changes across space and time, we need to assemble these many snapshots into a mosaic or "data cube" that we can analyze as a cohesive whole. [EARTHDATA STAC CATALOGS](https://radiantearth.github.io/stac-browser/#/external/cmr.earthdata.nasa.gov/stac/) ```r library(rstac) library(gdalcubes) gdalcubes_options(parallel = TRUE) ``` ### Earth Data Authentication First let's get EDL authentication out of the way. For cloud data from almost any other STAC catalog (NOAA, USGS, Planetary Computer, etc), authentication is either unnecessary or already provided by the STAC API, but NASA EDL is special. As usual, we can handle this with `edl_netrc()`. Because the `gdalcubes` package doesn't respect global environmental variables, we use a helper utility to export those into it's configuration as well. ```r library(earthdatalogin) edl_netrc() with_gdalcubes() ``` ## Search via STAC We will now use the `rstac` package to search one or more NASA collections for data that falls into our desired bounding box of space and time: ```r bbox <- c(xmin=-123, ymin=37.25, xmax=-122.0, ymax=38.25) start <- "2021-12-01" end <- "2022-05-31" # Find all assets from the desired catalog: items <- stac("https://cmr.earthdata.nasa.gov/stac/LPCLOUD") |> stac_search(collections = "HLSL30.v2.0", bbox = bbox, datetime = paste(start,end, sep = "/")) |> post_request() |> items_fetch() |> items_filter(properties[["eo:cloud_cover"]] < 20) #> | |==========================================================================================================================================================================================================================================================| 100% ``` Note that 98 features have matched our search criteria! Each feature represents a 'snapshot' image taken by the satellite as it passes by (this is a harmonized product so actually there's quite a lot of post-processing.) Each feature thus shares the same bounding box, projection, and timestamp, but may consist of many different 'assets', different COG files representing the different spectral bands on the satellite camera instrument. Each feature can potentially include quite extensive metadata about the feature, including details of instrument itself or from post-processing, such as cloud cover. Unfortunately, EarthData's STAC metadata tends to be quite sparse. ## Building a Data Cube ```r # Desired data cube shape & resolution v = cube_view(srs = "EPSG:4326", extent = list(t0 = as.character(start), t1 = as.character(end), left = bbox[1], right = bbox[3], top = bbox[4], bottom = bbox[2]), nx = 512, ny = 512, dt = "P1M") ``` ```r # RGB bands + cloud cover mask col <- stac_image_collection(items$features, asset_names = c("B02", "B03", "B04", "Fmask")) # use a cloud mask -- see # https://lpdaac.usgs.gov/documents/1326/HLS_User_Guide_V2.pdf cloud_mask <- image_mask("Fmask", values=1) ``` ```r # Here we go! note eval is lazy raster_cube(col, v, mask=cloud_mask) |> select_bands(c("B02","B03", "B04")) |> plot(rgb=3:1) ``` ![plot of chunk sf_bay](img/sf_bay-1.png)