Skip to contents

When to use this

myIO’s default SVG path handles up to about 20,000 rendered marks comfortably. Beyond that, the big-data tier uses a coordinator, an engine adapter, and optional Canvas or WebGL rendering so charts can respond to brush and zoom interactions over millions of rows. Opt in per chart by calling setBigData(widget, source).

Installation

Big-data features require two optional components.

First, install the Suggested R packages used for Arrow encoding, DuckDB queries, downloads, checksums, and status output:

install.packages(c("arrow", "duckdb", "DBI", "base64enc", "cli", "curl", "openssl"))

Second, install the DuckDB-WASM runtime when you plan to use the browser engine. The runtime is downloaded on demand by myIO::install_duckdb_wasm() and is not bundled in the CRAN tarball. It is cached under tools::R_user_dir("myIO", "cache"). For airgapped machines, place duckdb-mvp.wasm and duckdb-browser-mvp.worker.js in a local directory and call install_duckdb_wasm(from = "/path/to/dir").

install.packages(c("arrow", "duckdb", "DBI", "base64enc", "cli", "curl", "openssl"))
myIO::install_duckdb_wasm()
myIO::duckdb_wasm_status()

Attaching big data

setBigData() accepts several source types. A data.frame is encoded as inline Arrow IPC. This is convenient for portable HTML, but it warns above 50 MB and hard-errors above 200 MB.

\dontrun{
library(myIO)

big <- data.frame(
  id = seq_len(1e6),
  x = rnorm(1e6),
  y = rnorm(1e6)
)

myIO(engine = "wasm") |>
  addIoLayer(type = "point", label = "points",
             mapping = list(x_var = "x", y_var = "y")) |>
  setBigData(big, rowkey_col = "id")
}

An arrow::Table uses the same inline IPC path.

\dontrun{
library(arrow)
library(myIO)

tab <- arrow_table(big)

myIO(engine = "wasm") |>
  addIoLayer(type = "point", label = "points",
             mapping = list(x_var = "x", y_var = "y")) |>
  setBigData(tab, rowkey_col = "id")
}

For larger static assets, pass a local path or URL ending in .parquet, .csv, .arrow, or .feather.

\dontrun{
myIO(engine = "wasm") |>
  addIoLayer(type = "histogram", label = "x",
             mapping = list(x_var = "x")) |>
  setBigData("data/observations.parquet", rowkey_col = "id")

myIO(engine = "wasm") |>
  addIoLayer(type = "point", label = "remote",
             mapping = list(x_var = "x", y_var = "y")) |>
  setBigData("https://example.org/observations.csv", rowkey_col = "id")
}

A DBI connection is server-engine-only. Provide table = "..." so myIO can read the schema.

\dontrun{
library(DBI)
library(duckdb)
library(myIO)

con <- dbConnect(duckdb())
dbWriteTable(con, "observations", big)

myIO(engine = "server") |>
  addIoLayer(type = "point", label = "points",
             mapping = list(x_var = "x", y_var = "y")) |>
  setBigData(con, table = "observations", rowkey_col = "id")
}

The engine argument

Use engine = "auto", "server", "wasm", or "svg" on myIO(). "auto" is the default: a Shiny session resolves to "server"; otherwise it resolves to "wasm". "server" runs queries in R with duckdb and streams Arrow batches to the browser, which is a good fit when a Shiny server already exists. "wasm" runs DuckDB in the browser from the cached WASM runtime, which fits static Quarto or R Markdown HTML. "svg" forces the legacy SVG path without the coordinator and is mainly useful for testing.

Crosstalk threshold

By default, myIO broadcasts row keys to crosstalk::SharedData only when the selected row count is at or below 100,000. Below the threshold, sibling htmlwidgets such as plotly, leaflet, and reactable can react to myIO brushes. Above it, upward broadcast is suppressed; myIO-to-myIO linking still works through predicates, a one-shot console info is emitted, and the footer badge reads linked: predicate-only.

Tune the limit with:

options(myIO.crosstalk_threshold = 50000L)

The threshold is per selection, not per chart. A narrow brush on a million-row source can still broadcast if it matches few rows.

File-protocol limitation

When a Quarto or R Markdown HTML file is opened directly from the file manager with the file:// protocol, Chromium blocks dynamic module imports. myIO detects this and falls back to the SVG path with a one-shot console info. To use the WASM engine on a local static HTML, serve it with servr::httd() or quarto preview.

Performance expectations

Input rows Engine Renderer Interaction
<= 20k svg (default) D3 SVG Full brush/zoom, publication-quality
20k-100k svg + aggregation D3 SVG Smooth; tooltips on pre-aggregated data
100k-1M wasm or server Canvas or WebGL Sub-200ms brush re-aggregation (WASM), sub-500ms (server Shiny)
1M-10M wasm or server WebGL Target: 60fps pan/zoom; brush re-agg < 300ms

Limits and gotchas

Inline IPC above 200 MB hard-errors; use file paths or a DBI connection. The Crosstalk threshold depends on which rows match the current selection. The WASM binary is about 22 MB, downloads once per user per version, and is cached indefinitely; clear it with clear_duckdb_wasm_cache(). On Posit Connect or shinyapps.io, use the "server" engine; install_duckdb_wasm() is not needed on the server.

Minimal complete example

\dontrun{
library(myIO)

install.packages(c("arrow", "duckdb", "DBI", "base64enc", "cli", "curl", "openssl"))
myIO::install_duckdb_wasm()

set.seed(1)
events <- data.frame(
  id = seq_len(250000),
  time = as.POSIXct("2026-01-01", tz = "UTC") + seq_len(250000),
  x = rnorm(250000),
  y = rnorm(250000),
  group = sample(LETTERS[1:4], 250000, replace = TRUE)
)

myIO(engine = "wasm") |>
  addIoLayer(type = "point", label = "events",
             mapping = list(x_var = "x", y_var = "y", color = "group")) |>
  setBrush(direction = "xy") |>
  setBigData(events, rowkey_col = "id")
}