Skip to contents
#devtools::load_all(".")
library(pipfun)

Intro

The pipfun logging system provides a powerful, flexible way to track messages, metadata, function arguments, and error traces across your R codebase — especially when building packages.

Logs are stored in memory and can be written to disk as .qs files for persistence.

Use this system to:

  • Log errors, warnings, and informative messages
  • Capture arguments, return values, and internal states
  • Filter, summarize, and debug across modules or packages
  • Persist logs between sessions or across users

Quick Start

log_name <- "mylog"
log_init(log_name)

log_info("This is an info message", name = log_name)
log_warn("Something looks odd", name = log_name)
log_error("An error occurred", name = log_name)
log_has_errors(log_name)
#> [1] TRUE
log_summary(log_name)
#>      event count
#>     <char> <int>
#> 1:    info     1
#> 2: warning     1
#> 3:   error     1
log_filter(log_name, event = "error")
#> 
#> ── Log entries: ──
#> 
#> → [2025-07-29 15:03:12.403908] ERROR — `An error occurred`
#> Function: `eval(expr, envir)` (from global)
#> Trace: log_error("An error occurred", name = log_name)
#> 

Logging Concepts

Each log entry is a row in a data.table, with fields such as:

  • time: timestamp
  • event: one of “info”, “warning”, or “error”
  • message: custom message
  • fun, package: where it was logged from
  • args: captured arguments
  • output: result (optional)
  • trace: traceback object

Creating and Managing Logs

Initialize a log

log_init("devlog")

Add entries

Use the helpers for most use cases:

log_info("Process started", name = "devlog", logmeta = list(step = 1))
log_warn("Unexpected format", name = "devlog", logmeta = list(column = "status"))
log_error("Failed to compute results", name = "devlog", logmeta = list(file = "input.csv"))

Filter and Query

log_filter(name = "devlog", event = "error")
#> 
#> ── Log entries: ──
#> 
#> → [2025-07-29 15:03:12.686249] ERROR — `Failed to compute results`
#> Function: `eval(expr, envir)` (from global)
#> Trace: log_error("Failed to compute results", name = "devlog", logmeta =
#> list(file = "input.csv"))
#> Metadata: list(file = "input.csv")
#> 
log_filter(name = "devlog", fun = "my_function")
#> 
#> ── Log entries: ──
#> 
#>  The log is empty.
log_filter(name = "devlog", after = Sys.Date() - 1)
#> 
#> ── Log entries: ──
#> 
#> → [2025-07-29 15:03:12.683905] INFO — `Process started`
#> Function: `eval(expr, envir)` (from global)
#> Trace: log_info("Process started", name = "devlog", logmeta = list(step = 1))
#> Metadata: list(step = 1)
#> 
#> → [2025-07-29 15:03:12.685167] WARNING — `Unexpected format`
#> Function: `eval(expr, envir)` (from global)
#> Trace: log_warn("Unexpected format", name = "devlog", logmeta = list(column =
#> "status"))
#> Metadata: list(column = "status")
#> 
#> → [2025-07-29 15:03:12.686249] ERROR — `Failed to compute results`
#> Function: `eval(expr, envir)` (from global)
#> Trace: log_error("Failed to compute results", name = "devlog", logmeta =
#> list(file = "input.csv"))
#> Metadata: list(file = "input.csv")
#> 

Using Logs in Package Development

pipfun is especially helpful when building packages.

1. Initialize at load

In your package’s zzz.R:

.onLoad <- function(libname, pkgname) {
  pipfun::log_init("pipapi_log")
}

2. Use in functions


# I init the log here, but you can do it in zzz.R, or somewhere else.
log_init("pipapi_log", overwrite = TRUE)

get_data <- function(path, clean = TRUE) {
  log_info("Starting data load", name = "pipapi_log",
           logmeta = list(path = path, clean = clean))

  data <- tryCatch({
    read.csv(path)
  }, error = function(e) {
    log_error("Failed to read CSV", name = "pipapi_log",
              logmeta = list(error = e$message))
    stop(e)
  })

  log_info("Data loaded", name = "pipapi_log",
           logmeta = list(n = nrow(data)))
  return(data)
}

3. Use with tryCatch for robust pipelines

run_step <- function(step_name, expr) {
  tryCatch({
    result <- expr()
    log_info("Step completed", 
             name = "pipeline_log", 
             logmeta = list(step = step_name))
    result
  }, error = function(e) {
    log_error("Step failed", 
              name = "pipeline_log", 
              logmeta = list(step = step_name, error = e$message))
    NULL
  })
}

4. Enable or Disable Logging Dynamically

You can design your functions to respect a log = TRUE argument:

foo <- function(x, y, log = TRUE) {
  if (log) {
    log_info("Starting foo()",
             logmeta = list(x = x, y = y))
  }
  result <- x + y
  if (log) {
    log_info("Finished foo()",
             logmeta = list(result = result))
  }
  result
}

This makes it easy to disable logging in tests or benchmarks:

foo(3, 4, log = FALSE)
#> [1] 7

5. Check logs programmatically

if (log_has_errors("pipapi_log")) {
  cli::cli_alert_danger("Errors detected!")
  print(log_has_errors("pipapi_log", show = TRUE))
}

6. Save and Reload Logs

tf <- tempfile(fileext = ".qs")
log_save("pipapi_log", path = tf)
#>  Log pipapi_log saved to /tmp/RtmpEUEWBR/file20876f11a9c7.qs
log_reset("pipapi_log")
#>  Log pipapi_log has been reset.
log_load(path = tf, name = "pipapi_log", overwrite = TRUE)
#>  Log pipapi_log loaded from /tmp/RtmpEUEWBR/file20876f11a9c7.qs

Mastering log_add(): The Workhorse of the Logging System

The log_add() function is the core engine behind the logging system in pipfun. While log_info(), log_warn(), and log_error() are convenient wrappers, log_add() is designed for full transparency, flexibility, and programmatic control.

It allows you to:

  • Manually or automatically capture arguments from any function
  • Include custom metadata
  • Track outputs and computation results
  • Register messages across multiple logs
  • Define error levels or trace calls

This is the function you use when developing custom pipelines, internal packages, or diagnostics that need deep traceability.

Function Structure

log_add(
  event,
  message,
  name    = getOption("pipfun.log.default"),
  args    = NULL,
  logmeta = NULL,
  output  = NULL,
  .trace  = NULL,
  .env    = rlang::caller_env()
)

event — What kind of message is this?

This is the type of log entry: it defines how the log is categorized. The most common values are:

  • "info" – For successful operations or checkpoints
  • "warning" – When something may be problematic
  • "error" – When something failed or needs attention

You can also pass custom strings (e.g., "debug", "data") if your logging structure supports it.

Examples:

lname <- "event_log"
log_init(lname)
log_add("info", "Pipeline started", name = lname)
log_add("warning", "Missing values found", name =lname)
log_add("error", "Failed to merge datasets", name = lname)

message — What happened?

A short, descriptive message of the event. This field will be shown in logs and summaries, so it should be concise and meaningful.

Best practice: keep messages short but informative — they complement the structured metadata.

Examples:

log_add("info", "Reading household data", name = lname)
log_add("data", "Poverty line is missing", name = lname)

name — Which log to write to?

Each log is a uniquely named object stored in a dedicated environment .piplogenv. The name argument tells log_add() which log to append to.

If not provided, it defaults to the option set in getOption("pipfun.log.default").

Examples:

# Default log name (from options)
log_add("info", "Starting validation")

# Custom log name
log_init("step3_log")
log_add("info", "Custom step complete", name = "step3_log")

This allows multiple logs to coexist (e.g., one per module or process), and even be saved/loaded independently.

args — What were the function arguments?

If provided, this should be a named list of arguments. If omitted (args = NULL), then log_add() automatically captures the arguments from the calling environment.

This lets you store: - Inputs used - Function parameters - Execution settings

Examples:

# Explicitly pass arguments
log_add("info", "Manual log", args = list(country = "COL", year = 2022))

# Auto-capture from current function
calculate_stats <- function(x, method = "mean") {
  log_add("info", "summary of x")
  summary(x)
}
calculate_stats(rnorm(100))
#>     Min.  1st Qu.   Median     Mean  3rd Qu.     Max. 
#> -2.61233 -0.35608  0.09104  0.07079  0.62341  2.75542

log_get()[]
#> 
#> ── Log entries: ──
#> 
#> → [2025-07-29 15:03:13.782506] INFO — `Starting validation`
#> Function: `eval(expr, envir)` (from global)
#> Trace: eval(expr, envir)
#> 
#> → [2025-07-29 15:03:13.863387] INFO — `Manual log`
#> Function: `eval(expr, envir)` (from global)
#> Trace: eval(expr, envir)
#> 
#> → [2025-07-29 15:03:13.865178] INFO — `summary of x`
#> Function: `calculate_stats(rnorm(100))` (from )
#> Trace: calculate_stats(rnorm(100))
#> 

If you need to log metadata not available in the function’s arguments, use logmeta (see below).

logmeta — Attach extra metadata

logmeta is a named list that gets merged with args and lets you attach metadata that might not be passed as a function argument.

Use it to attach: - Custom tags - Status flags - Source IDs - Anything else relevant to the context

Examples:

log_add("info", "Post-processing started",
        logmeta = list(step = "adjustment", dataset = "povcalnet"))

log_add("warning", "Missing PPP detected",
        args = list(country = "ZAF"),
        logmeta = list(stage = "pricing", source = "national_stats"))

log_get()[event == "warning"][.N]
#> 
#> ── Log entries: ──
#> 
#> → [2025-07-29 15:03:14.007565] WARNING — `Missing PPP detected`
#> Function: `eval(expr, envir)` (from global)
#> Trace: eval(expr, envir)
#> Metadata: list(stage = "pricing", source = "national_stats")
#> 

This is particularly useful when logging from within a tryCatch() handler or from wrappers that don’t have access to the original arguments directly.

output — What did the function return?

You can attach any kind of output or result to the log. This makes the log entry a complete record of the input, processing, and result.

This is useful for:

  • Storing return values
  • Inspecting intermediate results
  • Comparing expected vs. actual outputs

Examples:

result <- stats::lm(mpg ~ wt, data = mtcars)
log_add("model", "Model finished", output = result)
log_filter(event = "model")
#> 
#> ── Log entries: ──
#> 
#> → [2025-07-29 15:03:14.114901] MODEL — `Model finished`
#> Function: `eval(expr, envir)` (from global)
#> Trace: eval(expr, envir)
#> Output: structure(list(coefficients = c("(Intercept)" = 37.285126167342, , wt =
#> -5.34447157272268), residuals = c("Mazda RX4" = -2.28261064680868, , "Mazda RX4
#> Wag" = -0.91977039576432, "Datsun 710" = -2.08595211862542, , "Hornet 4 Drive"
#> = 1.29734993896137, "Hornet Sportabout" = -0.200143957176023, , Valiant =
#> -0.693254525721567, "Duster 360" = -3.90536265272207, , "Merc 240D" =
#> 4.16373814964331, "Merc 230" = 2.3499592867344, , "Merc 280" =
#> 0.299856042823977, "Merc 280C" = -1.10014395717602, , "Merc 450SE" =
#> 0.866873133639264, "Merc 450SL" = -0.0502472010864458, , "Merc 450SLC" =
#> -1.88302362245031, "Cadillac Fleetwood" = 1.17334958945202, , "Lincoln
#> Continental" = 2.10328764310577, "Chrysler Imperial" = 5.98107438886067, ,
#> "Fiat 128" = 6.87271129264786, "Honda Civic" = 1.7461954226051, , "Toyota
#> Corolla" = 6.42197916860408, "Toyota Corona" = -2.61100374058063, , "Dodge
#> Challenger" = -2.97258623135821, "AMC Javelin" = -3.72686631503964, , "Camaro
#> Z28" = -3.46235532808695, "Pontiac Firebird" = 2.46436702977666, , "Fiat X1-9"
#> = 0.356426325876353, "Porsche 914-2" = 0.152042998284501, , "Lotus Europa" =
#> 1.20105932218738, "Ford Pantera L" = -4.54315128181114, , "Ferrari Dino" =
#> -2.78093991090021, "Maserati Bora" = -3.20536265272208, , "Volvo 142E" =
#> -1.02749519517299), effects = c("(Intercept)" = -113.649737406208, , …, "Lotus
#> Europa", "Ford Pantera L", "Ferrari Dino", "Maserati Bora", , and "Volvo
#> 142E"), class = "data.frame")), class = "lm")
#> 

If the result is large, you might log a summary or just a pointer (e.g., file path or row count):

log_add("mtcars", "mtcars summary", 
        output = list(n = nrow(mtcars), 
                      vars = names(mtcars)))
log_filter(event = "mtcars")
#> 
#> ── Log entries: ──
#> 
#> → [2025-07-29 15:03:14.223464] MTCARS — `mtcars summary`
#> Function: `eval(expr, envir)` (from global)
#> Trace: eval(expr, envir)
#> Output: list(n = 32L, vars = c("mpg", "cyl", "disp", "hp", "drat", "wt", and
#> "qsec", "vs", "am", "gear", "carb"))
#> 
log_get()[event == "mtcars"][.N][, output]
#> [[1]]
#> [[1]]$n
#> [1] 32
#> 
#> [[1]]$vars
#>  [1] "mpg"  "cyl"  "disp" "hp"   "drat" "wt"   "qsec" "vs"   "am"   "gear"
#> [11] "carb"

.trace — What call triggered this?

By default, log_add() captures the immediate call that triggered the log using sys.call(-1).

You can override it by passing a custom expression or trace object:

log_add("lm", "weird reg", .trace = quote(stats::lm(am ~ wt, data = mtcars)))
log_get()[event == "lm"][, trace]
#> [[1]]
#> stats::lm(am ~ wt, data = mtcars)

This is useful when logging from deeply nested code and you want to reflect the origin of the issue, not just the logging wrapper.

.env — Where to look for arguments

This is the environment from which to capture arguments when args = NULL. It defaults to the function that called log_add() (using rlang::caller_env()), which is usually correct.

You only need to use this if you’re capturing from a different environment on purpose.

Example:

caller <- function(x, y) {
  inner()
}
inner <- function() {
  log_add("caller", "From caller", .env = parent.frame())
}
caller(1, 2)
log_get()[event == "caller"][.N][, args]

Without .env = parent.frame(), inner() would log nothing useful — this makes it possible to step one level up and get what we need.

Basic use — no .env needed

log_init("mylog", overwrite = TRUE)

myfun <- function(x = 1, y = 2) {
  log_add("info", "Called myfun", name = "mylog")
}

myfun(5, 10)
log_get("mylog")$args[[1]]
#> $x
#> [1] 5
#> 
#> $y
#> [1] 10

This works perfectly: x and y are captured automatically. No need to set .env.

Problem: wrapped function logs its own frame

wrapped <- function(a = 3) {
  log_add("info", "From wrapped", name = "mylog")
}

outer <- function() {
  a <- 42
  wrapped()
}

log_init("mylog", overwrite = TRUE)
outer()
log_get("mylog")$args[[1]]
#> $a
#> [1] 3

Oops! The logged arguments come from wrapped(), not outer() — the values may be defaulted or wrong.

Solution: pass .env = parent.frame()

wrapped <- function(a = 3) {
  log_add("info", "From wrapped", name = "mylog", .env = parent.frame())
}

outer <- function() {
  a <- 42
  wrapped()
}

log_init("mylog", overwrite = TRUE)
outer()
log_get("mylog")$args[[1]]

Now the arguments from outer() are logged correctly — including a = 42.

Best Practices

Scenario Do you need to set .env?
Calling log_add() directly ❌ No (it auto-captures from caller)
Calling from a wrapper ✅ Yes, set .env = parent.frame()
Nested pipelines or metaprogramming ✅ Yes, be explicit
Building custom logging tools ✅ Always set .env for consistency

Using .env might look like a technical detail, but it makes your logs predictable and robust, especially in large pipelines or packages that delegate work across functions.

This creates a log entry with detailed context that you control entirely.

Key Takeaways

  • log_add() is designed to be smart and flexible.
  • Use it when you need more control than the helpers offer.
  • Combine args, logmeta, and output to produce rich logs.
  • Use .env = environment() to automatically capture local values.
  • Use in conjunction with tryCatch() to log recoverable errors.

Let your team know: if they understand log_add(), they understand the whole logging system.

Best Practices for Developers

  • Initialize logs early (e.g., in .onLoad() or on first use)
  • Use structured helpers instead of raw messages. However, I highly recommend you learn how function log_add() works and use it as needed.
  • Prefer named logmeta fields to store arguments or flags
  • Inspect logs interactively or export them for review
  • Add log = TRUE in your functions for user control

What’s Next?

Planned improvements to the logging framework include:

  • Color-coded print.piplog() for interactive sessions
  • Log levels (debug, info, warning, error, critical)
  • Export to JSON, Markdown, or CSV for diagnostics
  • Timeline visualizations of log activity
  • log_args() and log_with() helpers to simplify metadata
  • Integration with rlang::caller_args() and current_env() for richer introspection