Skip to contents
#devtools::load_all(".")
library(pipfun)

Intro

The pipfun logging system provides a powerful, flexible way to track messages, metadata, function arguments, and error traces across your R codebase — especially when building packages.

Logs are stored in memory and can be written to disk as .qs files for persistence.

Use this system to:

  • Log errors, warnings, and informative messages
  • Capture arguments, return values, and internal states
  • Filter, summarize, and debug across modules or packages
  • Persist logs between sessions or across users

Quick Start

log_name <- "mylog"
log_init(log_name)

log_info("This is an info message", name = log_name)
log_warn("Something looks odd", name = log_name)
log_error("An error occurred", name = log_name)
log_has_errors(log_name)
#> [1] TRUE
log_summary(log_name)
#>      event count
#>     <char> <int>
#> 1:    info     1
#> 2: warning     1
#> 3:   error     1
log_filter(log_name, event = "error")
#> 
#> ── Log entries: ──
#> 
#> → [2025-04-14 22:19:23.08109] ERROR — `An error occurred`
#> Function: `log_error("An error occurred", name = log_name)` (from global)
#> Trace: log_error("An error occurred", name = log_name)
#> 

Logging Concepts

Each log entry is a row in a data.table, with fields such as:

  • time: timestamp
  • event: one of “info”, “warning”, or “error”
  • message: custom message
  • fun, package: where it was logged from
  • args: captured arguments
  • output: result (optional)
  • trace: traceback object

Creating and Managing Logs

Initialize a log

log_init("devlog")

Add entries

Use the helpers for most use cases:

log_info("Process started", name = "devlog", logmeta = list(step = 1))
log_warn("Unexpected format", name = "devlog", logmeta = list(column = "status"))
log_error("Failed to compute results", name = "devlog", logmeta = list(file = "input.csv"))

Filter and Query

log_filter(name = "devlog", event = "error")
#> 
#> ── Log entries: ──
#> 
#> → [2025-04-14 22:19:23.318414] ERROR — `Failed to compute results`
#> Function: `log_error("Failed to compute results", name = "devlog", logmeta =
#> list(file = "input.csv"))` (from global)
#> Trace: log_error("Failed to compute results", name = "devlog", logmeta =
#> list(file = "input.csv"))
#> 
log_filter(name = "devlog", fun = "my_function")
#> 
#> ── Log entries: ──
#> 
#>  The log is empty.
log_filter(name = "devlog", after = Sys.Date() - 1)
#> 
#> ── Log entries: ──
#> 
#> → [2025-04-14 22:19:23.316422] INFO — `Process started`
#> Function: `log_info("Process started", name = "devlog", logmeta = list(step =
#> 1))` (from global)
#> Trace: log_info("Process started", name = "devlog", logmeta = list(step = 1))
#> 
#> → [2025-04-14 22:19:23.317496] WARNING — `Unexpected format`
#> Function: `log_warn("Unexpected format", name = "devlog", logmeta = list(column
#> = "status"))` (from global)
#> Trace: log_warn("Unexpected format", name = "devlog", logmeta = list(column =
#> "status"))
#> 
#> → [2025-04-14 22:19:23.318414] ERROR — `Failed to compute results`
#> Function: `log_error("Failed to compute results", name = "devlog", logmeta =
#> list(file = "input.csv"))` (from global)
#> Trace: log_error("Failed to compute results", name = "devlog", logmeta =
#> list(file = "input.csv"))
#> 

Using Logs in Package Development

pipfun is especially helpful when building packages.

1. Initialize at load

In your package’s zzz.R:

.onLoad <- function(libname, pkgname) {
  pipfun::log_init("pipapi_log")
}

2. Use in functions


# I init the log here, but you can do it in zzz.R, or somewhere else.
log_init("pipapi_log", overwrite = TRUE)

get_data <- function(path, clean = TRUE) {
  log_info("Starting data load", name = "pipapi_log",
           logmeta = list(path = path, clean = clean))

  data <- tryCatch({
    read.csv(path)
  }, error = function(e) {
    log_error("Failed to read CSV", name = "pipapi_log",
              logmeta = list(error = e$message))
    stop(e)
  })

  log_info("Data loaded", name = "pipapi_log",
           logmeta = list(n = nrow(data)))
  return(data)
}

3. Use with tryCatch for robust pipelines

run_step <- function(step_name, expr) {
  tryCatch({
    result <- expr()
    log_info("Step completed", 
             name = "pipeline_log", 
             logmeta = list(step = step_name))
    result
  }, error = function(e) {
    log_error("Step failed", 
              name = "pipeline_log", 
              logmeta = list(step = step_name, error = e$message))
    NULL
  })
}

4. Enable or Disable Logging Dynamically

You can design your functions to respect a log = TRUE argument:

foo <- function(x, y, log = TRUE) {
  if (log) {
    log_info("Starting foo()",
             logmeta = list(x = x, y = y))
  }
  result <- x + y
  if (log) {
    log_info("Finished foo()",
             logmeta = list(result = result))
  }
  result
}

This makes it easy to disable logging in tests or benchmarks:

foo(3, 4, log = FALSE)
#> [1] 7

5. Check logs programmatically

if (log_has_errors("pipapi_log")) {
  cli::cli_alert_danger("Errors detected!")
  print(log_has_errors("pipapi_log", show = TRUE))
}

6. Save and Reload Logs

tf <- tempfile(fileext = ".qs")
log_save("pipapi_log", path = tf)
#>  Log pipapi_log saved to /tmp/RtmpO3DfYh/file20354247c542.qs
log_reset("pipapi_log")
#>  Log pipapi_log has been reset.
log_load(path = tf, name = "pipapi_log", overwrite = TRUE)
#>  Log pipapi_log loaded from /tmp/RtmpO3DfYh/file20354247c542.qs

Mastering log_add(): The Workhorse of the Logging System

The log_add() function is the core engine behind the logging system in pipfun. While log_info(), log_warn(), and log_error() are convenient wrappers, log_add() is designed for full transparency, flexibility, and programmatic control.

It allows you to:

  • Manually or automatically capture arguments from any function
  • Include custom metadata
  • Track outputs and computation results
  • Register messages across multiple logs
  • Define error levels or trace calls

This is the function you use when developing custom pipelines, internal packages, or diagnostics that need deep traceability.

Function Structure

log_add(
  event,
  message,
  name    = getOption("pipfun.log.default"),
  args    = NULL,
  logmeta = NULL,
  output  = NULL,
  .trace  = NULL,
  .env    = rlang::caller_env()
)

event — What kind of message is this?

This is the type of log entry: it defines how the log is categorized. The most common values are:

  • "info" – For successful operations or checkpoints
  • "warning" – When something may be problematic
  • "error" – When something failed or needs attention

You can also pass custom strings (e.g., "debug", "data") if your logging structure supports it.

Examples:

lname <- "event_log"
log_init(lname)
log_add("info", "Pipeline started", name = lname)
log_add("warning", "Missing values found", name =lname)
log_add("error", "Failed to merge datasets", name = lname)

message — What happened?

A short, descriptive message of the event. This field will be shown in logs and summaries, so it should be concise and meaningful.

Best practice: keep messages short but informative — they complement the structured metadata.

Examples:

log_add("info", "Reading household data", name = lname)
log_add("data", "Poverty line is missing", name = lname)

name — Which log to write to?

Each log is a uniquely named object stored in a dedicated environment .piplogenv. The name argument tells log_add() which log to append to.

If not provided, it defaults to the option set in getOption("pipfun.log.default").

Examples:

# Default log name (from options)
log_add("info", "Starting validation")

# Custom log name
log_init("step3_log")
log_add("info", "Custom step complete", name = "step3_log")

This allows multiple logs to coexist (e.g., one per module or process), and even be saved/loaded independently.

args — What were the function arguments?

If provided, this should be a named list of arguments. If omitted (args = NULL), then log_add() automatically captures the arguments from the calling environment.

This lets you store: - Inputs used - Function parameters - Execution settings

Examples:

# Explicitly pass arguments
log_add("info", "Manual log", args = list(country = "COL", year = 2022))

# Auto-capture from current function
calculate_stats <- function(x, method = "mean") {
  log_add("info", "summary of x")
  summary(x)
}
calculate_stats(rnorm(100))
#>     Min.  1st Qu.   Median     Mean  3rd Qu.     Max. 
#> -2.61233 -0.35608  0.09104  0.07079  0.62341  2.75542

log_get()[]
#> 
#> ── Log entries: ──
#> 
#> → [2025-04-14 22:19:24.426521] INFO — `Starting validation`
#> Function: `eval(expr, envir)` (from global)
#> Trace: eval(expr, envir)
#> 
#> → [2025-04-14 22:19:24.510745] INFO — `Manual log`
#> Function: `eval(expr, envir)` (from global)
#> Trace: eval(expr, envir)
#> 
#> → [2025-04-14 22:19:24.533016] INFO — `summary of x`
#> Function: `calculate_stats(rnorm(100))` (from )
#> Trace: calculate_stats(rnorm(100))
#> 

If you need to log metadata not available in the function’s arguments, use logmeta (see below).

logmeta — Attach extra metadata

logmeta is a named list that gets merged with args and lets you attach metadata that might not be passed as a function argument.

Use it to attach: - Custom tags - Status flags - Source IDs - Anything else relevant to the context

Examples:

log_add("info", "Post-processing started",
        logmeta = list(step = "adjustment", dataset = "povcalnet"))

log_add("warning", "Missing PPP detected",
        args = list(country = "ZAF"),
        logmeta = list(stage = "pricing", source = "national_stats"))

log_get()[event == "warning"][.N]
#> 
#> ── Log entries: ──
#> 
#> → [2025-04-14 22:19:24.678779] WARNING — `Missing PPP detected`
#> Function: `eval(expr, envir)` (from global)
#> Trace: eval(expr, envir)
#> 

This is particularly useful when logging from within a tryCatch() handler or from wrappers that don’t have access to the original arguments directly.

output — What did the function return?

You can attach any kind of output or result to the log. This makes the log entry a complete record of the input, processing, and result.

This is useful for:

  • Storing return values
  • Inspecting intermediate results
  • Comparing expected vs. actual outputs

Examples:

result <- stats::lm(mpg ~ wt, data = mtcars)
log_add("model", "Model finished", output = result)
log_filter(event = "model")
#> 
#> ── Log entries: ──
#> 
#> → [2025-04-14 22:19:24.788252] MODEL — `Model finished`
#> Function: `eval(expr, envir)` (from global)
#> Trace: eval(expr, envir)
#> 

If the result is large, you might log a summary or just a pointer (e.g., file path or row count):

log_add("mtcars", "mtcars summary", 
        output = list(n = nrow(mtcars), 
                      vars = names(mtcars)))
log_filter(event = "mtcars")
#> 
#> ── Log entries: ──
#> 
#> → [2025-04-14 22:19:24.915967] MTCARS — `mtcars summary`
#> Function: `eval(expr, envir)` (from global)
#> Trace: eval(expr, envir)
#> 
log_get()[event == "mtcars"][.N][, output]
#> [[1]]
#> [[1]]$n
#> [1] 32
#> 
#> [[1]]$vars
#>  [1] "mpg"  "cyl"  "disp" "hp"   "drat" "wt"   "qsec" "vs"   "am"   "gear"
#> [11] "carb"

.trace — What call triggered this?

By default, log_add() captures the immediate call that triggered the log using sys.call(-1).

You can override it by passing a custom expression or trace object:

log_add("lm", "weird reg", .trace = quote(stats::lm(am ~ wt, data = mtcars)))
log_get()[event == "lm"][, trace]
#> [[1]]
#> stats::lm(am ~ wt, data = mtcars)

This is useful when logging from deeply nested code and you want to reflect the origin of the issue, not just the logging wrapper.

.env — Where to look for arguments

This is the environment from which to capture arguments when args = NULL. It defaults to the function that called log_add() (using rlang::caller_env()), which is usually correct.

You only need to use this if you’re capturing from a different environment on purpose.

Example:

caller <- function(x, y) {
  inner()
}
inner <- function() {
  log_add("caller", "From caller", .env = parent.frame())
}
caller(1, 2)
log_get()[event == "caller"][.N][, args]
#> [[1]]
#> [[1]]$x
#> [1] 1
#> 
#> [[1]]$y
#> [1] 2

Without .env = parent.frame(), inner() would log nothing useful — this makes it possible to step one level up and get what we need.

Basic use — no .env needed

log_init("mylog", overwrite = TRUE)

myfun <- function(x = 1, y = 2) {
  log_add("info", "Called myfun", name = "mylog")
}

myfun(5, 10)
log_get("mylog")$args[[1]]
#> $x
#> [1] 5
#> 
#> $y
#> [1] 10

This works perfectly: x and y are captured automatically. No need to set .env.

Problem: wrapped function logs its own frame

wrapped <- function(a = 3) {
  log_add("info", "From wrapped", name = "mylog")
}

outer <- function() {
  a <- 42
  wrapped()
}

log_init("mylog", overwrite = TRUE)
outer()
log_get("mylog")$args[[1]]
#> $a
#> [1] 3

Oops! The logged arguments come from wrapped(), not outer() — the values may be defaulted or wrong.

Solution: pass .env = parent.frame()

wrapped <- function(a = 3) {
  log_add("info", "From wrapped", name = "mylog", .env = parent.frame())
}

outer <- function() {
  a <- 42
  wrapped()
}

log_init("mylog", overwrite = TRUE)
outer()
log_get("mylog")$args[[1]]
#> $a
#> [1] 42

Now the arguments from outer() are logged correctly — including a = 42.

Best Practices

Scenario Do you need to set .env?
Calling log_add() directly ❌ No (it auto-captures from caller)
Calling from a wrapper ✅ Yes, set .env = parent.frame()
Nested pipelines or metaprogramming ✅ Yes, be explicit
Building custom logging tools ✅ Always set .env for consistency

Using .env might look like a technical detail, but it makes your logs predictable and robust, especially in large pipelines or packages that delegate work across functions.

This creates a log entry with detailed context that you control entirely.

Key Takeaways

  • log_add() is designed to be smart and flexible.
  • Use it when you need more control than the helpers offer.
  • Combine args, logmeta, and output to produce rich logs.
  • Use .env = environment() to automatically capture local values.
  • Use in conjunction with tryCatch() to log recoverable errors.

Let your team know: if they understand log_add(), they understand the whole logging system.

Best Practices for Developers

  • Initialize logs early (e.g., in .onLoad() or on first use)
  • Use structured helpers instead of raw messages. However, I highly recommend you learn how function log_add() works and use it as needed.
  • Prefer named logmeta fields to store arguments or flags
  • Inspect logs interactively or export them for review
  • Add log = TRUE in your functions for user control

What’s Next?

Planned improvements to the logging framework include:

  • Color-coded print.piplog() for interactive sessions
  • Log levels (debug, info, warning, error, critical)
  • Export to JSON, Markdown, or CSV for diagnostics
  • Timeline visualizations of log activity
  • log_args() and log_with() helpers to simplify metadata
  • Integration with rlang::caller_args() and current_env() for richer introspection