Using pipfun's Logging System
log.Rmd
Intro
The pipfun
logging system provides a powerful, flexible
way to track messages, metadata, function arguments, and error traces
across your R codebase — especially when building packages.
Logs are stored in memory and can be written to disk as
.qs
files for persistence.
Use this system to:
- Log errors, warnings, and informative messages
- Capture arguments, return values, and internal states
- Filter, summarize, and debug across modules or packages
- Persist logs between sessions or across users
Quick Start
log_name <- "mylog"
log_init(log_name)
log_info("This is an info message", name = log_name)
log_warn("Something looks odd", name = log_name)
log_error("An error occurred", name = log_name)
log_has_errors(log_name)
#> [1] TRUE
log_summary(log_name)
#> event count
#> <char> <int>
#> 1: info 1
#> 2: warning 1
#> 3: error 1
log_filter(log_name, event = "error")
#>
#> ── Log entries: ──
#>
#> → [2025-04-14 22:19:23.08109] ERROR — `An error occurred`
#> Function: `log_error("An error occurred", name = log_name)` (from global)
#> Trace: log_error("An error occurred", name = log_name)
#>
Logging Concepts
Each log entry is a row in a data.table
, with fields
such as:
-
time
: timestamp -
event
: one of “info”, “warning”, or “error” -
message
: custom message -
fun
,package
: where it was logged from -
args
: captured arguments -
output
: result (optional) -
trace
: traceback object
Creating and Managing Logs
Initialize a log
log_init("devlog")
Filter and Query
log_filter(name = "devlog", event = "error")
#>
#> ── Log entries: ──
#>
#> → [2025-04-14 22:19:23.318414] ERROR — `Failed to compute results`
#> Function: `log_error("Failed to compute results", name = "devlog", logmeta =
#> list(file = "input.csv"))` (from global)
#> Trace: log_error("Failed to compute results", name = "devlog", logmeta =
#> list(file = "input.csv"))
#>
log_filter(name = "devlog", fun = "my_function")
#>
#> ── Log entries: ──
#>
#> ℹ The log is empty.
log_filter(name = "devlog", after = Sys.Date() - 1)
#>
#> ── Log entries: ──
#>
#> → [2025-04-14 22:19:23.316422] INFO — `Process started`
#> Function: `log_info("Process started", name = "devlog", logmeta = list(step =
#> 1))` (from global)
#> Trace: log_info("Process started", name = "devlog", logmeta = list(step = 1))
#>
#> → [2025-04-14 22:19:23.317496] WARNING — `Unexpected format`
#> Function: `log_warn("Unexpected format", name = "devlog", logmeta = list(column
#> = "status"))` (from global)
#> Trace: log_warn("Unexpected format", name = "devlog", logmeta = list(column =
#> "status"))
#>
#> → [2025-04-14 22:19:23.318414] ERROR — `Failed to compute results`
#> Function: `log_error("Failed to compute results", name = "devlog", logmeta =
#> list(file = "input.csv"))` (from global)
#> Trace: log_error("Failed to compute results", name = "devlog", logmeta =
#> list(file = "input.csv"))
#>
Using Logs in Package Development
pipfun
is especially helpful when building packages.
1. Initialize at load
In your package’s zzz.R
:
.onLoad <- function(libname, pkgname) {
pipfun::log_init("pipapi_log")
}
2. Use in functions
# I init the log here, but you can do it in zzz.R, or somewhere else.
log_init("pipapi_log", overwrite = TRUE)
get_data <- function(path, clean = TRUE) {
log_info("Starting data load", name = "pipapi_log",
logmeta = list(path = path, clean = clean))
data <- tryCatch({
read.csv(path)
}, error = function(e) {
log_error("Failed to read CSV", name = "pipapi_log",
logmeta = list(error = e$message))
stop(e)
})
log_info("Data loaded", name = "pipapi_log",
logmeta = list(n = nrow(data)))
return(data)
}
4. Enable or Disable Logging Dynamically
You can design your functions to respect a log = TRUE
argument:
foo <- function(x, y, log = TRUE) {
if (log) {
log_info("Starting foo()",
logmeta = list(x = x, y = y))
}
result <- x + y
if (log) {
log_info("Finished foo()",
logmeta = list(result = result))
}
result
}
This makes it easy to disable logging in tests or benchmarks:
foo(3, 4, log = FALSE)
#> [1] 7
5. Check logs programmatically
if (log_has_errors("pipapi_log")) {
cli::cli_alert_danger("Errors detected!")
print(log_has_errors("pipapi_log", show = TRUE))
}
6. Save and Reload Logs
tf <- tempfile(fileext = ".qs")
log_save("pipapi_log", path = tf)
#> ✔ Log pipapi_log saved to /tmp/RtmpO3DfYh/file20354247c542.qs
log_reset("pipapi_log")
#> ✔ Log pipapi_log has been reset.
log_load(path = tf, name = "pipapi_log", overwrite = TRUE)
#> ✔ Log pipapi_log loaded from /tmp/RtmpO3DfYh/file20354247c542.qs
Mastering log_add()
: The Workhorse of the Logging
System
The log_add()
function is the core
engine behind the logging system in pipfun
. While
log_info()
, log_warn()
, and
log_error()
are convenient wrappers, log_add()
is designed for full transparency, flexibility, and programmatic
control.
It allows you to:
- Manually or automatically capture arguments from any function
- Include custom metadata
- Track outputs and computation results
- Register messages across multiple logs
- Define error levels or trace calls
This is the function you use when developing custom pipelines, internal packages, or diagnostics that need deep traceability.
Function Structure
log_add(
event,
message,
name = getOption("pipfun.log.default"),
args = NULL,
logmeta = NULL,
output = NULL,
.trace = NULL,
.env = rlang::caller_env()
)
event
— What kind of message is this?
This is the type of log entry: it defines how the log is categorized. The most common values are:
-
"info"
– For successful operations or checkpoints -
"warning"
– When something may be problematic -
"error"
– When something failed or needs attention
You can also pass custom strings (e.g., "debug"
,
"data"
) if your logging structure supports it.
Examples:
message
— What happened?
A short, descriptive message of the event. This field will be shown in logs and summaries, so it should be concise and meaningful.
Best practice: keep messages short but informative — they complement the structured metadata.
Examples:
name
— Which log to write to?
Each log is a uniquely named object stored in a dedicated environment
.piplogenv
. The name
argument tells
log_add()
which log to append to.
If not provided, it defaults to the option set in
getOption("pipfun.log.default")
.
Examples:
# Default log name (from options)
log_add("info", "Starting validation")
# Custom log name
log_init("step3_log")
log_add("info", "Custom step complete", name = "step3_log")
This allows multiple logs to coexist (e.g., one per module or process), and even be saved/loaded independently.
args
— What were the function arguments?
If provided, this should be a named list of arguments. If omitted
(args = NULL
), then log_add()
automatically
captures the arguments from the calling environment.
This lets you store: - Inputs used - Function parameters - Execution settings
Examples:
# Explicitly pass arguments
log_add("info", "Manual log", args = list(country = "COL", year = 2022))
# Auto-capture from current function
calculate_stats <- function(x, method = "mean") {
log_add("info", "summary of x")
summary(x)
}
calculate_stats(rnorm(100))
#> Min. 1st Qu. Median Mean 3rd Qu. Max.
#> -2.61233 -0.35608 0.09104 0.07079 0.62341 2.75542
log_get()[]
#>
#> ── Log entries: ──
#>
#> → [2025-04-14 22:19:24.426521] INFO — `Starting validation`
#> Function: `eval(expr, envir)` (from global)
#> Trace: eval(expr, envir)
#>
#> → [2025-04-14 22:19:24.510745] INFO — `Manual log`
#> Function: `eval(expr, envir)` (from global)
#> Trace: eval(expr, envir)
#>
#> → [2025-04-14 22:19:24.533016] INFO — `summary of x`
#> Function: `calculate_stats(rnorm(100))` (from )
#> Trace: calculate_stats(rnorm(100))
#>
If you need to log metadata not available in the function’s
arguments, use logmeta
(see below).
logmeta
— Attach extra metadata
logmeta
is a named list that gets merged with
args
and lets you attach metadata that might not
be passed as a function argument.
Use it to attach: - Custom tags - Status flags - Source IDs - Anything else relevant to the context
Examples:
log_add("info", "Post-processing started",
logmeta = list(step = "adjustment", dataset = "povcalnet"))
log_add("warning", "Missing PPP detected",
args = list(country = "ZAF"),
logmeta = list(stage = "pricing", source = "national_stats"))
log_get()[event == "warning"][.N]
#>
#> ── Log entries: ──
#>
#> → [2025-04-14 22:19:24.678779] WARNING — `Missing PPP detected`
#> Function: `eval(expr, envir)` (from global)
#> Trace: eval(expr, envir)
#>
This is particularly useful when logging from within a tryCatch() handler or from wrappers that don’t have access to the original arguments directly.
output
— What did the function return?
You can attach any kind of output or result to the log. This makes the log entry a complete record of the input, processing, and result.
This is useful for:
- Storing return values
- Inspecting intermediate results
- Comparing expected vs. actual outputs
Examples:
result <- stats::lm(mpg ~ wt, data = mtcars)
log_add("model", "Model finished", output = result)
log_filter(event = "model")
#>
#> ── Log entries: ──
#>
#> → [2025-04-14 22:19:24.788252] MODEL — `Model finished`
#> Function: `eval(expr, envir)` (from global)
#> Trace: eval(expr, envir)
#>
If the result is large, you might log a summary or just a pointer (e.g., file path or row count):
log_add("mtcars", "mtcars summary",
output = list(n = nrow(mtcars),
vars = names(mtcars)))
log_filter(event = "mtcars")
#>
#> ── Log entries: ──
#>
#> → [2025-04-14 22:19:24.915967] MTCARS — `mtcars summary`
#> Function: `eval(expr, envir)` (from global)
#> Trace: eval(expr, envir)
#>
log_get()[event == "mtcars"][.N][, output]
#> [[1]]
#> [[1]]$n
#> [1] 32
#>
#> [[1]]$vars
#> [1] "mpg" "cyl" "disp" "hp" "drat" "wt" "qsec" "vs" "am" "gear"
#> [11] "carb"
.trace
— What call triggered this?
By default, log_add()
captures the immediate
call that triggered the log using
sys.call(-1)
.
You can override it by passing a custom expression or trace object:
log_add("lm", "weird reg", .trace = quote(stats::lm(am ~ wt, data = mtcars)))
log_get()[event == "lm"][, trace]
#> [[1]]
#> stats::lm(am ~ wt, data = mtcars)
This is useful when logging from deeply nested code and you want to reflect the origin of the issue, not just the logging wrapper.
.env
— Where to look for arguments
This is the environment from which to capture
arguments when args = NULL
. It defaults to the
function that called log_add()
(using
rlang::caller_env()
), which is usually correct.
You only need to use this if you’re capturing from a different environment on purpose.
Example:
caller <- function(x, y) {
inner()
}
inner <- function() {
log_add("caller", "From caller", .env = parent.frame())
}
caller(1, 2)
log_get()[event == "caller"][.N][, args]
#> [[1]]
#> [[1]]$x
#> [1] 1
#>
#> [[1]]$y
#> [1] 2
Without .env = parent.frame()
, inner()
would log nothing useful — this makes it possible to step one level up
and get what we need.
Basic use — no .env
needed
log_init("mylog", overwrite = TRUE)
myfun <- function(x = 1, y = 2) {
log_add("info", "Called myfun", name = "mylog")
}
myfun(5, 10)
log_get("mylog")$args[[1]]
#> $x
#> [1] 5
#>
#> $y
#> [1] 10
This works perfectly: x
and y
are captured
automatically. No need to set .env
.
Problem: wrapped function logs its own frame
wrapped <- function(a = 3) {
log_add("info", "From wrapped", name = "mylog")
}
outer <- function() {
a <- 42
wrapped()
}
log_init("mylog", overwrite = TRUE)
outer()
log_get("mylog")$args[[1]]
#> $a
#> [1] 3
Oops! The logged arguments come from wrapped()
, not
outer()
— the values may be defaulted or wrong.
Solution: pass .env = parent.frame()
wrapped <- function(a = 3) {
log_add("info", "From wrapped", name = "mylog", .env = parent.frame())
}
outer <- function() {
a <- 42
wrapped()
}
log_init("mylog", overwrite = TRUE)
outer()
log_get("mylog")$args[[1]]
#> $a
#> [1] 42
Now the arguments from
outer()
are logged correctly — includinga = 42
.
Best Practices
Scenario | Do you need to set .env ? |
---|---|
Calling log_add() directly |
❌ No (it auto-captures from caller) |
Calling from a wrapper | ✅ Yes, set .env = parent.frame()
|
Nested pipelines or metaprogramming | ✅ Yes, be explicit |
Building custom logging tools | ✅ Always set .env for consistency |
Using .env
might look like a technical detail, but it
makes your logs predictable and robust, especially in
large pipelines or packages that delegate work across functions.
This creates a log entry with detailed context that you control entirely.
Key Takeaways
-
log_add()
is designed to be smart and flexible. - Use it when you need more control than the helpers offer.
- Combine
args
,logmeta
, andoutput
to produce rich logs. - Use
.env = environment()
to automatically capture local values. - Use in conjunction with
tryCatch()
to log recoverable errors.
Let your team know: if they understand log_add()
, they
understand the whole logging system.
Best Practices for Developers
- Initialize logs early (e.g., in
.onLoad()
or on first use) - Use structured helpers instead of raw messages. However, I highly
recommend you learn how function
log_add()
works and use it as needed. - Prefer named
logmeta
fields to store arguments or flags - Inspect logs interactively or export them for review
- Add
log = TRUE
in your functions for user control
What’s Next?
Planned improvements to the logging framework include:
- Color-coded
print.piplog()
for interactive sessions - Log levels (debug, info, warning, error, critical)
- Export to JSON, Markdown, or CSV for diagnostics
- Timeline visualizations of log activity
-
log_args()
andlog_with()
helpers to simplify metadata - Integration with
rlang::caller_args()
andcurrent_env()
for richer introspection