This function is a wrapper around the system command ps
that can
be used to benchmark (peak) memory and CPU usage of parallel R code.
By taking snapshots the memory usage of R processes at a regular interval
,
the function dynamically builds up a profile of their usage of system
resources.
syrup(expr, interval = 0.5, peak = FALSE, env = caller_env())
An expression.
The interval at which to take snapshots of respirce usage. In practice, there's an overhead on top of each of these intervals.
Whether to return rows for only the "peak" memory usage.
Interpreted as the id
with the maximum rss
sum. Defaults to FALSE
,
but may be helpful to set peak = TRUE
for potentially very long-running
processes so that the tibble doesn't grow too large.
The environment to evaluate expr
in.
A tibble with columns id
and time
and a number of columns from
ps::ps()
output describing memory and CPU usage. Notably, the process ID
pid
, parent process ID ppid
, percent CPU usage, and resident set size
rss
(a measure of memory usage).
While much of the verbiage in the package assumes that the supplied
expression will be distributed across CPU cores, there's nothing specific
about this package that necessitates the expression provided to syrup()
is
run in parallel. Said another way, syrup()
will work just fine
with "normal," sequentially-run R code (as in the examples). That said,
there are many better, more fine-grained tools for the job in the case of
sequential R code, such as Rprofmem()
, the
profmem
package, the bench package, and packages in
the R-prof GitHub organization.
Loosely, the function works by:
Setting up another R process (call it sesh
) that queries system
information using ps::ps()
at a regular interval,
Evaluating the supplied expression,
Reading the queried system information back into the main process from sesh
,
Closing sesh
, and then
Returning the queried system information.
Note that information on the R process sesh
is filtered out from the results
automatically.
# pass any expression to syrup. first, sequentially:
res_syrup <- syrup({res_output <- Sys.sleep(1)})
res_syrup
#> # A tibble: 132 × 8
#> id time pid ppid name pct_cpu rss vms
#> <dbl> <dttm> <int> <int> <chr> <dbl> <bch:b> <bch:by>
#> 1 1 2025-07-09 13:57:31 2825985 2818763 R NA 191.5MB 785.22MB
#> 2 1 2025-07-09 13:57:31 2825787 2825448 R NA 257.8MB 925.91MB
#> 3 1 2025-07-09 13:57:31 2825448 2825447 R NA 82.1MB 665.31MB
#> 4 1 2025-07-09 13:57:31 2818763 2814503 R NA 292.9MB 1.21GB
#> 5 1 2025-07-09 13:57:31 2814503 2814502 R NA 119.2MB 702.88MB
#> 6 1 2025-07-09 13:57:31 2788560 2788479 rsession NA 219.8MB 1.38GB
#> 7 1 2025-07-09 13:57:31 2788479 2788477 rsession-… NA 3MB 4.4MB
#> 8 1 2025-07-09 13:57:31 2766449 2766360 R NA 58.1MB 563MB
#> 9 1 2025-07-09 13:57:31 2766448 2766360 R NA 58.2MB 563MB
#> 10 1 2025-07-09 13:57:31 2766442 2766360 R NA 99.5MB 678.14MB
#> # ℹ 122 more rows
# to snapshot memory and CPU information more (or less) often, set `interval`
syrup(Sys.sleep(1), interval = .01)
#> # A tibble: 198 × 8
#> id time pid ppid name pct_cpu rss vms
#> <dbl> <dttm> <int> <int> <chr> <dbl> <bch:b> <bch:by>
#> 1 1 2025-07-09 13:57:33 2825985 2818763 R NA 207MB 926.83MB
#> 2 1 2025-07-09 13:57:33 2825787 2825448 R NA 267MB 935.19MB
#> 3 1 2025-07-09 13:57:33 2825448 2825447 R NA 83MB 666.22MB
#> 4 1 2025-07-09 13:57:33 2818763 2814503 R NA 292.9MB 1.21GB
#> 5 1 2025-07-09 13:57:33 2814503 2814502 R NA 119.2MB 702.88MB
#> 6 1 2025-07-09 13:57:33 2788560 2788479 rsession NA 219.8MB 1.38GB
#> 7 1 2025-07-09 13:57:33 2788479 2788477 rsession-… NA 3MB 4.4MB
#> 8 1 2025-07-09 13:57:33 2766449 2766360 R NA 58.1MB 563MB
#> 9 1 2025-07-09 13:57:33 2766448 2766360 R NA 58.2MB 563MB
#> 10 1 2025-07-09 13:57:33 2766442 2766360 R NA 99.5MB 678.14MB
#> # ℹ 188 more rows
# use `peak = TRUE` to return only the snapshot with
# the highest memory usage (as `sum(rss)`)
syrup(Sys.sleep(1), interval = .01, peak = TRUE)
#> # A tibble: 65 × 8
#> id time pid ppid name pct_cpu rss vms
#> <dbl> <dttm> <int> <int> <chr> <dbl> <bch:b> <bch:by>
#> 1 2 2025-07-09 13:57:35 2825787 2825448 R NA 267.8MB 936.1MB
#> 2 2 2025-07-09 13:57:35 2825448 2825447 R NA 83.9MB 667.12MB
#> 3 2 2025-07-09 13:57:35 2818763 2814503 R NA 312MB 1.24GB
#> 4 2 2025-07-09 13:57:35 2814503 2814502 R NA 119.2MB 702.88MB
#> 5 2 2025-07-09 13:57:35 2788560 2788479 rsession NA 219.8MB 1.38GB
#> 6 2 2025-07-09 13:57:35 2788479 2788477 rsession-… NA 3MB 4.4MB
#> 7 2 2025-07-09 13:57:35 2766449 2766360 R NA 58.1MB 563MB
#> 8 2 2025-07-09 13:57:35 2766448 2766360 R NA 58.2MB 563MB
#> 9 2 2025-07-09 13:57:35 2766442 2766360 R NA 99.5MB 678.14MB
#> 10 2 2025-07-09 13:57:35 2766441 2766360 R NA 83.5MB 661.89MB
#> # ℹ 55 more rows
# results from syrup are more---or maybe only---useful when
# computations are evaluated in parallel. see package README
# for an example.