Wraps strings to a specified width accounting for Control Sequences.
strwrap_ctl is intended to emulate strwrap closely except with respect to
the Control Sequences (see details for other minor differences), while
strwrap2_ctl adds features and changes the processing of whitespace.
strwrap_ctl is faster than strwrap.
Usage
strwrap_ctl(
x,
width = 0.9 * getOption("width"),
indent = 0,
exdent = 0,
prefix = "",
simplify = TRUE,
initial = prefix,
warn = getOption("fansi.warn", TRUE),
term.cap = getOption("fansi.term.cap", dflt_term_cap()),
ctl = "all",
normalize = getOption("fansi.normalize", FALSE),
carry = getOption("fansi.carry", FALSE),
terminate = getOption("fansi.terminate", TRUE)
)
strwrap2_ctl(
x,
width = 0.9 * getOption("width"),
indent = 0,
exdent = 0,
prefix = "",
simplify = TRUE,
initial = prefix,
wrap.always = FALSE,
pad.end = "",
strip.spaces = !tabs.as.spaces,
tabs.as.spaces = getOption("fansi.tabs.as.spaces", FALSE),
tab.stops = getOption("fansi.tab.stops", 8L),
warn = getOption("fansi.warn", TRUE),
term.cap = getOption("fansi.term.cap", dflt_term_cap()),
ctl = "all",
normalize = getOption("fansi.normalize", FALSE),
carry = getOption("fansi.carry", FALSE),
terminate = getOption("fansi.terminate", TRUE)
)Arguments
- x
a character vector, or an object which can be converted to a character vector by
as.character.- width
a positive integer giving the target column for wrapping lines in the output.
- indent
a non-negative integer giving the indentation of the first line in a paragraph.
- exdent
a non-negative integer specifying the indentation of subsequent lines in paragraphs.
- prefix, initial
a character string to be used as prefix for each line except the first, for which
initialis used.- simplify
a logical. If
TRUE, the result is a single character vector of line text; otherwise, it is a list of the same length asxthe elements of which are character vectors of line text obtained from the corresponding element ofx. (Hence, the result in the former case is obtained by unlisting that of the latter.)- warn
TRUE (default) or FALSE, whether to warn when potentially problematic Control Sequences are encountered. These could cause the assumptions
fansimakes about how strings are rendered on your display to be incorrect, for example by moving the cursor (see?fansi). At most one warning will be issued per element in each input vector. Will also warn about some badly encoded UTF-8 strings, but a lack of UTF-8 warnings is not a guarantee of correct encoding (usevalidUTF8for that).- term.cap
character a vector of the capabilities of the terminal, can be any combination of "bright" (SGR codes 90-97, 100-107), "256" (SGR codes starting with "38;5" or "48;5"), "truecolor" (SGR codes starting with "38;2" or "48;2"), and "all". "all" behaves as it does for the
ctlparameter: "all" combined with any other value means all terminal capabilities except that one.fansiwill warn if it encounters SGR codes that exceed the terminal capabilities specified (seeterm_cap_testfor details). In versions prior to 1.0,fansiwould also skip exceeding SGRs entirely instead of interpreting them. You may add the string "old" to any otherwise validterm.capspec to restore the pre 1.0 behavior. "old" will not interact with "all" the way other valid values for this parameter do.- ctl
character, which Control Sequences should be treated specially. Special treatment is context dependent, and may include detecting them and/or computing their display/character width as zero. For the SGR subset of the ANSI CSI sequences, and OSC hyperlinks,
fansiwill also parse, interpret, and reapply the sequences as needed. You can modify whether a Control Sequence is treated specially with thectlparameter."nl": newlines.
"c0": all other "C0" control characters (i.e. 0x01-0x1f, 0x7F), except for newlines and the actual ESC (0x1B) character.
"sgr": ANSI CSI SGR sequences.
"csi": all non-SGR ANSI CSI sequences.
"url": OSC hyperlinks
"osc": all non-OSC-hyperlink OSC sequences.
"esc": all other escape sequences.
"all": all of the above, except when used in combination with any of the above, in which case it means "all but".
- normalize
TRUE or FALSE (default) whether SGR sequence should be normalized out such that there is one distinct sequence for each SGR code. normalized strings will occupy more space (e.g. "\033[31;42m" becomes "\033[31m\033[42m"), but will work better with code that assumes each SGR code will be in its own escape as
crayondoes.- carry
TRUE, FALSE (default), or a scalar string, controls whether to interpret the character vector as a "single document" (TRUE or string) or as independent elements (FALSE). In "single document" mode, active state at the end of an input element is considered active at the beginning of the next vector element, simulating what happens with a document with active state at the end of a line. If FALSE each vector element is interpreted as if there were no active state when it begins. If character, then the active state at the end of the
carrystring is carried into the first element ofx(see "Replacement Functions" for differences there). The carried state is injected in the interstice between an imaginary zeroeth character and the first character of a vector element. See the "Position Semantics" section ofsubstr_ctland the "State Interactions" section of?fansifor details. Except forstrwrap_ctlwhereNAis treated as the string"NA",carrywill causeNAs in inputs to propagate through the remaining vector elements.- terminate
TRUE (default) or FALSE whether substrings should have active state closed to avoid it bleeding into other strings they may be prepended onto. This does not stop state from carrying if
carry = TRUE. See the "State Interactions" section of?fansifor details.- wrap.always
TRUE or FALSE (default), whether to hard wrap at requested width if no word breaks are detected within a line. If set to TRUE then
widthmust be at least 2.- pad.end
character(1L), a single character to use as padding at the end of each line until the line is
widthwide. This must be a printable ASCII character or an empty string (default). If you set it to an empty string the line remains unpadded.- strip.spaces
TRUE (default) or FALSE, if TRUE, extraneous white spaces (spaces, newlines, tabs) are removed in the same way as base::strwrap does. When FALSE, whitespaces are preserved, except for newlines as those are implicit boundaries between output vector elements.
- tabs.as.spaces
FALSE (default) or TRUE, whether to convert tabs to spaces. This can only be set to TRUE if
strip.spacesis FALSE.- tab.stops
integer(1:n) indicating position of tab stops to use when converting tabs to spaces. If there are more tabs in a line than defined tab stops the last tab stop is re-used. For the purposes of applying tab stops, each input line is considered a line and the character count begins from the beginning of the input line.
Details
strwrap2_ctl can convert tabs to spaces, pad strings up to width, and
hard-break words if single words are wider than width.
Unlike base::strwrap, both these functions will translate any non-ASCII
strings to UTF-8 and return them in UTF-8. Additionally, invalid UTF-8
always causes errors, and prefix and indent must be scalar.
When replacing tabs with spaces the tabs are computed relative to the
beginning of the input line, not the most recent wrap point.
Additionally,indent, exdent, initial, and prefix will be ignored when
computing tab positions.
Note
Non-ASCII strings are converted to and returned in UTF-8 encoding. Width calculations will not work properly in R < 3.2.2.
For the strwrap* functions the carry parameter affects whether
styles are carried across input vector elements. Styles always carry
within a single wrapped vector element (e.g. if one of the input elements
gets wrapped into three lines, the styles will carry through those three
lines even if carry=FALSE, but not across input vector elements).
Control and Special Sequences
Control Sequences are non-printing characters or sequences of characters.
Special Sequences are a subset of the Control Sequences, and include CSI
SGR sequences which can be used to change rendered appearance of text, and
OSC hyperlinks. See fansi for details.
Graphemes
fansi approximates grapheme widths and counts by using heuristics for
grapheme breaks that work for most common graphemes, including emoji
combining sequences. The heuristic is known to work incorrectly with
invalid combining sequences, prepending marks, and sequence interruptors.
The utf8 package provides a
conforming grapheme parsing implementation.
Output Stability
Several factors could affect the exact output produced by fansi
functions across versions of fansi, R, and/or across systems.
In general it is best not to rely on exact fansi output, e.g. by
embedding it in tests.
Width and grapheme calculations depend on Unicode database version (see
fansi_unicode_version, and grapheme processing logic among other
things (see "Graphemes"). Individual character width are intended to match
R4.5.1 definitions in an English locale, except for differences introduced by
Unicode Database Version updates and grapheme processing.
How a particular display format is encoded in Control Sequences is
not guaranteed to be stable across fansi versions. Additionally, which
Special Sequences are re-encoded vs transcribed untouched may change.
In general we will strive to keep the rendered appearance stable.
To maximize the odds of getting stable output set normalize_state to
TRUE and type to "chars" in functions that allow it, and
set term.cap to a specific set of capabilities.
Bidirectional Text
fansi is unaware of text directionality and operates as if all strings are
left to right (LTR). Using fansi function with strings that contain mixed
direction scripts (i.e. both LTR and RTL) may produce undesirable results.
See also
?fansi for details on how Control Sequences are
interpreted, particularly if you are getting unexpected results,
normalize_state for more details on what the normalize parameter does,
state_at_end to compute active state at the end of strings,
close_state to compute the sequence required to close active state.
Examples
hello.1 <- "hello \033[41mred\033[49m world"
hello.2 <- "hello\t\033[41mred\033[49m\tworld"
strwrap_ctl(hello.1, 12)
#> [1] "hello \033[41mred\033[0m" "world"
strwrap_ctl(hello.2, 12)
#> [1] "hello \033[41mred\033[0m" "world"
## In default mode strwrap2_ctl is the same as strwrap_ctl
strwrap2_ctl(hello.2, 12)
#> [1] "hello \033[41mred\033[0m" "world"
## But you can leave whitespace unchanged, `warn`
## set to false as otherwise tabs causes warning
strwrap2_ctl(hello.2, 12, strip.spaces=FALSE, warn=FALSE)
#> [1] "hello\t\033[41mred\033[49m\t" "world"
## And convert tabs to spaces
strwrap2_ctl(hello.2, 12, tabs.as.spaces=TRUE)
#> [1] "hello \033[41mred\033[49m" " world"
## If your display has 8 wide tab stops the following two
## outputs should look the same
writeLines(strwrap2_ctl(hello.2, 80, tabs.as.spaces=TRUE))
#> hello red world
writeLines(hello.2)
#> hello red world
## tab stops are NOT auto-detected, but you may provide
## your own
strwrap2_ctl(hello.2, 12, tabs.as.spaces=TRUE, tab.stops=c(6, 12))
#> [1] "hello \033[41mred\033[49m " " "
#> [3] "world"
## You can also force padding at the end to equal width
writeLines(strwrap2_ctl("hello how are you today", 10, pad.end="."))
#> hello how
#> are you..
#> today....
## And a more involved example where we read the
## NEWS file, color it line by line, wrap it to
## 25 width and display some of it in 3 columns
## (works best on displays that support 256 color
## SGR sequences)
NEWS <- readLines(file.path(R.home('doc'), 'NEWS'))
NEWS.C <- fansi_lines(NEWS, step=2) # color each line
W <- strwrap2_ctl(NEWS.C, 25, pad.end=" ", wrap.always=TRUE)
writeLines(c("", paste(W[1:20], W[100:120], W[200:220]), ""))
#>
#> R News returns NaN but base_rdxrefs_db().
#> the correct values (0 or
#> CHANGES IN R 4.5.1: 1, or their logs for * It is now possible to
#> log.p = TRUE). set the background color
#> NEW FEATURES: This improves Mathlib's for row and column
#> C level bratio() and names in the data editor
#> * The internal method of hence also on Windows (Rgui).
#> unzip() now follows pnbinom(), etc..
#> unzip 6.00 in how it * Rterm on Windows now
#> handles extracted file CHANGES IN R 4.5.0: accepts input lines of
#> paths which contain unlimited length.
#> "../". With thanks to NEW FEATURES:
#> Ivan Krylov. * file.info() on Windows
#> * as.integer(rl) and now provides file owner
#> INSTALLATION: hence as.raw(rl) now name and domain.
#> work for a list of
#> * Standalone nmath can raw(1) * Sys.info() on Windows
#> be built with early-2025 elements, as proposed by now provides current
#> versions of Michael Chirico's user domain.
#> clang-based compilers PR#18696.
#> R News * findInterval() gets
#>