---
title: "Getting Started with mighty.component"
output: rmarkdown::html_vignette
vignette: >
  %\VignetteIndexEntry{Getting Started with mighty.component}
  %\VignetteEngine{knitr::rmarkdown}
  %\VignetteEncoding{UTF-8}
---

```{r, include = FALSE}
knitr::opts_chunk$set(
  collapse = TRUE,
  comment = "#>"
)
```

```{r setup}
library(mighty.component)
```

## What is a mighty component?

Components let you write a data transformation once and reuse it across studies
by swapping variable names at render time. Instead of copying and modifying code
for each new study, you maintain a single template.

A mighty component is a reusable code template for a single, well-defined data
transformation step. Components are commonly used to generate ADaM (Analysis
Dataset Model) programs, but the concept is general: any parameterized R code
snippet that reads a data set, modifies it, and writes it back can be expressed
as a component.

Components combine two ideas:

- **Mustache templating** — placeholders like `{{{ domain }}}` are filled in at
  render time, so the same logic works across different data sets and variables.
- **Roxygen-like documentation** — tags like `@title`, `@param`, and `@depends`
  describe what the component does, what it needs, and what it produces.

Think of components as reusable building blocks: each one handles a single
derivation or transformation, and you compose several of them to build a
complete program.

In the broader mighty ecosystem, `mighty.metadata` provides study-level
configuration (via [`mighty_study()`](https://novonordisk-opensource.github.io/mighty.metadata/reference/mighty_study.html) and `_study.yml`) that can drive which
components are rendered and with what parameters.

## Anatomy of a component template

Below is a minimal component that doubles a column. Every tag is visible at a
glance:
```r
#' @title Double a variable
#' @description
#' Creates a new column that is twice the value of an existing column.
#'
#' @param domain `character` Name of the domain (data frame)
#' @param input `character` Name of the existing column to double
#' @param output `character` Name of the new column to create
#' @type column
#' @origin Derived
#' @depends {{{domain}}} {{{input}}}
#' @outputs {{{output}}}
#' @code
{{{domain}}} <- {{{domain}}} |>
  dplyr::mutate(
    {{{output}}} = 2 * {{{input}}}
  )
```

### Tags reference

| Tag | Purpose |
|-----|---------|
| `@title` | One-line title (required) |
| `@description` | Multi-line description (required) |
| `@param name description` | Declares a Mustache placeholder the user must provide in metadata specifications |
| `@type` | Component type: `column`, `row`, `parameter`, or `internal` |
| `@origin` | CDISC origin (optional): `Assigned`, `Collected`, `Derived`, `Not Available`, `Other`, `Predecessor`, or `Protocol` |
| `@depends domain column` | Declares that the code reads `column` from `domain` (repeat for each) |
| `@outputs variable` | Declares a column the code creates (repeat for each) |
| `@code` | Everything below this tag is executable R code |

### Mustache syntax

Components use [Mustache](https://mustache.github.io) — a simple, logic-less
templating language. Inside `@code`, `{{{ }}}` are Mustache placeholders, not R
syntax. They are text-replaced with concrete values before the code is parsed
as R. Rendering is done by the
[whisker](https://github.com/edwindj/whisker) R package.

The three patterns used in components are:

- **`{{ variable }}`** — replaced with the value supplied at render time.
- **`{{{ variable }}}`** — unescaped replacement. Used when the value is literal
  R code (e.g., `{{{ value }}}` to insert `1`, `"text"`, or an expression).
- **`{{#list}}...{{/list}}`** — repeats its body once for each element of a
  vector parameter.
Mustache template variables in double braces `{{}}` are HTML escaped by default. Since mighty renders R code, we recommend using triple braces `{{{}}}`.
See the [Mustache manual](https://mustache.github.io/mustache.5.html) for the
full syntax reference.

### Conventions

1. The input data set is always called `{{{ domain }}}`.
2. The code must assign the result back to `{{{ domain }}}`.
3. Use explicit package namespaces (`dplyr::mutate()`, not `mutate()`).
4. Joins must always specify an explicit `by` argument — this is enforced by
   automatic validation (see [Automatic code validation]).

## Retrieve and inspect a component

List the example components shipped with the package:

```{r list-components}
path <- system.file("examples", package = "mighty.component")
list_components(path)
```

Retrieve one by file path:

```{r get-ady}
ady <- get_component(
  system.file("examples", "ady.mustache", package = "mighty.component")
)
ady
```

Access individual fields through the active bindings:

```{r inspect-fields}
ady$title
ady$type
ady$params
ady$depends
ady$outputs
ady$origin
```

## Render a component

Rendering fills in the Mustache placeholders with concrete values. The
`$render()` method takes parameters as named arguments and returns a
`mighty_component_rendered` object. Notice every `{{{ }}}` placeholder is now a
concrete name:

Note that rendering is purely textual — mighty.component replaces placeholders
with the values you supply but does not check whether the resulting code is
valid R or whether the referenced columns exist. Runtime correctness is your
responsibility; use `get_test_component()` (see [Testing components]) to verify
components against real data.

```{r render-ady}
ady_rendered <- ady$render(domain = "ADAE", variable = "ASTDY", date = "ASTDT")
ady_rendered
```

A convenience function combines retrieval and rendering in one step. It returns
the same rendered component as above. Note that `get_rendered_component()`
takes parameters as a named `list`, unlike `$render()` which takes `...`:

```{r shortcut}
get_rendered_component(
  system.file("examples", "ady.mustache", package = "mighty.component"),
  list(domain = "ADAE", variable = "ASTDY", date = "ASTDT")
)
```

If you omit a required parameter, you get an informative error:

```{r render-error, error=TRUE}
ady$render(domain = "ADAE")
```

## Evaluate rendered code

Once rendered, call `$eval()` to execute the code in your current environment.
The component code contains an assignment (e.g., `ADAE <- ADAE |> ...`), and
`$eval()` evaluates that code in the calling environment via
`eval(envir = parent.frame())`. This means `$eval()` modifies the domain
variable in place — no assignment of the return value is needed.

```{r eval-setup}
ADAE <- pharmaverseadam::adae |>
  dplyr::select(USUBJID, ASTDT, TRTSDT)

names(ADAE)
```

The `ASTDY` column does not exist yet. Run the rendered component:

```{r eval-run}
ady_rendered$eval()
names(ADAE)
head(ADAE)
```

`$eval()` executes the rendered code in the calling environment by default. You
can pass a different environment via the `envir` argument if needed.

If you want to save the rendered code to a script file instead of evaluating it
interactively, use `$stream(path)` to append the code to an R file:

```{r stream-example}
script_file <- tempfile(fileext = ".R")
ady_rendered$stream(script_file)
readLines(script_file)
```

## Writing a custom component

You can author your own components as `.mustache` files. Here is a realistic
example that derives a ratio of the current value to baseline (`R2BASE`) for a
lab parameter. Save the following template to a `.mustache` file:

```r
#' @title Ratio to baseline
#' @description
#' Derives the ratio of the analysis value to the baseline value.
#'
#' @param domain `character` Name of the domain
#' @param variable `character` Name of the new ratio variable
#' @type column
#' @origin Derived
#' @depends {{{domain}}} AVAL
#' @depends {{{domain}}} BASE
#' @outputs {{{variable}}}
#' @code
{{{domain}}} <- {{{domain}}} |>
  dplyr::mutate(
    {{{variable}}} = dplyr::if_else(BASE != 0, AVAL / BASE, NA_real_)
  )
```

```{r custom-write, include = FALSE}
r2base_file <- tempfile(fileext = ".mustache")
writeLines(c(
  "#' @title Ratio to baseline",
  "#' @description",
  "#' Derives the ratio of the analysis value to the baseline value.",
  "#'",
  "#' @param domain `character` Name of the domain",
  "#' @param variable `character` Name of the new ratio variable",
  "#' @type column",
  "#' @origin Derived",
  "#' @depends {{{domain}}} AVAL",
  "#' @depends {{{domain}}} BASE",
  "#' @outputs {{{variable}}}",
  "#' @code",
  "{{{domain}}} <- {{{domain}}} |>",
  "  dplyr::mutate(",
  "    {{{variable}}} = dplyr::if_else(BASE != 0, AVAL / BASE, NA_real_)",
  "  )"
), r2base_file)
```

After saving this template to a `.mustache` file, load, render, and run it:

```{r custom-load}
r2base <- get_component(r2base_file)
r2base
```

```{r custom-render}
r2base_rendered <- r2base$render(
  domain = "ADLB",
  variable = "R2BASE"
)
r2base_rendered$code
```

```{r custom-eval}
ADLB <- pharmaverseadam::adlb |>
  dplyr::filter(PARAMCD == "ALB") |>
  dplyr::select(USUBJID, PARAMCD, AVISIT, AVAL, BASE)

head(ADLB)

r2base_rendered$eval()

ADLB |>
  dplyr::select(USUBJID, PARAMCD, AVISIT, AVAL, BASE, R2BASE) |>
  head()
```

## Automatic code validation

When a component is rendered, the generated code is automatically validated.
The package currently checks for **implicit joins** — any `dplyr::left_join()`,
`dplyr::inner_join()`, or similar call without an explicit `by` argument triggers
an error. This prevents a common source of bugs in clinical programming where
join columns change between studies.

Here is a component that fails validation:

```r
#' @title Bad join example
#' @description Implicit join that will fail validation.
#'
#' @param domain `character` domain name
#' @type row
#' @depends {{{domain}}} USUBJID
#' @outputs NEWCOL
#' @code
{{{domain}}} <- {{{domain}}} |>
  dplyr::left_join(other_data)
```

```{r validation-fail-setup, include = FALSE}
bad_template <- c(
  "#' @title Bad join example",
  "#' @description Implicit join that will fail validation.",
  "#'",
  "#' @param domain `character` domain name",
  "#' @type row",
  "#' @depends {{{domain}}} USUBJID",
  "#' @outputs NEWCOL",
  "#' @code",
  "{{{domain}}} <- {{{domain}}} |>",
  "  dplyr::left_join(other_data)"
)

bad_file <- tempfile(fileext = ".mustache")
writeLines(bad_template, bad_file)
```

```{r validation-fail, error=TRUE}
get_rendered_component(bad_file, list(domain = "ADAE"))
```

The fix is to specify the join key explicitly:

```r
#' @title Good join example
#' @description Explicit join that passes validation.
#'
#' @param domain `character` domain name
#' @type row
#' @depends {{{domain}}} USUBJID
#' @outputs NEWCOL
#' @code
{{{domain}}} <- {{{domain}}} |>
  dplyr::left_join(other_data, by = dplyr::join_by(USUBJID))
```

```{r validation-pass-setup, include = FALSE}
good_template <- c(
  "#' @title Good join example",
  "#' @description Explicit join that passes validation.",
  "#'",
  "#' @param domain `character` domain name",
  "#' @type row",
  "#' @depends {{{domain}}} USUBJID",
  "#' @outputs NEWCOL",
  "#' @code",
  "{{{domain}}} <- {{{domain}}} |>",
  "  dplyr::left_join(other_data, by = dplyr::join_by(USUBJID))"
)

good_file <- tempfile(fileext = ".mustache")
writeLines(good_template, good_file)
```

```{r validation-pass}
get_rendered_component(good_file, list(domain = "ADAE"))$code
```

## Testing components

`get_test_component()` creates a component that runs in an **isolated R session**
with automatic code coverage tracking. This is useful both for interactive
exploration and for formal unit tests with `testthat`.

We set `check_coverage = FALSE` here because this code runs inside a vignette,
not inside a `test_that()` block. The default (`TRUE`) uses `withr::defer()` to
automatically verify coverage when a test finishes — use that default in your
actual tests.

```{r test-create}
ady_path <- system.file(
  "examples", "ady.mustache",
  package = "mighty.component"
)
ady_test <- get_test_component(
  component = ady_path,
  params = list(domain = "ADAE", variable = "ASTDY", date = "ASTDT"),
  check_coverage = FALSE # set TRUE in real tests
)
ady_test
```

Assign input data into the isolated session:

```{r test-assign}
ADAE_input <- pharmaverseadam::adae |>
  dplyr::select(USUBJID, ASTDT, TRTSDT)

ady_test$assign("ADAE", ADAE_input)
ady_test$ls()
```

Execute the component and retrieve the result:

```{r test-eval}
ady_test$eval()
ady_test$get("ADAE") |> head()
```

Check coverage — every line of the component code should have been executed:

```{r test-coverage}
# Normal print method
ady_test
# Percent coverage
ady_test$percent_coverage
# Line coverage in a data.frame
ady_test$line_coverage
```

When `check_coverage = TRUE` (the default), coverage is verified automatically
when the test object goes out of scope using `withr::defer()`. If any line was
not executed, an error is raised. This integrates naturally with `testthat` test
files: create the test component inside a `test_that()` block, assign data,
evaluate, and assert on the results — coverage checking happens automatically
when the test finishes.