Skip to content

Extraction rules

wuchale doesn’t force you to only use the predefined extraction rules. Those are just defaults. What gets extracted is decided by the heuristic function which you can customize. It is implemented by the adapters so you can configure it on a per-adapter basis. Sensible default heuristic functions are provided out of the box. They implement the following rules (where applicable).

If the message contains no letters used in any natural language (e.g., just numbers or symbols), it is ignored.

All textual content is extracted.

Examples:

<p>This is extracted</p>
  • If the first character is a lowercase English letter ([a-z]), it is ignored.
  • If the element is a <path>, it is ignored (e.g., for SVG d="M10 10..." attributes).
  • Otherwise, it is extracted.

Examples:

<img alt="Profile Picture" class="not-extracted" />
  • If the value is inside console.*() call, it is ignored.
  • If the first character is a lowercase English letter ([a-z]) or is any non-letter character, it is ignored.
  • Otherwise, it is extracted.

Examples:

const message = 'This is extracted'
function foo() {
// extracted
const extracted = 'Hello!'
const nonExtracted = '-starts with non letter'
}

In addition to the above rules, the adapter may have additional restrictions to provide default rules as good as possible. You can see them in the specific adapter’s page.

If you need more control, you can supply your own heuristic function in the configuration. Custom heuristics can return undefined or null to fall back to the default. For convenience, the default heuristic is exported by the package.

With that, it is easy to handle just one case or a few cases and let the default heuristic handle the rest. For example, if we want to ignore all title attributes that contain the + character, the custom heuristic would be:

wuchale.config.js
//...
adapters: {
main: svelte({
heuristic: (msg, details) => {
if (details.attribute === 'title' && msg.includes('+')) {
return false
}
}
})
},
//...

And the rest of the checks will be handled with the default.

If you don’t want to modify the global heuristic but want to ignore or include just a few messages, you can use comment directives.

  • @wc-ignore — skips extraction
  • @wc-include — forces extraction