Skip to content

Writing footprint

Writing footprint is separate from templates: it describes style and variation in prose, not repeating log lines or JSON keys. It is computed only for text-like content after template mining (plain text, markdown, HTML body, EPUB body).

It does not appear for logs, JSON template runs, or metadata-only formats.

Where it appears in JSON

LocationWhen
writing_footprint on MiningResultTemplates-only path for text-like files
Top-level writing_footprint on OutputSame metrics, also in templates-only export
html_metadataHTML-specific counts plus footprint from body text
epub_metadataEPUB bibliographic fields plus footprint from spine body

UBLX shows this in the Writing pane (table layout, same family as Metadata).

How it is derived

Writing metrics are computed from the same sentences used for templates:

  1. Exact-pattern pass — group sentences by shared phrases / n-grams; build templates and example diversity.
  2. Shape fallback — if no templates emerge, group by sentence length and ending punctuation (period, ?, !).
  3. Aggregate — vocabulary, punctuation, template entropy, and optional SVO-style pivot analysis over template examples.

Logs use line templates only; log_metadata covers line endings and timestamp heuristics instead.

WritingFootprint fields

json
{
  "vocabulary_richness": 0.42,
  "avg_sentence_length": 18.3,
  "template_diversity": 27,
  "avg_entropy": 0.61,
  "punctuation": {
    "period_percent": 72.5,
    "question_percent": 4.1,
    "exclamation_percent": 0.8,
    "dialogue_percent": 12.0,
    "avg_commas_per_sentence": 1.4
  },
  "svo_analysis": {
    "svo_structure_percent": 38.0,
    "avg_subject_length": 5.2,
    "avg_object_length": 8.1,
    "common_pivots": ["is", "was", "has"]
  }
}
FieldMeaning
vocabulary_richnessUnique words ÷ total words (0.0–1.0)
avg_sentence_lengthMean words per sentence
template_diversityCount of distinct template patterns
avg_entropyMean placeholder variation across templates (0.0–1.0; higher = more varied wording in slots)
punctuationSentence-ending and dialogue punctuation profile (see below)
svo_analysisOptional inferred subject–verb–object structure from template pivots

punctuation object

FieldMeaning
period_percentShare of sentences ending with .
question_percentShare ending with ?
exclamation_percentShare ending with !
dialogue_percentShare containing quotes
avg_commas_per_sentenceMean commas per sentence

svo_analysis object (optional)

Present when the analyzer finds enough pivot structure in template examples:

FieldMeaning
svo_structure_percentShare of templates with pivot-like structure
avg_subject_lengthMean words before pivot
avg_object_lengthMean words after pivot
common_pivotsFrequent pivot tokens (often verbs or structural words)

Related: Template mining overview, Metadata extraction.

UBLX · Nefaxer · ZahirScan