Skip to main content

Metadata Enrichment

Overview

DocuDesk automatically enriches document metadata when documents are created or updated in OpenRegister. Enrichment includes language detection, keyword extraction, topic classification, document type standardization, and date normalization. All processing runs locally using heuristic algorithms.

Enrichment Pipeline

StepDescriptionFeature Toggle
Language DetectionDetect nl/en via word frequency analysisenable_language_detection
Keyword ExtractionExtract top 10 non-stop-word keywordsenable_keyword_extraction
Topic ClassificationClassify as legal, financial, medical, or technicalenable_topic_classification
Document TypeStandardize types (doc->word, xlsx->spreadsheet)Always on
Date NormalizationNormalize date fields to ISO 8601Always on

API

  • POST /apps/docudesk/api/metadata/enrich - Trigger enrichment for a document object

Request Body

{
"objectId": "uuid-123",
"register": "register-id",
"schema": "schema-id",
"objectData": { "text": "Document content..." }
}

Event-Driven Processing

Enrichment runs automatically via the DocuDeskEventListener when OpenRegister fires ObjectCreatedEvent or ObjectUpdatedEvent. Feature toggles in admin settings control which enrichments are active.