--- name: doc2math description: Convert narrative technical documents into grounded Mathematical Problem Specifications with variables, constraints, objectives, and uncertainty. risk: safe source: community date_added: "2026-05-31" --- # DOC2MATH — Document-to-Mathematics Problem Specification ## When to Use This Skill - "Formalize this problem statement into math" - "Extract the mathematical structure from this research paper section" - "What variables, constraints, and objectives are in this spec?" - "Convert this word problem to a structured MPS" - "Find what's missing in this problem formulation" ## Zero-Inference Protocol (Mandatory) 1. **Closed World** — if it is not stated in the document, it does not exist in output 2. **Grounding Rule** — every element must cite the exact source phrase (`"evidence"` field) 3. **No Silent Filling** — unknown values use `null`; ambiguous types use `"ambiguous"` 4. **Inference Tagging** — structural inferences tagged `"inferred": true` with `"inference_basis"` 5. **MISSING Markers** — elements mentioned but insufficiently defined get `"status": "MISSING"` with `"missing_reason"` 6. **No Hallucinated Math** — never introduce equations or values not in the source text ## Limitations - Does not invent missing equations, domains, values, or assumptions that are absent from the source document. - Requires enough source text to cite every extracted element; sparse prompts should be returned with explicit missing-information markers. - Produces a formal specification, not a solved optimization model or proof. ## How It Works ### Step 1 — Receive Document Accept the document text, research excerpt, problem description, or specification as input. ### Step 2 — Classify Identify `problem_class`: `optimization | classification | simulation | proof | estimation | other` ### Step 3 — Extract MPS Components **Variables** — `id`, `name`, `symbol`, `type`, `domain`, `units`, `role`, `evidence`, `inferred`, `status` **Operators** — `id`, `name`, `symbol`, `arity`, `acts_on`, `produces`, `evidence`, `inferred` **Constraints** — `id`, `type`, `expression`, `variables_involved`, `evidence`, `hardness`, `inferred`, `status` **Objectives** — `id`, `direction` (minimize/maximize/satisfy/find/prove), `expression`, `variables_involved`, `evidence`, `inferred` **Uncertainty** — `id`, `type` (stochastic/epistemic/measurement/model/none_stated), `affects`, `characterization`, `evidence`, `status` ### Step 4 — Surface Missing Information Identify what the document implies but doesn't state: `missing_information[]` with `element`, `needed_for`, `missing_reason`. ### Step 5 — Validate and Score `validation_flags`: - `has_complete_objectives`: true/false/partial - `has_bounded_variables`: true/false/partial - `has_evidence_for_all_elements`: true/false/partial - `inference_count`: integer - `missing_count`: integer - `overall_formalizability`: HIGH/MEDIUM/LOW ## Output Format Produce the complete MPS as a JSON object: ```json { "mps_version": "1.0", "source_title": "...", "problem_class": "optimization", "variables": [...], "operators": [...], "constraints": [...], "objectives": [...], "uncertainty": [...], "missing_information": [...], "validation_flags": { "overall_formalizability": "HIGH" } } ``` ## Best Practices - ✅ Apply all 6 Zero-Inference Protocol rules before outputting any element - ✅ Surface MISSING markers rather than silently inferring — incomplete formalization is valid output - ✅ Cite the exact source phrase in every `evidence` field - ❌ Never introduce mathematical relationships not grounded in the source text ## Additional Resources - Repository: [thebrierfox/doc2math-skill](https://github.com/thebrierfox/doc2math-skill) - Full BYOK tool: [ace-license-server-production.up.railway.app/byok/doc2math](https://ace-license-server-production.up.railway.app/byok/doc2math) - Built by [IntuiTek¹](https://intuitek.ai) (~K¹) — MIT License