12 KiB
Chinese Documentation Translation Design
Date: 2026-03-27 Status: Approved Author: Design generated through brainstorming process Type: Documentation infrastructure
Overview
Update docs_zh-CN/ to achieve full parity with English docs/ by translating ~50 missing files using a sequential glossary-building approach that ensures terminology consistency across all documentation.
Problem Statement
The Chinese documentation (docs_zh-CN/) is missing approximately 50+ files that exist in the English version (docs/), including:
- Critical user-facing guides (tool-specific skills, troubleshooting)
- Contributor documentation (quality standards, security guidelines)
- Maintainer documentation (update guides, release processes)
- Root-level documentation files
This gap prevents Chinese users from accessing complete documentation and creates an inconsistent experience between English and Chinese audiences.
Goals
- Achieve documentation parity: Translate all 50+ missing files
- Ensure terminology consistency: Build and maintain a glossary of technical terms
- Maintain quality standards: Validate links, formatting, and content
- Create maintainable process: Establish workflow for future translations
Architecture
Workflow Phases
- Glossary Foundation - Translate high-priority user docs, extract consistent terminology
- Sequential Translation - Process remaining files using established glossary
- Validation - Link checking, markdown linting, terminology consistency verification
- Integration - Commit structure, generate translation status report
File Processing Order
Files are processed in dependency order to ensure the most important documentation sets the terminology foundation:
Priority 1: Core User Docs (sets terminology foundation)
→ README.md, getting-started.md, usage.md, faq.md
Priority 2: Tool-Specific Guides (uses established terminology)
→ claude-code-skills.md, cursor-skills.md, gemini-cli-skills.md, codex-cli-skills.md
Priority 3: Advanced User Docs
→ bundles.md, workflows.md, skills-vs-mcp-tools.md, agent-overload-recovery.md
Priority 4: Contributor Guides
→ contributors/quality-bar.md, contributors/security-guardrails.md
Priority 5: Maintainer Docs
→ maintainers/skills-update-guide.md, maintainers/repo-growth-seo.md
Key Design Decisions
- Sequential processing: Files translated in dependency order (user-facing first)
- English preservation: Technical terms in English stay in English (e.g., "Claude Code", not "克劳德代码")
- Glossary evolution: Starts with ~20 core terms, grows to ~100+ terms through translation process
- Incremental validation: Each batch validated before proceeding to next
Components
1. Glossary Manager
Purpose: Central terminology database for consistent translations
Structure: JSON file at docs_zh-CN/.glossary.json
{
"skills": {
"translation": "技能",
"context": "Core concept - AI assistant capabilities",
"examples": ["use skills", "skill library"]
},
"bundles": {
"translation": "捆绑包",
"context": "Curated skill collections",
"examples": ["starter bundles", "bundle recommendations"]
},
"workflows": {
"translation": "工作流",
"context": "Step-by-step execution guides",
"examples": ["workflow automation", "execution workflows"]
}
}
Evolution:
- Foundation: ~20 core terms
- After Priority 1: ~60 terms
- After Priority 2-3: ~100 terms
- Final: ~100-150 terms covering all domains
2. Translation Engine
Input: English markdown file + current glossary
Process:
- Parse markdown structure (preserve headers, code blocks, links)
- Extract translatable content
- Apply glossary substitutions
- Translate content sections
- Reassemble with original formatting
Output: Chinese markdown file with consistent terminology
3. Link Validator
Checks:
- Internal links (
.md) - External links (http/https)
Rules:
- Internal English links → Chinese equivalents (e.g.,
../usage.md→../usage.md) - Keep external links unchanged
- Flag broken internal links for manual review
4. Quality Validator
Checks:
- Markdown linting: Format consistency
- Terminology consistency: Verify glossary terms used uniformly
- Placeholder verification: Ensure no
[TRANSLATE ME]or similar placeholders remain
Data Flow:
English File → Glossary Lookup → Translation → Glossary Update → Validation → Chinese File
↓
Extract new terms
Add to glossary
Glossary Building Process
Phase 1: Foundation Glossary (Priority 1 Files)
1. Pre-translation Analysis
- Scan
README.md,getting-started.md,usage.md,faq.md - Extract recurring technical terms (frequency analysis)
- Identify brand names that stay in English (Claude Code, Cursor, GitHub, etc.)
- Create initial glossary with ~30-40 core terms
2. First Translation Pass
- Translate the 4 Priority 1 files using initial glossary
- Track new terms encountered during translation
- Document ambiguous terms (e.g., "skills" = 技能 vs 技巧)
- Expand glossary to ~60 terms
3. Glossary Refinement
- Review terminology consistency across the 4 files
- Resolve conflicts (choose one translation per term)
- Add context notes for ambiguous terms
- Lock foundation glossary
Example Glossary Evolution:
Initial: {skills, installation, repository}
→ After translation: {skills, installation, repository, bundles, workflows,
contributors, maintainers, cli, agent, mcp, ...}
→ Refined: Add context notes for ambiguous terms
Phase 2: Sequential Expansion
- Each new translation adds 5-10 terms to glossary
- Weekly glossary checkpoints ensure consistency
- Final glossary: ~100-150 terms covering all domains
Translation Process
Per-File Translation Pipeline
1. File Analysis (~1-2 minutes per file)
- Extract headings, code blocks, links, tables
- Identify translatable vs. non-translatable sections
- Check file dependencies (links to other docs)
- Estimate glossary term usage
2. Translation Execution (~3-5 minutes per file)
- Preserve markdown structure exactly
- Apply glossary substitutions first
- Translate content section by section
- Keep code blocks, commands, file paths in English
- Handle special cases:
- Image alt text: translate
- Code comments: translate if explanatory
- Inline code: keep in English
- URLs: keep unchanged
3. Link Processing
- Internal English links → Chinese equivalents
- Links to non-translated files → flag for later
- External links → unchanged
- Update table of contents if present
4. Glossary Update
- Extract new technical terms
- Check for terminology conflicts
- Add new terms with context
- Version the glossary update
Translation Rules
Translate:
- ✅ Explanatory text, headers, lists, prose
- ✅ User-facing comments in code examples
- ✅ Image alt text
Don't translate:
- ❌ Code blocks and inline code
- ❌ Commands and file paths
- ❌ URLs and links
- ❌ Proper nouns (Claude, GitHub, npm)
Context-dependent:
- 🔧 UI elements (keep quotes if original has them)
- 🔧 Technical comments in code
Batch Processing
- Process files in priority order
- After each batch, run validation
- Commit batch before starting next (checkpoint system)
- Track progress in
translation-status.md
Validation & Quality Assurance
Standard Validation Checklist (Per File)
1. Link Verification
- ✅ All internal links point to existing files
- ✅ External links are valid (HTTP 200)
- ✅ Anchor links (
#heading) work correctly - ⚠️ Flag links to non-translated files
2. Markdown Structure
- ✅ Valid markdown syntax
- ✅ Proper heading hierarchy (H1 → H2 → H3)
- ✅ Code blocks properly fenced
- ✅ Tables formatted correctly
- ✅ No broken list formatting
3. Content Quality
- ✅ No placeholder text (
[TODO],[TRANSLATE ME]) - ✅ Consistent terminology (matches glossary)
- ✅ Proper Chinese punctuation (full-width for Chinese text)
- ✅ No mixed English/Chinese sentences unless necessary
4. Formatting Consistency
- ✅ Code blocks use correct language tags
- ✅ Spacing around Chinese/English boundaries
- ✅ Bullet/numbered list formatting matches source
- ✅ Quote blocks properly formatted
Validation Tools
markdownlintfor markdown structure- Custom script for link checking
- Custom script for glossary consistency
- Manual review for ambiguous cases
Error Handling
Broken links → Log to docs_zh-CN/translation-issues.md
Glossary conflicts → Manual review, update glossary
Translation ambiguities → Add inline comments for review
Error Handling & Edge Cases
Common Edge Cases
1. Ambiguous Technical Terms
- Example: "agent" = 代理 vs 智能体 vs 代理程序
- Solution: Context notes in glossary, choose based on domain
- Documentation: Add
usage_contextfield to.glossary.json
2. Code Comments in Examples
- Translatable if explanatory (
# Set up the client) - Not translatable if technical (
// Initialize SDK) - Rule: Translate user-facing comments, keep technical comments
3. Brand Names and Product Names
- Always keep in English: Claude Code, Cursor, GitHub, npm
- Only translate if official Chinese name exists
- Check: Official docs for preferred translations
4. Links to Non-Translated Files
- During transition: Some Chinese docs link to English
- Add indicator:
(English)after link - Track: In
translation-status.mdfor future translation
5. Tables with Mixed Content
- Translate column headers
- Translate cell content unless technical
- Preserve code blocks within cells
6. Screenshots and Diagrams
- Keep as-is (no image editing)
- Add descriptive alt text in Chinese
- Note if screenshot contains translatable UI text
Recovery Strategy
- Glossary conflicts → Stop, resolve, continue
- Broken links → Log, flag in file, continue
- Translation errors → Revert file, fix, re-validate
Deliverables
Primary Deliverables
1. Translated Documentation (~50 new files in docs_zh-CN/)
- All missing user-facing docs
- All missing contributor docs
- All missing maintainer docs
- Proper directory structure matching
docs/
2. Glossary Artifact
docs_zh-CN/.glossary.json- Complete terminology database- 100-150 terms with context notes
- Usage examples for ambiguous terms
3. Validation Reports
docs_zh-CN/translation-status.md- Completion trackingdocs_zh-CN/translation-issues.md- Known issues and edge cases- Link validation results
- Terminology consistency report
4. Integration Artifacts
- Git commits organized by priority batch
- Commit messages following project conventions
- Pull request ready for review
Estimated Timeline
- Glossary Foundation: 45-60 minutes
- Translation Batches: 2-3 hours (50 files ÷ ~3-4 min/file)
- Validation & Fixes: 30-45 minutes
- Final Review & Integration: 30 minutes
- Total: ~4-5 hours
Success Criteria
- ✅ All 50+ missing files translated
- ✅ Zero broken internal links
- ✅ Terminology consistency ≥95%
- ✅ Markdown linting passes
- ✅ Ready for Chinese user review
Future Considerations
Maintenance:
- Glossary should be updated as new English docs are added
- Consider automated translation suggestions for future updates
- Periodic review of glossary for terminology updates
Automation:
- Potential for CI integration to check translation completeness
- Automated glossary consistency checks
- Link validation in CI pipeline
Community:
- Consider process for community-contributed translations
- Review workflow for suggested glossary improvements
- Translation memory database for reusable segments