Coding Guide Best Practices
A well-designed coding guide is the foundation of reliable survey coding. This guide covers how to create categories that maximize inter-rater agreement and minimize the need for manual reconciliation.
Designing Effective Categories
Good categories follow two fundamental principles: they should be mutually exclusive (no overlap between categories) and collectively exhaustive (every possible response fits somewhere).
Start broad: Begin with 5-10 categories. It's easier to split a broad category later than to merge overlapping ones. Analyze a sample of your data first to understand the range of responses.
Category Naming
Clear, descriptive names help both the AI raters and human reviewers understand what belongs in each category:
- Be specific: "Product quality complaints" is better than "Negative feedback"
- Avoid jargon: Unless your entire team uses the same terminology
- Keep it concise: 2-4 words is ideal for quick scanning
- Use consistent grammar: All nouns ("Pricing", "Quality") or all verbs ("Complained about pricing")
Writing Category Descriptions
Descriptions are crucial for AI accuracy. They tell the raters exactly what belongs in each category:
- Include examples: "Responses mentioning product durability, materials, or build quality"
- Define boundaries: "Does NOT include complaints about shipping damage"
- Be explicit: The AI only knows what you tell it
Ambiguous descriptions cause disagreements: If you're vague about what belongs in a category, the two AI raters will interpret it differently, leading to more reconciliation work.
Single-Label vs Multi-Label Mode
Choose the mode that matches your analysis needs:
Single-Label Mode
Each response receives exactly one category. Use this when:
- Categories are truly mutually exclusive
- You want clean frequency counts for reporting
- Responses typically focus on one main topic
Multi-Label Mode
Responses can receive multiple categories. Use this when:
- Responses often mention multiple distinct topics
- You need to capture all themes, not just the primary one
- Your analysis requires understanding co-occurrence
Multi-label increases complexity: Agreement is calculated per-category, and reconciliation requires reviewing each assigned category. Only use multi-label when your research genuinely requires it.
The Role of "Other" and "N/A" Categories
Not every response will fit your defined categories. How you handle these edge cases significantly impacts your results.
Explicit "Other" Category
Add an "Other" category when you want to capture responses that don't fit elsewhere but still contain meaningful content:
- Description: "Relevant feedback that doesn't fit other categories"
- Use case: Discovering new themes you hadn't anticipated
- Review "Other" responses periodically to see if new categories are needed
Implicit N/A Option
Enable the "Add implicit N/A category" setting when responses might be:
- Empty or meaningless (gibberish, test entries)
- Off-topic (doesn't answer the question asked)
- Genuinely not applicable to any category
When to use implicit N/A: If you're coding open-ended survey responses, some respondents will write things like "N/A", "none", or completely off-topic answers. The implicit N/A option gives the AI a valid way to classify these without forcing them into an inappropriate category.
Keeping "Other" Under 10%
A high percentage of "Other" responses indicates your categories aren't capturing the data well:
| Other % | Interpretation | Action |
|---|---|---|
| <5% | Excellent coverage | No action needed |
| 5-10% | Good coverage | Review "Other" responses for patterns |
| 10-20% | Categories may be too narrow | Consider adding categories or broadening existing ones |
| >20% | Category scheme needs revision | Analyze "Other" responses and redesign categories |
Examples of Good vs Poor Categories
Poor Category Design
- "Positive" - Too vague. Positive about what?
- "Issues" - What kind of issues? Quality? Service? Delivery?
- "Feedback about our company" - This could be anything
Good Category Design
- "Product Quality - Positive" - Clear topic and sentiment
- "Delivery Speed Complaints" - Specific issue type
- "Customer Service Experience" - Defined scope
Next: Learn how agreement rates and Cohen's Kappa are calculated to measure coding reliability.