Coding Guide Best Practices

A well-designed coding guide is the foundation of reliable survey coding. This guide covers how to create categories that maximize inter-rater agreement and minimize the need for manual reconciliation.

Designing Effective Categories

Good categories follow two fundamental principles: they should be mutually exclusive (no overlap between categories) and collectively exhaustive (every possible response fits somewhere).

Start broad: Begin with 5-10 categories. It's easier to split a broad category later than to merge overlapping ones. Analyze a sample of your data first to understand the range of responses.

Category Naming

Clear, descriptive names help both the AI raters and human reviewers understand what belongs in each category:

  • Be specific: "Product quality complaints" is better than "Negative feedback"
  • Avoid jargon: Unless your entire team uses the same terminology
  • Keep it concise: 2-4 words is ideal for quick scanning
  • Use consistent grammar: All nouns ("Pricing", "Quality") or all verbs ("Complained about pricing")

Writing Category Descriptions

Descriptions are crucial for AI accuracy. They tell the raters exactly what belongs in each category:

  • Include examples: "Responses mentioning product durability, materials, or build quality"
  • Define boundaries: "Does NOT include complaints about shipping damage"
  • Be explicit: The AI only knows what you tell it

Ambiguous descriptions cause disagreements: If you're vague about what belongs in a category, the two AI raters will interpret it differently, leading to more reconciliation work.

Single-Label vs Multi-Label Mode

Choose the mode that matches your analysis needs:

Single-Label Mode

Each response receives exactly one category. Use this when:

  • Categories are truly mutually exclusive
  • You want clean frequency counts for reporting
  • Responses typically focus on one main topic

Multi-Label Mode

Responses can receive multiple categories. Use this when:

  • Responses often mention multiple distinct topics
  • You need to capture all themes, not just the primary one
  • Your analysis requires understanding co-occurrence

Multi-label increases complexity: Agreement is calculated per-category, and reconciliation requires reviewing each assigned category. Only use multi-label when your research genuinely requires it.

The Role of "Other" and "N/A" Categories

Not every response will fit your defined categories. How you handle these edge cases significantly impacts your results.

Explicit "Other" Category

Add an "Other" category when you want to capture responses that don't fit elsewhere but still contain meaningful content:

  • Description: "Relevant feedback that doesn't fit other categories"
  • Use case: Discovering new themes you hadn't anticipated
  • Review "Other" responses periodically to see if new categories are needed

Implicit N/A Option

Enable the "Add implicit N/A category" setting when responses might be:

  • Empty or meaningless (gibberish, test entries)
  • Off-topic (doesn't answer the question asked)
  • Genuinely not applicable to any category

When to use implicit N/A: If you're coding open-ended survey responses, some respondents will write things like "N/A", "none", or completely off-topic answers. The implicit N/A option gives the AI a valid way to classify these without forcing them into an inappropriate category.

Keeping "Other" Under 10%

A high percentage of "Other" responses indicates your categories aren't capturing the data well:

Other % Interpretation Action
<5% Excellent coverage No action needed
5-10% Good coverage Review "Other" responses for patterns
10-20% Categories may be too narrow Consider adding categories or broadening existing ones
>20% Category scheme needs revision Analyze "Other" responses and redesign categories

Examples of Good vs Poor Categories

Poor Category Design

  • "Positive" - Too vague. Positive about what?
  • "Issues" - What kind of issues? Quality? Service? Delivery?
  • "Feedback about our company" - This could be anything

Good Category Design

  • "Product Quality - Positive" - Clear topic and sentiment
  • "Delivery Speed Complaints" - Specific issue type
  • "Customer Service Experience" - Defined scope

Next: Learn how agreement rates and Cohen's Kappa are calculated to measure coding reliability.