Key Concepts

These core concepts explain how qualcode.ai delivers publication-ready inter-rater reliability from two independent AI raters — and how the system improves with every coding cycle.

Projects

A project is a container for your coding work. Think of it as a folder that organizes everything related to a single study or research question.

Each project contains:

  • Data files: Your uploaded CSV or Excel files with survey responses
  • Coding runs: The results of coding a specific column with a specific guide
  • History: A record of all coding activities and exports

Organization suggestion: Create one project per research study or survey wave. This keeps your data organized and makes it easy to find results later.

Coding Guides

A coding guide defines how responses should be classified. It's the set of categories and rules the AI raters use to code your data.

Each coding guide includes:

  • Categories: The codes or labels you want to assign (e.g., "Product Quality", "Customer Service", "Pricing")
  • Descriptions: Clear explanations of what belongs in each category
  • Mode: Single-label (one category per response) or multi-label (multiple categories allowed)
  • Training data: Optional examples that teach the AI your coding style

Coding guides are reusable across projects. Create a guide once, then apply it to multiple data files or surveys that use the same categories.

If category design is the hardest part of getting started, you can also use the Auto-Suggest workflow to draft a first codebook with two independent AI analyses and a third semantic merge pass before you refine it.

Implicit N/A option: Enable this when responses might be empty, off-topic, or not applicable. It gives the AI a valid way to classify these cases without forcing them into an inappropriate category.

Dual-Rater Methodology

qualcode.ai's defining feature is its dual-rater approach. Two independent AI systems code each response in separate, isolated API calls — no shared context, no order effects, no cross-contamination. Solo researchers get the same dual-rater reliability that traditionally required hiring a second human coder.

Rater Provider Default Model
Rater A OpenAI GPT-4.1-mini
Rater B Anthropic Claude Haiku 4.5

Why Two Raters?

This approach mirrors traditional inter-rater reliability (IRR) studies in qualitative research:

  • Agreement metrics: Calculate Cohen's Kappa, Krippendorff's Alpha, and percent agreement
  • Genuine independence: Different architectures, different training data, and per-response isolation — each response is coded in its own API call with no shared state, which is what makes the agreement metrics methodologically valid
  • Disagreement detection: Cases where raters disagree are flagged for human review
  • Methodological credibility: Results meet academic standards for reliability reporting

For publications: The dual-rater methodology provides the agreement metrics that reviewers and journals expect. See Agreement Calculation for details on interpreting these metrics.

Training Data

Training data consists of example responses with their correct category assignments. It teaches the AI how you want responses classified.

Zero-Shot Mode (No Training Data)

When you provide no training examples, qualcode.ai operates in zero-shot mode:

  • AI raters use only your category names and descriptions
  • Works well for straightforward, well-defined categories
  • Fastest way to get started
  • Good for initial exploration of your data

Few-Shot Mode (With Training Data)

Adding training examples switches to few-shot mode:

  • AI raters learn from your specific examples
  • Improves accuracy for nuanced or domain-specific categories
  • Helps handle edge cases and ambiguous responses
  • Training data is versioned, so you can track improvements

No minimum required: Training data is always optional. Start with zero-shot, then add examples if you notice consistent misclassifications.

Credits

Credits are the currency for AI processing in qualcode.ai. Each coding run consumes credits based on a simple formula:

training_overhead = ceil(min(training_tokens, 100,000) / 10,000)
credits = ceil((responses + training_overhead) x tier_factor)
Factor What It Affects Range
Tier Factor AI quality level (Budget, Standard, or Quality) 1.0x - 3.0x
Training Overhead Small fixed cost based on training-data size 0 - 10 credits

Free Credits

Every new account can receive up to 500 free credits: 50 credits immediately on signup, and 450 more after verifying your email. That's enough to code approximately 330 responses at Standard quality. No credit card required.

Purchased credits never expire: Buy once, use whenever you need them. Free credits expire 12 months after issuance if unused.

Reconciliation

Reconciliation is the process of resolving disagreements between the two AI raters. When Rater A and Rater B assign different categories to a response, you decide the correct code.

The Reconciliation Process

  1. Review disagreements: See responses where raters disagree, along with both suggested codes
  2. Make decisions: Choose the correct category (or assign a different one)
  3. Build training data: Your decisions can become training examples for future runs

Active Learning Loop

Reconciliation creates an active learning loop:

  • Every reconciled disagreement becomes a training example for both AI raters in subsequent runs
  • The system gets measurably sharper with each coding cycle
  • Start with zero training data and build precision organically through use

Disagreements are valuable: High-disagreement cases often represent genuinely ambiguous responses or gaps in your category definitions. Use them to refine your coding guide.


Next Steps

Now that you understand the core concepts: