Auto-Suggest Coding Guide

Two independent AI models analyze your responses separately, then a third merges their findings semantically — producing a draft codebook with confidence ratings in minutes instead of days.

New to qualcode.ai? Auto-suggest is optional. You can always create coding guides manually or use existing ones. Auto-suggest eliminates the blank-page problem: when category discovery is the hardest part, it delivers a structured, confidence-rated starting codebook in minutes.

Already have a codebook? If you already have your categories in an Excel or CSV file, see Import & Export instead — auto-suggest is for discovering categories from your data, not for ingesting an existing list.

How It Works

Auto-suggest uses a three-AI workflow to analyze your survey responses and identify common themes:

You provide your data: Upload a CSV or Excel file with open-ended survey responses
Two AIs analyze independently: OpenAI GPT-5.2 and Anthropic Claude Opus 4.5 each analyze a random sample of your responses and suggest categories
Semantic merge: A third AI pass performs semantic reconciliation—matching categories by meaning, not just name. Categories identified by both AIs are flagged as high-confidence; categories from only one AI are flagged as low-confidence
You review and refine: Rename suggestions, edit descriptions, and delete or restore categories before creating the guide. Deeper editing can continue on the resulting coding guide later
Create your guide: Apply the suggestions to create a new coding guide, ready for coding runs

Why Two AIs?

Using two independent AI models mirrors the inter-rater reliability approach used in traditional qualitative research. When two human coders independently analyze data and agree on themes, those themes are more likely to be meaningful and valid.

The same principle applies here:

Categories both AIs identified are marked "High" confidence - they represent clear, robust themes in your data
Categories only one AI found are marked "Low" confidence - they may represent valid but less obvious themes, or false positives

Different perspectives: OpenAI and Anthropic models were trained on different data with different approaches. This genuine independence makes their agreement more meaningful than if we simply ran the same model twice.

Why a Third AI Pass?

The third pass is not a third rater for confidence scoring. It is a semantic merge step that reconciles overlapping category ideas by meaning, so you do not have to manually normalize near-duplicates before editing the draft codebook.

Less cleanup: Similar categories from the two independent raters are merged before you review them
Better starting structure: You see a cleaner draft codebook, not two disconnected suggestion lists
Clear provenance: Confidence still reflects whether the first two independent models agreed

Getting Started

To use auto-suggest, you need a data file with survey responses already uploaded to your project.

Navigate to your project and open your data file
Click Suggest Coding Guide (sparkle icon) in the data file header
Select the column containing your open-ended responses
Select Quick analysis mode (Thorough is shown as a coming-soon option)
Adjust the sample size if needed
Click Generate Suggestions to start the analysis

Premium AI Models

Auto-suggest always uses our most capable AI models to ensure high-quality category suggestions:

OpenAI GPT-5.2 with extended reasoning capabilities
Anthropic Claude Opus 4.5 with extended thinking

These premium models provide deeper analysis and more nuanced category identification than the standard models used for coding runs.

Choosing an Analysis Mode

Quick mode is available now. Thorough mode is planned for a later release and may appear in the interface as a disabled Soon option.

Mode	Availability	Description	Best For	Cost
Quick	Available	Direct dual-rater analysis with high reasoning effort	Most use cases, exploratory analysis	1.0x (base cost)
Thorough	Coming soon	Multi-step iterative analysis with deeper reasoning	Complex topics, nuanced themes, final codebook development	Planned: 2.0x

Note: Quick mode already uses extended reasoning/thinking. Thorough mode is not yet live; once available, it will spend more credits because it performs a deeper multi-step pass over the sampled responses.

Sample Size

You can adjust how many responses are analyzed (10-1000). The default of 300 works well for most datasets. Larger samples capture more themes but cost more credits.

Responses are randomly selected from your data file. This ensures the sample is representative of the full dataset rather than biased toward responses at the beginning or end of the file.

Understanding Results

After analysis completes, you will see a list of suggested categories. Each category includes:

Category name: A short, descriptive label for the theme
Description: A definition explaining what belongs in this category
Example responses: Sample responses from your data that fit this category
Confidence badge: Whether both AIs agreed (High) or only one suggested it (Low)
Source: Which AI(s) identified this category

Confidence Levels

The confidence level indicates how robustly a category was identified:

Confidence	Meaning	Recommendation
High	Both OpenAI and Anthropic independently identified this theme	Strong candidate - likely a real pattern in your data
Low	Only one AI identified this theme	Review carefully - may be valid but less obvious, or could be a false positive

Provenance Badges

Each category shows where it came from:

Both: Category identified by both AIs (shown with a dual-user icon)
OpenAI only: Category identified only by the OpenAI model
Anthropic only: Category identified only by the Anthropic model

Editing Suggestions

The suggested categories are a starting point. You should review and refine them before creating your coding guide, then continue editing the guide itself if you want to reshape the codebook further.

Renaming Categories

Click on a category name to edit it. You can also modify the description to better match your research questions.

Combining Categories

If two suggested categories are similar or overlapping, you can remove one before guide creation and fold its wording into the category you keep. Once the guide exists, you can use the normal coding guide editor to add, rename, rewrite, split, or consolidate categories as your review develops.

Deleting Categories

Remove categories that do not fit your research needs. You can delete:

Overly broad categories that would catch too many responses
Categories outside the scope of your research question
Low-confidence categories you do not find useful

Reviewing Examples

Each category comes with example responses from your data. Review them as evidence for whether the suggestion makes sense; after creating the guide, you can add or edit training examples in the Coding Guides section.

Creating the Coding Guide

Once you are satisfied with your categories:

Review all categories one final time
Enter a name for your new coding guide
Choose whether to enable multi-label coding (if responses can belong to multiple categories)
Click Create Coding Guide

Your new guide will be created with:

All categories you selected
Descriptions for each category
Training examples from the AI-identified responses

You can then use this guide immediately for coding runs, or continue editing it in the Coding Guides section.

Best Practices

Before Running Auto-Suggest

Use representative data: The AI analyzes a sample of your responses. Make sure your data file represents the full range of responses you expect.
Larger samples are better: More responses give the AI more patterns to identify. At least 50-100 responses is recommended.
Consider your research question: Have a clear idea of what you are looking for - it helps you evaluate whether the suggestions are useful.

Reviewing Suggestions

Trust high-confidence categories: When both AIs agree, the theme is likely real and meaningful.
Scrutinize low-confidence categories: These may be valid but need human judgment to confirm.
Look for missing themes: AI suggestions are a starting point, not exhaustive. Add categories manually after creating the guide if important themes are missing.
Combine similar categories: The two AIs might use different names for the same concept. Remove duplicates before guide creation or consolidate them in the guide editor afterward.

After Creating the Guide

Run a test coding: Try the guide on a small subset of your data to see if categories work as expected.
Add training examples: If certain categories have low accuracy, add more training examples from your reconciled results.
Iterate: Coding guides improve over time as you add training data and refine categories.

Academic Validity

A common concern: Is AI-generated categorization academically valid? Here is why qualcode.ai's approach meets academic standards:

The Dual-Rater Principle

Inter-rater reliability is a cornerstone of qualitative research. When multiple coders independently analyze the same data and agree, this provides evidence that the coding scheme is reliable and not just one person's interpretation.

qualcode.ai applies this same principle using two genuinely independent AI systems:

OpenAI's models (trained by OpenAI)
Anthropic's models (trained by Anthropic)

These systems have different architectures, training data, and design philosophies. When they independently identify the same themes, this provides meaningful validation - similar to two human coders agreeing.

Human Review Required

Auto-suggest is a starting point, not a final answer. The workflow requires human review:

Researchers decide which suggested categories to keep, modify, or delete
Category names and descriptions are edited by humans
The final coding guide is a human-curated product, informed by AI suggestions

Transparent Provenance

Unlike black-box AI tools, qualcode.ai shows you exactly where each suggestion came from:

Which AI(s) identified each category
Confidence levels based on inter-AI agreement
Example responses that support each category

This transparency allows researchers to make informed decisions and report their methodology accurately.

Suggested Methods Section Text

When using auto-suggest, you might describe your methodology as:

"Coding categories were developed using an AI-assisted inductive approach via qualcode.ai's auto-suggest feature. Two independent large language models (OpenAI GPT-5.2 and Anthropic Claude Opus 4.5) analyzed a random sample of [N] responses to identify emergent themes. A semantic merge step reconciled categories by meaning across both models. Categories identified by both models were flagged as high-confidence; categories identified by only one model were flagged as low-confidence. All suggested categories were reviewed and refined by the research team before finalizing the coding guide."

Need more detailed templates? See our Citing qualcode.ai documentation for complete methods section templates, AI transparency statements, and supplementary materials checklists.

Frequently Asked Questions

How many responses does auto-suggest analyze?

Auto-suggest analyzes a random sample of your responses (configurable from 10 to 1,000, with a default of 300). The sample is randomly selected to ensure it represents the full range of themes in your data. Larger samples may identify more themes but cost more credits.

What if important categories are missing?

AI suggestions are a starting point, not a complete solution. If you know certain themes should be present, create the suggested guide and add the missing categories in the normal coding guide editor.

Can I run auto-suggest multiple times?

Yes. If you have updated your data or want a fresh pass, you can run auto-suggest again. Each run produces fresh suggestions that you can review independently.

Does auto-suggest cost credits?

Yes. Auto-suggest uses credits because it runs two high-capability models plus a merge step on your data. The cost depends on sample size and analysis mode: max(5, floor((5 + 0.08 × sample_size) × mode_factor)). Quick uses 1.0x; at the default sample size of 300 responses, that is 29 credits. Thorough is planned but not live yet. The exact cost is shown before you confirm.

What if the two AIs disagree on everything?

Low agreement between AIs can indicate that your data has diverse or complex themes. In this case:

Review the low-confidence suggestions - many may still be valid
Use a larger sample if you have enough responses available
Use the suggestions as inspiration and create your own categories manually

Auto-Suggest Coding Guide

How It Works

Why Two AIs?

Why a Third AI Pass?

Getting Started

Premium AI Models

Choosing an Analysis Mode

Sample Size

Understanding Results

Confidence Levels

Provenance Badges

Editing Suggestions

Renaming Categories

Combining Categories

Deleting Categories

Reviewing Examples

Creating the Coding Guide

Best Practices

Before Running Auto-Suggest

Reviewing Suggestions

After Creating the Guide

Academic Validity

The Dual-Rater Principle

Human Review Required

Transparent Provenance

Suggested Methods Section Text

Frequently Asked Questions

How many responses does auto-suggest analyze?

What if important categories are missing?

Can I run auto-suggest multiple times?

Does auto-suggest cost credits?

What if the two AIs disagree on everything?

Related Documentation