AI Code Detection: A Multi-Signal Approach
How ByteVerity achieves 95.6% F1 Score in detecting AI-generated code using an ensemble of machine learning models and heuristic signals.
1Executive Summary
Problem: As AI coding assistants like GitHub Copilot, Claude Code, Cursor, and Devin become ubiquitous, organizations face a critical governance challenge: they cannot distinguish AI-generated code from human-written code in their repositories. This creates compliance risks, audit gaps, and security blind spots.
Solution: ByteVerity has developed a multi-signal AI code detection engine that combines machine learning with traditional heuristics. Our ensemble approach uses a fine-tuned Contrastive CodeBERT model as the primary signal, augmented by annotation detection, pattern analysis, timing heuristics, and git metadata analysis.
Results: Our production system achieves 95.6% F1 Score with 96.2% Precision and 95.0% Recall. The false positive rate is below 2%, making the system suitable for enterprise deployment where false alarms must be minimized.
2Model Architecture
Base Model: CodeBERT
Our primary detection model is built on Microsoft's CodeBERT, a bimodal pre-trained model for programming language and natural language. We chose CodeBERT for its strong performance on code understanding tasks and its ability to capture semantic patterns in source code.
- Pre-trained on 6.4M bimodal data points from CodeSearchNet
- 125M parameters with 12 transformer layers
- Supports 6 programming languages (Python, JavaScript, Java, Go, Ruby, PHP)
Fine-Tuning: Contrastive Learning
We apply contrastive learning to fine-tune CodeBERT for AI vs human code classification. The model learns to create embeddings where AI-generated code clusters separately from human-written code in the embedding space.
Loss = -log(exp(sim(z_i, z_j)/τ) / Σ exp(sim(z_i, z_k)/τ)) where: z_i, z_j = positive pairs (same origin: AI or human) z_k = all samples in batch τ = temperature parameter (0.07) sim = cosine similarity
Ensemble: XGBoost Classifier
The final classification uses an XGBoost classifier that combines the CodeBERT embedding with four additional signal features. This ensemble approach improves robustness and allows the model to leverage different types of evidence.
Input Features
768-dim embedding + 4 signal scores
Output
Binary classification + confidence score
3Training Dataset
Data Sources
Human-Written Code
- • Open-source repositories (pre-2021)
- • Verified human-only contributions
- • Code review history analysis
- • Manual curation for quality
AI-Generated Code
- • GitHub Copilot outputs
- • Claude-generated samples
- • Cursor completions
- • GPT-4 code generation
Dataset Statistics
| Metric | Value |
|---|---|
| Total Samples | 2.4M code snippets |
| AI-Generated | 1.1M (46%) |
| Human-Written | 1.3M (54%) |
| Languages | 12 programming languages |
| Train/Val/Test Split | 70% / 15% / 15% |
Preprocessing
- Tokenization using CodeBERT tokenizer (max 512 tokens)
- Comment and docstring normalization
- Whitespace standardization
- Duplicate and near-duplicate removal
4Detection Methods
Our multi-signal detection engine combines five independent methods, each contributing to a weighted confidence score. This ensemble approach provides robustness against adversarial attempts to disguise AI-generated code.
ML-Based Detection (CodeBERT)
98% confidenceThe primary detection signal. Fine-tuned CodeBERT generates embeddings that capture semantic and stylistic differences between AI and human code. XGBoost classifier produces final probability.
Strengths: High accuracy on syntactically valid code, language-agnostic features
Weaknesses: Requires minimum code length (~50 tokens), compute-intensive
Annotation Detection
95% confidenceDetects explicit markers left by AI tools including comments, metadata, and tool-specific signatures that indicate AI involvement.
// Generated by Copilot
// @ai-generated
# Claude suggestion
/* Cursor autocomplete */
Pattern Detection
65% confidenceIdentifies structural patterns common in AI-generated code: consistent naming conventions, predictable formatting, template-like boilerplate, and characteristic variable naming.
Indicators: Excessive documentation, overly verbose variable names, repetitive error handling patterns, standardized import ordering
Timing Heuristics
50% confidenceAnalyzes keystroke timing and code generation speed. AI-assisted code often appears in bursts that are too fast or too consistent for human typing patterns.
Metrics: Characters per second, pause patterns, edit velocity, bulk insertion detection
Git Metadata Analysis
40% confidenceExamines git commit patterns, author metadata, and change frequency for signatures that correlate with AI tool usage.
Signals: Commit message patterns, file change clustering, author activity timing, diff size distribution
Weighted Signal Aggregation
Final confidence score is computed as a weighted combination of all signals:
final_score = ( 0.45 * ml_score + 0.25 * annotation_score + 0.15 * pattern_score + 0.10 * timing_score + 0.05 * git_score )
Threshold for positive classification: 0.65 (configurable per deployment)
5Evaluation Results
Overall Performance Metrics
95.6%
F1 Score
96.2%
Precision
95.0%
Recall
1.8%
False Positive Rate
Confusion Matrix (Test Set: 360K samples)
| Predicted: Human | Predicted: AI | |
|---|---|---|
| Actual: Human | 186,540 (TN) | 3,460 (FP) |
| Actual: AI | 8,500 (FN) | 161,500 (TP) |
Performance by Programming Language
6Agent Attribution
Beyond binary AI/human classification, our system identifies the specific AI coding assistant that generated the code. This is achieved through agent-specific signatures and behavioral patterns.
Attribution Accuracy
Attribution Signals
- Tool-specific comment patterns
- Naming convention fingerprints
- Error handling style patterns
- Documentation verbosity metrics
7Limitations & Edge Cases
Short Code Snippets
Accuracy drops to ~75% for snippets under 50 tokens. Short utility functions may not contain enough signal for reliable classification.
Heavily Edited AI Code
Code that was AI-generated but significantly modified by humans may be classified as human-written. This is a feature (the human "owns" the code) but limits provenance tracking.
Unknown AI Tools
New or custom AI coding tools not in our training set may evade detection. We continuously update the model with emerging tools.
Cross-Language Detection
Performance varies across languages. Less common languages (Scala, Kotlin) have lower accuracy due to limited training data.
8Future Improvements
Real-Time IDE Detection
Direct integration with IDEs to detect AI assistance at the moment of generation, before code is committed.
Continuous Learning Pipeline
Automated retraining on newly observed AI patterns to maintain accuracy as AI tools evolve.
Granular Attribution
Line-level attribution to identify exactly which portions of a file were AI-generated vs human-written.
Multi-Modal Analysis
Incorporating video/screen recordings and chat logs from AI tools for higher-confidence attribution.
Ready to detect AI-generated code in your repositories?
Deploy ByteVerity's ML detection engine and gain visibility into all AI activity across your codebase.