Virus Host Classifier
Step 1: Predict overall viral sequence origin (human vs non-human) and identify extreme regions.
Step 2: Explore subregions to see local feature influence, distribution, GC content, etc.
Step 3: Analyze gene features and their contributions.
Step 4: Compare sequences and analyze differences.
Color Scale: Negative values = Blue, Zero = White, Positive values = Red.
Subregion Analysis
Select start/end positions to view local feature importance, distribution, GC content, etc.
The heatmap uses the same Blue-White-Red scale.
**Analyze Gene Features**
Upload a FASTA file and corresponding gene features file to analyze feature importance values per gene.
Gene features should be in the format:
gene_name [gene=X] [locus_tag=Y] [location=start..end] or [location=complement(start..end)] SEQUENCE The genome viewer will show genes color-coded by their contribution: - Red: Genes pushing toward human origin - Blue: Genes pushing toward non-human origin - Color intensity indicates strength of signal
Compare Two Sequences
Upload or paste two FASTA sequences to compare their feature importance patterns.
The sequences will be normalized to the same length for comparison.
Color Scale:
- Red: Sequence 2 more human-like
- Blue: Sequence 1 more human-like
- White: No substantial difference
Interface Features
- Overall Classification (human vs non-human) using k-mer frequencies
- Feature Importance Analysis shows which k-mers push classification toward or away from human
- White-Centered Gradient:
- Negative (blue), 0 (white), Positive (red)
- Symmetrical color range around 0
- Identify Subregions with strongest push for human or non-human
- Gene Feature Analysis:
- Analyze individual genes' contributions
- Interactive genome viewer
- Gene-level statistics and classification
- Sequence Comparison:
- Compare two sequences to identify regions of difference
- Normalized comparison to handle different lengths
- Statistical summary of differences
- Data Export:
- Download results as CSV files
- Download k-mer importance values
- Save analysis outputs for further processing