Stanford University HIV Drug Resistance Database - A curated public database designed to represent, store, and analyze the divergent forms of data underlying HIV drug resistance.

Genotype-Phenotype Datasets

Last updated on 2019-2-20

High Quality Filtered Datasets
This genotype-phenotype correlation dataset contains isolates on which in vitro susceptibility tests were performed using the PhenoSense assay (Monogram, South San Francisco, USA) (Zhang 2005). Redundant viruses obtained from the same individual that contained the same pattern of major drug resistance mutations (defined below) were excluded to minimize bias that would result from over-representing highly similar viruses. Viruses with sequences containing mixtures at these major drug resistance mutation positions were also excluded because the presence of mixtures at these positions may confound genotype-phenotype correlations.

PI Major Drug Resistance Positions: 30, 32, 47, 48, 50, 54, 76, 82, 84, 88
NRTI Major Drug Resistance Positions: 41, 65, 70, 74, 75, 151, 184, 210, 215
NNRTI Major Drug Resistance Positions: 100I, 101P, 103N, 106A/M, 181C/I/A, 188C/L/H, 190A/E/S/Q, 230L
INI Major Drug Resistance Positions: 66A/I/K, 92Q, 118R, 143, 148H/R/K, 155H, 263K

In addition, the dataset can be read in directly over the web to the R script provided here. This R script contains a function which runs least squares regression on this dataset in cross-validations and generates two output files by default: (1) the coefficient and the standard error of each input mutation estimated from cross-validation runs; (2) the mean square error (MSE) estimated in each cross-validation run. The input parameters and additional options to this R function are documented in the script.

To access high quality filtered datasets from HIVDB by drug class, click the links below:

Drug Class Data
PI 11771 phenotype results from 1951 isolates
NRTI 9682 phenotype results from 1707 isolates
NNRTI 4177 phenotype results from 1812 isolates
INI 1519 phenotype results from 659 isolates

Description of fields in the datasets

Field NameDescription
SeqIDSequence identifier
Drug FoldFold resistance of Drug X compared to the wild type.
P1...PnAmino acid at this position. '-' indicates consensus; '.' indicates no sequence; '#' indicates an insertion; '~' indicates a deletion; '*' indicates a stop codon and a letter indicates one letter Amino Acid substitution. Two and more amino acid codes indicates a mixture. The consensus B amino acid sequences can be found here.
CompMutListComplete list of mutations present in the sequence.

Complete Unfiltered Datasets
To access complete unfiltered datasets from HIVDB by drug class, click the links below:

Drug Class Data Method Number of Isolates
PI 22749 phenotype results from 3905 isolates PhenoSense2166
Antivirogram1427
Other312
NRTI 30924 phenotype results from 5532 isolates PhenoSense3558
Antivirogram1684
Other290
NNRTI 8842 phenotype results from 3973 isolates PhenoSense2220
Antivirogram1596
Other157
INI 4505 phenotype results from 1719 isolates PhenoSense989
Other730

Description of fields in the datasets

Field NameDescription
SeqIDSequence identifier
PtIDPatient identifier
SubtypeSubtype of sequence
MethodPhenotype method
RefIDPublished reference. View References Table
TypeClinical vs. Lab Isolate. Lab isolates are site directed mutants or results of in vitro passage.
IsolateNameIsolate identifier
SeqTypeComplete vs. selective mutations. Complete mutation lists have been reported except for the isolates annotated with 'PartialMutationList' in this column which indicates that authors reported only a subset of mutations present in the isolates. Each specific criteria for reporting muations can be found in the associated publication in the References Table.
Drug FoldFold resistance of Drug X compared to the wild type.
P1...PnAmino acid at this position. '-' indicates consensus; '.' indicates no sequence; '#' indicates an insertion; '~' indicates a deletion; '*' indicates a stop codon and a letter indicates one letter Amino Acid substitution. Two and more amino acid codes indicates a mixture. The consensus B amino acid sequences can be found here.
CompMutListComplete list of mutations present or reported (for the isolates annotated with 'PartialMutationList' as for the 'SeqType').