Stanford University HIV Drug Resistance Database - A curated public database designed to represent, store, and analyze the divergent forms of data underlying HIV drug resistance.

Genotype-Phenotype Datasets

Last updated on 2014-9-28

High Quality Filtered Datasets
This genotype-phenotype correlation dataset contains isolates on which in vitro susceptibility tests were performed using the PhenoSense assay (Monogram, South San Francisco, USA) (Zhang 2005). Redundant viruses obtained from the same individual that contained the same pattern of major drug resistance mutations (defined below) were excluded to minimize bias that would result from over-representing highly similar viruses. Viruses with sequences containing mixtures at these major drug resistance mutation positions were also excluded because the presence of mixtures at these positions may confound genotype-phenotype correlations.

PI Major Drug Resistance Positions: 30, 32, 47, 48, 50, 54, 76, 82, 84, 88
NRTI Major Drug Resistance Positions: 41, 65, 70, 74, 75, 151, 184, 210, 215
NNRTI Major Drug Resistance Mutations: 100I, 101P, 103N, 106A/M, 181I/C/V, 188C/H/L, 190A/S/E/Q, 230L

In addition, the dataset can be read in directly over the web to the R script provided here. This R script contains a function which runs least squares regression on this dataset in cross-validations and generates two output files by default: (1) the coefficient and the standard error of each input mutation estimated from cross-validation runs; (2) the mean square error (MSE) estimated in each cross-validation run. The input parameters and additional options to this R function are documented in the script.

Drug Class High Quality Filtered Datasets
PI 10935 phenotype results from 1808 isolates
NRTI 8430 phenotype results from 1498 isolates
NNRTI 4048 phenotype results from 1765 isolates

Description of fields in the datasets

Field NameDescription
SeqIDSequence identifier
Drug FoldFold resistance of Drug X compared to the wild type.
P1...P nAmino acid at this position. '-' indicates consensus; '.' indicates no sequence; '#' indicates an insertion; '~' indicates a deletion; '*' indicates a stop codon and a letter indicates one letter Amino Acid substitution. Two and more amino acid codes indicates a mixture. The consensus B amino acid sequences can be found here.

 

Complete Unfiltered Datasets
To access complete unfiltered datasets from HIVDB by gene and phenotype assay, click the links below:

Gene Method Complete Unfiltered Datasets
PR PhenoSense (ViroLogicTM)14763 phenotype results from 2109 isolates
Antivirogram (VircoTM)9965 phenotype results from 1425 isolates
All Others701 phenotype results from 213 isolates
RT PhenoSense (ViroLogicTM)17894 phenotype results from 1985 isolates
Antivirogram (VircoTM)11633 phenotype results from 1663 isolates
All Others1214 phenotype results from 330 isolates

Description of fields in the datasets

Field NameDescription
SeqIDSequence identifier
PtIDPatient identifier
SubtypeSubtype of sequence
MethodPhenotype method
RefIDPublished reference. View References Table
TypeClinical vs. Lab Isolate. Lab isolates are site directed mutants or results of in vitro passage.
IsolateNameIsolate identifier
Drug FoldFold resistance of Drug X compared to the wild type.
Drug FoldMatchFold match for Drug X. '=' for most results. '<' if result is below the lower limit of the assay. '>' if the result is greater than the upper limit of the assay.
P1...P nAmino acid at this position. '-' indicates consensus; '.' indicates no sequence; '#' indicates an insertion; '~' indicates a deletion; '*' indicates a stop codon and a letter indicates one letter Amino Acid substitution. Two and more amino acid codes indicates a mixture. The consensus B amino acid sequences can be found here.
Mutation ListsMutation Lists provided at the far right of the output are lists of mutations present in the sequence. Some lists contain all the mutations, others contain specific sublists.