Genotype-Phenotype Datasets
PI Major Drug Resistance Positions: 30, 32, 47, 48, 50, 54, 76, 82, 84, 88
NRTI Major Drug Resistance Positions: 41, 65, 70, 74, 75, 151, 184, 210, 215
NNRTI Major Drug Resistance Positions: 100I, 101P, 103N, 106A/M, 181C/I/A, 188C/L/H, 190A/E/S/Q, 230L
INI Major Drug Resistance Positions: 66A/I/K, 92Q, 118R, 143, 148H/R/K, 155H, 263K
CAI Major Drug Resistance Positions: 56I, 57S, 66I, 67H/Y/N/K, 70R/S/N/H, 74D/S/K, 105T/E/S, 107N/C
In addition, the dataset can be read in directly over the web to the R script provided here. This R script contains a function which runs least squares regression on this dataset in cross-validations and generates two output files by default: (1) the coefficient and the standard error of each input mutation estimated from cross-validation runs; (2) the mean square error (MSE) estimated in each cross-validation run. The input parameters and additional options to this R function are documented in the script.
To access high quality filtered datasets from HIVDB by drug class, click the links below:
Description of fields in the datasets
| Field Name | Description |
|---|---|
| SeqID | Sequence identifier |
| Drug Fold | Fold resistance of Drug X compared to the wild type. |
| P1...Pn | Amino acid at this position. '-' indicates consensus; '.' indicates no sequence; '#' indicates an insertion; '~' indicates a deletion; '*' indicates a stop codon and a letter indicates one letter Amino Acid substitution. Two and more amino acid codes indicates a mixture. The consensus B amino acid sequences can be found here. |
| CompMutList | Complete list of mutations present in the sequence. |
| Drug Class | Data | Method | Number of Isolates |
|---|---|---|---|
| PI | 27341 phenotype results from 4512 isolates | PhenoSense | 2715 | Other | 1797 |
| NRTI | 30331 phenotype results from 5524 isolates | PhenoSense | 3370 | Other | 2154 |
| NNRTI | 11776 phenotype results from 5034 isolates | PhenoSense | 2815 | Other | 2219 |
| INI | 5694 phenotype results from 2146 isolates | PhenoSense | 1169 | Other | 977 |
| CAI | 171 phenotype results from 171 isolates | PhenoSense | 82 | Other | 89 |
Description of fields in the datasets
| Field Name | Description |
|---|---|
| SeqID | Sequence identifier |
| PtID | Patient identifier |
| Subtype | Subtype of sequence |
| Method | Phenotype method |
| RefID | Published reference. View References Table |
| Type | Clinical vs. Lab Isolate. Lab isolates are site directed mutants or results of in vitro passage. |
| IsolateName | Isolate identifier |
| SeqType | Complete vs. selective mutations. Complete mutation lists have been reported except for the isolates annotated with 'PartialMutationList' in this column which indicates that authors reported only a subset of mutations present in the isolates. Each specific criteria for reporting muations can be found in the associated publication in the References Table. |
| Drug Fold | Fold resistance of Drug X compared to the wild type. |
| P1...Pn | Amino acid at this position. '-' indicates consensus; '.' indicates no sequence; '#' indicates an insertion; '~' indicates a deletion; '*' indicates a stop codon and a letter indicates one letter Amino Acid substitution. Two and more amino acid codes indicates a mixture. The consensus B amino acid sequences can be found here. |
| CompMutList | Complete list of mutations present or reported (for the isolates annotated with 'PartialMutationList' as for the 'SeqType'). |
