Stanford University HIV Drug Resistance Database - A curated public database designed to represent, store, and analyze the divergent forms of data underlying HIV drug resistance.

Release Notes for HIVdb, HIVseq, HIValg

Last updated on May 16, 2014

Table of Contents

  1. Introduction
  2. Submission of Sequences and Mutations
  3. HIVdb
  4. Listing of Program Updates
  5. HIVseq
  6. HIValg
  7. User-Submitted Algorithms
  8. Program Code Download
  9. References
  10. Appendices
 
1. Introduction

The presence of HIV-1 drug resistance before starting a new antiretroviral (ARV) drug treatment regimen is an independent predictor of the virological response to that regimen. Several studies have shown that the use of genotypic resistance testing prior to the start of new treatment regimen increases the likelihood of virological response to that regimen. However, interpreting the results of HIV-1 drug resistance tests is one of the most difficult tasks facing health care providers (TF Liu & RW Shafer, CID 2006). First, there are many different drug resistance mutations (RW Shafer & JM Schapiro, AIDS Rev 2008). Second, these mutations cause varying levels of decreased susceptibility to different ARVs. Third, standard genotypic resistance tests fail to detect drug-resistance mutations that are present at low levels within a patient's virus quasispecies.

More than 50 published studies have been performed to discover rules that correlate pre-therapy drug-resistance mutations with the clinical response to a salvage therapy ARV treatment regimen including more than 30 studies of protease inhibitors and more than 20 studies of NRTI inhibitors and studies of NNRTI inhibitors. However, rules-discovery studies are limited in power because of the large number of drug-resistance mutations, the large number of covariates that influence virologic response, and the different patient populations, optimized background regimens, and virological endpoints used in these studies (F Brun-Vezinet et al. Antivir Ther 2004). As a result, many academic and commercial groups have developed integrated genotypic resistance interpretation systems that supplement the rules discovered in these studies with other forms of data such as the results of in vitro susceptibility testing and of in vitro and in vivo drug selection studies.

The most commonly used publicly available integrated genotypic resistance interpretation systems include the HIVdb system found here, the ANRS (Agence Nationale de Recherhes sur le Sida) system, the Rega Institute System, The Antiretroscan (Italian Antiretroviral Resistance Cohort), and the Geno2pheno (German National Reference Center) (Liu & Shafer, CID 2006). The most commonly used proprietary systems include the ViroSeq system which is associated with Celera's FDA-approved HIV-1 RT and protease sequencing kit, the TrueGene system, which is associated with Siemen's FDA-approved HIV-1 RT and protease sequencing kit, the VircoType system developed by Virco Laboratories, and the GeneSeq system developed by Monogram Biosciences (Liu & Shafer CID 2006).

Each of these systems performs the same basic function: assess how active an ARV is likely to be against a particular mutant virus compared with the drug's activity against a wildtype virus. When combined with a sound understanding of the principles of antiretroviral therapy, these systems and other Web and printed drug-resistance summaries help health care providers better understand the results of HIV-1 genotypic resistance tests. However, because these systems do not explicitly consider the relative potencies of different ARV drugs and drug combinations or the results of other relevant clinical data such as previous drug-resistance test results, ARV treatment history, plasma HIV-1 RNA levels, CD4 counts, and drug toxicity, they do not have the logical power to instruct clinicians on which ARV drugs should be used when constructing a salvage therapy regimen (Liu & Shafer 2006).

There are three programs in the HIV Drug Resistance Database which share a common code base: HIVseq, HIVdb, and HIValg. HIVseq accepts user-submitted protease, RT, and integrase sequences, compares them to the consensus subtype B reference sequence, and uses the differences as query parameters for interrogating the HIV Drug Resistance database (Shafer, D Jung, & B Betts, Nat Med 2000; Rhee SY et al. AIDS 2006). The query result provides users with the prevalence of protease, RT and integrase mutations according to subtype and PI, nucleoside RT inhibitor (NRTI), non-nucleoside RT inhibitor (NNRTI), and integrase inhibitor (INI) exposure. This allows users to detect unusual sequence results immediately so that the person doing the sequencing can check the primary sequence output while it is still on the desktop. In addition, unexpected associations between sequences or isolates can be discovered by immediately retrieving data on isolates sharing one or more mutations with the sequence.

HIVdb is an expert system that accepts user-submitted HIV-1 pol sequences and returns inferred levels of resistance to 20 FDA-approved ARV drugs including 8 PIs, 7 NRTIs, 4 NNRTIs, and - with this update - one INI. In the HIVdb system, each HIV-1 drug resistance mutation is assigned a drug penalty score and a comment; the total score for a drug is derived by adding the scores of each mutation associated with resistance to that drug. Using the total drug score, the program reports one of the following levels of inferred drug resistance: susceptible, potential low-level resistance, low-level resistance, intermediate resistance, and high-level resistance.

With this update, the PIs have been renamed as follows: atazanavir/r (ATV/r), darunavir/r (DRV/r), fosamprenavir/r (FPV/r), indinavir/r (IDV/r), lopinavir/r (LPV/r), saquinavir/r (SQV/r), and tipranavir/r (TPV/r) where "/r" indicates co-administration with low-dose ritonavir (RTV) for pharmacological "boosting". Nelfinavir (NFV) which cannot be reliably boosted by ritonavir has not been changed. This change has been made to indicate unambiguously that the penalty scores and activity estimates for the PIs apply to their boosted form. Indeed, LPV is co-formulated with low-dose RTV; DRV and TPV are approved only with low-dose RTV; and ATV, FPV, SQV, and IDV are usually administered with low-dose RTV. Although ATV and FPV have been approved for administration without RTV boosting, these drugs are generally used in this manner for treating viruses lacking PI-resistance mutations.

HIValg is designed for users interested in comparing the results of different algorithms or who are interested in comparing and evaluating existing and newly developed algorithms. The ability to develop new algorithms that can be run on the HIV Drug Resistance Database depends on the Algorithm Specific Interface (ASI) compiler (Shafer & Betts JCM 2003).

 
2. Submission of Sequences and Mutations

2.1 User Interface

For each of the three programs, sequences can be entered using either the Sequence Analysis Form or the Mutation List form. To use the Sequence Analysis Form, paste one or more non-interleaved sequences in fasta format into the textbox or upload a file containing up to 100 non-interleaved fasta sequences Consistent with the fasta format each sequence should be preceded by a line containing ">" followed by a sequence name and optionally followed by additional descriptors separated by pipes ("|").

To use the Mutation List Form, select mutations using the drop down boxes or by entering the mutations into the textboxes. When using the textboxes, it is essential that amino acid mutations are entered in UPPERCASE whereas insertions and deletions should be entered using lowercase "ins" or "del". If there is a mixture of more than one amino acid at a position, write both amino acids (intervening slashes are optional). The consensus mutations must be separated either by spaces or commas; preceding the amino acid position by the consensus amino acid residue is optional. When using the drop down menu, choose the amino acid present in the sequence. If the amino acid is not present, then choose select the asterisk which will open a text box allowing you to enter an amino acid that is not on the drop-down list.

If you are a frequent user and typically enter many sequences at one time then it will be more convenient for you to use the Web service, Sierra. Sequences entered are not stored on our servers. Sierra allows you to enter 1,000 sequences at one time and returns the results as an XML report that is easy to interpret and parse, making it unnecessary to manually inspect a large number of HTML results. Whatever interface you use: Mutation List Form, Sequence Analysis Form, or the Web Service Sierra your results are not stored on our servers.

2.2 Sequence Alignment and Amino Acid Translation

Nucleotide sequences are aligned to the consensus B HIV-1 pol amino acid sequence using a nucleotide to amino acid sequence local alignment program (X Huang Genomics 1996). Very short sequences or sequences containing multiple insertions, deletions, and frame-shifts may not aligned successfully and may yield a warning. The current version of the program should be able to align all HIV-1 sequences. However; HIV-2 sequences usually produce warnings. We are planning to modify the code so that HIV-2 sequences can also be submitted. Amino acid insertions that appear in the region between RT amino acids 65 to 74 are hard-coded to appear at position 69; whereas amino acid deletions in this region are hard-coded to appear at position 67. This is consistent with how these mutations are most frequently described in published papers and with how the drug-resistance mutation penalties have been established.

Nucleotide triplets containing ambiguities are translated into each of the possible amino acids they encode. However, when the resulting list of possible amino acids is more than four, we replace this list with an 'X'. For example, WMC is translated to NTYS (N for AAC, T for ACC, Y for TAC, S for TCC), but WMS is translated to X instead of NTYSK* (N for AAC, T for ACC, Y for TAC, S for TCC, K for AAG, T for ACG, * for TAG, and S for TCG). All possible translations are explicitly defined in the triplets-table.txt file.

2.3 Output Options

the Sequence Analysis method of the HIVdb interpretation program has 3 output options: HTML (default), Spreadsheet, and XML.

1) HTML output, the default, is best for analyzing sequences in the HTML browser.

2) Spreadsheet output, is best for analyzing multiple sequences in a table format where all the results for one sequence are arranged into one line. The output is a bit unwieldy since there are so many columns so I will describe them. Each row represents all the information on a sequence and the columns are divided into 6 sections following the sequence id. It is important to notice that the number of columns will most likely be different from one set of sequences to another since only genes, positions, and mutation present in the set are shown.

  • gene summary information, some basic information about each protease, RT and integrase gene. For the given set, there were no protease genes so this section starts with the following RT column headers with the sample values given:
    GeneSummarysubtypesubtype matchfirstAAlastAAStop Codons, Frame ShiftsB,D,H,V,NUnusual ResiduesG-A Hypermutated
    RTB95.61334NoneNoneNoneNone
    RTB93.930247NoneNoneNoneNone
    RTB94.81399NoneNoneNoneNone

  • insertions, deletions and frame shifts present in the entire sequence.
    indelsreading frame shifts
    There are no insertions or deletionsThere are no reading frameshifts
    There are no insertions or deletionsThere are no reading frameshifts
    RT AA Insertion: codon 69 AA: SG NA: AGTGGTThere are no reading frameshifts

  • drug resistance for PI, RTI, and INI drugs. For the given set, there were no protease genes so this section starts with the following RTI column headers with the (truncated) sample values given:
    NRTINNRTIOTHER3TC3TCABCABCAZTAZT
    NoneNoneK122E,D123E,I135T,D177E,Q207N,V245MSusceptible0Susceptible0Susceptible0
    T69NTNoneK30Q,K32G,A33C,L34T,V35L,E36N,I37F,R83K,I135IMV,T200A,V245KSusceptible0Susceptible0Susceptible5
    M41L,D67G,T69Si,K70KQ,T215YK103NP4PS,T7P,K20R,W88C,I135V,S162Y,I178L,Q207E,D237E,V245EIntermediate resistance50High-level resistance95High-level resistance125

  • mutations (all) for PR, RT, and IN genes. Each cell lists the amino acids present in the position indicated on the column header. For the given set, there were no protease genes so this section starts with the following RT column headers with the sample values given:
    Mutations4720303233343536373941486769708388102103122123135162172173174177178200207211215237245
    RTEETENM
    RT---QGCTLNFNTKIMVAK
    RTPSPRLGiKQCNVYLEYEE

  • mutation binary (all) for PR, RT, and IN genes. Each cell indicates the presence or absence of the resistance mutation indicated on the column header; all the mutations present in this set are listed in this section. For the given set, there were no protease genes so this section starts with the following RT column headers with the sample values given:
    AllMutationBinary4P4S7P20R30Q32G33C34T35L35T36A36N37F39E41L48S48T67G69N69S69T69i70K70Q83K88C102K102R103N122E123D123E123N
    RT000000000000000000000000000001010
    RT----11111001100000101000100000000
    RT111100000000001001010111010010000

  • mutation binary (scored) for PR, RT, and IN genes. Each cell indicates the presence or absence of the resistance mutation indicated on the column header; the mutations listed in this section represent those mutations with a score. The list of mutations with their classifications can be found in the Tab-Delimited Files column in the comments section. For the given set, there were no scored Protease mutations so this section is composed of the following RT column headers with the sample values given:
    ScoredMutationBinary41L67G69N69i70Q103N215Y
    RT0000000
    RT0010000
    RT1101111

3) Spreadsheet output Fixed Width, is very similar to the above Spreadsheet output (and it was the only option until May 2014); in fact the first 3 sections are identical but the last section shows a binary output only for Drug Resistant Mutations (DRMs). Another big difference is that the number of columns will remain the same for all sections.

  • mutation binary (DRMs) for PR, RT, and IN genes. Each cell indicates the presence or absence of the resistance mutation indicated on the column header; the mutations listed in this section represent those mutations not classified as Other or polymorphic as discussed here. The list of mutations with their classifications can be found in the Tab-Delimited Files column in the comments section. Please note that within each position, the mutations are not in alphabetical order but arranged based on their classification; for example, PI Major mutations appear before PI Minor mutations. Protease starts with the following column headers with the sample values given:
    MutationBinary10F10I10R10V10Y11I11L23I24I30N30A30C30E30F30G30H30I30K30L30M30P30Q30R30S30T30V30W30Y
    PR0100000000000000000001000000

4) XML output, is best for analyzing multiple sequences using a programmatic approach or an API that let's you import data from an XML formatted file. This output is the same as the one returned by Sierra web services and it's detailed here. Please note that the HTML table that you see in the web browser is the result of applying an XML-stylesheet but if you look at the underlying source by clicking "View->Page Source" you can see and save the XML output.

 
3. HIVdb

3.1 Quality control analysis

The quality control analysis reports three types of problem positions: (i) Positions containing stop codons or frame shifts; (ii) Positions containing highly ambiguous nucleotides: N (cannot distinguish between A,C,G, or T), B (contains a combination of C, G, and T), D (contains a combination of A, G, and T), H (contains a combination of A, C, T), and V (contains a combination of A, C, and G). (iii) Highly unusual mutations defined as mutations that are not associated with drug resistance and which are present in HIVDB at a frequency of <0.05% or in only a single reference. Tables containing non-highly-unusual mutations in protease, RT, and integrase can be found at these links: PR variation, RT variation, IN variation. (iv) Mutations strongly suggestive of APOBEC3G-induced G to A hypermutation. A table with such mutations in protease and RT can be found at this link.

3.2 Subtyping

Each sequence is compared to a list of reference sequences for each of the Main group of HIV-1 sequences representing subtypes A, B, C, D, F, G, H, J, K, CRF01_AE, and CRF02_AG. The subtype of the closest reference sequence is assigned to the submitted sequence. This method will generally be accurate (Gifford R et al AIDS 2006); however, it will not accurately characterize circulating recombinant forms (CRFs) other than 01 and 02 or other non-CRF recombinants. Moreover, performance on protease sequences alone is often suboptimal because there is often insufficient phylogenetic signal to distinguish between subtypes B and D or between subtype A and CRF01_AE without additional sequence data.

Several other programs have increased accuracy for HIV-1 subtyping including (i) Rega HIV-1 Subtyping Tool (de Oliveira T et al Bioinformatics 2005). This program follows the most rigorous approach for HIV-1 subtyping and uses boot-scanning to detect recombinant sequences. It can be found on this website or at the BioAfrica site maintained by Dr. de Oliveira); (ii) The STAR Subtype Analyzer which uses a position-specific scoring matrix for HIV-1 subtyping (http://www.vgb.ucl.ac.uk/starn.shtml) (Myers RE et al, BioInformatics 2005); (iii) Virus subtyping tool, NCBI (http://www.ncbi.nlm.nih.gov/projects/genotyping/formpage.cgi) (Rozanov M et al Nucl Acids Res 2004); and (iv) Subtyping Distance Tool (SUDI) found at the Los Alamos HIV Sequence Database (http://www.hiv.lanl.gov/content/sequence/SUDI/sudi.html).

3.3 Mutation Categories

Mutations are defined as differences from the consensus B reference sequence (PR, RT, and IN). Mutations are further characterized as follows: (i) RT mutations - NRTI-resistance mutations, NNRTI-resistance mutations, and Other mutations; (ii) PR mutations - Major PI resistance mutations, Minor PI resistance mutations, and Other mutations; (iii) INI mutations - Major INI resistance mutations, Minor INI-resistance mutations. A partial explanation for how mutations are assigned to specific categories can be found in the HIVDB FAQ page. This categorization is occasionally modified as new drug resistance knowledge accrues. The latest categorization of mutations can be found in the tables linked here: PI Major and PI Minor, NRTI and NNRTI, and INI Major and INI Minor.

The "Other mutations" list may often contain mutations that are associated with drug resistance but which are primarily accessory and which are polymorphic (meaning they frequently occur even in untreated persons). The decision to move these mutations from these mutations from the "Minor" category to the "Other" category was made for the following reasons: (i) these mutations have little effect on drug susceptibility, (ii) these mutations often represent the consensus sequence in non-B subtypes, (iii) including these mutations in the "Minor" mutation category would complicate the report and make it more difficult to identify mutations indicative of past selective drug pressure.

The "Other mutations" list may often contain rare non-polymorphic mutations that are associated with drug resistance but which have not yet been described in a peer-reviewed paper or received widespread recognition. Fortunately, these mutations are generally uncommon and generally emerge only after multiple "Major" and "Minor" resistance mutations have emerged (explaining why they have not been well studied). The decision to place these mutations in the "Other" category was made to simplify the report. Future versions of the report may indicate which of the mutations in the "Other" category may be indicative of past selective drug pressure and of uncertain clinical significance.

3.4 Mutation Penalty Scores

Mutation penalty scores are developed based on the following considerations: (i) Published studies and data linking mutations to ARV therapy; (ii) Published studies and data linking mutations to decreased ARV susceptibility; (iii) Published studies linking pre-therapy mutations with the virological response to a new ARV treatment regimen. Mutation penalty scores undergo repeated testing to insure that most common mutation papers receive total scores consistent with the studies described above. Mutation penalty scores are frequently modified based on new papers, scientific presentations, and occasionally by user feedback.

(i) Published studies and data linking mutations to ARV therapy. Mutations that are polymorphic (i.e. occur in the absence of selective drug pressure) generally do not receive scores or receive low scores even if the prevalence of these mutations increases during ARV therapy. The rationale for this approach is that polymorphic mutations have not been shown to significantly impair the response to a new ARV treatment regimen.
Moreover, polymorphic mutations that are accessory mutations in one subtype often represent the consensus amino acid in another subtype. In contrast, nonpolymorphic mutations that emerge during ARV treatment failure are considered important ARV-resistance mutations and are assigned significant mutation penalty scores.

(ii) Mutations that decrease ARV susceptibility in vitro receive significant mutation penalty scores, particularly if these mutations have been shown to decrease ARV susceptibility in the absence of additional major mutations. Occasionally data of this type are available for site-directed mutants containing a single or a small set of mutations thus allowing an assessment of the precise contribution a mutation makes to decreased susceptibility. More often, however, such data are obtained from the statistical analysis of clinical isolates for which both genotypic and phenotypic data are available. Mutations that are associated with increased drug susceptibility generally receive a small negative score unless the mutation occurs in a mixed virus population.

(iii) Published studies linking pre-therapy mutations with virological response to a new ARV treatment regimen. As noted above, there have been more than 50 published studies of this type summarized in the Genotype-Clinical section of this website. These studies are often underpowered because of the large number of drug-resistance mutations, the large number of covariates that influence virologic response, and the different patient populations, optimized background regimens, and virological endpoints used in these studies. Therefore, our approach is to weigh the evidence from these studies carefully. Mutations that are associated with a decreased clinical response in large studies or in more than one small study are given more credence.

Some of the data from these studies such as those for tipranavir (RESIST study), darunavir (POWER studies), and etravirine (DUET study) have become so widely known that we provide a specific comment listing the number of specific mutations reported to be associated with resistance in these studies. Nonetheless, the mutations from these studies do not necessarily receive mutation penalties particularly if they are polymorphic. For example, many of the mutations from the original RESIST study list were highly polymorphic and were not assigned penalties. Two of the mutations in the current etravirine DUET study list (V90I and V106I) are polymorphic and are not assigned mutation penalties.

The most recent scores are available as tab-delimited files or tables sortable by position or drug:

Tab-Delimited FilesSortable Tables
scores for PIs
scores for NRTIs
scores for NNRTIs
scores for INIs
scores for a combination of PI mutations
scores for a combination of NRTI mutations
scores for a combination of NNRTI mutations
scores for a combination of INI mutations
scores for PIs
scores for NRTIs
scores for NNRTIs
scores for INIs

To display the effect of our scoring in a concrete manner, we run our algorithm through a set of unique mutation patterns derived from HIV-1 RT sequences from >40,000 persons. The NRTI, NNRTI, and PI spreadsheets show the drug resistance levels for 2,081, 1,104, and 2,556 unique patterns of scored mutations using the latest version of our algorithm.

Throughout our website we refer to each drug by its abbreviation and here you can find the different names for each drug

Protease Inhibitors (PIs)
Generic Name Brand Name Abbreviation
tipranavir/r Aptivus TPV/r
indinavir/r Crixivan IDV/r
saquinavir/r Invirase SQV/r
lopinavir/r Kaletra LPV/r
fosamprenavir/r Lexiva FPV/r
atazanavir/r Reyataz ATV/r
nelfinavir Viracept NFV
darunavir/r Prezista DRV/r
Footnote: "/r" indicates ritonavir boosting. LPV is co-formulated with low-dose RTV; DRV and TPV are approved only with low-dose RTV; and ATV, FPV, SQV, and IDV are usually administered with low-dose RTV. Although ATV and FPV have been approved for administration without RTV boosting, these drugs are generally used in this manner for treating viruses lacking PI-resistance mutations

Nucleoside Reverse Transcriptase Inhibitors (NRTIs)
Generic Name Brand Name Abbreviation
emtricitabine Emtriva FTC
lamivudine Epivir 3TC
zidovudine Retrovir AZT
didanosine Videx ddI
tenofovir Viread TDF
stavudine Zerit d4T
abacavir Ziagen ABC

Non-Nucleoside Reverse Transcriptase Inhibitors (NNRTIs)
Generic Name Brand Name Abbreviation
efavirenz Sustiva EFV
etravirine Intelence ETR
nevirapine Viramune NVP
rilpivirine Edurant RPV

Integrase Inhibitors (INIs)
Generic Name Brand Name Abbreviation
raltegravir Isentress RAL
elvitegravir EVG
dolutegravir DTG

3.5 ARV Resistance Estimates

The drug resistance estimate for an ARV is obtained by adding together the scores of each for the mutations associated with resistance to that drug. The scores are titrated to fall within the following ranges: (i) 0 to 9: Susceptible, no evidence of reduced susceptibility compared with wildtype; (ii) 10 to 14: Potential low-level resistance. The virus is likely to be fully susceptible yet it contains mutations that may be indicative of previous exposure to the ARV class of the drug; (iii) 15 to 29: Low-level resistance. Virus isolates of this type have reduced in-vitro drug-susceptibility and/or patients with viruses of this genotype may have a suboptimal virologic response to treatment compared with the treatment of a wildtype virus; (iv) 30 to 59: The genotype suggests a degree of drug resistance greater than low-level resistance but lower than high-level resistance; (v) >=60: the genotype is similar to that of isolates with the highest levels of in vitro drug resistance and/or patients infected with isolates having similar genotypes generally have little or no virologic response to treatment with the drug.

At the end of every report is a table listing each of the ARV-resistance mutations present, their scores for each of the drugs, and the summary of scores for each of the drugs. This table is important to examine because it contains more information than the five categories listed at the top of the report. It is not uncommon for an isolate to have intermediate resistance to two PIs with one PI having a score of 31 (close to low-level resistance) and another having a score of 59 (close to high-level resistance). The scores themselves are also links to information in the database supporting the level of the mutation penalty.

As noted in the Introduction, the purpose of this program is to assess how active an ARV is likely to be against a particular mutant virus compared with its activity against wildtype virus. The program does little else to help a health care provider choose therapy. For example, it is often wiser to use a highly potent drug assigned intermediate resistance than to use a less potent drug assigned low-level resistance. Second, some drugs such as 3TC and FTC continue to provide some degree of virological benefit even in the presence of high-level resistance possibly because the mutations usually responsible for resistance M184V/I, increase HIV-1 susceptibility to other NRTIs and because M184V/I are associated with decreased virus replication. Although a program that could select the appropriate treatment regimen for a patient would be desirable, no such program exists making it necessary for all health-care providers to have a sound understanding of the principles of antiretroviral therapy (http://aidsinfo.nih.gov/Guidelines/Default.aspx?MenuItem=Guidelines).

3.6 Comments

Following the list of ARV Resistance Estimates, the HIVdb report contains a series of comments: (i) The first type of comment includes a listing of the mutations associated with the "GSSs" developed by Boehringer-Ingelheim for tipranavir (Baxter JD et al, J Virol 2006; Scherer J et al 11th European HIV Conf 2007) and by Tibotec for darunavir (De Meyer et al AIDS Res Hum Retrovirus 2008) and etravirine (Vingerhoets J et al HIVDRW 2007; Vingerhoets J et al HIVDRW 2008); (ii) Mutation-specific comments. These are brief 1 to 2 sentence synopses of XX number of protease, YY number of RT, and ZZ number of integrase mutations that have been associated with ARV resistance. (iii) A listing of mutations associated with hypersusceptibility. (iv) Highly unusual mutations defined as mutations that are not associated with drug resistance and which are present in HIVDB at a frequency of <0.05% or in only a single reference (These are also indicated in the Quality control analysis section of the report).

The most recent comments are available as tab-delimited files or web pages:

Tab-Delimited FilesWeb Pages
comments for protease inhibitors
comments for RT inhibitors
comments for integrase inhibitors
comments for PIs
comments for NRTIs
comments for NNRTIs
comments for INIs
 
4. Program Updates

The scoring tables, comments, and programs are frequently updated; these updates are tracked in the Updates page. Below is a listing of our current and previous versions linking to the specific improvements since January 2003.

 
5. HIVseq

HIVseq allows users to examine new sequences in the context of previously published sequence data on RT, protease, and integrase (Shafer R, Jung D, and Betts B, Nature Med 2000; Rhee et al AIDS 2006). Like HIVDB, HIVseq can accept either mutations or complete sequences and produces an assessment of quality control.

HIVSeq overview:
HIVseq accepts user-submitted RT, protease, and integrase sequences, compares them to a reference sequence (subtype B consensus) and uses the difference to query the database. The program output includes (i) a list of mutations defined as differences from the consensus B amino acid sequence, (ii) the frequency with which each mutation occurs in treated and untreated person infected with viruses belonging to the eight most common subtypes (A, B, C, D, F, G, CRF01_AE, CRF02_AG), (iii) hyper links to a table containing each report of those mutations associated with a particular treatment status and subtype. (the mutation itself is a hyperlink). (Can we provide a link to Figure 1 from Soo's 2006 paper in AIDS)

Detailed description of the tabular output of HIVseq:
For RT sequences, the program provides a NRTI table containing mutation frequency for isolates from RTI-naive and NRTI-treated (NNRTI) persons, and a NNRTI table containing mutation frequency for isolates from NNRTI-naive and NNRTI-treated persons. For protease sequences, the program provides a protease table containing mutation frequency for HIV-1 isolates from PI-naive and PI-experienced persons. For integrase sequences, the program provides a protease table containing mutation frequency for HIV-1 isolates from INI-naive and INI-experienced persons.

Each table contains one row for each mutation and 20 columns. Columns 1 to 4 list the position, the position's consensus amino acid, the submitted nucleotide triplet and the submitted mutation. Columns 5 to 12 list the frequency of each mutation in subtypes A, B, C, D, F, G, CRF01_AE and CRF02_AG in drug class naive persons. Columns 13 to 20 list the frequency of each mutation in subtypes A, B, C, D, F, G, CRF01_AE and CRF02_AG in drug class experiences persons. Each mutation is also a hyper-link to a separate web page with detailed information on each isolate, including literature references with Medline abstracts, the GenBank accession number, and complete sequence and treatment records.

Note: To minimize reporting bias, the mutation frequency tables contain one sequence per individual. For individuals in whom sequences from multiple isolates were published, the mutation tables include the earliest sequence from untreated persons and the latest sequence (while on therapy) from persons receiving antiretroviral therapy. To exclude technical sequencing errors and cases of circulating virus containing unusual variants, the mutation tables include only mutations present as the predominant form whenever multiple clones from the same isolate were sequenced. Sequences of poor quality and those considered to be possible laboratory contaminants are excluded from the data sets.

The following table provides a summary of number of persons used for the HIVseq output.

SubtypeproteaseRTintegrase
GenePI-naïvePI-treatedRTI-naïveRTI-treatedINI-naïveINI-treated
A15241181301363162 
B7407665155391094237811
C214528219791063451 
D5128232024880 
F60530726228159 
G619199368635761
CRF01_AE89061762728125 
CRF02_AG1415119102536293 
 
6. HIValg

6.1 Objectives

The objectives of this program are to 1) identify the extent of agreement between three commonly used genotypic drug resistance interpretation systems; and 2) to identify sequences responsible for disagreements between these systems. It is important to note that two of the three algorithms have been simplified from a five-to-six level output (Rega) or a five level output (HIVdb) to a three level output so that all three algorithms can be roughly compared. It is also important to note that discrepancies of one level (e.g. susceptible vs low/intermediate resistance or low/intermediate resistance vs high-level resistance) can frequently occur by chance if the level of resistance is on the borderline between two levels. Only discrepancies between fully susceptible and high-level resistance should be examined closely.

6.2 Algorithms

The following algorithms are available online in their XML form in the "Algorithm Specification Interface page". They are all encoded using the ASI format, which is also described in the same page.

  • ANRS: Agence Nationale de Recherches sur le SIDA 4,5.
  • HIVDB: The current version of the drug-resistance interpretation program on this site is referred to as the "HIVdb" algorithm.
  • Rega Institute: Courtesy of Professor Anne-Mieke Vandamme 7.

Each of the algorithms reports their results differently. The table below shows how the results of the algorithm are normalized for comparison by the program. Users of HIValg can select whether they prefer to receive output with the original interpretation or with the normalized interpretation ('SIR' option).

AlgorithmSIR
ANRSSusceptiblePossible resistanceResistance
HIVDBSusceptible
Potential low-level resistance
Low-level resistance
Intermediate resistance
High-level resistance
Rega InstituteSusceptible GSS 1
Susceptible GSS 1.5
Intermediate Resistant GSS 0.75
Intermediate Resistant GSS 0.5
Intermediate Resistant GSS 0.25
Resistant GSS 0

 
7. User-Submitted Algorithms / ASI

Selecting which algorithms appear in the output report can be done in two different ways. The first technique is to select from the list of algorithms made available on our servers. The second technique allows you to upload an algorithm from your machine, assuming that the algorithm is in proper ASI format as described in the Algorithm Specification Interface page (Betts BJ & Shafer RW J Clin Microbiol 2003). These techniques can be used in combination.

 
8. Program Code Downloads

the latest code is available for download.

 
9. References
  1. Baxter, J. D., J. M. Schapiro, C. A. Boucher, V. M. Kohlbrenner, D. B. Hall, J. R. Scherer, and D. L. Mayers. 2006. Genotypic changes in human immunodeficiency virus type 1 protease associated with reduced susceptibility and virologic response to the protease inhibitor tipranavir. J Virol 80:10794-801.
  2. Betts, B. J., and R. W. Shafer. 2003. Algorithm specification interface for human immunodeficiency virus type 1 genotypic interpretation. J Clin Microbiol 41:2792-4.
  3. Brun-Vezinet, F., D. Costagliola, M. A. Khaled, V. Calvez, F. Clavel, B. Clotet, R. Haubrich, D. Kempf, M. King, D. Kuritzkes, R. Lanier, M. Miller, V. Miller, A. Phillips, D. Pillay, J. Schapiro, J. Scott, R. Shafer, M. Zazzi, A. Zolopa, and V. DeGruttola. 2004. Clinically validated genotype analysis: guiding principles and statistical concerns. Antivir Ther 9:465-78.
  4. De Meyer, S., T. Vangeneugden, B. van Baelen, E. de Paepe, H. van Marck, G. Picchio, E. Lefebvre, and M. P. de Bethune. 2008. Resistance profile of darunavir: combined 24-week results from the POWER trials. AIDS Res Hum Retroviruses 24:379-88.
  5. de Oliveira, T., K. Deforche, S. Cassol, M. Salminen, D. Paraskevis, C. Seebregts, J. Snoeck, E. J. van Rensburg, A. M. Wensing, D. A. van de Vijver, C. A. Boucher, R. Camacho, and A. M. Vandamme. 2005. An automated genotyping system for analysis of HIV-1 and other microbial sequences. Bioinformatics 21:3797-800.
  6. Gifford, R., T. de Oliveira, A. Rambaut, R. E. Myers, C. V. Gale, D. Dunn, R. Shafer, A. M. Vandamme, P. Kellam, and D. Pillay. 2006. Assessment of automated genotyping protocols as tools for surveillance of HIV-1 genetic diversity. AIDS 20:1521-1529.
  7. Gifford, R. J., S. Y. Rhee, N. Eriksson, T. F. Liu, M. Kiuchi, A. K. Das, and R. W. Shafer. 2008. Sequence editing by Apolipoprotein B RNA-editing catalytic component-B and epidemiological surveillance of transmitted HIV-1 drug resistance. AIDS 22:717-25.
  8. Hammer, S. M., J. J. Eron, Jr., P. Reiss, R. T. Schooley, M. A. Thompson, S. Walmsley, P. Cahn, M. A. Fischl, J. M. Gatell, M. S. Hirsch, D. M. Jacobsen, J. S. Montaner, D. D. Richman, P. G. Yeni, and P. A. Volberding. 2008. Antiretroviral treatment of adult HIV infection: 2008 recommendations of the International AIDS Society-USA panel. Jama 300:555-70.
  9. Huang, X., and J. Zhang. 1996. Methods for comparing a DNA sequence with a protein sequence. Comput Appl Biosci 12:497-506.
  10. Liu, T. F., and R. W. Shafer. 2006. Web resources for HIV type 1 genotypic-resistance test interpretation. Clin Infect Dis 42:1608-18.
  11. Myers, R. E., C. V. Gale, A. Harrison, Y. Takeuchi, and P. Kellam. 2005. A statistical model for HIV-1 sequence classification using the subtype analyser (STAR). Bioinformatics 21:3535-40.
  12. Rhee, S. Y., R. Kantor, D. A. Katzenstein, R. Camacho, L. Morris, S. Sirivichayakul, L. Jorgensen, L. F. Brigido, J. M. Schapiro, and R. W. Shafer. 2006. HIV-1 pol mutation frequency by subtype and treatment experience: extension of the HIVseq program to seven non-B subtypes. AIDS 20:643-51.
  13. Rozanov, M., U. Plikat, C. Chappey, A. Kochergin, and T. Tatusova. 2004. A web-based genotyping resource for viral sequences. Nucleic Acids Res 32:W654-9.
  14. Scherer, J., C. A. Boucher, J. Baxter, J. Schapiro, V. Kohlbrenner, and D. Hall. 2007. Improving the prediction of virological response to tipranavir: the development of a tipranavir weighted score [Abstract P3.4/07]. 11th Euorpean AIDS Conference, Madrid, Spain, October 24-27, 2007.
  15. Shafer, R. W., D. R. Jung, and B. J. Betts. 2000. Human immunodeficiency virus type 1 reverse transcriptase and protease mutation search engine for queries. Nat Med 6:1290-1292.
  16. Shafer, R. W., and J. M. Schapiro. 2008. HIV-1 Drug Resistance Mutations: an Updated Framework for the Second Decade of HAART. AIDS Rev 10:67-84.
  17. US Department of Health and Human Services Panel on Clinical Practices for Treatment of HIV Infection, A. 2007. Guidelines for the use of antiretroviral agents in HIV-1-infected adults and adolescents (The living document, January, 2008), http://aidsinfo.nih.gov/.
  18. Vingerhoets, J., M. Buelens, M. Peeters, G. Picchio, L. Pambuyzer, H. Van Barck, G. de Smedt, B. Woodfall, and M. P. de Bethune. 2007. Impact of baseline mutations on the virological response to TMC125 in the phase III clinical trials DUET-1 and DUET-2 [abstract 32]. Antivir Ther 12:S34.
 
10. Appendices

Appendix 1. Consensus B Sequences

The subtype B consensus sequence is derived from an alignment of subtype B sequences maintained at the Los Alamos HIV Sequence Database (hiv-web.lanl.gov). The consensus B sequence is therefore a commonly used reference sequence to which new sequences are compared. Files containing the consensus PR, consensus RT, and consensus IN are also available.

Consensus B SequencesAmino Acids
Protease PQITLWQRPLVTIKIGGQLKEALLDTGADDTVLEEMNLPGRWKPKMIGGI
GGFIKVRQYDQILIEICGHKAIGTVLVGPTPVNIIGRNLLTQIGCTLNF
RT PISPIETVPVKLKPGMDGPKVKQWPLTEEKIKALVEICTEMEKEGKISKI
GPENPYNTPVFAIKKKDSTKWRKLVDFRELNKRTQDFWEVQLGIPHPAGL
KKKKSVTVLDVGDAYFSVPLDKDFRKYTAFTIPSINNETPGIRYQYNVLP
QGWKGSPAIFQSSMTKILEPFRKQNPDIVIYQYMDDLYVGSDLEIGQHRT
KIEELRQHLLRWGFTTPDKKHQKEPPFLWMGYELHPDKWTVQPIVLPEKD
SWTVNDIQKLVGKLNWASQIYAGIKVKQLCKLLRGTKALTEVIPLTEEAE
LELAENREILKEPVHGVYYDPSKDLIAEIQKQGQGQWTYQIYQEPFKNLK
TGKYARMRGAHTNDVKQLTEAVQKIATESIVIWGKTPKFKLPIQKETWEA
WWTEYWQATWIPEWEFVNTPPLVKLWYQLEKEPIVGAETFYVDGAANRET
KLGKAGYVTDRGRQKVVSLTDTTNQKTELQAIHLALQDSGLEVNIVTDSQ
YALGIIQAQPDKSESELVSQIIEQLIKKEKVYLAWVPAHKGIGGNEQVDK
LVSAGIRKVL
integrase FLDGIDKAQEEHEKYHSNWRAMASDFNLPPVVAKEIVASCDKCQLKGEAM
HGQVDCSPGIWQLDCTHLEGKIILVAVHVASGYIEAEVIPAETGQETAYF
LLKLAGRWPVKTIHTDNGSNFTSTTVKAACWWAGIKQEFGIPYNPQSQGV
VESMNKELKKIIGQVRDQAEHLKTAVQMAVFIHNFKRKGGIGGYSAGERI
VDIIATDIQTKELQKQITKIQNFRVYYRDSRDPLWKGPAKLLWKGEGAVV
IQDNSDIKVVPRRKAKIIRDYGKQMAGDDCVASRQDED


Appendix 2. Sample Data Sets

A small data set (N=10) has been compiled to provide users with a sample input for running our programs. To view the results for these sequences, copy and paste them into the input form.

A large data set (N=2055) is also available. We ask users to restrict the number of sequences they process at a time using our programs to 100, so this data set cannot be directly submitted to our programs.

A very large data set (N=5838) is available. Again, we ask users to restrict the number of sequences they process at a time using our programs to 100, so this data set cannot be directly submitted to our programs.