Stanford University HIV Drug Resistance Database - A curated public database designed to represent, store, and analyze the divergent forms of data underlying HIV drug resistance.

Algorithm Specification Interface version 2 (ASI2)

Last updated on June 24, 2011

Table of Contents

  1. Introduction
  2. Algorithms in XML Format
  3. Language Developed for ASI
  4. Compiler
  5. Converting original ASI to ASI2
  6. References
  7. Appendices
 

HIV drug resistance interpretation algorithms will continue to evolve with the publication of new data and the introduction of new antiretroviral inhibitors. The software needed to implement these algorithms should remain stable to allow direct comparison between emerging algorithms and their component rules.

ASI is a common platform that we have developed for coding genotypic interpretation algorithms. It consists of an XML format for specifying an algorithm and a compiler that transforms the XML into executable code. ASI makes it possible for drug resistance experts to develop and test genotypic interpretation algorithms without the assistance of a computer programmer, thereby allowing HIV drug resistance experts to focus on developing, testing, and modifying rules rather than on developing software to encode algorithms.


Update to ASI2

The original ASI grammar was updated (ASI2) by our collaborators of Frontier Science & Technology Research Foundation, Inc. (FSTRF) to keep all the aforementioned qualities and goals of the original ASI grammar and further improved it in several ways:

  • The new MAX keyword was added to explicitly define whether to sum scores triggered in a SCORE FROM rule or take the maximum score at each position; previously the presence or absence of the global processing directive "RESTRICTED_POSITION_SCORING" would determine the behavior to follow. The MAX keyword is clearer, localized and flexible; clearer because it is explicit; localized since only the specified portion of the rule is affected instead of the entire algorithm; and flexible because it can be applied to mutations at the same or different positions.
  • Conditional statements can now be embedded in SCORE FROM rules so combination scoring rules are possible.
  • Scoring ranges can no longer overlap each other; the new compiler will throw an error so there will be no ambiguities. In the previous ASI algorithm, level 1 was defined from -∞ to 10 and level 2 from 10-15 so an interpretation score of 10 would be assigned to two levels and Sierra took the higher level (level 2).
  • The Java compiler SableCC (sablecc.org) is used to create an object-oriented framework with strictly-typed abstract syntax trees and tree walkers for building our Java compiler. The only input to SableCC was the grammar definition shown in Section 3: Language Developed for ASI, shortening the development cycle when incorporating changes in the grammar definition.

 
2. Algorithms in XML Format

The following links provide information about the ASI and to existing programs that use ASI technology:

1. ANRS XML HTML
2. HIVDB XML HTML
3. Rega InstituteXMLHTML

Each algorithm is completely contained in an XML document. The ASI2 Document Type Definition (DTD) that they adhere to can also be viewed.

 
3. Language Developed for ASI

Drug resistance algorithms can be specified in an XML document that adheres to an XML Document Type Definition (DTD) that we have developed (and refer to as the ASI2 DTD). The DTD is available online and the text of the DTD is shown in Appendix 1. It provides the basic framework that an algorithm must adhere to.

At the heart of an algorithm is a <DRUG> clause. An example clause taken from the ANRS algorithm is shown below:


<DRUG>
        <NAME>d4T</NAME>
        <FULLNAME>stavudine</FULLNAME>
        <RULE>
            <CONDITION>
                65R OR 151M OR 69i
            </CONDITION>
            <ACTIONS>
    	        <LEVEL>3</LEVEL>
            </ACTIONS>
        </RULE>
</DRUG>

In this example, for the drug d4T the algorithm designer would like the drug to receive a level of 3 if either mutation 151M or 65R is present or if an insertion is present at position 69 (the designer would have previously specified what was meant by level 3 - in ANRS it means "Resistance"). That is, if the condition statement evaluates to true then the appropriate action is taken. Note that the <CONDITION> clause of each <RULE> is free-form text that adheres to a language grammar we have developed for the purpose. This grammar is described in the next section.


The grammar for the ASI2 language is described in the table that follows using BNF notation.

ClauseDefinition
statementbooleancondition |
scorecondition
booleanconditioncondition condition2*;
conditionl_par booleancondition r_par |
residue |
excludestatement |
selectstatement
condition2logicsymbol condition;
logicsymboland |
or
residue[originalaminoacid]:amino_acid? integer [mutatedaminoacid]:amino_acid+ |
not [originalaminoacid]:amino_acid? Integer [mutatedaminoacid]:amino_acid+ |
[originalaminoacid]:amino_acid? integer l_par not [mutatedaminoacid]:amino_acid+ r_par
excludestatementexclude residue
selectstatementselect selectstatement2
selectstatement2exactly integer from l_par selectlist r_par |
atleast integer from l_par selectlist r_par |
notmorethan integer from l_par selectlist r_par |
atleast [atleastnumber]:integer logicsymbol notmorethan [notmorethannumber]:integer from l_par selectlist r_par
selectlistresidue listitems*
listitemscomma residue
scoreconditionscore from l_par scorelist r_par
scorelistscoreitem scoreitems*
scoreitembooleancondition mapper min? number |
max l_par scorelist r_par
 
legend| separates possibilties
? indicates zero or one
+ one or more
* zero or more
l_par (
r_par )

Here is a link to the previous version of the ASI

Some examples of valid clauses in the language follow. The language is very much fairly self-explanatory given that it reads similarly to an English sentence.

  1. 151M OR 69i
  2. SELECT ATLEAST 2 FROM (41L, 67N, 70R, 210W, 215FY, 219QE)
  3. SELECT ATLEAST 2 AND NOTMORETHAN 2 FROM (41L, 67N, 70R, 210W, 215FY, 219QE)
    What this really means is "choose exactly two from the following list". The keyword "EXACTLY", which has not been added to the language yet, would allow that rule to be expressed more compactly.
  4. 215FY AND NOT 184VI
    For this to be true, a mutation F or Y must be present at position 215 and a V or I mutation must not be present at position 184.
  5. SCORE FROM (65R => 20, 74V => 20, 184VI => 20)
    Scores are added for each score that is triggered by a matching mutation, so if 65R and 74V were present the total score would be 40. A <SCORERANGE> section of the <ACTIONS> would indicate how scores were to be mapped to resistance levels.
  6. 151M AND EXCLUDE 69i
    The mutation 69i must not be present for this clause to evaluate to true. That is, 151M would have to be present along with an arbitrary number of other mutations as long as one was not 69i.
  7. 69(NOT TDN)
    A mutation at position 69 other than T69D and T69N. The consensus must also be excluded to pick up atypical mutations at this position.
  8. 215F OR 215Y
    A long way of saying 215FY.
  9. MAX ( 101P => 40, 101E => 30, 101HN => 15, 101Q => 5 )
    Use the maximum score in position 101 among scores triggered by a matching mutation: 40(P), 30(E), 15(H), 15(N), 5(Q)
  10. (184V AND 115F) => 20
    a mutation V at position 184 and mutation F at 115 must both be present for the score of 20 to be triggered

 
4. Compiler

Similarly to the previos version, once specified in the ASI2 format, an algorithm can be compiled into an executable set of routines. These routines accept as input a set of mutations and produce as output a list of drug resistance levels along with other output (such as comment strings) as directed by the algorithm. As mentioned in the introduction, SableCC (a Java compiler compiler) was used to create a Java framework; the previous one was written in Perl.

 
5. Converting an algorithm in the original ASI (ASI_OLD) format to ASI2
There are 4 things you must change in your original ASI grammar to make it compliant with the ASI2 grammar and work with our HIValg tools.
  1. Delete processing directives since they are no longer used. Processing directives look like the example below
    <PROCESSING_DIRECTIVES>
    		<RESTRICTED_POSITION_SCORING/>
    </PROCESSING_DIRECTIVES>
    

  2. Use the new MAX keyword to define the set of mutations among which to choose the highest score since the RESTRICTED_POSITION_SCORING processing directive is no longer used.
    For example, the old ASI (ASI_OLD) rule needs to be rewritten as shown below (ASI2)
    A) ASI_OLD   <CONDITION>SCORE FROM ( 98G => 10, 100I => 40,
    		        101P => 40, 101E => 30, 101HN => 15, 101Q => 5 )
                 </CONDITION>
    B) ASI2      <CONDITION>SCORE FROM ( 98G => 10, 100I => 40,
    		        MAX (101P => 40, 101E => 30, 101HN => 15, 101Q => 5) )
                 </CONDITION>
    
    In this example, if a mutation list with just 101HP was submitted to the new ASI engine along with:
    A) the original unchanged ASI_OLD rule, the mutation would score a 55 - the sum of both scores, 40 (101P) and 15 (101H); B) the rewritten ASI2 rule, the mutation would score a 40 - the higher of 40 and 15

  3. Add gene information to link the drug class to a gene. The examples below link the protease and RT genes to their respective drug classes.
        	<GENE_DEFINITION>
        	   <NAME>PR</NAME>
        	   <DRUGCLASSLIST>PI</DRUGCLASSLIST>
        	</GENE_DEFINITION>
    
        	<GENE_DEFINITION>
        	   <NAME>RT</NAME>
        	   <DRUGCLASSLIST>NNRTI, NRTI</DRUGCLASSLIST>
        	</GENE_DEFINITION>
    

  4. You must rewrite your scoring ranges if your definitions overlap, since it is no longer allowed. For example, the global range in the old ASI rule (ASI_OLD) had to be rewritten as shown below (ASI2)
    ASI_OLD  <GLOBALRANGE>
                   (-INF TO 10  => 1, 
                      10 TO 15  => 2, 
                      15 TO 30  => 3,
                      30 TO 60  => 4,
                      60 TO INF => 5)
             </GLOBALRANGE>
    
    ASI2    <GLOBALRANGE>
                (-INF TO 9 => 1, 
                 10 TO 14  => 2, 
                 15 TO 29  => 3,
                 30 TO 59  => 4,
                 60 TO INF => 5)
            </GLOBALRANGE>
    
 
6. References
  1. Hirsch MS, Brun-Vezinet F, D'Aquila RT, et al. Antiretroviral drug resistance testing in adult HIV-1 infection: recommendations of an International AIDS Society-USA Panel. JAMA 2000;283:2417-2426.
  2. British HIV Association. British HIV Association (BHIVA) guidelines for the treatment of HIV-infected adults with antiretroviral therapy. HIV Med 2001;2(4):276-313.
  3. EuroGuidelines Group for HIV Resistance. Clinical and laboratory guidelines for the use of HIV-1 drug resistance testing as part of treatment management: recommendations for the European setting. AIDS 2001;15:309-320.
  4. US Department of Health and Human Services Panel on Clinical Practices for Treatment of HIV Infection A. Guidelines for the use of antiretroviral agents in HIV-1-infected adults and adolescents (The living document, February 4, 2002), http://www.aidsinfo.nih.gov/guidelines/, 2002.
 
7. Appendices

Appendix 1. ASI2 DTD


<!-- 
This DTD indicates the format the various drug resistance algorithms must use.
-->

<!ELEMENT ALGORITHM (ALGNAME, ALGVERSION?, DEFINITIONS, DRUG*, 
                     MUTATION_COMMENTS?)>

<!-- *************************************************************** -->
<!ELEMENT ALGNAME (#PCDATA)>
<!-- *************************************************************** -->

<!-- *************************************************************** -->
<!ELEMENT ALGVERSION (#PCDATA)>
<!-- *************************************************************** -->

<!-- *************************************************************** -->
<!ELEMENT DEFINITIONS (GENE_DEFINITION*, LEVEL_DEFINITION*, DRUGCLASS*, 
                       GLOBALRANGE?, COMMENT_DEFINITIONS?)>
<!ELEMENT GENE_DEFINITION (NAME, DRUGCLASSLIST?)>
<!ELEMENT DRUGCLASSLIST (#PCDATA)>  
<!ELEMENT LEVEL_DEFINITION (ORDER, ORIGINAL, SIR)>
<!ELEMENT ORDER (#PCDATA)>
<!ELEMENT ORIGINAL (#PCDATA)>
<!ELEMENT SIR (#PCDATA)>
<!ELEMENT DRUGCLASS (NAME, DRUGLIST)>
<!ELEMENT NAME (#PCDATA)>
<!ELEMENT DRUGLIST (#PCDATA)>
<!ELEMENT GLOBALRANGE (#PCDATA)>
<!ELEMENT COMMENT_DEFINITIONS (COMMENT_STRING+)>
<!ELEMENT COMMENT_STRING (TEXT)>
<!ATTLIST COMMENT_STRING   id ID #REQUIRED>
<!ELEMENT TEXT (#PCDATA)>
<!-- *************************************************************** -->

<!-- *************************************************************** -->
<!ELEMENT DRUG (NAME, FULLNAME?, RULE*)>
<!ELEMENT FULLNAME (#PCDATA)>
<!ELEMENT RULE (CONDITION, ACTIONS)>
<!ELEMENT CONDITION (#PCDATA)>
<!ELEMENT ACTIONS (LEVEL?, SCORERANGE?, COMMENT?)>
<!ELEMENT LEVEL (#PCDATA)>
<!ELEMENT SCORERANGE (#PCDATA | USE_GLOBALRANGE)*>
<!ELEMENT USE_GLOBALRANGE EMPTY>
<!ELEMENT COMMENT EMPTY>
<!ATTLIST COMMENT   ref IDREF #REQUIRED>
<!-- *************************************************************** -->

<!-- *************************************************************** -->
<!ELEMENT MUTATION_COMMENTS (GENE*)>
<!ELEMENT GENE (NAME, RULE+)>
<!-- *************************************************************** -->